Startups

Snorkel AI scores $35M Series B to automate data labeling in machine learning

Comment

Anna Bassi / EyeEm
Image Credits: Anna Bassi / EyeEm / Getty Images

One of the more tedious aspects of machine learning is providing a set of labels to teach the machine learning model what it needs to know. Snorkel AI wants to make it easier for subject matter experts to apply those labels programmatically, and today the startup announced a $35 million Series B.

It also announced a new tool called Application Studio that provides a way to build common machine learning applications using templates and predefined components.

Lightspeed Venture Partners led the round with participation from previous investors Greylock, GV, In-Q-Tel and Nepenthe Capital. New investors Walden and BlackRock also joined in. The startup reports that it has now raised $50 million.

Company co-founder and CEO Alex Ratner says that data labeling remains a huge challenge and roadblock to moving machine learning and artificial intelligence forward inside a lot of industries because it is costly, labor-intensive and hard for the subject experts to carve out the time to do it.

“The not so hidden secret about AI today is that in spite of all the technological and tooling advancements, roughly 80 to 90% of the cost and time for an average AI project goes into just manually labeling and collecting and relabeling this training data,” he said.

He says that his company has developed a solution to simplify this process to make it easier for subject experts to programmatically add the labels, a process he says decreases the time and effort required to apply labels in a pretty dramatic way from months to hours or days, depending on the complexity of the data.

As the company has developed this methodology, customers have been asking for help in the next step of the machine learning process, which is taking that training data and the model and building an application. That’s where the Application Studio comes in. It could be a contract classifier at a bank or a network anomaly detector at a telco and it helps companies take that next step after data labeling.

“It’s not just about how you programmatically label the data, it’s also about the models, the preprocessors, the post processors, and so we’ve made this now accessible in a kind of templated and visual no-code interface,” he said.

DataRobot is acquiring Paxata to add data prep to machine learning platform

The company’s products are based on research that began at the Stanford AI Lab in 2015. The founders spent four years in the research phase before launching Snorkel in 2019. Today, the startup has 40 employees. Ratner recognizes the issues that the technology industry has had from a diversity perspective and says he has made a conscious effort to build a diverse and inclusive company.

“What I can say is that we tried to prioritize it at a company level, the full team level and at a board level from day one, and to also put action behind that. So we’ve been working with external firms for internal training and audits and strategy around DEI, and we’ve made pipeline diversity a non-negotiable requirement of any of our contracts with recruiting firms,” he said.

Ratner also recognizes that automation can hard code bias into machine learning models, and he’s hopeful that by simplifying the labeling process, it can make it much easier to detect bias when it happens.

“If you start with a dozen or two dozen of what we call labeling functions in Snorkel, you still need to be vigilant and proactive about trying to detect bias, but it’s easier to audit what taught your model to change it by just going back and looking at a couple of hundred lines of code.”

How artificial intelligence will be used in 2021

More TechCrunch

Welcome back to TechCrunch Mobility — your central hub for news and insights on the future of transportation. Sign up here for free — just click TechCrunch Mobility! Is it…

Tesla lobbies for Elon and Kia taps into the GenAI hype

Crowdaa is an app that allows non-developers to easily create and release apps on the mobile store. 

App developer Crowdaa raises €1.2 million and plans a U.S. expansion

Back in 2019, Canva, the wildly successful design tool, introduced what the company was calling an enterprise product, but in reality it was more geared towards teams than fulfilling true…

Canva launches a proper enterprise product — and they mean it this time

TechCrunch Disrupt 2024 isn’t just an event for innovation; it’s a platform where your voice matters. With the Disrupt 2024 Audience Choice Program, you have the power to shape the…

2 days left to vote for Disrupt Audience Choice

The United States Department of Justice and 30 state attorneys general filed a lawsuit against Live Nation Entertainment, the parent company of Ticketmaster, for alleged monopolistic practices. Live Nation and…

Ticketmaster is at the heart of a U.S. antitrust lawsuit against parent company Live Nation

The UK will shortly get its own rulebook for Big Tech, after peers in the House of Lords agreed Thursday afternoon to pass the Digital Markets, Competition and Consumer bill…

‘Pro-competition’ rules for Big Tech make it through UK’s pre-election wash-up

Spotify’s addition of its AI DJ feature, which introduces personalized song selections to users, was the company’s first step into an AI future. Now, Spotify is developing an alternative version…

Spotify experiments with an AI DJ that speaks Spanish

Call Arc can help answer immediate and small questions, according to the company. 

Arc Search’s new Call Arc feature lets you ask questions by ‘making a phone call’

After multiple delays, Apple and the Paris area transportation authority rolled out support for Paris transit passes in Apple Wallet. It means that people can now use their iPhone or…

Paris transit passes now available in iPhone’s Wallet app

Redwood Materials, the battery recycling startup founded by former Tesla co-founder JB Straubel, will be recycling production scrap for batteries going into General Motors electric vehicles.  The company announced Thursday…

Redwood Materials is partnering with Ultium Cells to recycle GM’s EV battery scrap

A new startup called Auggie is aiming to give parents a single platform where they can shop for products and connect with each other. The company’s new app, which launched…

Auggie’s new app helps parents find community and shop

Andrej Safundzic, Alan Flores Lopez and Leo Mehr met in a class at Stanford focusing on ethics, public policy and technological change. Safundzic — speaking to TechCrunch — says that…

Lumos helps companies manage their employees’ identities — and access

Remark trains AI models on human product experts to create personas that can answer questions with the same style of their human counterparts.

Remark puts thousands of human product experts into AI form

ZeroPoint claims to have solved compression problems with hyper-fast, low-level memory compression that requires no real changes to the rest of the computing system.

ZeroPoint’s nanosecond-scale memory compression could tame power-hungry AI infrastructure

In 2021, Roi Ravhon, Asaf Liveanu and Yizhar Gilboa came together to found Finout, an enterprise-focused toolset to help manage and optimize cloud costs. (We covered the company’s launch out…

Finout lands cash to grow its cloud spend management platform

On the heels of raising $102 million earlier this year, Bugcrowd is making good on its promise to use some of that funding to make acquisitions to strengthen its security…

Bugcrowd, the crowdsourced white-hat hacker platform, acquires Informer to ramp up its security chops

Google is preparing to build what will be the first subsea fibre optic cable connecting the continents of Africa and Australia. The news comes as the major cloud hyperscalers battle…

Google to build first subsea fibre optic cable connecting Africa with Australia

The Kia EV3 — the new all-electric compact SUV revealed Thursday — illustrates a growing appetite among global automakers to bring generative AI into their vehicles.  The automaker said the…

The new Kia EV3 will have an AI assistant with ChatGPT DNA

Bing, Microsoft’s search engine, was working improperly for several hours on Thursday in Europe. At first, we noticed it wasn’t possible to perform a web search at all. Now it…

Bing’s API was down, taking Microsoft Copilot, DuckDuckGo and ChatGPT’s web search feature down too

If you thought autonomous driving was just for cars, think again. The “autonomous navigation” market — where ships steer themselves guided by AI, resulting in fuel and time savings —…

Autonomous shipping startup Orca AI tops up with $23M led by OCV Partners and MizMaa Ventures

The best known mycoprotein is probably Quorn, a meat substitute that’s fast approaching its 40th birthday. But Finnish biotech startup Enifer is cooking up something even older: Its proprietary single-cell…

Meet the Finnish biotech startup bringing a long lost mycoprotein to your plate

Silo, a Bay Area food supply chain startup, has hit a rough patch. TechCrunch has learned that the company on Tuesday laid off roughly 30% of its staff, or north…

Food supply chain software maker Silo lays off ~30% of staff amid M&A discussions

Featured Article

Meta’s new AI council is composed entirely of white men

Meanwhile, women and people of color are disproportionately impacted by irresponsible AI.

20 hours ago
Meta’s new AI council is composed entirely of white men

If you’ve ever wanted to apply to Y Combinator, here’s some inside scoop on how the iconic accelerator goes about choosing companies.

Garry Tan has revealed his ‘secret sauce’ for getting into Y Combinator

Indian ride-hailing startup BluSmart has started operating in Dubai, TechCrunch has exclusively learned and confirmed with its executive. The move to Dubai, which has been rumored for months, could help…

India’s BluSmart is testing its ride-hailing service in Dubai

Under the envisioned framework, both candidate and issue ads would be required to include an on-air and filed disclosure that AI-generated content was used.

FCC proposes all AI-generated content in political ads must be disclosed

Want to make a founder’s day, week, month, and possibly career? Refer them to Startup Battlefield 200 at Disrupt 2024! Applications close June 10 at 11:59 p.m. PT. TechCrunch’s Startup…

Refer a founder to Startup Battlefield 200 at Disrupt 2024

Social networking startup and X competitor Bluesky is officially launching DMs (direct messages), the company announced on Wednesday. Later, Bluesky plans to “fully support end-to-end encrypted messaging down the line,”…

Bluesky now has DMs

The perception in Silicon Valley is that every investor would love to be in business with Peter Thiel. But the venture capital fundraising environment has become so difficult that even…

Peter Thiel-founded Valar Ventures raised a $300 million fund, half the size of its last one