Startups

Snorkel AI scores $35M Series B to automate data labeling in machine learning

Comment

Anna Bassi / EyeEm
Image Credits: Anna Bassi / EyeEm / Getty Images

One of the more tedious aspects of machine learning is providing a set of labels to teach the machine learning model what it needs to know. Snorkel AI wants to make it easier for subject matter experts to apply those labels programmatically, and today the startup announced a $35 million Series B.

It also announced a new tool called Application Studio that provides a way to build common machine learning applications using templates and predefined components.

Lightspeed Venture Partners led the round with participation from previous investors Greylock, GV, In-Q-Tel and Nepenthe Capital. New investors Walden and BlackRock also joined in. The startup reports that it has now raised $50 million.

Company co-founder and CEO Alex Ratner says that data labeling remains a huge challenge and roadblock to moving machine learning and artificial intelligence forward inside a lot of industries because it is costly, labor-intensive and hard for the subject experts to carve out the time to do it.

“The not so hidden secret about AI today is that in spite of all the technological and tooling advancements, roughly 80 to 90% of the cost and time for an average AI project goes into just manually labeling and collecting and relabeling this training data,” he said.

He says that his company has developed a solution to simplify this process to make it easier for subject experts to programmatically add the labels, a process he says decreases the time and effort required to apply labels in a pretty dramatic way from months to hours or days, depending on the complexity of the data.

As the company has developed this methodology, customers have been asking for help in the next step of the machine learning process, which is taking that training data and the model and building an application. That’s where the Application Studio comes in. It could be a contract classifier at a bank or a network anomaly detector at a telco and it helps companies take that next step after data labeling.

“It’s not just about how you programmatically label the data, it’s also about the models, the preprocessors, the post processors, and so we’ve made this now accessible in a kind of templated and visual no-code interface,” he said.

DataRobot is acquiring Paxata to add data prep to machine learning platform

The company’s products are based on research that began at the Stanford AI Lab in 2015. The founders spent four years in the research phase before launching Snorkel in 2019. Today, the startup has 40 employees. Ratner recognizes the issues that the technology industry has had from a diversity perspective and says he has made a conscious effort to build a diverse and inclusive company.

“What I can say is that we tried to prioritize it at a company level, the full team level and at a board level from day one, and to also put action behind that. So we’ve been working with external firms for internal training and audits and strategy around DEI, and we’ve made pipeline diversity a non-negotiable requirement of any of our contracts with recruiting firms,” he said.

Ratner also recognizes that automation can hard code bias into machine learning models, and he’s hopeful that by simplifying the labeling process, it can make it much easier to detect bias when it happens.

“If you start with a dozen or two dozen of what we call labeling functions in Snorkel, you still need to be vigilant and proactive about trying to detect bias, but it’s easier to audit what taught your model to change it by just going back and looking at a couple of hundred lines of code.”

How artificial intelligence will be used in 2021

More TechCrunch

Welcome back to TechCrunch’s Week in Review — TechCrunch’s newsletter recapping the week’s biggest news. Want it in your inbox every Saturday? Sign up here. OpenAI announced this week that…

Scarlett Johansson brought receipts to the OpenAI controversy

Accurate weather forecasts are critical to industries like agriculture, and they’re also important to help prevent and mitigate harm from inclement weather events or natural disasters. But getting forecasts right…

Deal Dive: Can blockchain make weather forecasts better? WeatherXM thinks so

pcTattletale’s website was briefly defaced and contained links containing files from the spyware maker’s servers, before going offline.

Spyware app pcTattletale was hacked and its website defaced

Featured Article

Synapse, backed by a16z, has collapsed, and 10 million consumers could be hurt

Synapse’s bankruptcy shows just how treacherous things are for the often-interdependent fintech world when one key player hits trouble. 

8 hours ago
Synapse, backed by a16z, has collapsed, and 10 million consumers could be hurt

Sarah Myers West, profiled as part of TechCrunch’s Women in AI series, is managing director at the AI Now institute.

Women in AI: Sarah Myers West says we should ask, ‘Why build AI at all?’

Keeping up with an industry as fast-moving as AI is a tall order. So until an AI can do it for you, here’s a handy roundup of recent stories in the world…

This Week in AI: OpenAI and publishers are partners of convenience

Evan, a high school sophomore from Houston, was stuck on a calculus problem. He pulled up Answer AI on his iPhone, snapped a photo of the problem from his Advanced…

AI tutors are quietly changing how kids in the US study, and the leading apps are from China

Welcome to Startups Weekly — Haje‘s weekly recap of everything you can’t miss from the world of startups. Sign up here to get it in your inbox every Friday. Well,…

Startups Weekly: Drama at Techstars. Drama in AI. Drama everywhere.

Last year’s investor dreams of a strong 2024 IPO pipeline have faded, if not fully disappeared, as we approach the halfway point of the year. 2024 delivered four venture-backed tech…

From Plaid to Figma, here are the startups that are likely — or definitely — not having IPOs this year

Federal safety regulators have discovered nine more incidents that raise questions about the safety of Waymo’s self-driving vehicles operating in Phoenix and San Francisco.  The National Highway Traffic Safety Administration…

Feds add nine more incidents to Waymo robotaxi investigation

Terra One’s pitch deck has a few wins, but also a few misses. Here’s how to fix that.

Pitch Deck Teardown: Terra One’s $7.5M Seed deck

Chinasa T. Okolo researches AI policy and governance in the Global South.

Women in AI: Chinasa T. Okolo researches AI’s impact on the Global South

TechCrunch Disrupt takes place on October 28–30 in San Francisco. While the event is a few months away, the deadline to secure your early-bird tickets and save up to $800…

Disrupt 2024 early-bird tickets fly away next Friday

Another week, and another round of crazy cash injections and valuations emerged from the AI realm. DeepL, an AI language translation startup, raised $300 million on a $2 billion valuation;…

Big tech companies are plowing money into AI startups, which could help them dodge antitrust concerns

If raised, this new fund, the firm’s third, would be its largest to date.

Harlem Capital is raising a $150 million fund

About half a million patients have been notified so far, but the number of affected individuals is likely far higher.

US pharma giant Cencora says Americans’ health information stolen in data breach

Attention, tech enthusiasts and startup supporters! The final countdown is here: Today is the last day to cast your vote for the TechCrunch Disrupt 2024 Audience Choice program. Voting closes…

Last day to vote for TC Disrupt 2024 Audience Choice program

Featured Article

Signal’s Meredith Whittaker on the Telegram security clash and the ‘edge lords’ at OpenAI 

Among other things, Whittaker is concerned about the concentration of power in the five main social media platforms.

1 day ago
Signal’s Meredith Whittaker on the Telegram security clash and the ‘edge lords’ at OpenAI 

Lucid Motors is laying off about 400 employees, or roughly 6% of its workforce, as part of a restructuring ahead of the launch of its first electric SUV later this…

Lucid Motors slashes 400 jobs ahead of crucial SUV launch

Google is investing nearly $350 million in Flipkart, becoming the latest high-profile name to back the Walmart-owned Indian e-commerce startup. The Android-maker will also provide Flipkart with cloud offerings as…

Google invests $350 million in Indian e-commerce giant Flipkart

A Jio Financial unit plans to purchase customer premises equipment and telecom gear worth $4.32 billion from Reliance Retail.

Jio Financial unit to buy $4.32B of telecom gear from Reliance Retail

Foursquare, the location-focused outfit that in 2020 merged with Factual, another location-focused outfit, is joining the parade of companies to make cuts to one of its biggest cost centers –…

Foursquare just laid off 105 employees

“Running with scissors is a cardio exercise that can increase your heart rate and require concentration and focus,” says Google’s new AI search feature. “Some say it can also improve…

Using memes, social media users have become red teams for half-baked AI features

The European Space Agency selected two companies on Wednesday to advance designs of a cargo spacecraft that could establish the continent’s first sovereign access to space.  The two awardees, major…

ESA prepares for the post-ISS era, selects The Exploration Company, Thales Alenia to develop cargo spacecraft

Expressable is a platform that offers one-on-one virtual sessions with speech language pathologists.

Expressable brings speech therapy into the home

The French Secretary of State for the Digital Economy as of this year, Marina Ferrari, revealed this year’s laureates during VivaTech week in Paris. According to its promoters, this fifth…

The biggest French startups in 2024 according to the French government

Spotify is notifying customers who purchased its Car Thing product that the devices will stop working after December 9, 2024. The company discontinued the device back in July 2022, but…

Spotify to shut off Car Thing for good, leading users to demand refunds

Elon Musk’s X is preparing to make “likes” private on the social network, in a change that could potentially confuse users over the difference between something they’ve favorited and something…

X should bring back stars, not hide ‘likes’

The FCC has proposed a $6 million fine for the scammer who used voice-cloning tech to impersonate President Biden in a series of illegal robocalls during a New Hampshire primary…

$6M fine for robocaller who used AI to clone Biden’s voice

Welcome back to TechCrunch Mobility — your central hub for news and insights on the future of transportation. Sign up here for free — just click TechCrunch Mobility! Is it…

Tesla lobbies for Elon and Kia taps into the GenAI hype