Enterprise

Heartex raises $25M for its AI-focused, open source data labeling platform

Comment

dollars, money, binary code
Image Credits: Bryce Durbin / TechCrunch

Heartex, a startup that bills itself as an “open source” platform for data labeling, today announced that it landed $25 million in a Series A funding round led by Redpoint Ventures. Unusual Ventures, Bow Capital and Swift Ventures also participated, bringing Heartex’s total capital raised to $30 million.

Co-founder and CEO Michael Malyuk said that the new money will be put toward improving Heartex’s product and expanding the size of the company’s workforce from 28 people to 68 by the end of the year.

“Coming from engineering and machine learning backgrounds, [Heartex’s founding team] knew what value machine learning and AI can bring to the organization,” Malyuk told TechCrunch via email. “At the time, we all worked at different companies and in different industries yet shared the same struggle with model accuracy due to poor-quality training data. We agreed that the only viable solution was to have internal teams with domain expertise be responsible for annotating and curating training data. Who can provide the best results other than your own experts?”

Software developers Malyuk, Maxim Tkachenko and Nikolay Lyubimov co-founded Heartex in 2019. Lyubimov was a senior engineer at Huawei before moving to Yandex, where he worked as a backend developer on speech technologies and dialogue systems.

Heartex
Heartex’s dashboard. Image Credits: Heartex

The ties to Yandex, a company sometimes referred to as the “Google of Russia”, might unnerve some — particularly in light of accusations by the European Union that Yandex’s news division played a sizeable role in spreading Kremlin propaganda. Heartex has an office in San Francisco, California, but several of the company’s engineers are based in the former Soviet Republic of Georgia.

When asked, Heartex says that it doesn’t collect any customer data and open sources the core of its labeling platform for inspection. “We’ve built a data architecture that keeps data private on the customer’s storage, separating the data plane and control plane,” Malyuk added. “Regarding the team and their locations, we’re a very international team with no current members based in Russia.”

Setting aside its geopolitical affiliations, Heartex aims to tackle what Malyuk sees as a major hurdle in the enterprise: extracting value from data by leveraging AI. There’s a growing wave of businesses aiming to become “data-centric” — Gartner recently reported that enterprise use of AI grew a whopping 270% over the past several years. But many organizations are struggling to use AI to its fullest.

“Having reached a point of diminishing returns in algorithm-specific development, enterprises are investing in perfecting data labeling as part of their strategic, data-centric initiatives,” Malyuk said. “This is a progression from earlier development practices that focused almost exclusively on algorithm development and tuning.”

If, as Malyuk asserts, data labeling is receiving increased attention from companies pursuing AI, it’s because labeling is a core part of the AI development process. Many AI systems “learn” to make sense of images, videos, text and audio from examples that have been labeled by teams of human annotators. The labels enable the systems to extrapolate the relationships between the examples (e.g. the link between the caption “kitchen sink” and a photo of a kitchen sink) to data the systems haven’t seen before (e.g. photos of kitchen sinks that weren’t included in the data used to “teach” the model).

The trouble is, not all labels are created equal. Labeling data like legal contracts, medical images and scientific literature requires domain expertise that not just any annotator has. And — being human — annotators make mistakes. In an MIT analysis of popular AI datasets, researchers found mislabeled data like one breed of dog confused for another and an Ariana Grande high note categorized as a whistle.

Image Credits: Heartex

Malyuk makes no claim that Heartex completely solves these issues. But in an interview, he explained that the platform is designed to support labeling workflows for different AI use cases, with features that touch on data quality management, reporting and analytics. For example, data engineers using Heartex can see the names and email addresses of annotators and data reviewers, which are tied to labels that they’ve contributed or audited. This helps to monitor label quality and — ideally — to fix problems before they impact training data.

“The angle for the C-suite is pretty simple. It’s all about improving production AI model accuracy in service of achieving the project’s business objective,” Malyuk said. “We’re finding that most C-suite managers with AI, machine learning, and/or data science responsibilities have confirmed through experience that, with more strategic investments in people, processes, technology, and data, AI can deliver extraordinary value to the business across a multitude of diverse use cases. We also see that success has a snowball effect. Teams that find success early are able to create additional high-value models more quickly building not just on their early learnings but also on the additional data generated from using the production models.”

In the data labeling toolset arena, Heartex competes with startups including AIMMO, Labelbox, Scale AI and Snorkel AI, as well as Google and Amazon (which offers data labeling products through Google Cloud and SageMaker, respectively). But Malyuk believes that Heartex’s focus on software as opposed to services sets it apart from the rest. Unlike many of its competitors, the startup doesn’t sell labeling services through its platform.

“As we’ve built a truly horizontal solution, our customers come from a variety of industries. We have small startups as customers, as well as several Fortune 100 companies. [Our platform] has been adopted by over 100,000 data scientists globally,” Malyuk said, while declining to reveal revenue numbers. “[Our customers] are establishing internal data annotation teams and buying [our product] because their production AI models aren’t performing well and recognize that poor training data quality is the primary cause.”

More TechCrunch

The TechCrunch team runs down all of the biggest news from the Apple WWDC 2024 keynote in an easy-to-skim digest.

Here’s everything Apple announced at the WWDC 2024 keynote, including Apple Intelligence, Siri makeover

Hello and welcome back to TechCrunch Space. What a week! In the same seven-day period, we watched Boeing’s Starliner launch astronauts to space for the first time, and then we…

TechCrunch Space: A week that will go down in history

Elon Musk’s posts seem to misunderstand the relationship Apple announced with OpenAI at WWDC 2024.

Elon Musk threatens to ban Apple devices from his companies over Apple’s ChatGPT integrations

“We’re looking forward to doing integrations with other models, including Google Gemini, for instance, in the future,” Federighi said during WWDC 2024.

Apple confirms plans to work with Google’s Gemini ‘in the future’

When Urvashi Barooah applied to MBA programs in 2015, she focused her applications around her dream of becoming a venture capitalist. She got rejected from every school, and was told…

How Urvashi Barooah broke into venture after everyone told her she couldn’t

Slack CEO Denise Dresser is speaking at TechCrunch Disrupt 2024.

Slack CEO Denise Dresser is coming to TechCrunch Disrupt this October

Apple kicked off its weeklong Worldwide Developers Conference (WWDC 2024) event today with the customary keynote at 1 p.m. ET/10 a.m. PT. The presentation focused on the company’s software offerings…

Watch the Apple Intelligence reveal, and the rest of WWDC 2024 right here

Apple’s SDKs (software development kits) have been updated with a variety of new APIs and frameworks.

Apple brings its GenAI ‘Apple Intelligence’ to developers, will let Siri control apps

Older iPhones or iPhone 15 users won’t be able to use these features.

Apple Intelligence features will be available on iPhone 15 Pro and devices with M1 or newer chips

Soon, Siri will be able to tap ChatGPT for “expertise” where it might be helpful, Apple says.

Apple brings ChatGPT to its apps, including Siri

Apple Intelligence will have an understanding of who you’re talking with in a messaging conversation.

Apple debuts AI-generated … Bitmoji

To use InSight, Apple TV+ subscribers can swipe down on their remote to bring up a display with actor names and character information in real time.

Apple TV+ introduces InSight, a new feature similar to Amazon’s X-Ray, at WWDC 2024

Siri is now more natural, more relevant and more personal — and it has new look.

Apple gives Siri an AI makeover

The company has been pushing the feature as integral to all of its various operating system offerings, including iOS, macOS and the latest, VisionOS.

Apple Intelligence is the company’s new generative AI offering

In addition to all the features you can find in the Passwords menu today, there’s a new column on the left that lets you more easily navigate your password collection.

Apple is launching its own password manager app

With Smart Script, Apple says it’s making handwriting your notes even smoother and straighter.

Smart Script in iPadOS 18 will clean up your handwriting when using an Apple Pencil

iOS’ perennial tips calculating app is finally coming to the larger screen.

Calculator for iPad does the math for you

The new OS, announced at WWDC 2024, will allow users to mirror their iPhone screen directly on their Mac and even control it.

With macOS Sequoia, you can mirror your iPhone on your Mac

At Apple’s WWDC 2024, the company announced MacOS Sequoia.

Apple unveils macOS Sequoia

“Messages via Satellite,” announced at Apple’s WWDC 2024 keynote, works much like the SOS feature does.

iPhones will soon text via satellite

Apple says the new design will lead to less time searching for photos.

Apple revamps its Photos app for iOS 18

Users will be able to lock an app when they hand over their phone.

iOS 18 will let you hide and lock apps

Apple’s WWDC 2024 keynote was packed, including a number of key new updates for iOS 18. One of the more interesting additions is Tap to Cash, which is more or…

Tap to Cash lets you pay by touching iPhones

In iOS 18, Apple will now support long-requested functionality, like the ability to set app icons and widgets wherever you want.

iOS 18 will finally let you customize your icons and unlock them from the grid

As expected, this is a pivotal moment for the mobile platform as iOS 18 is going to focus on artificial intelligence.

Apple unveils iOS 18 with tons of AI-powered features

Apple today kicked off what it promised would be a packed WWDC 2024 with a handful of visionOS announcements. At the top of the list is the ability to turn…

visionOS can now make spatial photos out of 3D images

The Apple Vision Pro is now available in eight new countries.

Apple to release Vision Pro in international markets

VisionOS 2 will come to Vision Pro as a free update later this year.

Apple debuts visionOS 2 at WWDC 2024

The security firm said the attacks targeting Snowflake customers is “ongoing,” suggesting the number of affected companies may rise.

Mandiant says hackers stole a ‘significant volume of data’ from Snowflake customers

French startup Kelvin, which uses computer vision and machine learning to make it easier to audit homes for energy efficiency, has raised $5.1M.

Kelvin wants to help save the planet by applying AI to home energy audits