Startups

Tonic is betting that synthetic data is the new big data to solve scalability and security

Comment

Image Credits: Vertigo3d (opens in a new window) / Getty Images

Big data is a sham. For years now, we have been told that every company should save every last morsel of digital exhaust in some sort of database, lest management lose some competitive intelligence against … a competitor, or something.

There is just one problem with big data though: It’s honking huge.

Processing petabytes of data to generate business insights is expensive and time-consuming. Worse, all that data hanging around paints a big, bright red target on the back of the company for every hacker group in the world. Big data is expensive to maintain, expensive to protect and expensive to keep private. And the upshot might not be all that much in the end after all — oftentimes, well-curated and chosen data sets can provide faster and better insight than endless quantities of raw data.

What should a company do? Well, they need a Tonic to ameliorate their big data sins.

Tonic is a “synthetic data” platform that transforms raw data into more manageable and private data sets usable by software engineers and business analysts. Along the way, Tonic’s algorithms de-identify the original data and create statistically identical but synthetic data sets, which means that personal information isn’t shared insecurely.

For instance, an online shopping platform will have transaction history on its customers and what they purchased. Sharing that data with every engineer and analyst in the company is dangerous, since that purchase history could have personally identifying details to which no one without a need-to-know should have access. Tonic could take that original payments data and transform it into a new, smaller data set with exactly the same statistical properties, but not tied to original customers. That way, an engineer could test their app or an analyst could test their marketing campaign, all without triggering concerns about privacy.

Synthetic data and other ways to handle the privacy of large data sets has garnered massive attention from investors in recent months. We reported last week on Skyflow, which raised a round to use polymorphic encryption to ensure that employees only have access to the data they need and are blocked from accessing the rest. BigID takes a more overarching view of just tracking what data is where and who should have access to it (i.e. data governance) based on local privacy laws.

Tonic’s approach has the benefit of helping solve not just privacy issues, but also scalability challenges as data sets get larger and larger in size. That combination has attracted the attention of investors: This morning, the company announced that it has raised $8 million in a Series A led by Glenn Solomon and Oren Yunger of GGV, the latter of whom will join the company’s board.

The company was founded in 2018 by a quad of founders: CEO Ian Coe worked with COO Karl Hanson (they first met in middle school as well) and CTO Andrew Colombi while they were all working at Palantir, and Coe also formerly worked with the company’s head of engineering Adam Kamor while at Tableau. That training at some of the largest and most successful data infrastructure companies from the Valley forms part of the product DNA for Tonic.

Tonic’s team. Photo via Tonic.

Coe explained that Tonic is designed to prevent some of the most obvious security flaws that arise in modern software engineering. In addition to saving data pipelining time for engineering teams, Tonic “also means that they’re not worried about sensitive data going from production environments to lower environments that are always less secure than your production systems.”

He said that the idea for what would become Tonic originated while troubleshooting problems at a Palantir banking client. They needed data to solve a problem, but that data was super sensitive, and so the team ended up using synthetic data to bridge the difference. Coe wants to expand the utility of synthetic data to more people in a more rigorous way, particularly given the legal changes these days. “I think regulatory pressure is really pushing teams to change their practices” around data, he noted.

The key to Tonic’s technology is its subsetter, which evaluates raw data and starts to statistically define the relationships between all the records. Some of that analysis is automated depending on the data sources, and when it can’t be automated, Tonic’s UI can help a data scientist onboard data sets and define those relationships manually. In the end, Tonic generates these synthetic data sets usable by all the customers of that data inside a company.

With the new round of funding, Coe wants to continue doubling down on ease-of-use and onboarding and proselytizing the benefit of this model for his clients. “In a lot of ways, we’re creating a category, and that means that people have to understand and also get the value [and have] the early-adopter mindset,” he said.

In addition to lead investor GGV, Bloomberg Beta, Xfund, Heavybit and Silicon Valley CISO Investments participated in the round, as well as angels Assaf Wand and Anthony Goldbloom.

Skyflow raises $17.5M more to help companies protect your personal data

More TechCrunch

Lydia is splitting itself into two apps — Lydia for P2P payments and Sumeria for those looking for a mobile-first bank account.

Lydia, the French payments app with 8 million users, launches mobile banking app Sumeria

Cargo ships docking at a commercial port incur costs called “disbursements” and “port call expenses.” This might be port dues, towage, and pilotage fees. It’s a complex patchwork and all…

Shipping logistics startup Harbor Lab raises $16M Series A led by Atomico

AWS has confirmed its European “sovereign cloud” will go live by the end of 2025, enabling greater data residency for the region.

AWS confirms will launch European ‘sovereign cloud’ in Germany by 2025, plans €7.8B investment over 15 years

Go Digit, an Indian insurance startup, has raised $141 million from investors including Goldman Sachs, ADIA, and Morgan Stanley as part of its IPO.

Indian insurance startup Go Digit raises $141M from anchor investors ahead of IPO

Peakbridge intends to invest in between 16 and 20 companies, investing around $10 million in each company. It has made eight investments so far.

Food VC Peakbridge has new $187M fund to transform future of food, like lab-made cocoa

For over six decades, the nonprofit has been active in the financial services sector.

Accion’s new $152.5M fund will back financial institutions serving small businesses globally

Meta’s newest social network, Threads, is starting its own fact-checking program after piggybacking on Instagram and Facebook’s network for a few months.

Threads finally starts its own fact-checking program

Looking Glass makes trippy-looking mixed-reality screens that make things look 3D without the need of special glasses. Today, it launches a pair of new displays, including a 16-inch mode that…

Looking Glass launches new 3D displays

Replacing Sutskever is Jakub Pachocki, OpenAI’s director of research.

Ilya Sutskever, OpenAI co-founder and longtime chief scientist, departs

Intuitive Machines made history when it became the first private company to land a spacecraft on the moon, so it makes sense to adapt that tech for Mars.

Intuitive Machines wants to help NASA return samples from Mars

As Google revamps itself for the AI era, offering AI overviews within its search results, the company is introducing a new way to filter for just text-based links. With the…

Google adds ‘Web’ search filter for showing old-school text links as AI rolls out

Blue Origin’s New Shepard rocket will take a crew to suborbital space for the first time in nearly two years later this month, the company announced on Tuesday.  The NS-25…

Blue Origin to resume crewed New Shepard launches on May 19

This will enable developers to use the on-device model to power their own AI features.

Google is building its Gemini Nano AI model into Chrome on the desktop

It ran 110 minutes, but Google managed to reference AI a whopping 121 times during Google I/O 2024 (by its own count). CEO Sundar Pichai referenced the figure to wrap…

Google mentioned ‘AI’ 120+ times during its I/O keynote

Firebase Genkit is an open source framework that enables developers to quickly build AI into new and existing applications.

Google launches Firebase Genkit, a new open source framework for building AI-powered apps

In the coming months, Google says it will open up the Gemini Nano model to more developers.

Patreon and Grammarly are already experimenting with Gemini Nano, says Google

As part of the update, Reddit also launched a dedicated AMA tab within the web post composer.

Reddit introduces new tools for ‘Ask Me Anything,’ its Q&A feature

Here are quick hits of the biggest news from the keynote as they are announced.

Google I/O 2024: Here’s everything Google just announced

LearnLM is already powering features across Google products, including in YouTube, Google’s Gemini apps, Google Search and Google Classroom.

LearnLM is Google’s new family of AI models for education

The official launch comes almost a year after YouTube began experimenting with AI-generated quizzes on its mobile app. 

Google is bringing AI-generated quizzes to academic videos on YouTube

Around 550 employees across autonomous vehicle company Motional have been laid off, according to information taken from WARN notice filings and sources at the company.  Earlier this week, TechCrunch reported…

Motional cut about 550 employees, around 40%, in recent restructuring, sources say

The keynote kicks off at 10 a.m. PT on Tuesday and will offer glimpses into the latest versions of Android, Wear OS and Android TV.

Google I/O 2024: Watch all of the AI, Android reveals

Google Play has a new discovery feature for apps, new ways to acquire users, updates to Play Points, and other enhancements to developer-facing tools.

Google Play preps a new full-screen app discovery feature and adds more developer tools

Soon, Android users will be able to drag and drop AI-generated images directly into their Gmail, Google Messages and other apps.

Gemini on Android becomes more capable and works with Gmail, Messages, YouTube and more

Veo can capture different visual and cinematic styles, including shots of landscapes and timelapses, and make edits and adjustments to already-generated footage.

Google Veo, a serious swing at AI-generated video, debuts at Google I/O 2024

In addition to the body of the emails themselves, the feature will also be able to analyze attachments, like PDFs.

Gemini comes to Gmail to summarize, draft emails, and more

The summaries are created based on Gemini’s analysis of insights from Google Maps’ community of more than 300 million contributors.

Google is bringing Gemini capabilities to Google Maps Platform

Google says that over 100,000 developers already tried the service.

Project IDX, Google’s next-gen IDE, is now in open beta

The system effectively listens for “conversation patterns commonly associated with scams” in-real time. 

Google will use Gemini to detect scams during calls

The standard Gemma models were only available in 2 billion and 7 billion parameter versions, making this quite a step up.

Google announces Gemma 2, a 27B-parameter version of its open model, launching in June