Startups

AI-driven audio cloning startup gives voice to Einstein chatbot

Comment

You’ll need to prick up your ears for this slice of deepfakery emerging from the wacky world of synthesized media: A digital version of Albert Einstein — with a synthesized voice that’s been (re)created using AI voice cloning technology drawing on audio recordings of the famous scientist’s actual voice.

The startup behind the “uncanny valley” audio deepfake of Einstein is Aflorithmic (whose seed round we covered back in February).

While the video engine powering the 3D character rending components of this “digital human” version of Einstein is the work of another synthesized media company — UneeQ — which is hosting the interactive chatbot version on its website.

Alforithmic says the “digital Einstein” is intended as a showcase for what will soon be possible with conversational social commerce. Which is a fancy way of saying deepfakes that make like historical figures will probably be trying to sell you pizza soon enough, as industry watchers have presciently warned.

The startup also says it sees educational potential in bringing famous, long-deceased figures to interactive “life”.

Or, well, an artificial approximation of it — the “life” being purely virtual and Digital Einstein’s voice not being a pure tech-powered clone either; Alforithmic says it also worked with an actor to do voice modelling for the chatbot (because how else was it going to get Digital Einstein to be able to say words the real-deal would never even have dreamt of saying — like, er, “blockchain”?). So there’s a bit more than AI artifice going on here too.

“This is the next milestone in showcasing the technology to make conversational social commerce possible,” Alforithmic’s COO Matt Lehmann told us. “There are still more than one flaws to iron out as well as tech challenges to overcome but overall we think this is a good way to show where this is moving to.”

In a blog post discussing how it recreated Einstein’s voice the startup writes about progress it made on one challenging element associated with the chatbot version — saying it was able to shrink the response time between turning around input text from the computational knowledge engine to its API being able to render a voiced response, down from an initial 12 seconds to less than three (which it dubs “near-real-time”). But it’s still enough of a lag to ensure the bot can’t escape from being a bit tedious.

How to choose and deploy industry-specific AI models

Laws that protect people’s data and/or image, meanwhile, present a legal and/or ethical challenge to creating such “digital clones” of living humans — at least not without asking (and most likely paying) first.

Of course historical figures aren’t around to ask awkward questions about the ethics of their likeness being appropriated for selling stuff (if only the cloning technology itself, at this nascent stage). Though licensing rights may still apply — and do in fact in the case of Einstein.

“His rights lie with the Hebrew University of Jerusalem who is a partner in this project,” says Lehmann, before ‘fessing up to the artist licence element of the Einstein “voice cloning” performance. “In fact, we actually didn’t clone Einstein’s voice as such but found inspiration in original recordings as well as in movies. The voice actor who helped us modelling his voice is a huge admirer himself and his performance captivated the character Einstein very well, we thought.”

Turns out the truth about high-tech “lies” is itself a bit of a layer cake. But with deepfakes it’s not the sophistication of the technology that matters so much as the impact the content has — and that’s always going to depend upon context. And however well (or badly) the faking is done, how people respond to what they see and hear can shift the whole narrative — from a positive story (creative/educational synthesized media) to something deeply negative (alarming, misleading deepfakes).

Concern about the potential for deepfakes to become a tool for disinformation is rising, too, as the tech gets more sophisticated — helping to drive moves toward regulating AI in Europe, where the two main entities responsible for “Digital Einstein” are based.

Earlier this week a leaked draft of an incoming legislative proposal on pan-EU rules for “high risk” applications of artificial intelligence included some sections specifically targeted at deepfakes.

Under the plan, lawmakers look set to propose “harmonised transparency rules” for AI systems that are designed to interact with humans and those used to generate or manipulate image, audio or video content. So a future Digital Einstein chatbot (or sales pitch) is likely to need to unequivocally declare itself artificial before it starts faking it — to avoid the need for internet users to have to apply a virtual Voight-Kampff test.

For now, though, the erudite-sounding interactive Digital Einstein chatbot still has enough of a lag to give the game away. Its makers are also clearly labelling their creation in the hopes of selling their vision of AI-driven social commerce to other businesses.

amp

An optimistic view of deepfakes

More TechCrunch

Replacing Sutskever is Jakub Pachocki, OpenAI’s director of research.

Ilya Sutskever, OpenAI co-founder and longtime chief scientist, departs

Intuitive Machines made history when it became the first private company to land a spacecraft on the moon, so it makes sense to adapt that tech for Mars.

Intuitive Machines wants to help NASA return samples from Mars

As Google revamps itself for the AI era, offering AI overviews within its search results, the company is introducing a new way to filter for just text-based links. With the…

Google adds ‘Web’ search filter for showing old-school text links as AI rolls out

Blue Origin’s New Shepard rocket will take a crew to suborbital space for the first time in nearly two years later this month, the company announced on Tuesday.  The NS-25…

Blue Origin to resume crewed New Shepard launches on May 19

This will enable developers to use the on-device model to power their own AI features.

Google is building its Gemini Nano AI model into Chrome on the desktop

It ran 110 minutes, but Google managed to reference AI a whopping 121 times during Google I/O 2024 (by its own count). CEO Sundar Pichai referenced the figure to wrap…

Google mentioned ‘AI’ 120+ times during its I/O keynote

Firebase Genkit is an open source framework that enables developers to quickly build AI into new and existing applications.

Google launches Firebase Genkit, a new open source framework for building AI-powered apps

In the coming months, Google says it will open up the Gemini Nano model to more developers.

Patreon and Grammarly are already experimenting with Gemini Nano, says Google

As part of the update, Reddit also launched a dedicated AMA tab within the web post composer.

Reddit introduces new tools for ‘Ask Me Anything,’ its Q&A feature

Here are quick hits of the biggest news from the keynote as they are announced.

Google I/O 2024: Here’s everything Google just announced

LearnLM is already powering features across Google products, including in YouTube, Google’s Gemini apps, Google Search and Google Classroom.

LearnLM is Google’s new family of AI models for education

The official launch comes almost a year after YouTube began experimenting with AI-generated quizzes on its mobile app. 

Google is bringing AI-generated quizzes to academic videos on YouTube

Around 550 employees across autonomous vehicle company Motional have been laid off, according to information taken from WARN notice filings and sources at the company.  Earlier this week, TechCrunch reported…

Motional cut about 550 employees, around 40%, in recent restructuring, sources say

The keynote kicks off at 10 a.m. PT on Tuesday and will offer glimpses into the latest versions of Android, Wear OS and Android TV.

Google I/O 2024: Watch all of the AI, Android reveals

Google Play has a new discovery feature for apps, new ways to acquire users, updates to Play Points, and other enhancements to developer-facing tools.

Google Play preps a new full-screen app discovery feature and adds more developer tools

Soon, Android users will be able to drag and drop AI-generated images directly into their Gmail, Google Messages and other apps.

Gemini on Android becomes more capable and works with Gmail, Messages, YouTube and more

Veo can capture different visual and cinematic styles, including shots of landscapes and timelapses, and make edits and adjustments to already-generated footage.

Google Veo, a serious swing at AI-generated video, debuts at Google I/O 2024

In addition to the body of the emails themselves, the feature will also be able to analyze attachments, like PDFs.

Gemini comes to Gmail to summarize, draft emails, and more

The summaries are created based on Gemini’s analysis of insights from Google Maps’ community of more than 300 million contributors.

Google is bringing Gemini capabilities to Google Maps Platform

Google says that over 100,000 developers already tried the service.

Project IDX, Google’s next-gen IDE, is now in open beta

The system effectively listens for “conversation patterns commonly associated with scams” in-real time. 

Google will use Gemini to detect scams during calls

The standard Gemma models were only available in 2 billion and 7 billion parameter versions, making this quite a step up.

Google announces Gemma 2, a 27B-parameter version of its open model, launching in June

This is a great example of a company using generative AI to open its software to more users.

Google TalkBack will use Gemini to describe images for blind people

Google’s Circle to Search feature will now be able to solve more complex problems across psychics and math word problems. 

Circle to Search is now a better homework helper

People can now search using a video they upload combined with a text query to get an AI overview of the answers they need.

Google experiments with using video to search, thanks to Gemini AI

A search results page based on generative AI as its ranking mechanism will have wide-reaching consequences for online publishers.

Google will soon start using GenAI to organize some search results pages

Google has built a custom Gemini model for search to combine real-time information, Google’s ranking, long context and multimodal features.

Google is adding more AI to its search results

At its Google I/O developer conference, Google on Tuesday announced the next generation of its Tensor Processing Units (TPU) AI chips.

Google’s next-gen TPUs promise a 4.7x performance boost

Google is upgrading Gemini, its AI-powered chatbot, with features aimed at making the experience more ambient and contextually useful.

Google’s Gemini updates: How Project Astra is powering some of I/O’s big reveals

Veo can generate few-seconds-long 1080p video clips given a text prompt.

Google’s image-generating AI gets an upgrade