AI

This week in AI: Experiments, retirements, and extinction events

Comment

YouTube play button
Image Credits: Alexander Shatov (opens in a new window) / Unsplash

Keeping up with an industry as fast-moving as AI is a tall order. So until an AI can do it for you, here’s a handy roundup of the last week’s stories in the world of machine learning, along with notable research and experiments we didn’t cover on their own.

YouTube has begun experimenting with AI-generated summaries for videos on the watch and search pages, though only for a limited number of English-language videos and viewers.

Certainly, the summaries could be useful for discovery — and accessibility. Not every video creator can be bothered to write a description. But I worry about the potential for mistakes and biases embedded by the AI.

Even the best AI models today tend to “hallucinate.” OpenAI freely admits that its latest text-generating-and-summarizing model, GPT-4, makes major errors in reasoning and invents “facts.” Patrick Hymel, an entrepreneur in the health tech industry, wrote about the ways in which GPT-4 makes up references, facts and figures without any identifiable link to real sources. And Fast Company tested ChatGPT’s ability to summarize articles, finding it . . . quite bad.

One can imagine AI-generated video summaries going off the deep end, given the added challenge of analyzing the content contained within the videos. It’s tough to evaluate the quality of YouTube’s AI-generated summaries. But it’s well established that AI isn’t all that great at summarizing text content.

YouTube subtly acknowledges that AI-generated descriptions are no substitute for the real thing. On the support page, it writes: “While we hope these summaries are helpful and give you a quick overview of what a video is about, they do not replace video descriptions (which are written by creators!).”

Here’s hoping the platform doesn’t roll out the feature too hastily. But considering Google’s half-baked AI product launches lately (see its attempt at a ChatGPT rival, Bard), I’m not too confident.

Here are some other AI stories of note from the past few days:

Dario Amodei is coming to Disrupt: We’ll be interviewing the Anthropic co-founder about what it’s like to have so much money. And AI stuff too.

Google Search gains new AI features: Google is adding contextual images and videos to its AI-powered Search Generative Experience (SGE), the generative AI-powered search feature announced at May’s I/O conference. With the updates, SGE now shows images or videos related to the search query. The company also reportedly is pivoting its Assistant project to a Bard-like generative AI.

Microsoft kills Cortana: Echoing the events of the Halo series of games from which the name was plucked, Cortana has been destroyed. Fortunately this was not a rogue general AI but an also-ran digital assistant whose time had come.

Meta embraces generative AI music: Meta this week announced AudioCraft, a framework to generate what it describes as “high-quality,” “realistic” audio and music from short text descriptions, or prompts.

Google pulls AI Test Kitchen: Google has pulled its AI Test Kitchen app from the Play Store and the App Store to focus solely on the web platform. The company launched the AI Test Kitchen experience last year to let users interact with projects powered by different AI models such as LaMDA 2.

Robots learn from small amounts of data: On the subject of Google, DeepMind, the tech giant’s AI-focused research lab, has developed a system that it claims allows robots to effectively transfer concepts learned on relatively small datasets to different scenarios.

Kickstarter enacts new rules around generative AI: Kickstarter this week announced that projects on its platform using AI tools to generate content will be required to disclose how the project owner plans to use the AI content in their work. In addition, Kickstarter is mandating that new projects involving the development of AI tech detail info about the sources of training data the project owner intends to use.

China cracks down on generative AI: Multiple generative AI apps have been removed from Apple’s China App Store this week, thanks to new rules that’ll require AI apps operating in China to obtain an administrative license.

Inworld, a generative AI platform for creating NPCs, lands fresh investment

Stable Diffusion releases new model: Stability AI launched Stable Diffusion XL 1.0, a text-to-image model that the company describes as its “most advanced” release to date. Stability claims that the model’s images are “more vibrant” and “accurate” colors and have better contrast, shadows and lighting compared to artwork from its predecessor.

The future of AI is video: Or at least a big part of the generative AI business is, as Haje has it.

AI.com has switched from OpenAI to X.ai: It’s extremely unclear whether it was sold, rented, or is part of some kind of ongoing scheme, but the coveted two-letter domain (likely worth $5 million to $10 million) now points to Elon Musk’s X.ai research outfit rather than the ChatGPT interface.

Other machine learnings

AI is working its way into countless scientific domains, as I have occasion to document here regularly, but you could be forgiven for not being able to list more than a few specific applications offhand. This literature review at Nature is as comprehensive an accounting of areas and methods where AI is taking effect as you’re likely to find anywhere, as well as the advances that have made them possible. Unfortunately it’s paywalled, but you can probably find a way to get a copy.

A deeper dive into the potential for AI to improve the global fight against infectious diseases can be found here at Science, and a few takeaways can be found in UPenn’s summary. One interesting part is that models built to predict drug interactions could also help “unravel intricate interactions between infectious organisms and the host immune system.” Disease pathology can be ridiculously complicated, so epidemiologists and doctors will probably take any help they can get.

Asteroid spotted, ma’am. Image Credits: UW

Another interesting example, with the caveat that not every algorithm should be called AI, is this multi-institutional work algorithmically identifying “potentially hazardous” asteroids. Sky surveys generate a ton of data and sorting through it for faint signals like asteroids is tough work that’s highly susceptible to automation. The 600-foot 2022 SF289 was found during a test of the algorithm on ATLAS data. “This is just a small taste of what to expect with the Rubin Observatory in less than two years, when HelioLinc3D will be discovering an object like this every night,” said UW’s Mario Jurić. Can’t wait!

A sort of halo around the AI research world is research being done on AI — how it works and why. Usually these studies are pretty difficult for non-experts to parse, and this one from ETHZ researchers is no exception. But lead author Johannes von Oswald also did an interview explaining some of the concepts in plain English. It’s worth a read if you’re curious about the “learning” process that happens inside models like ChatGPT.

Improving the learning process is also important, and as these Duke researchers find, the answer is not always “more data.” In fact, more data can hinder a machine learning model, said Duke professor Daniel Reker: “It’s like if you trained an algorithm to distinguish pictures of dogs and cats, but you gave it one billion photos of dogs to learn from and only one hundred photos of cats. The algorithm will get so good at identifying dogs that everything will start to look like a dog, and it will forget everything else in the world.” Their approach used an “active learning” technique that identified such weaknesses in the dataset, and proved more effective while using just 1/10 of the data.

A University College London study found that people were only able to discern real from synthetic speech 73% of the time, in both English and Mandarin. Probably we’ll all get better at this, but in the near term the tech will probably outstrip our ability to detect it. Stay frosty out there.

More TechCrunch

It ran 110 minutes, but Google managed to reference AI a whopping 121 times during Google I/O 2024 (by its own count). CEO Sundar Pichai referenced the figure to wrap…

Google mentioned ‘AI’ 120+ times during its I/O keynote

Firebase Genkit is an open source framework that enables developers to quickly build AI into new and existing applications.

Google launches Firebase Genkit, a new open source framework for building AI-powered apps

In the coming months, Google says it will open up the Gemini Nano model to more developers.

Patreon and Grammarly are already experimenting with Gemini Nano, says Google

As part of the update, Reddit also launched a dedicated AMA tab within the web post composer.

Reddit introduces new tools for ‘Ask Me Anything,’ its Q&A feature

Here are quick hits of the biggest news from the keynote as they are announced.

Google I/O 2024: Here’s everything Google just announced

LearnLM is already powering features across Google products, including in YouTube, Google’s Gemini apps, Google Search and Google Classroom.

LearnLM is Google’s new family of AI models for education

The official launch comes almost a year after YouTube began experimenting with AI-generated quizzes on its mobile app. 

Google is bringing AI-generated quizzes to academic videos on YouTube

Around 550 employees across autonomous vehicle company Motional have been laid off, according to information taken from WARN notice filings and sources at the company.  Earlier this week, TechCrunch reported…

Motional cut about 550 employees, around 40%, in recent restructuring, sources say

The keynote kicks off at 10 a.m. PT on Tuesday and will offer glimpses into the latest versions of Android, Wear OS and Android TV.

Google I/O 2024: Watch all of the AI, Android reveals

Google Play has a new discovery feature for apps, new ways to acquire users, updates to Play Points, and other enhancements to developer-facing tools.

Google Play preps a new full-screen app discovery feature and adds more developer tools

Soon, Android users will be able to drag and drop AI-generated images directly into their Gmail, Google Messages and other apps.

Gemini on Android becomes more capable and works with Gmail, Messages, YouTube and more

Veo can capture different visual and cinematic styles, including shots of landscapes and timelapses, and make edits and adjustments to already-generated footage.

Google Veo, a serious swing at AI-generated video, debuts at Google I/O 2024

In addition to the body of the emails themselves, the feature will also be able to analyze attachments, like PDFs.

Gemini comes to Gmail to summarize, draft emails, and more

The summaries are created based on Gemini’s analysis of insights from Google Maps’ community of more than 300 million contributors.

Google is bringing Gemini capabilities to Google Maps Platform

Google says that over 100,000 developers already tried the service.

Project IDX, Google’s next-gen IDE, is now in open beta

The system effectively listens for “conversation patterns commonly associated with scams” in-real time. 

Google will use Gemini to detect scams during calls

The standard Gemma models were only available in 2 billion and 7 billion parameter versions, making this quite a step up.

Google announces Gemma 2, a 27B-parameter version of its open model, launching in June

This is a great example of a company using generative AI to open its software to more users.

Google TalkBack will use Gemini to describe images for blind people

This will enable developers to use the on-device model to power their own AI features.

Google is building its Gemini Nano AI model into Chrome on the desktop

Google’s Circle to Search feature will now be able to solve more complex problems across psychics and math word problems. 

Circle to Search is now a better homework helper

People can now search using a video they upload combined with a text query to get an AI overview of the answers they need.

Google experiments with using video to search, thanks to Gemini AI

A search results page based on generative AI as its ranking mechanism will have wide-reaching consequences for online publishers.

Google will soon start using GenAI to organize some search results pages

Google has built a custom Gemini model for search to combine real-time information, Google’s ranking, long context and multimodal features.

Google is adding more AI to its search results

At its Google I/O developer conference, Google on Tuesday announced the next generation of its Tensor Processing Units (TPU) AI chips.

Google’s next-gen TPUs promise a 4.7x performance boost

Google is upgrading Gemini, its AI-powered chatbot, with features aimed at making the experience more ambient and contextually useful.

Google’s Gemini updates: How Project Astra is powering some of I/O’s big reveals

Veo can generate few-seconds-long 1080p video clips given a text prompt.

Google’s image-generating AI gets an upgrade

At Google I/O, Google announced upgrades to Gemini 1.5 Pro, including a bigger context window. .

Google’s generative AI can now analyze hours of video

The AI upgrade will make finding the right content more intuitive and less of a manual search process.

Google Photos introduces an AI search feature, Ask Photos

Apple released new data about anti-fraud measures related to its operation of the iOS App Store on Tuesday morning, trumpeting a claim that it stopped over $7 billion in “potentially…

Apple touts stopping $1.8B in App Store fraud last year in latest pitch to developers

Online travel agency Expedia is testing an AI assistant that bolsters features like search, itinerary building, trip planning, and real-time travel updates.

Expedia starts testing AI-powered features for search and travel planning