Skip to main content

Digital Trends may earn a commission when you buy through links on our site. Why trust us?

This AI can spoof your voice after just three seconds

Artificial intelligence (AI) is having a moment right now, and the wind continues to blow in its sails with the news that Microsoft is working on an AI that can imitate anyone’s voice after being fed a short three-second sample.

The new tool, dubbed VALL-E, has been trained on roughly 60,000 hours of voice data in the English language, which Microsoft says is “hundreds of times larger than existing systems”. Using that knowledge, its creators claim it only needs a small smattering of vocal input to understand how to replicate a user’s voice.

man speaking into phone
Fizkes/Shutterstock

More impressive, VALL-E can reproduce the emotions, vocal tones, and acoustic environment found in each sample, something other voice AI programs have struggled with. That gives it a more realistic aura and brings its results closer to something that could pass as genuine human speech.

Recommended Videos

When compared to other text-to-speech (TTS) competitors, Microsoft says VALL-E “significantly outperforms the state-of-the-art zero-shot TTS system in terms of speech naturalness and speaker similarity.” In other words, VALL-E sounds much more like real humans than rival AIs that encounter audio inputs that they have not been trained on.

On GitHub, Microsoft has created a small library of samples created using VALL-E. The results are mostly very impressive, with many samples that reproduce the lilt and accent of the speakers’ voices. Some of the examples are less convincing, indicating VALL-E is probably not a finished product, but overall the output is convincing.

Huge potential — and risks

A person conducting a video call on a Microsoft Surface device running Windows 11.
Microsoft/Unsplash

In a paper introducing VALL-E, Microsoft explains that VALL-E “may carry potential risks in misuse of the model, such as spoofing voice identification or impersonating a specific speaker.” Such a capable tool for generating realistic-sounding speech raises the specter of ever-more convincing deepfakes, which could be used to mimic anything from a former romantic partner to a prominent international personality.

To mitigate that threat, Microsoft says “it is possible to build a detection model to discriminate whether an audio clip was synthesized by VALL-E.” The company says it will also use its own AI principles when developing its work. Those principles cover areas such as fairness, safety, privacy, and accountability.

VALL-E is just the latest example of Microsoft’s experimentation with AI. Recently, the company has been working on integrating ChatGPT into Bing, using AI to recap your Teams meetings, and grafting advanced tools into apps like Outlook, Word, and PowerPoint. And according to Semafor, Microsoft is looking to invest $10 billion into ChatGPT maker OpenAI, a company it has already plowed significant funds into.

Despite the apparent risks, tools like VALL-E could be especially useful in medicine, for instance, to help people to regain their voice after an accident. Being able to replicate speech with such a small input set could be immensely promising in these situations, provided it is done right. But with all the money being spent on AI — both by Microsoft and others — it’s clear it’s not going away any time soon.

Alex Blake
Alex Blake has been working with Digital Trends since 2019, where he spends most of his time writing about Mac computers…
Amazon’s AI agent will make it even easier for you to part with your money
Amazon Nova Act performing task in a web browser.

The next big thing in the field of artificial intelligence is Agentic AI, which is essentially an AI tool that can automate certain multi-step processes for users. For example, interacting with a web browser for tasks like booking tickets or ordering groceries. 

Amazon certainly sees a future in there. After giving a massive overhaul to Alexa and introducing a new Alexa+ assistant, the company has today announced a new AI agent called Nova Act. Amazon says Nova Act is designed to “complete tasks in a web browser.” Amazon won’t be the first to reach this milestone, as few other AI companies have already attempted this vision. 

Read more
3 open source AI apps you can use to replace your ChatGPT subscription
Phone running Deepseek on a laptop keyboard.

The next leg of the AI race is on, and has expanded beyond the usual players, such as OpenAI, Google, Meta, and Microsoft. In addition to the dominance of the tech giants, more open-source options have now taken to the spotlight with a new focus in the AI arena.

Various brands, such as DeepSeek, Alibaba, and Baidu, have demonstrated that AI functions can be developed and executed at a fraction of the cost. They have also navigated securing solid business partnerships and deciding or continuing to provide AI products to consumers as free or low-cost, open source models, while larger companies double down on a proprietary, for-profit trajectory, hiding their best features behind a paywall.

Read more
Opera One puts an AI in control of browser tabs, and it’s pretty smart
AI tab manager in Opera One browser.

Opera One browser has lately won a lot of plaudits for its slick implementation of useful AI features, a clean design, and a healthy bunch of chat integrations. Now, it is putting AI in command of your browser tabs, and in a good way.
The new feature is called AI Tab Commands, and it essentially allows users to handle their tabs using natural language commands. All you need to do is summon the onboard Aria AI assistant, and it will handle the rest like an obedient AI butler.
The overarching idea is to let the AI handle multiple tabs, and not just one. For example, you can ask it to “group all Wikipedia tabs together,” “close all the Smithsonian tabs,” “or shut down the inactive tabs.”

A meaningful AI for web browsing
Handling tabs is a chore in any web browser, and if internet research is part of your daily job, you know the drill. Having to manually move around tabs using a mix of cursor and keyboard shorcuts, naming them, and checking through the entire list of tabs is a tedious task.
Meet Opera Tab Commands: manage your tabs with simple prompts
Deploying an AI do it locally — and using only natural language commands — is a lovely convenience and one of the nicest implementations of AI I’ve seen lately. Interestingly, Opera is also working on a futuristic AI agent that will get browser-based work done using only text prompts.
Coming back to the AI-driven tab management, the entire process unfolds locally, and no data is sent to servers, which is a neat assurance. “When using Tab Commands and asking Aria to e.g. organize their tabs, the AI only sends to the server the prompt a user provides (e.g., “close all my YouTube tabs”) – nothing else,” says the company.
To summon the AI Tab manager, users can hit the Ctrl + slash(/) shortcut, or the Command + Slash combo for macOS. It can also be invoked with a right-click on the tabs, as long as there are five or more currently running in a window.
https://x.com/opera/status/1904822529254183166?s=61
Aside from closing or grouping tabs, the AI Tab Commands can also be used to pin tabs. It can also accept exception commands, such as “close all tabs except the YouTube tabs.” Notably, this feature is also making its way to Opera Air and the gaming-focused Opera GX browser, as well.
Talking about grouping together related tabs, Opera has a neat system called tab islands, instead of color-coded tab groups at the top, as is the case with Chrome or Safari. Opera’s implementation looks better and works really well.
Notably, the AI Tab Commands window also comes with an undo shortcut, for scenarios where you want to revert the actions, like reviving a bunch of closed tabs. Opera One is now available to download on Windows and macOS devices. Opera also offers Air, a browser than puts some zen into your daily workflow.

Read more