A brand is a fictional person, and like a person it has many unique characteristics, including voice. A brand’s voice helps users instantly identify a brand’s personality through hearing. Today, Amazon’s cloud service Amazon Polly launched the “Brand Voice” business, a fully automated service. The service can convert text content into realistic speech, providing customers with specially customized voice services. As Amazon’s head of AI voice Rafal Kuklinski and senior product manager Ankit Dhawan explained in a blog post, Brand Voice allows companies to differentiate themselves from other brands by incorporating a unique sonic signature into their products and services. “Every company can have their own unique sonic brand,” they wrote. Amazon partnered with KFC to implant the latter's brand logo "KFC Grandpa" with an English accent from the Southern United States and launched it on the Amazon Alexa App. It also designed the Australian English voice for National Australia Bank, which migrated its contact center to Amazon Connect, Amazon’s omnichannel cloud contact center product. Late last year, Amazon detailed its work on using AI to generate speech in a research paper (“The Impact of Data Reduction Effects on Text-to-Speech Conversion”), in which researchers described a system that could learn a new speaking style with just a few hours of training. To achieve the same goal, a voice actor may need dozens of hours. Amazon's AI model consists of two parts. The first is a neural network that converts a sequence of phonemes into a sequence of spectrograms, or a visual representation of the frequency spectrum of a sound over time. The second is a vocoder, which converts the spectrogram into a continuous audio signal. The method for training this AI model combines a large amount of neutral-style speech data with data in the desired style and an AI system that can distinguish between speech. Amazon already uses it internally to generate new voices for Alexa. This technology has good commercial value. The brand voice (for example, the character Fio, played by actress Stephanie Courtney) is often tasked with recording a phone tree for an interactive voice response system or an e-learning script for a corporate training video. Synthesizers can make actors more efficient by reducing auxiliary recording and listening, while freeing them up to work creatively. Amazon and Google stand out in this space with Brand Voice and other text-to-speech services. Google recently launched 31 AI-synthesized WaveNet voices and 24 new standard voices for its Cloud Text-to-Speech service. Beyond that, Amazon has another notable competitor in Microsoft, which offers three AI-generated preview voices and 75 standard voices through the Azure Speech Service API. Amazon’s Brand Voice also competes with offerings from a number of startups, such as Voicery, which offer customized digital voices that sound impressively human. Text-to-speech startup iSpeech has similar voice tools, as do Modulate, Respeecher, Resemble AI, Descript and Bengaluru-based DeepSync. ( Source: Cross-border Sellers Teahouse ) |
It is learned that on May 14, according to foreign...
WishPost is a cross-border e-commerce logistics pr...
Giropay is an online payment method using online b...
Many friends actually don’t know which Amazon sel...
Starting a business is a journey full of challenge...
Just yesterday, New York State, one of the most pr...
The beginning of each year is a slow season for ma...
ONEBIGADS is a social advertising creative materia...
If Amazon wants to sell new products, it must have...
Visme is an online interactive presentation chart ...
Recently, many sellers have discovered that the Am...
text Amazon’s “Translation Service” is a feature ...
Ruby Lane is the world's largest curated marke...
Prime Day 2024 officially starts on July 15th, and...