Tag: Ziad Sultan

Spotify’s AI Voice Translation Pilot Means Your Favorite Podcasters Might Be Heard in Your Native Language

Across cultures, countries, and communities, the stories we share bring us together. And more often than not, it is the voices of the speakers that lend as much weight to the stories as the narratives themselves. For more than 15 years, Spotify’s global platform has empowered creators of all walks to share their work with audiences around the world. At its core, this has been made possible through technology that’s leveraged the power of audio to overcome barriers to access, borders, and distance. But with recent advancements, we’ve been wondering: Are there more ways we can bridge the language gap so that these voices can be heard worldwide?

Today, we’re excited to pilot Voice Translation for podcasts, a groundbreaking feature powered by AI that translates podcasts into additional languages—all in the podcaster’s voice. 


This Spotify-developed tool leverages the latest innovations—one of which is OpenAI’s newly released voice generation technology—to match the original speaker’s style, making for a more authentic listening experience that sounds more personal and natural than traditional dubbing. A podcast episode originally recorded in Engl
ish can now be available in other languages while keeping the speaker’s distinctive speech characteristics.  

As part of the pilot, we’ve worked closely with podcasters Dax Shepard, Monica Padman, Lex Fridman, Bill Simmons, and Steven Bartlett to generate AI-powered voice translations in other languages—including Spanish, French, and German—for a select number of catalog episodes and future episode releases. We’re also looking forward to including other shows, such as Dax Shepard’s eff won with DRS, The Rewatchables from The Ringer, and Trevor Noah’s new original podcast, which launches later this year.

“By matching the creator’s own voice, Voice Translation gives listeners around the world the power to discover and be inspired by new podcasters in a more authentic way than ever before,” says Ziad Sultan, VP of Personalization. “We believe that a thoughtful approach to AI can help build deeper connections between listeners and creators, a key component of Spotify’s mission to unlock the potential of human creativity.”

Voice-translated episodes from pilot creators will be available worldwide to Premium and Free users. We’re starting by releasing an initial bundle of translated episodes in Spanish, with French and German rolling out in the coming days and weeks:

  1. Lex Fridman Podcast – “Interview with Yuval Noah Harari”
  2. Armchair Expert – “Kristen Bell, by the grace of god, returns”
  3. The Diary of a CEO with Steven Bartlett – “Interview with Dr. Mindy Pelz”

We’ll start rolling these out to users on the Now Playing View of supported episodes starting today. Can’t wait and want to hear the episodes right away? Head to the dedicated Voice Translations Hub, which we’ll update with even more voice-translated episodes over the coming weeks and months.

Today is just the beginning. We’re excited to empower creators to bring their storytelling to more listeners across the world. The creator and audience feedback from the pilot will provide important insights for future expansion, iterations, and innovations. As the number of people (100M+) regularly listening to podcasts on Spotify continues to grow, we’ll continue exploring new ways to overcome barriers to storytelling.

Stay tuned to Spotify for Podcasters as we aim to expand access for more creators and languages.

Behind the Scenes of Spotify’s New AI DJ

Your very own DJ from Spotify

Since launching Spotify’s brand-new AI DJ in beta a few weeks back, Premium listeners across the U.S. and Canada have had the chance to experience our personalization capabilities in a whole new way. We’ve already seen so much love for DJ—both on-platform and across social media—and we’re not even out to 100% of users yet. On days when users tune in, fans spend 25% of their listening time with DJ, and they keep coming back for more, with more than half of first-time listeners coming back to listen to DJ the very next day.* 

 

View this post on Instagram

 

A post shared by Spotify (@spotify)

Today at Stream On, Spotify Co-President and Chief R&D Officer Gustav Söderström and Spotify Head of Cultural Partnerships Xavier “X” Jernigan, whose voice is the first model for DJ, got the chance to highlight the new feature even further. With DJ, we’re reimagining how listeners hear and discover the music they love as the tool transforms Spotify from a music tool into a living, breathing, interactive music experience. 

What went into building DJ? For the Record connected with some of the minds behind the new feature, including VP of Personalization Ziad Sultan, Head of Product Design for Personalization Emily Galloway, Product Director Zeena Qureshi, and Head of Global Hits J.J. Italiano, to better understand the synergy between the humans and technology that brought DJ to life. 

Spotify boasts years of expertise in personalization. How did we leverage this to create DJ?

Ziad Sultan: Personalization is at the heart of what we do. When we ask our listeners what they like most about Spotify, more than 81% cite our personalization. That’s because we have a bit of a secret sauce: We combine state-of-the-art technology with human passion and expertise.

We applied that same recipe to DJ. The result is a delightful music-listening experience that is deeply personalized to each individual listener, yet at a scale that the world has not really seen. Never before has listening felt so completely personal to each and every user. And it’s made possible by a powerful combination of three things: Spotify’s personalization technology, generative AI in the hands of the world’s best music curators, and a stunningly realistic AI voice that brings it all to life. 

Can you tell us more about how you designed the experience? 

Emily Galloway: DJ is an entirely new way to listen, and a brand-new format, so there wasn’t a formula to follow when we were making decisions. We had to answer some core experiential questions like: “How do we take you on a journey with both familiar and unfamiliar music?” “How do we evoke feelings of nostalgia?” and “What does it mean to give context to music listening?” But most importantly was, “How might we help fans and creators form a deeper, more meaningful connection?” I’m really proud of where we’ve landed—making personalization and AI more human than ever.

We know listeners are excited as well—they feel something different. We continue to see connection and discovery as the main themes of DJ. We found that when listeners hear commentary alongside personal music recommendations, they’re more willing to try something new, to listen to a song they may have otherwise skipped. For Spotify, that brings us closer to our goal of deepening artist and fan connections.  

Why did you decide to use a human-like AI powered voice for this experience?

Zeena Qureshi: We know that human voice helps people form connections, and the same is true when it comes to DJ. We found that having the voice sound human is key for users to foster a deeper connection with DJ, as human voice provides familiarity and instant context. By incorporating voice traits such as pacing, projection, emotion, and emphasis, it results in a DJ that’s emotional and highly realistic. 

Last year, Spotify acquired Sonantic and its unique patented algorithms, making all this possible. Sonantic is now Spotify’s dynamic AI voice platform that creates compelling, nuanced, and stunningly realistic voices from text. 

How does DJ leverage the expertise of our music editors? 

J.J. Italiano: DJ is built from human editorial expertise married with cutting-edge technology—that is Spotify’s superpower. The editorial team, which consists of hundreds of experts across the globe who know music and culture inside and out, can now harness this power to help tell artists’ stories and better contextualize their songs. To help arm DJ with knowledge and expertise, we created a Writers’ Room with music experts, culture experts, data curators, scriptwriters, and generative AI. Adding in this context gives the listener a deeper connection and experience when hearing an artist or song—and I’m very excited to bring it to listeners!

Ziad Sultan: Putting generative AI technology in the hands of our music experts allows them to scale their expertise like never before. Taking a Writers’ Room approach allows us to ensure that the commentary is accurate, relevant, and enriching to the product experience. We’re very excited about this approach that builds on our years of experience combining human expertise with world-class technology. That’s how DJ is able to deepen the connection between fans and their favorite artists, as well as help them discover new ones.

Ready to dive in? Learn where to find your DJ in your own Spotify app. 

*Results are based on eligible users (Premium users in the U.S. and Canada on mobile) and collected from February 22 to March 1.

Spotify to Acquire Sonantic, an AI Voice Platform

Spotify and Sonantic logos

As a leader in all things audio, Spotify is always searching for new ways to create unique experiences that our users will love. So today we’re excited to share our intention to acquire Sonantic, a dynamic AI voice platform that creates compelling, nuanced, and stunningly realistic voices from text. 

Listeners come to Spotify for all of the best audio content in the world—and we believe that Sonantic’s technology will allow us to create high-quality experiences for our users by building on our existing technical capabilities. 

“We’re really excited about the potential to bring Sonantic’s AI voice technology onto the Spotify platform and create new experiences for our users,” says Ziad Sultan, Spotify’s Vice President of Personalization. “This integration will enable us to engage users in a new and even more personalized way.”

“We’re looking forward to joining Spotify and continuing to build exciting voice experiences,” said Sonantic co-founders Zeena Qureshi and John Flynn in a joint statement. “We believe in the power voice has and its ability to foster a deeper connection with listeners around the world, and we know we can be better than ever on the world’s largest audio platform.”

At Spotify, we’ve identified several potential opportunities for text-to-speech capabilities across our platform, and we believe that over the long term, high-quality voice will be important to growing our share of listening. For example, this voice technology could allow us to give context to users about upcoming recommendations when they aren’t looking at their screens. Using voice in these moments can reduce barriers to creating new audio experiences—and open up the doors to even more new opportunities.