• Podcast Radio
  • Posts
  • 🤨The Shady Secrets Behind AI's Meteoric Rise🚀

🤨The Shady Secrets Behind AI's Meteoric Rise🚀

🥷What to do to protect your podcast from transcript theft 🥷

Podcast Radio, The Leading Podcast Industry Magazine

In the high-stakes race to develop the world's most advanced AI systems, major tech companies like OpenAI and Meta are going to extraordinary lengths to acquire the massive troves of data needed to train their increasingly complex models. From transcribing millions of YouTube videos without permission to considering the outright acquisition of book publishers, these firms are stretching ethical and legal norms in pursuit of a competitive edge.

  • OpenAI transcribed over 1 million hours of YouTube videos without consent to train GPT-4

    • It’s argued this fell under "fair use" but violated YouTube's terms of service

    • Not to mention using copyrighted podcasts, audiobooks, literary works without licenses

  • Meta faced similar data shortages after exhausting online text sources

  • Many of these companies are now proposing licensing deals with book publishers or potentially acquiring one outright

  • Google has also gathered YouTube transcripts but claims it was done via creator agreements on the platform they own

Commentary: The exponential growth of large language models is rapidly outpacing the creation of new online training data, driving companies toward controversial measures. With solutions like synthetic data generation still unproven at scale, respecting data rights and ethical principles may become increasingly difficult as the thirst for AI supremacy intensifies.

Be vigilant on these developments, and you’re encouraged to use these Large Language Models to your own advantage- use them to understand user agreements when you post your podcast to these platforms. Check to see if you’re giving away rights to allow them to train their models.

Follow up, things to look more into: Emerging ethical frameworks and legal regimes to govern AI development and data usage. The potential impacts of an AI "data drought" by 2028 on the field's trajectory. Alternative approaches to responsibly developing AI technologies.

Podcast Radio, by AudioTuner

Check out Sidekick to tap into an assistant that creates chapters for your episodes, writes search friendly descriptions or boost the way new listeners can find your podcast with blog posts.