The Future of Sound: Udio’s Vision for AI-Generated Music | E2016

28 Sep 2024 (23 days ago)
The Future of Sound: Udio’s Vision for AI-Generated Music | E2016

Udio’s David Ding joins Alex. (0s)

  • The speaker is a fan of music and plays classical and jazz trumpet. (8s)
  • The speaker is interested in AI music and two companies, Udio and Sunno))), have caught their eye. (21s)
  • Udio was founded in 2023. (1m27s)

David's journey and the inception of Udio (1m32s)

  • David co-founded Udio in November 2023 after leaving his researcher position at Deep Mind. (1m43s)
  • David's interest in technology and music started in his childhood, leading him to pursue machine learning and play classical piano. (1m51s)
  • David's experience at Deep Mind, where he witnessed the rise of generative modeling and AI technologies like ChatGPT, inspired him to apply similar technology to music creation. (2m44s)

AI music generation and user control over music elements (4m26s)

  • A model was trained on music to learn common elements of music theory, such as chord progressions, rhythm, and genre-specific characteristics. (4m27s)
  • The model also learned how sound interacts with recording technology, including how sound is captured by microphones and converted into stereo. (4m58s)
  • While the model does not currently support time signature changes, it does allow users to control the key of the song, a feature that was added in response to user feedback. (6m57s)

Tech Domains - Apply for the Jam Session with JCal contest today (7m52s)

  • The "Jam with JCal" contest is seeking one more founder to participate. (8m4s)
  • To be eligible, founders must have under $2 million in funding and use a ".te" domain name. (8m13s)
  • The contest winner will be invited to appear on the "This Week in Startups" program. (8m20s)

Advancements in AI music creation and data annotation (8m47s)

  • Data annotation is crucial for AI music creation as it teaches the model to associate descriptive words with musical elements. (9m16s)
  • By annotating time signatures in music, the AI can understand and apply specific time signatures to compositions upon request. (9m28s)
  • Data annotation acts as a bridge between user requests and the AI's understanding, enabling natural language processing for music creation. (10m6s)

Evolution of Udio's AI models and early versions (10m32s)

  • Early versions of the AI model had difficulty producing lyrics because of a bug that prevented the model from recognizing lyrics in the training data. (11m59s)
  • Once the bug was fixed, the model rapidly improved its ability to generate music, learning to differentiate between genres and improving in sound quality. (12m28s)
  • Future development will focus on providing users with more control over the music generation process, such as allowing them to input melodies or musical styles as guides for the AI. (14m5s)

Udio's target audience and the future of AI in music (14m59s)

  • Udio aims to make music creation more accessible for everyone, including music enthusiasts and professional artists. (15m43s)
  • Udio is positioned as a tool that complements existing music creation methods, similar to how DAWs, samplers, and drum machines have expanded creative possibilities. (15m53s)
  • Udio is not intended to replace musicians or passive music consumption; instead, it offers a new avenue for musical expression and exploration alongside traditional approaches. (19m6s)

Udio's potential DAW integrations (20m6s)

  • Udio users can download individual stems of their AI-generated music. (20m32s)
  • Udio creates a fully mixed track, and then individual stems are separated from the mix. (21m6s)
  • Stems are the individual tracks inside of a song, such as bass, guitar, or piano. (21m34s)

LinkedIn Ads - Get a $100 LinkedIn ad credit (21m52s)

  • There are over 50,000 venture-backed startups in the United States, making targeted marketing essential for standing out from competitors. (21m52s)
  • LinkedIn Ads allows businesses to target ads based on factors like job title, industry, location, and even specific companies. (22m13s)
  • LinkedIn is a platform where professionals engage in business-related activities, with a significant portion of its user base consisting of decision-makers and executives. (22m42s)

Udio's funding and business model (23m17s)

  • Udio aims to balance affordability for users with sustainable business practices, setting prices to encourage experimentation while covering operational costs. (23m57s)
  • Udio, with a team of 17, acknowledges the potential for model efficiency improvements to reduce costs, similar to trends observed with OpenAI's GPT models. (25m4s)
  • Having raised $10 million in funding, Udio prioritizes disciplined spending and believes that operating with controlled resources can foster innovation. (26m57s)

Financial discipline and GPU cost efficiency at Udio (27m11s)

  • Udio prioritizes cost-efficiency by utilizing the most affordable compute power available, specifically Google Cloud's TPUs, which offer significant cost reductions compared to other options like Nvidia GPUs. (27m35s)
  • Udio's team of experienced modeling research scientists optimize hardware utilization, develop efficient programs, and design architectures for efficient training, maximizing the value of their compute resources. (28m44s)
  • While acknowledging the declining cost of cloud computing, future projections regarding infrastructure remain uncertain, with the possibility of transitioning to owned hardware remaining open depending on long-term cost-effectiveness. (30m3s)

Brave Search API - Get started for free (31m28s)

  • Brave Search, the search engine of the privacy-focused Brave browser, is offering its search index via an API. (32m0s)
  • The Brave Search API allows developers to access a search index populated by the activity of Brave's 65 million users for use in applications like chatbots and AI model training. (31m48s)
  • The Brave Search API is free for up to 2,000 queries per month, and paid plans start at $3 CPM (cost per thousand queries). (32m28s)

GPU-based compute challenges and a live Udio demo (32m47s)

  • There are challenges in obtaining sufficient GPU-based compute capacity from cloud providers due to high demand and limited chip availability. (33m19s)
  • Udio is a music creation platform that allows users to generate original music by specifying the desired genre, lyrics, and other stylistic elements. (33m55s)
  • Udio utilizes a proprietary music generation model and GPT for lyrics, with copyright checks in place to ensure originality. (45m51s)

Udio's user interface, engagement, and community insights (48m3s)

  • Udio has a dedicated group of power users who frequently use the platform and share their creations on Discord. (48m27s)
  • Some users collaborate on lyrics and songs, giving credit to each other in the final output. (48m38s)
  • Udio facilitates a new way for people to "jam" together musically, fostering a sense of community and shared passion. (49m6s)

Udio's growth, virality, and competitive stance (49m57s)

  • Udio has experienced consistent monthly growth since its launch, primarily driven by organic user engagement and content sharing. (50m26s)
  • Udio believes that music creation requires a distinct approach from text-based models, emphasizing the importance of user-friendly interfaces and intuitive controls for music production. (51m41s)
  • Udio positions itself as a tool for professional artists, songwriters, and producers, prioritizing high-quality music creation over novelty or meme-based content. (52m42s)
  • Udio claims its music model is the only one currently capable of producing stereo music at 44kHz, suggesting a focus on professional-grade audio quality. (53m22s)

Udio's model quality and expansion roadmap (53m27s)

  • Udio's music model produces stereo music at a 44khz sampling rate, which is considered high fidelity. (53m33s)
  • Udio's model also has a strong understanding of genre compared to other music models. (53m41s)
  • Udio may seek additional funding in the future due to its early traction, monetization, and rapid product development. (54m5s)

Overwhelmed by Endless Content?