Date: October 28, 2024
Meta has released the open implementation version of the viral generate-a-podcast feature in Google’s NotebookLM.
Meta has spent time honing an AI feature that helped Google’s NotebookLM go viral. To help users simplify the absorption of complex information, Meta is releasing an open implementation model of the generate-a-podcast feature. Launched under the name NotebookLlama, it resembles Google’s NotebookLM in many features, restrictions, and potential.
Like NotebookLM, the newly launched podcast generator can back-and-forth, podcast-style digests of uploaded text files or sources you provide to the AI model. Its remarkable text-to-speech engine creates compelling and engaging audio content demonstrating the dramatized nature of human podcasts.
The process explained by Meta is simple, effective, and fast, thanks to the proprietary Large Language model, Llama. NotebookLlama first creates a PDF transcript of a news article or blog post while preserving its context using Llama 3.2-1B Instruct. Then, it feeds the interpretation to Llama 3.1-70B-Instruct, where the podcast script is generated.
To add dramatization that makes a podcast interesting, engaging, and conversational, the script draft is processed by Llama 3.2-8B-Instruct, and a crispier script is generated. Then, the AI model converts the final script into an audio podcast, orated in a conversational natural tone.
The podcast generator is still in its infant stage, and that’s why the software has been released in an open version, unlike Google’s NotebookLM. While it showcases immense potential, the limitations pose a major roadblock in finding the actual use case of the product. One of the limitations is the not-so-natural conversational tone and voice used by the AI model.
“The text-to-speech model is the limitation of how natural this will sound. Also, another approach of writing the podcast would be having two agents debate the topic of interest and write the podcast outline. Right now we use a single model to write the podcast outline,” said Meta on its official NotebookLlama GitHub page.
Another limitation is the most persistent one in any AI model built so far. NotebookLlama and NotebookLM both are prone to hallucinations even if the user provides exact sources of the content to generate a podcast from. These hallucinations arise either while creating a strong context or to compensate for the lack of understanding through a new angle.
However, Meta NotebookLlama has solved one critical problem for every AI chatbot user: it can provide answers to questions they don’t even know to ask. In simple words, it takes a two-person conversational approach to explore all angles and identify important aspects in the form of a Q&A or debate. While the technology has a long way to go before becoming a flagship product, the breakthrough shows great promise for future developments.
By Arpit Dubey
Arpit is a dreamer, wanderer, and tech nerd who loves to jot down tech musings and updates. Armed with a Bachelor's in Business Administration and a knack for crafting compelling narratives and a sharp specialization in everything from Predictive Analytics to FinTech—and let’s not forget SaaS, healthcare, and more. Arpit crafts content that’s as strategic as it is compelling. With a Logician mind, he is always chasing sunrises and tech advancements while secretly preparing for the robot uprising.
Reddit Unveils AI-Powered Search Tool for Smarter Results
Reddit launched Reddit Answers, an AI-powered search tool that curates and summarizes discussions to enhance user experience and reduce reliance on Google.
OpenAI Scraps o3 Model, Pushes for Unified GPT-5 in a Major AI Overhaul
OpenAI is canceling its o3 AI model and merging it into GPT-5 for a simpler, more powerful system. A big move to stay ahead in the AI race.
Virtual Reality in Healthcare: Revolutionizing Patient Care
Experience the power of virtual reality in healthcare as it transforms medical training, patient care, and treatment methods with immersive technology for better accuracy, efficiency, and improved outcomes.
Google I/O 2025: Dates Announced for the Tech Giant’s Biggest Event of the Year
Google I/O 2025 is set for May 20-21! Expect big AI reveals, Android 16 updates, and more. Registrations are open for keynotes, demos, and game-changing tech innovations!