The Daily AI: Issue #17

"Meta is getting meta."

Welcome to The Daily AI Show Newsletter, your deeper dive into AI that goes beyond the latest news. In this issue:

  • Find out how DIY AI apps could be the scalable solution for businesses

  • Why digital twins and simulations are the Matrix we need right now

  • Learn how multimodal RAGs are helping businesses get deeper insights from their documents

  • Forget 2025, find out what our Q4 predictions are for AI

Plus, we discuss OpenAI’s big people problem, Google’s slick video obsession, Meta’s push into augmented reality, a lot of Hoover Dams worth of power for data centers, our top news from the week, and a lot more.

It’s Sunday morning.

Let’s dive into all the AI updates before Skynet shows up and makes them irrelevant.

The DAS Crew

Why It Matters

Our Deeper Look Into This Week’s Topics

Are On-Demand DIY Apps The Real Future?

The rise of AI is making personalized, on-demand apps a reality, allowing individuals and businesses to create custom applications without the need for coding expertise. This shift could significantly change how we interact with technology and streamline workflows.

Imagine a future where you can wake up to a list of bespoke app suggestions tailored specifically to your needs, whether it’s managing your daily schedule, planning a vacation, or even preparing your taxes.

The real innovation lies in the ability of these AI tools to understand and predict what kind of app you might need based on your behavior and data. For example, if you’re planning a trip, an AI could generate a travel app that integrates your flight details, hotel reservations, and local attractions all in one place.

It’s not just about convenience—it’s about creating a seamless experience that adapts to your life in real-time.

These DIY apps go beyond simple task management. They can include financial management, learning platforms, or even mental wellness tools, all created on the fly to suit your personal or business needs.

As AI becomes more integrated into our devices, we’re likely to see a surge in such personalized applications, designed and built just for us, when we need them.

WHY IT MATTERS

  • Personalized Efficiency: On-demand apps tailored to individual needs can streamline workflows, making everyday tasks more manageable and less time-consuming.

  • Cost-Effective Solutions: Businesses and individuals can avoid the cost of expensive custom software by using AI to build what they need on the spot.

  • Enhanced User Experience: Apps designed specifically for your requirements mean a more intuitive and satisfying user experience, reducing the need for multiple, disjointed applications.

  • Scalable Applications: These AI-generated apps can scale with your needs, adapting to changes in your schedule, workload, or personal life without the need for manual updates.

  • Future of App Development: This trend could revolutionize app development, shifting the focus from static, one-size-fits-all solutions to dynamic, user-specific experiences that evolve in real-time.

Is That A Simulation . . .Or Just Your Digital Twin?

The convergence of virtual worlds and AI technology is unlocking unprecedented opportunities for industries to simulate real-world scenarios with precision and scale.

Digital twins, virtual replicas of physical and information systems, are becoming critical tools for industries ranging from urban planning to corporate operations. With the ability to create these dynamic, data-driven models, businesses can test, optimize, and even predict outcomes in a risk-free virtual environment before implementing changes in the real world.

One standout example is the use of Unreal Engine for urban planning. By creating a digital twin of a city, stakeholders can simulate various scenarios—such as traffic flow during peak hours or the impact of new infrastructure—without disrupting daily life. These simulations allow city planners to anticipate potential problems and make data-informed decisions, saving time, money, and resources.

In the corporate world, companies like NVIDIA are leveraging AI-driven simulations to streamline factory floor operations and optimize workflows. Their Omniverse platform enables enterprises to build detailed digital twins of their facilities, simulating everything from equipment layout to logistics flows to human task inputs. This models operations for efficiency analysis and ‘what if’ experiments while reducing the need for costly trial-and-error in physical spaces.

AI simulations are also paving the way for more interactive, human-centric applications. With tools like Project SID, autonomous agents in virtual worlds can replicate complex human behaviors and social interactions.

This opens up new possibilities for businesses to model customer behavior, optimize service delivery, and even train employees in a safe and controlled environment.

WHY IT MATTERS

  • Risk-Free Experimentation: Digital twins allow businesses to test and refine processes in a virtual setting, reducing the risk and cost associated with physical trials.

  • Enhanced Decision-Making: Simulating real-world scenarios enables data-driven decisions, whether it's urban planning, factory optimization, or corporate strategy.

  • Operational Efficiency: By modeling workflows and logistics, companies can identify inefficiencies and implement changes without disrupting operations.

  • Human Behavior Modeling: AI-driven simulations can replicate complex human interactions, offering new insights into customer behavior and employee training.

  • Future-Ready Enterprises: As AI simulation technology continues to advance, it will become a cornerstone for businesses looking to innovate and stay competitive in a rapidly changing environment.

LlamaCloud Wants To Be Your Multimodal RAG-Time Gal

Multimodal Retrieval-Augmented Generation (RAG) is emerging as a powerful tool for enhancing AI's ability to handle complex, real-world scenarios. By integrating multiple data types—text, images, video, and even audio—into its retrieval and generation process, multimodal RAG allows AI systems to access and process a broader range of information. This enables more accurate and context-aware responses, addressing a key limitation of traditional language models.

The capability to pull relevant data from PDFs, images, and other sources in real-time is a game-changer for businesses dealing with intricate information workflows.

For example, imagine a real estate platform that not only understands textual property descriptions but can also process and retrieve information from floor plans, photos, and even audio descriptions. This holistic approach offers clients a more comprehensive and intuitive experience.

One of the standout features of platforms like Llama Index's LlamaCloud is the ability to parse and analyze complex documents such as PDFs, extracting useful information while discarding irrelevant layout data. This means businesses can automate the extraction of crucial data from documents, like financial reports or technical manuals, and integrate it into their RAG systems for quick, accurate retrieval.

However, it’s not without challenges.

The process of converting multimodal data into a format usable by AI is still clunky and resource-intensive. The AI needs to break down images into text descriptions and integrate this with textual data in a coherent manner, which can slow down response times. Moreover, the current systems struggle with the real-time application of this technology, especially when it comes to processing video and complex audio data.

WHY IT MATTERS

  • Improved Accuracy: By incorporating multiple data types, multimodal RAG can provide more contextually accurate responses, reducing the chances of AI-generated misinformation.

  • Enhanced Data Retrieval: Businesses can leverage this technology to automate the extraction and retrieval of complex information from diverse sources, improving efficiency.

  • Broader Applications: From real estate to technical support, multimodal RAG enables a more nuanced understanding of queries, helping users interact with complex data in intuitive ways.

  • Scalable Solutions: Although still in its early stages, the technology holds promise for scalable, multi-purpose AI solutions that can adapt to various industry needs.

  • Challenges Ahead: The complexity and processing demands of multimodal RAG mean that further technological advancements are needed to make this a seamless, real-time tool for businesses.

HEARD AROUND THE SLACK COOLER
What We Are Chatting About This Week Outside the Live Show

OpenAI Keeps Bleeding Talent

We have all been talking about Mira Murati’s recent departure from OpenAI. Mira was the CTO and far from the only big name to leave in the past 12 months.

Here are the other notable names:

  • Helen Toner - 11/23 -Former Board Member - left after failed ousting of Sam Altman

  • Tasha McCauley - 11/23 - Former Board Member - left after failed ousting of Sam Altman

  • Andrej Karpathy - 2/24 - Co-founder - research scientist who has since founded Eureka Labs

  • Jan Leike - 5/24 - Co-head of the Superalignment team - left after disagreeing with “core priorities”

  • Illya Sutskever - 6/24 - Co-founder - Chief Scientist - was part of failed ousting of Altman - now he is Chief Scientist at Safe Superintelligence

  • Peter Dang - early ‘24 - VP of Product - no reason reported for his leaving after joining in 2023

  • John Schulman - 8/24 - Co-founder - left to work at Anthropic, maker of Claude

  • Bob McGrew - 9/24 - Chief Research Officer since August and with the company since 2017 - no reason given for his departure

  • Barret Zoph - 9/24 - VP of Research - said he decided to leave “based on how I want to evolve the next phase of my career.”


    Source: Business Insider

Ethan Mollick Having a Slight Dig at Sam Altman and OpenAI

Eran shared this post with the group from one of our favorite voices in AI right now.

It was in response to Sam Altman’s recent blog post that many saw as a call to raise money to help build more data centers and means of power production.


 

“So Many Slick Videos”

This was Karl’s response to Google’s Gemini At Work digital event this week. It is a pattern we keep seeing with Google unfortunately. While OpenAI and Meta have started giving live demos to show their AI tools and products in action, Google, Apple, and a few others opt for highly produced “Super Bowl” ads that are flashy, but often leave the real early adopters feeling duped.

Our take: Have the flashy ads, but start with the live demos. That is how you will capture the early adopters and champions while also gaining the attention of the larger population.

Our Top AI Predictions for Q4

  • Google's AI Push Continues: Google has already made significant strides this year with the release of Gemini Pro 1.5 and Notebook LM updates. However, Beth speculated that "we could see at least one more big release from Google," possibly in the form of improved integrations across their ecosystem or a more robust version of Gemini. This could include enhancements like better AI tools for developers or new functionalities in Google Workspace, making AI more accessible and useful in day-to-day business tasks.

  • Apple’s AI Moves Still Pending: While Apple introduced iOS 18 and the iPhone 16 with promises of advanced AI capabilities, the full suite of Apple Intelligence is yet to be rolled out. Brian noted, “We’re unlikely to see the complete Apple Intelligence update before the end of the year.” It’s expected that Apple may wait until early 2025 to fully launch features like advanced Siri capabilities that can interact seamlessly with apps like ChatGPT, potentially transforming how users interact with their devices.

  • OpenAI’s Next Steps: OpenAI has been relatively quiet since launching 01-preview model for enhanced reasoning capabilities. Jyunmi pointed out that “01 coming out of preview within a month seems likely,” but beyond that, no major updates like GPT-5 or DALL-E 4 are expected this year. OpenAI may focus on refining existing models and improving user experience with the recently released advanced voice features.

  • Personal AI Agents Still in Early Stages: Despite the hype, truly autonomous personal AI agents remain out of reach. Karl emphasized that current solutions from companies like Salesforce and HubSpot are more akin to “smarter automations” than true autonomous agents. We may see incremental improvements in these systems, but a fully autonomous agent that can operate independently across multiple platforms is unlikely to emerge this year.

  • Perplexity’s Growth and Innovation: Perplexity AI has been gaining traction with its innovative search capabilities and user-friendly interface. With recent funding rounds and a successful marketing campaign, they are expected to introduce new features and enhancements to solidify their position as a strong alternative to traditional search engines. The integration of ads and further refinements in navigational searches could be in the cards for Q4.

  • Meta’s Local AI Models: Meta’s release of Llama 3.2b, capable of running locally on devices, signals a push towards more on-device AI capabilities. Brian noted, “We’ll see more on-device local options, possibly from other major players like Anthropic.” This trend could redefine how users interact with AI, prioritizing privacy and speed by keeping interactions local rather than cloud-based.

Did you know?

There are over 10,000 data centers in the world.

Nearly 50% of them are in the U.S.

We would need 5 Hoover Dams to run all 10k of them for a year.

This Week’s Conundrum
A difficult problem or question that doesn't have a clear or easy solution.

The AI Relationship Paradox:

AI companions and virtual assistants are becoming increasingly sophisticated, capable of holding conversations, offering emotional support, and even mimicking human behavior in relationships. Some people find solace in these interactions, using AI to combat loneliness or as a substitute for human companionship.

However, there are concerns that relying on AI for emotional support could lead to unhealthy attachments, social isolation, and a diminished capacity for genuine human relationships.

The conundrum: Should we encourage the development and use of AI companions to help people cope with loneliness and mental health challenges, even if it risks creating an over-reliance on machines and weakening real human connections? Or should we limit AI's role in personal relationships to preserve the authenticity and complexity of human interaction, even if it means leaving some people without needed support?

The News That Caught Our Eye

Advanced Voice Mode Finally Arrives for ChatGPT Users

After much anticipation, OpenAI has rolled out Advanced Voice Mode, a highly anticipated feature allowing users to communicate more naturally with ChatGPT. This new update brings smoother, more conversational interactions, and a variety of voices to choose from. It’s not just about utility; it’s about making the experience more engaging and interactive. Brian said, “I spent 30 minutes last night just playing with voice impressions and storytelling features. It’s a game-changer for how we interact with AI.​”

Google's Data Gemma Tackles AI Hallucinations

Google has introduced Data Gemma, a model designed to reduce AI hallucinations by cross-referencing responses with a trusted data set called Data Commons. This approach helps ensure more accurate and reliable outputs from AI models. This is a major step forward. Using a real-world data set to validate AI responses could set a new standard for minimizing hallucinations in large language models.

Duolingo Adds AI-Powered Mini Games and Video Calls

Duolingo has introduced new features including AI-powered adventures and video call capabilities to make language learning more engaging. Users can now practice conversations in a simulated environment and even have live, interactive video calls with AI. Brian expressed mixed feelings: “It’s a step in the right direction, but Duolingo could do so much more with AI. I’d love to see more open-ended conversations and less reliance on their existing, somewhat annoying characters.”​

Microsoft Partners with Three Mile Island for AI Data Center Power

In a surprising move, Microsoft has signed a 20-year deal to exclusively access 835 megawatts of power from the Three Mile Island nuclear plant to fuel its AI data centers. This deal shows just how serious tech giants are about securing sustainable energy for the growing demands of AI. It’s a return to nuclear power as a reliable, large-scale energy source.

Salesforce Launches AgentForce to Automate Complex Workflows

Salesforce has unveiled AgentForce, a new AI tool that can automate complex tasks like scheduling installations and following up on sales leads. The tool is highly customizable, integrating smoothly into the existing Salesforce ecosystem. This isn’t just another AI feature—Salesforce is making a big play in AI-driven CRM automation. It’s poised to redefine how businesses handle customer interactions.

James Cameron Joins Stability AI’s Board to Guide AI-CGI Integration

In a surprising move, filmmaker James Cameron has joined the board of Stability AI, aiming to bridge the gap between AI and CGI in the film industry. His focus will be on merging traditional CGI techniques with AI advancements to create new possibilities in visual storytelling. Jyunmi said, “Having someone like Cameron on board could accelerate the integration of AI into Hollywood, making CGI and AI indistinguishable in the future.”

Google’s Gemini Models Get Major Updates

Google has updated its Gemini models with improved performance and reduced costs, making them more accessible for developers. The new versions boast a 20% boost in math-related benchmarks and a 50% reduction in cost for input and output under 128,000 tokens. While this might not sound as flashy as some other announcements, these updates make Google’s models more attractive for cost-conscious developers looking for robust API solutions.

New Memory Model 'Letta' Promises Long-Term AI Memory

A new company, Letta, has emerged from stealth, promising to bring long-term memory capabilities to AI models. Unlike current memory implementations, Letta’s technology can be integrated into any model, making it adaptable for various applications. This could be a breakthrough for customer service AI and other long-term interaction scenarios. It’s all about creating a more ‘human-like’ memory for AI.

Did You Miss A Show Last Week?

Enjoy the replays here on YouTube or take us with you in podcast form on Apple Podcasts or Spotify.

How'd We Do?

Let us know what you think of this newsletter so we can continue to make it even better.

Login or Subscribe to participate in polls.