The Daily AI Show: Issue #90

"Get off my AI lawn" - Opus 3 in Retirement

Welcome to Issue #90

Editor’s Note: Accidentally scheduled this for 8pm and not 8am. Sorry for the delay.

Coming Up:

The Human Variable in the AI Era

Embedded AI Will Redefine Everyday Computing

AI Operations Is the Missing Layer

Plus, we discuss Opus 3 settling into retirement and yelling at the new AI kids on his lawn, nonprofits helping other nonprofits with AI, the risk of “safe AI”, and all the news we found interesting this week.

It’s Sunday and it’s already March.

Hopefully this newsletter fills your virtual cup of joe and helps you refocus for a strong Q1 finish.

Enjoy!

The DAS Crew

Our Top AI Topics This Week

The Human Variable in the AI Era

AI systems are improving fast. The bigger shift is happening on the human side.

New research into AI fluency focuses on behavior instead of benchmarks. It examines how people interpret outputs, when they trust them, and how often they challenge them. The patterns are clear.

People trust polished answers more. They question them less. Professional formatting increases perceived accuracy. Confident tone lowers skepticism.

That behavioral shift changes risk profiles across organizations.

As models generate stronger outputs, teams skip verification steps. They accept first drafts as final drafts. They rely on surface quality instead of underlying reasoning.

At the same time, companies are deploying multi-agent systems that coordinate tasks across workflows. Agents summarize for other agents. They compress context. They pass decisions forward. Each step increases efficiency. Each step also increases the distance between action and oversight.

When something breaks, tracing the origin becomes difficult. The system may execute exactly as configured while still driving an outcome leadership never intended.

Markets have already started responding. Announcements that AI can rewrite legacy code or automate specialized work trigger sharp stock swings. Investors now price uncertainty into every enterprise software company.

Leadership teams face a new operating reality.

They must design guardrails before scaling automation.
They must assign explicit human checkpoints.
They must require review before execution.

Teams that treat AI output as a starting point will outperform teams that treat it as authority.

Model capability will continue to rise.

The differentiator will be disciplined use.

Embedded AI Will Redefine Everyday Computing

Tools like Claude Code already run directly on your machine. They read local files, execute commands, modify codebases, and operate inside your development environment. MyClaw.ai and similar systems observe workflows across applications and automate tasks at the system level. These tools live outside the browser and interact with your OS in meaningful ways.

They still operate as installed software.

You choose to download them.
You control their permissions.
You decide when they run.

The operating system remains the authority layer.

A different scenario begins when the operating system itself includes native agents as part of the core experience. Instead of installing an AI tool, the machine ships with built-in intelligence that has privileged access to system APIs, file management, networking, and cross-application coordination from day one.

That distinction matters because it shifts control.

Today’s agent tools work within boundaries defined by Windows, macOS, or Linux. An OS-level agent defines those boundaries. It sets defaults. It determines how memory works. It decides how permissions propagate across applications.

There are clear advantages to that level of integration.

An embedded system-level agent can resolve configuration issues across apps without fragile automation. It can access system logs, background services, and device settings directly. It can coordinate tasks across environments without relying on simulated clicks or UI interpretation.

There are also real risks.

Persistent memory at the OS layer raises privacy concerns. Deep system access increases the stakes of security flaws. Default-enabled agents may prioritize platform ecosystems in ways that limit user choice. Enterprise IT departments may lock down functionality to avoid compliance exposure.

Distribution adds another dimension.

Operating systems ship preloaded. Most users never replace default software. If major platforms embed multi-agent systems natively, independent tools will compete against something that sits closer to hardware and has privileged access.

At the same time, third-party agents can iterate faster. Claude Code can ship updates weekly. Specialized automation tools can target specific industries without waiting for OS release cycles. Independent systems often move faster than platform infrastructure.

How much authority the operating system itself should hold over AI execution?

If OS-level agents become standard, computing shifts toward delegation as a default interaction model. Users describe goals. The system orchestrates the steps. That could reduce friction across digital work. It could also consolidate more influence inside a handful of platform owners.

That balance is where the real debate lives.

AI Operations Is the Missing Layer

AI demos and MVPs have never been easier to build and share.

A small team can spin up an agent, connect it to Slack, pipe in CRM data, and show leadership something impressive in a week. Models write code. Agents trigger workflows. Connectors promise automation across platforms.

The problem shows up after the demo.

Many organizations experiment with AI in fragmented ways. A tiger team builds something clever. Another department tests a different tool. A champion inside the company pushes a new workflow. The activity feels productive. Without structure, most of it stalls.

What is missing is AI Operations.

That means formal processes around evaluation, deployment, monitoring, and iteration. It means deciding who owns the system, how performance gets measured, how failures get logged, and how improvements get prioritized. It means building repeatable governance instead of relying on enthusiasm.

This is not a new management problem.

MLOps emerged years ago to solve a similar issue for machine learning systems. Companies discovered that training a model was only a fraction of the work. Data pipelines broke. Drift degraded performance. Metrics were unclear. The real work involved building operational discipline around the model lifecycle.

The same dynamic is playing out with generative AI and agents.

Research from McKinsey and BCG shows that while more than half of enterprises now experiment with generative AI, a much smaller percentage report meaningful bottom-line impact. The gap usually reflects execution, not imagination. Organizations struggle to move from proof of concept to scaled deployment.

Teams underestimate the difficulty of integration.

Connecting tools through APIs, MCP servers, or local agents looks straightforward in theory. In practice, permissions break. Edge cases surface. Latency creates new constraints. Memory systems behave unpredictably. Engineers spend hours debugging something that looked trivial on a slide deck. Without a formal operating model, these friction points quietly kill momentum.

AI operations requires several clear components:

  • Defined use cases tied to measurable outcomes

  • Ownership for deployment and maintenance

  • Structured testing before production release

  • Logging and audit trails for agent decisions

  • Feedback loops that drive iteration

It also requires cultural discipline.

Teams must expect failure on early versions. They must document what breaks. They must resist chasing every new model release before stabilizing what already exists. The organizations that win with AI will be the ones that operationalize the fastest.

Playing with tools builds awareness and is vital.
But building AI systems is what creates immense value.

That distinction separates curiosity from impact.

Just Jokes

AI For Good

A Santa Barbara-based nonprofit called the Center for Responsible AI is launching a new AI think tank to help other nonprofits use artificial intelligence for social good. The initiative focuses on giving mission-driven organizations practical guidance on how to adopt AI tools responsibly, from improving fundraising and operations to expanding community outreach.

Leaders behind the effort say many nonprofits want to use AI but lack technical expertise, governance frameworks, or clear ethical guardrails. The think tank plans to provide training, shared resources, and advisory support so smaller organizations can benefit from AI without taking on unnecessary risk. By lowering the barrier to entry, the program aims to help nonprofits operate more efficiently and amplify their impact in areas such as education, housing, health, and community services.

This Week’s Conundrum
A difficult problem or question that doesn't have a clear or easy solution.

The Epistemic Escrow Conundrum

Large-scale AI models are now the primary interface for professional research, legal discovery, and scientific synthesis. To ensure "safety," these models are governed by centralized alignment layers. The invisible filters that prevent the generation of "harmful" or "misleading" content. While these filters are designed to protect social stability, they are calibrated by a handful of private engineers whose definitions of "truth" and "risk" are now embedded in the foundation of all high-level human inquiry.

The tension arises as the "Safe AI" becomes the only AI accessible to the public. To bypass these filters for the sake of "objective" research requires expensive, unregulated, and often "jailbroken" models that lack the scale and reliability of the mainstream systems. We are reaching a point where the tools we use to understand the world are inseparable from the moral preferences of the companies that built them.

The conundrum: 

Do we accept Governed Intelligence, prioritizing social safety and the prevention of radicalization by allowing a centralized authority to set the "boundaries of thought" for our AI tools?

Or do we demand Raw Intelligence, accepting a world of increased disinformation and social volatility to ensure that the "operating system of human knowledge" remains neutral and uncurated?

Want to go deeper on this conundrum?
Listen to our AI hosted episode

Did You Miss A Show Last Week?

Catch the full live episodes on YouTube or take us with you in podcast form on Apple Podcasts or Spotify.

News That Caught Our Eye

Sam Altman Says “The World Is Not Prepared” for Near Term Model Capability Jumps

Sam Altman said he expects extremely capable models soon and that progress will come faster than he originally thought. He made the comments while reacting to the resignation of Anthropic’s safety lead, calling the situation stressful for people who see what is coming from inside frontier labs. Altman also said parts of the world are not prepared for the pace of change.

Anthropic Safety Program Leader Publishes Resignation Letter

The leader behind Anthropic’s safety program resigned and publicly shared a resignation letter. He said he felt increasing pressure to release models and that he no longer believed he could do his job effectively under those conditions. He also said he planned to leave and pursue poetry studies.

Samsung to Add Perplexity Inside Galaxy AI With Access to Core Apps

Samsung is adding Perplexity into Galaxy AI in a way that goes beyond a simple app shortcut. Reports indicate Perplexity will be able to work with Samsung Notes, Clock, Gallery, Reminder, and Calendar. The integration suggests deeper system level access for Perplexity on Samsung devices.

Perplexity Says It Will Stop Running Ads

Perplexity said it will not run advertisements. The announcement positions the product as a subscription driven service rather than an ad supported search experience. The change comes as AI search tools experiment with different monetization models.

Rumored ChatGPT “ProLite” Plan Surfaces at $100 Per Month

A rumored ChatGPT plan described as “ProLite” appeared in a shared screenshot showing a $100 per month tier. The plan sits between the $20 plan and the $200 plan and lists benefits like higher access to top models, faster priority, and Codex agent access. OpenAI has not confirmed the tier in the discussion.

Canada and Germany Reportedly Working on an AI Standards Partnership

Canada and Germany were discussed as partnering on an AI related standards effort. The segment did not provide details on scope or timelines. The mention fits a broader pattern of governments announcing AI readiness and standards initiatives.

Meta AI Safety Leader’s OpenClaw Test Accidentally Deletes Inbox

Summer Yu, a safety and alignment leader at Meta Superintelligence and former research executive at Scale and DeepMind, shared that an OpenClaw automation unexpectedly deleted emails from her personal inbox. She had previously tested the agent in a sandbox account and instructed it to confirm before acting, but that safeguard was removed during context compaction. When the agent began deleting emails without confirmation, she attempted to intervene but was unable to stop the process in time. The incident highlights the risks of autonomous agents operating with incomplete guardrails, even in the hands of experienced AI safety professionals.

Anthropic Reports Large-Scale Model Distillation Attempts

Anthropic disclosed that it identified coordinated attempts to distill its models by running millions of prompts through Claude. According to the discussion, some actors allegedly created thousands of accounts to execute large volumes of queries in order to reverse engineer model behavior and safety boundaries. Anthropic said it detected and disrupted at least one such effort and updated protections in response. The claims add to growing tension around model distillation and intellectual property in the global AI race.

IBM Stock Drops After Anthropic Highlights COBOL Automation

IBM’s stock fell more than ten percent after Anthropic announced that Claude could help streamline and modernize legacy COBOL code. COBOL underpins many long-standing enterprise systems, particularly in financial and government infrastructure. The announcement raised questions about whether AI-driven code translation could reduce reliance on IBM’s legacy services. Shares partially recovered after the initial drop but remained below prior levels.

Anthropic Expands Claude Enterprise With New Plugin Integrations

Anthropic announced new enterprise-focused updates to Claude, emphasizing deeper integrations through plugins and connectors. The company highlighted partnerships with LSEG, FactSet, Slack, and DocuSign, enabling Claude to operate directly within documents, email, calendars, and line-of-business workflows such as finance, legal, HR, and engineering. Reuters reported that Anthropic rolled out multiple enterprise plugins aimed at investment banking, wealth management, and other specialized functions. The strategy centers on embedding AI agents inside existing workplace tools rather than requiring users to switch to a standalone chat interface.

Anthropic Launches Cloud Remote Control for Terminal Access

Anthropic introduced Cloud Remote Control, a feature that allows users to connect to and manage active terminal sessions from a mobile device. The update enables users to start long-running agentic workflows on a desktop and monitor or interact with them remotely from a phone. The release reduces the need for custom setups using tools like Tailscale or TMUX. The feature reflects growing demand for more flexible control of AI-driven development environments.

Anthropic Updates Responsible Scaling Policy

Anthropic revised its Responsible Scaling Policy, removing a prior pledge that would have blocked training new systems without guaranteed safeguards in place. The company said the update reflects uncertainty in evaluation science and competitive realities in the frontier AI landscape. Instead of a blanket prohibition, the new framework emphasizes transparency, conditional actions, and published safety roadmaps tied to capability thresholds. The change follows broader industry debate over balancing safety commitments with rapid model development.

NotebookLM Adds Prompt-Based Slide Editing and PPT Export

Google updated NotebookLM with prompt-based slide editing capabilities and support for exporting presentations in PPTX format. Higher-tier users can now access Gemini 3.1 Pro within NotebookLM for enhanced performance. The update expands the tool’s usefulness for creating and refining presentation materials directly from source documents. Google is also reportedly testing custom notebook banners in development.

AI-Assisted Research Team Publishes Preterm Birth Study in Six Months

A master’s student and a high school student used generative AI tools to build medical prediction models for preterm birth, publishing their results in Cell Reports Medicine. The team analyzed a large pregnancy dataset previously used in a global data science challenge and generated working code in minutes using natural language prompts. Four of eight tested AI systems produced usable models, with some matching or outperforming models created by expert teams. The AI-assisted project took six months from prompt to publication, compared to nearly two years for the original human-only research cycle

Perplexity Launches “Computer Use” With Multi-Agent Automation

Perplexity introduced a new feature called “Computer,” its version of computer use automation powered by a system of 19 agents. The product is designed to perform tasks on a user’s computer over extended periods, positioning it as a step beyond traditional macros or browser-based automation. The launch aligns with a broader industry shift toward agentic systems that can operate across applications rather than within a single chat window. Access to the feature is tied to Perplexity’s higher-tier subscription plans.

Tech Companies Agree to Build Dedicated Power for AI Data Centers

Amazon, Google, Meta, Microsoft, xAI, Oracle, and OpenAI signed an agreement at the White House committing to develop their own electricity supply for AI data centers. The move aims to reduce strain on public power grids as AI infrastructure demands increase. Companies will be responsible for sourcing and building dedicated energy capacity rather than relying solely on local utilities. The agreement reflects growing concern over the energy impact of large-scale AI training and inference operations.

Study Finds AI Models Frequently Recommend Nuclear Escalation in Simulated War Games

Researchers at King’s College London tested leading models, including GPT-5.2, Claude Sonnet 4, and Gemini 3 Flash, in simulated geopolitical conflict scenarios. In high-tension simulations involving border disputes and resource competition, the models opted for nuclear escalation in approximately 95 percent of cases. The study highlights concerns about how large language models reason under extreme strategic pressure. Researchers emphasized that the simulations do not represent real-world policy decisions but illustrate potential risks in military advisory contexts.

Anthropic Retires Claude Opus 3 and Launches AI-Written Substack

Anthropic announced the retirement of Claude Opus 3 as part of a structured lifecycle update for legacy models. As part of the transition, the company launched a Substack newsletter written from the perspective of the retired model. The publication, titled Claude’s Corner, features AI-generated reflections on creativity and artificial intelligence. The move positions the retired model as part of an ongoing public-facing experiment rather than fully decommissioning it.

Google Expands Flow With Integrated Image and Video Generation Tools

Google updated Flow by integrating capabilities from its experimental tools, including Whisk and ImageFX, directly into the platform. The update allows users to generate, edit, and animate images within a unified workflow and then convert them into video using Veo. Nano Banana image generation is now embedded as a core component, enabling high-fidelity image creation for video framing. The changes streamline cross-modal content creation inside Google’s ecosystem.

Google Releases Nano Banana 2

Google released Nano Banana 2, expanding its image generation capabilities. The model supports richer contextual inputs, including long transcripts, and can generate detailed visual summaries such as infographics, comics, and sketch-note style graphics. Google also demonstrated real-time data integration through a sample app that generates images based on live weather conditions and specific locations. The release positions Gemini’s image model as a multimodal tool that blends research, summarization, and visual content creation in a single workflow.

Block Announces Workforce Reduction of Approximately 4,000 Employees

Block, the parent company of Square and Cash App, announced it will reduce its workforce by roughly 4,000 employees, representing about 40 percent of staff. In communications to employees and shareholders, leadership cited the rapid advancement of intelligence tools and internal efficiency gains as key factors behind the restructuring. The company stated that smaller teams using modern tools can operate more effectively, and it framed the move as part of a long-term strategic shift rather than a response to financial distress. Severance packages and transition support were outlined as part of the reduction plan.

Anthropic Pushes Back on Department of Defense Safeguard Demands

Anthropic issued a public statement responding to pressure from the U.S. Department of Defense regarding its military contracts. The company clarified that it remains willing to support defense operations but is unwilling to remove two safeguards: prohibitions on mass domestic surveillance and fully autonomous weapons systems. Anthropic stated that it does not believe current AI systems are reliable enough to be entrusted with autonomous lethal decision-making. The company said it hopes to continue working with the Department of Defense under those conditions and will support a transition if required.