Aakash Gupta Product Growth

By Kevin O'Donnell

May 26, 2026

About this collection

A collection of 200+ sources from Aakash Gupta's Product Growth podcast and newsletter, covering the shift to AI-native product management. Featuring interviews with product leaders from companies like Webflow, Braintrust, Arize, and others on how AI is changing how products get built, evaluated, and scaled. Topics include the move from MVPs to Minimal Viable Outputs, how vibe coding tools are letting PMs prototype without engineering bottlenecks, why evaluation pipelines are replacing PRDs as the core product artifact, the Model Context Protocol and what it means for agentic workflows, and how fractional and solo operators are using AI to punch above their weight. Try asking: - "How should a PM evaluate whether an AI feature is actually working?" - "What's replacing the traditional PRD in AI-native product teams?" - "How are solo founders using AI agents to scale without hiring?" - "What is MCP and why does it matter for product managers?"

Curated Sources

How to Develop Your Product Strategy, with Satyajeet Salgar | Director of Product, Google

Product strategy centers on defining the specific game a team intends to win and establishing how to keep score. Many strategy documents fail because they are either just roadmaps or vague mission statements. A high-quality strategy requires a sharp diagnosis of the current situation and a pointed plan of action. It is a point-in-time document that must evolve as the product or market changes. If a strategy hasn't changed in years, the team likely isn't adjusting fast enough. Foundational frameworks like Richard Rumelt's 'Good Strategy, Bad Strategy' help in identifying specific, non-fuzzy goals. At the leadership level, the PM role shifts from individual execution to organizational unblocking. This involves building credibility across teams to ensure the best decisions are made, even if the PM isn't the one making them. The 70-20-10 rule for resource allocation provides a useful guide: 70% of resources go to the core business, 20% to the next big thing, and 10% to high-risk, exponential-return bets. A 100% success rate in product launches is actually a 'bug' because it suggests the team is becoming risk-averse and not swinging for the fences. Manufacturing 'apple pie' principles on the fly is a common mistake. Principles should be learned behaviors that help people make decisions when the leader isn't in the room, such as YouTube's focus on watch time or creator economy growth. Experimentation serves as a tool to answer hard product questions cheaply. Instead of relying solely on intuition, PMs should find the fastest ways to validate or invalidate their beliefs. Looking forward, AI is expected to empower PMs by automating tasks like data analysis, mockups, and prototyping. This shift may lead to a merging of roles, where PMs at smaller companies handle design and engineering tasks more directly, returning to the 'do-it-all' nature of the early PM role. Tools like Claude are already becoming defaults for product thinkers to increase their efficiency and creative output.

Key Takeaways

  • Strategy is a dynamic diagnosis rather than a permanent plan. If a strategy remains static for several years, it usually means the team is failing to acknowledge new market realities or product history.
  • A perfect track record indicates a lack of ambition. Salgar suggests that teams should intentionally take low-confidence, high-reward bets, as avoiding failure entirely means missing out on exponential growth opportunities.
  • Senior leadership is about creating a culture of product. As PMs move toward director roles, their value shifts from making specific product calls to unblocking the organization and ensuring cross-functional alignment.
  • AI will likely trigger a collapse of specialized roles. By lowering the barrier to technical tasks like prototyping and analysis, AI tools will allow PMs to become more integrated 'builders' who handle multiple aspects of product development.

The Ultimate Guide to Positioning

Anthony Pierri, co-founder of Fletch PMM, breaks down the mechanics of effective SaaS positioning and homepage design. He defines positioning as the mental frame of reference that helps customers understand a product's value, timing, and utility. Pierri identifies two main ways to anchor a product: category-based positioning, where you compete in an established market like Figma in design tools, and use case-based positioning, where you own a specific activity like Calendly for scheduling. He introduces the Positioning Triangle, which consists of the target segment, competitive alternatives, and unique differentiation. A major theme is the danger of messaging fifth-order effects. Pierri argues that while a sales rep can build a complex business case for how a tool eventually increases revenue, a website homepage cannot. When companies lead with vague outcomes like driving business transformation, they become undifferentiated and lose credibility. Instead, they should focus on first-order benefits, which are the immediate, tangible results of using the software. Regarding homepage strategy, Pierri views the site as a storefront rather than a Wikipedia page. Its job is to entice visitors to take the next step, not to document every feature. He notes that users scan rather than read, so headlines must lead with differentiated product capabilities. He also discusses the degradation of clarity that happens as companies grow, often caused by internal stakeholders fighting for homepage real estate or hiring brand agencies that prioritize consensus over strategy. For early-stage startups, he recommends a bootstrap approach: focusing on one ICP and dominating a single segment before expanding, much like Amazon started solely with books. Finally, he emphasizes that product management and product marketing are two sides of the same coin and should be aligned from the earliest stages of feature planning.

Key Takeaways

  • Clarity comes from specificity. If your headlines only mention saving time or making money without explaining how the product works, you are invisible to the customer.
  • The Bootstrap Mindset is a strategic advantage. VC-backed startups often fail by trying to hit ten industries at once; focusing on the next five customers in one segment is more effective.
  • Pricing is a positioning lever. Startups often overprice their software compared to Microsoft or OpenAI anchors, creating a credibility gap that marketing cannot fix.
  • Homepages are for scanning. A successful page allows a user to understand the product's core value just by reading the headlines while scrolling quickly.
  • Product marketing and product strategy are interdependent. Positioning choices should influence the product roadmap, and new features should shift how the product is positioned in the market.

If You Don’t Understand AI Evals, Don’t Build AI

Ankur Goyal, CEO of Braintrust, argues that evaluation frameworks are the most critical part of building AI products, effectively replacing traditional Product Requirement Documents (PRDs). In this new paradigm, an eval is a quantifiable tool that the entire team can run to measure product quality. The core framework for this is Data-Task-Scores. Data consists of a set of inputs, the Task generates an output from the model, and Scores normalize the result between 0 and 1. This normalization is essential for comparing performance across different models and prompt versions over time. A major theme is the distance principle, which states that the further a team is from their end user, the more they need formal evals. While engineers building for other engineers can sometimes rely on subjective vibe checks, teams in specialized fields like healthcare or complex B2B SaaS must use rigorous scoring. Goyal notes that while prompts are ephemeral and change with every model update, evals represent a long-term investment in understanding user needs that survives model swaps. The discussion also covers practical implementation, such as using Linear's MCP server to build evals from scratch. Goyal suggests starting with imperfect, auto-generated data to iterate quickly rather than waiting for a perfect golden dataset. He also emphasizes the importance of having evals that fail, as these provide a roadmap for future development. For production-ready AI, teams should build an offline-to-online flywheel where the same scoring functions are applied to production logs, creating a continuous loop of improvement based on real-world usage.

Key Takeaways

  • Evals act as a living, quantifiable PRD that allows teams to move beyond subjective assessments to a standardized measurement of AI product quality.
  • The Data-Task-Scores framework provides a necessary normalization for AI performance, ensuring that product benchmarks remain valid even as underlying LLMs evolve.
  • The distance principle highlights that domain-specific AI requires more robust evaluation because the developer's intuition often fails to align with the specialized needs of the end user.
  • Building an offline-to-online flywheel is the most effective way to bridge the gap between development and production, using real-world logs to refine evaluation datasets.

The OpenClaw Guide no PM is Talking About (Masterclass for AI PMs)

Naman Pandey provides a masterclass on OpenClaw, an open-source AI agent with 245,000 GitHub stars and 2 million weekly visitors. This agent is designed to run as a continuous daemon, distinguishing it from reactive chatbots like ChatGPT because it maintains persistent memory and executes autonomous tasks via scheduled cron jobs. The installation process is streamlined into three terminal commands: NPM install, openclaw onboard, and hatch. Successful installation is indicated by the absence of red text in the terminal, though yellow warnings can be safely ignored. A critical component of its utility is the Slack integration, which requires a specific Reinstall to Workspace step in the Slack API console to ensure permission changes take effect. The platform enables several high-value product management automations. It can act as a team knowledge base by indexing documents in a local workspace folder, allowing team members to query PRDs and FAQs directly via Slack. For workflow efficiency, OpenClaw can automate morning stand-ups by scanning channels for updates and blockers. It also serves as a competitive intelligence tool, monitoring external websites and reviews to generate SWOT analyses. Furthermore, it can synthesize Voice of the Customer reports from multiple sources like Reddit and Google Reviews. A more advanced use case involves smart bug routing, where the agent identifies the customer tier from a CSV and prioritizes enterprise issues for engineering while routing lower-tier bugs to design. Deployment options range from local hosting to dedicated hardware like a Mac Mini, with a strong emphasis on running regular automated security audits to monitor file access and firewall status. The system distinguishes between skills and tools, providing a framework for expanding the agent's capabilities. While a VPS offers 24/7 uptime, local deployment on a dedicated Mac Mini is often the most recommended balance between reliability and having a physical kill switch. The guide also compares OpenClaw to alternatives like Claude Cowork, highlighting its unique position as a persistent, autonomous assistant rather than a standard chat interface.

Key Takeaways

  • OpenClaw shifts the AI paradigm from reactive chat to proactive daemon execution. This allows for autonomous background tasks like 3 a.m. data scraping or report generation.
  • The Knowledge Base functionality turns static documentation into an interactive Slack-based resource. You simply drop PRDs or FAQs into a local directory for the bot to index.
  • Automating PM rituals like stand-ups and bug routing changes the job. PMs move from manual data aggregation to high-level decision-making based on AI-synthesized insights.
  • Deployment choices are central to the agentic workflow. A dedicated Mac Mini is often the best balance between 24/7 uptime and having a physical kill switch for safety.

Automate Your Entire Work Life With Claude Code — No Coding Needed

Dave Killeen, Field CPO at Pendo, demonstrates DEX, a personal operating system built within Claude Code. This system centralizes a professional's entire workday into a single terminal interface. By leveraging the Model Context Protocol (MCP), DEX connects to diverse data sources like CRMs, calendars, LinkedIn, and over 100 newsletters. The workflow replaces traditional morning routines with a single command that synthesizes account intelligence and daily priorities. A core component is the use of living markdown files that act as long term memory for projects and people. These files accumulate context from every interaction, ensuring the AI gets smarter over time. The system also utilizes session start hooks to inject quarterly goals and personal preferences into every new chat, preventing the AI from starting with a blank slate. Beyond daily tasks, the setup includes a career MCP server for tracking promotion readiness and an accountability feature where the AI provides harsh truths based on system audits. This shift moves the user's focus from manual execution to high level strategic taste and goal setting.

Key Takeaways

  • Context compounding is the most valuable feature of a personal AI OS. By using session start hooks and living markdown files, the system avoids the repetitive onboarding phase of typical LLM chats, making the AI a true long term partner.
  • MCP servers represent a shift from probabilistic AI to deterministic tool use. Instead of hoping the AI understands an API, MCP provides structured guardrails that ensure consistent data retrieval from professional tools like CRMs and calendars.
  • The bottleneck in AI-driven product work has shifted from technical execution to strategic taste. When an AI can build a mobile app in under an hour, the competitive advantage lies in knowing exactly what to build and setting precise, high-quality goals.

Everyone is building in Claude Code. No one is running evals.

The current landscape of AI development focuses heavily on building, but the real competitive advantage lies in running evaluations and closing the self-improvement loop. Using Claude Code, product managers can now build, instrument, and evaluate agents within a single session. The core framework for this process is build, trace, eval, and improve. Tracing acts as the step-by-step playback of an agent's actions, providing the evidence base needed for meaningful evaluation. Without visibility into every LLM call and tool interaction, developers cannot accurately diagnose where an agent fails. A critical shift is occurring in the role of the product manager. As code becomes cheaper to produce, product taste becomes the primary differentiator. In AI-native teams, the distinction between a PM and an engineer is disappearing. PMs are expected to be triple threats who can identify customer pain, prototype a solution in Claude Code, and manage the observability layer. This requires a move away from manual task management toward building internal agents that automate repetitive workflows, such as scoring GitHub issues or synthesizing customer feedback from Gong and Slack. Instrumentation has evolved into a one-command task. Tools like Arize Phoenix allow Claude Code to automatically identify LLM calls and wire them to a tracing layer without heavy engineering support. Once traces are flowing, developers can use vibe evals as a starting point. While these initial evaluations are often suggested by the LLM itself, they must be refined through human judgment to align with specific business logic or product policies. The ultimate goal is a self-improvement loop where the system identifies failure categories, proposes prompt fixes, and ships updates after human approval. For enterprises, the major hurdle is data silos. Success depends on building a context graph that gives agents access to unified data across CRM, analytics, and communication channels.

Key Takeaways

  • Tracing must precede evaluation because it provides the granular evidence needed to understand agent behavior. You cannot evaluate what you cannot see, so every intermediate output and tool call must be visible before writing an eval.
  • The unit of evaluation is the span, not the entire trace. Breaking down a complex agent interaction into discrete steps allows for more precise debugging, such as checking if a specific scoring step was accurate rather than judging the entire run as good or bad.
  • Self-improvement loops are the next frontier for competitive AI teams. By grouping eval failures into categories and using agents to propose prompt improvements, teams can move from manual debugging to an automated refinement process that only requires human sign-off.
  • Context graphs are the primary unlock for enterprise AI. Agents are only as effective as the data they can access, and the teams winning in the enterprise space are those unifying silos like GitHub, Slack, and CRM data into a single context layer.

The Claude Code Setup for Non-Technical PMs That Nobody Shows You

Product managers are moving away from administrative tasks in Jira and Linear to become active builders using a modern AI stack. Andre Albuquerque outlines a four-level framework designed to help non-technical PMs start shipping code. Level one starts with using Lovable for personal projects. This builds confidence in a safe environment where mistakes won't break production systems. Level two introduces a bridge between Lovable and Claude Code by connecting both tools to the same GitHub repository. This setup allows PMs to use Claude Code for complex logic while using Lovable for visual QA and simple deployments. Level three moves into production-grade development using Cursor and Vercel. Cursor provides a powerful IDE with a free debugging agent to help users when they get stuck, while Vercel handles hosting and provides preview URLs for different branches. Level four focuses on advanced agentic engineering using CLAUDE.md files. This memory file acts as a culture document for the AI, setting persistent rules and defining specific roles. A central concept is the PM orchestrator agent pattern. In this model, the orchestrator does not write code. Instead, it decides which specialized agent should handle research, design, or implementation. The strategy emphasizes fixing the agent's instructions rather than just patching the code. This ensures the AI inherits the fix for all future sessions. To get started, PMs are encouraged to get collaborator access to a low-risk repository and ship a fix for a backlog ticket by the end of their first week.

Key Takeaways

  • The transition to a builder PM requires shifting focus from fixing features to fixing agent instructions so the AI learns from every mistake.
  • Connecting Lovable and Claude Code to a single GitHub repo creates a unique workflow where PMs can handle deep logic and visual QA simultaneously.
  • The CLAUDE.md file functions as a persistent memory layer that prevents the AI from repeating errors and encodes specific team methodologies.
  • Successful AI-native teams spend significant time orchestrating specialized agents rather than manually writing or correcting individual lines of code.

Complete Course: Claude for PMs (Cowork + Code + Dispatch)

Pawel Huryn demonstrates a sophisticated AI operating system for product managers using Anthropic's latest tools. The core shift involves moving away from standard Claude Chat toward agentic environments like Cowork, Claude Code, and Dispatch. Cowork provides direct file access and connects to external apps like Gmail and Slack via Model Context Protocol (MCP) servers. A central concept is the Skills marketplace, where PMs can install baseline prompts and iterate on them five to six times until the agent achieves 99% accuracy. This system uses progressive disclosure, where the agent only loads full instructions once a task matches a specific skill name, keeping the context window clean. The workflow relies on a CLAUDE.md file that acts as a router rather than a storage unit, pointing the agent to domain knowledge stored in separate files. Huryn introduces a framework for self-improving knowledge consisting of three categories: confirmed rules, tracked hypotheses with evidence, and rejected patterns to prevent retesting failed approaches. For technical execution, Claude Code manages larger projects and handles complex tasks like generating HTML infographics and component libraries without manual coding. To optimize costs and performance, the system favors Vercel's Agent Browser over the standard Chrome MCP, as the latter is significantly more expensive due to frequent screen-shotting. Finally, Dispatch enables mobile management of these agentic threads, allowing PMs to run parallel tasks like competitor analysis or email management while on the go.

Key Takeaways

  • Agentic workflows outperform standard chat interfaces by integrating directly with local files and external APIs through MCP, significantly reducing manual data transfer.
  • The highest ROI for PMs comes from building a library of Skills that are iterated upon multiple times until they produce expert-level outputs from first principles.
  • A robust AI operating system requires a structured memory that tracks confirmed rules, active hypotheses, and rejected patterns to avoid repeating past mistakes.
  • Strategic use of CLAUDE.md as a routing layer rather than a data dump prevents context window bloat and keeps agent performance high.

Stop Applying to AI PM Jobs Until You Watch This Safety & Ethics Mock

AI Product Manager interviews often include a safety and ethics round that catches even senior candidates off guard. The SHIR framework is a primary tool for these discussions. It stands for Severity, Harm scope, Immediacy, and Reversibility. Using this framework helps you pause and think through a structured response instead of rushing to a binary ship or kill decision. Most candidates jump straight to a solution, but senior PMs are expected to evaluate the depth of the risk first. Common mock scenarios include medical chatbots that contradict clinical guidelines, hiring tools with demographic bias, and AI agents that autonomously book travel or send emails. For agents, safety relies on three specific pillars. First is scope, which involves setting spending caps and category limits. Second is confirmation, which uses push notifications or undo windows depending on how high the stakes are. Third is reversibility, which includes pending states and anomaly detection. This layered approach ensures that even if an agent makes a mistake, the impact is contained and fixable. In the interview loop, safety is often baked into the product sense rubric rather than being a standalone topic. If you haven't mentioned safety within the first 40 minutes of a 60 minute interview, you are likely losing points that are hard to recover. When discussing business trade-offs, it is effective to frame safety issues as headline risks. A VP might worry about a $50 million revenue hit from pulling a feature, but they will worry more about a $5 billion brand risk if the AI gives dangerous advice. This converts a short term financial argument into a long term strategic one. Liability for AI behavior almost always lands on the platform provider because they designed the guardrails, so framing your answers around risk reduction through scope limits is essential. Anthropic is known for the toughest safety rounds in the industry, often lasting an hour and focusing on concepts like constitutional AI and the company founding story. Practice involves both situational and historical behavioral answers to demonstrate a deep commitment to safe deployment.

Key Takeaways

  • The SHIR framework acts as a cognitive buffer to help you evaluate risks like severity and reversibility before proposing a solution.
  • Safety should be integrated into your general product sense answers early on rather than waiting for a specific safety question.
  • Framing ethical risks as massive brand liabilities is the best way to align safety concerns with executive business goals.
  • AI agent safety requires a layered defense of scope limits, tiered confirmation flows, and technical reversibility.

Google PM Runs 7 Claude Code Agents to Build Apps (0 Employees)

Gabor Mayer, a Product Manager at Google, demonstrates a sophisticated workflow for building production-ready mobile apps using a team of 21 specialized Claude Code agents. This approach moves beyond simple vibe coding by implementing a structured development lifecycle that mirrors a traditional engineering team. The process starts with a System Analyst agent that uses Model Context Protocol (MCP) to interact with Confluence and Jira. Instead of typing brief prompts, Mayer uses voice dictation to provide high-density specifications. This captures significantly more nuance than typing, which the System Analyst then breaks down into detailed tickets and documentation. A critical component of this setup is the use of specialized agents for different roles. For instance, a dedicated agent handles visual direction by linking Figma designs to Jira tickets. This ensures the final code matches the intended UI rather than defaulting to generic AI styles. Another agent, dubbed the Spaghetti Agent, focuses exclusively on code maintainability by checking for circular references and naming conventions. This modularity prevents context compression, a common failure point where a single LLM loses track of details like security requirements or specific color palettes when overwhelmed with a massive prompt. The workflow demonstrates building a hockey rules app from zero to TestFlight. It emphasizes that the actual coding phase is often the fastest part of the process. The majority of the effort goes into specification, design, and ticket organization. By using MCPs to connect Claude Code to external tools like Jira and Figma, the agents maintain a source of truth across the entire stack. This system allows a solo founder or PM to act as an orchestrator, managing a complex team that encodes institutional knowledge into reusable markdown files. This ensures that lessons learned on one project are automatically applied to the next, creating a scalable and repeatable engine for app development.

Key Takeaways

  • Context compression is the primary killer of AI-generated code. When a single agent handles a massive spec, it silently drops low-priority details like edge cases or specific brand colors. Breaking tasks into small, agent-led modules is the only way to maintain production quality.
  • The System Analyst is the most vital role in an agentic team. By forcing the AI to ask clarifying questions and document decisions in Confluence before writing a single line of code, you prevent the hallucination loops that occur when agents work with partial information.
  • Agentic workflows shift the PM's value from writing requirements to orchestrating context. Using tools like MCP to link Jira, Figma, and GitHub allows agents to operate with a shared understanding, making the PM an editor and architect rather than a manual task manager.
  • Reusable agent files act as a repository for institutional knowledge. By encoding API workarounds and project-specific quirks into agent markdown files, the system becomes more intelligent with every project, preventing the repetition of past mistakes.

Inside a $400K AI Product Sense Interview (Amazon, Meta, Google, OpenAI)

AI product management interviews at top tier firms like OpenAI, Anthropic, and Google DeepMind now hinge on a specific AI product sense round. While behavioral rounds might get a candidate an offer, the product sense performance dictates the final leveling and compensation. In 2026, median PM compensation at OpenAI is around $800,000, while senior PMs at Google see roughly half a million. This round tests a candidate's ability to design for non-deterministic systems where outputs vary, models hallucinate, and safety is a core requirement rather than an afterthought. Success requires moving beyond standard frameworks like CIRCLES to a custom structure focusing on mission, ecosystem mapping, and user segmentation. In a mock case study for scaling Claude Code weekly active users, the strategy prioritized segments like knowledge automators over just traditional coders. Key technical nuances include distinguishing between model layer requests and application layer changes, such as leveraging a million token context window. Strategic context is vital. Candidates should enter interviews with data on the product's current run rate and competitive landscape. The interviewers look for candidates who can defend their prioritization logic while remaining flexible enough to pivot when new constraints, like specific surface areas or safety protocols, are introduced. Ultimately, the goal is to demonstrate a deep understanding of how AI products function differently from traditional software, specifically regarding cost, hallucinations, and the probabilistic nature of LLMs.

Key Takeaways

  • AI product sense is the primary lever for compensation and leveling. High level offers at firms like OpenAI and Anthropic are directly tied to how well a PM navigates non-deterministic system design during this specific interview round.
  • The shift from deterministic to probabilistic design is the biggest hurdle for traditional PMs. Success requires accounting for model hallucinations, variable costs per query, and the inherent unpredictability of AI outputs within the core product logic.
  • Effective AI PMs must bridge the gap between model and application layers. Solutions should not just be UI changes. They need to specify whether a feature requires a model team adjustment or can be handled at the application level using existing infrastructure like large context windows.

How to Become a Builder PM (n8n, Claude Code, OpenClaw)

Mahesh Yadav, a former PM at Google and Meta who managed a $1.3M total compensation package, outlines the transition from traditional product management to the Builder PM role. A Builder PM is defined by the ability to identify customer needs and ship a functional first version to the initial ten customers without relying on a developer. This shift is powered by an agentic stack comprising n8n, Claude Code, and the OpenClaw pattern. Effective agents require four core components: intelligence (the model), tools (actions), memory (session context), and knowledge (proprietary data). Any agent that fails to deliver usually lacks one of these four pillars. n8n serves as a foundational visual tool for understanding agent architecture, allowing PMs to see how different nodes interact in a multi-agent system. It is particularly useful for building evaluation pipelines and initial multi-agent workflows. However, the release of Claude Code in late 2025 significantly altered the landscape by integrating context, action, and evaluation into a single loop. This tool effectively replaced several categories of AI startups by providing direct computer control, including file system access and bash command execution. This capability allows AI to move from performing three-minute tasks to handling autonomous jobs lasting several hours, effectively turning the AI from an assistant into an autonomous worker. The OpenClaw pattern represents a shift toward delegation through existing channels like WhatsApp, utilizing model-agnostic sandboxing. This approach is expected to be adopted by major cloud providers like Google and AWS. For PMs, the career implications are significant. AI PM interviews at senior levels have moved away from theoretical product sense questions toward live building exercises and AI system design. Success in this new environment requires proficiency with tools like Claude Code to demonstrate technical agency. Yadav’s own career trajectory highlights the financial upside of this technical pivot, noting that his compensation doubled every eighteen months by focusing on AI-centric roles. He ultimately left Big Tech because the slow approval cycles hindered the rapid innovation possible with these new builder tools.

Key Takeaways

  • Claude Code has effectively commoditized the agentic loop by combining context, action, and evaluation, making many standalone AI startups redundant.
  • The real power of modern AI agents lies in computer control, where access to the file system and terminal allows for long-horizon autonomous work rather than simple chat interactions.
  • The Builder PM role represents a return to high-velocity shipping where the barrier between product discovery and technical execution is nearly eliminated by low-code and agentic tools.

This One Thing is Stopping You From $500K as an AI PM

This mock interview features Aakash Gupta and Aman Goyal walking through a high-stakes AI system design challenge: building a churn reduction agent for a telecom provider. The session highlights the specific technical and product skills required for senior AI PM roles at companies like OpenAI and Meta. A key focus is the initial discovery phase, where the candidate must use clarifying questions to define the scope, user segments, and specific pain points before proposing any technical solutions. The technical architecture is built around three core pillars: the model, the data, and the memory. While many candidates focus solely on the model, the discussion emphasizes that data is the primary differentiator and memory is what allows the agent to improve over time. A significant portion of the interview covers the trade-off between traditional machine learning models like XGBoost and modern Large Language Models (LLMs). For structured data tasks like predicting churn probability, XGBoost is often preferred for its cost-effectiveness and interpretability, while LLMs are better suited for the intervention phase, such as generating personalized retention offers or interacting with customers. The interview also stresses the importance of live system diagramming to visualize data flows from collection to intervention. Candidates are expected to address production concerns early, including latency, scaling, and the choice between on-prem and cloud infrastructure. Evaluation is another critical component, requiring a framework that tracks model performance (recall, hallucinations) alongside business outcomes like retention rates and escalation frequency. The final feedback emphasizes that technical fluency and the ability to ground abstract designs in real-world contexts are what separate top-tier AI PMs from traditional product managers.

Key Takeaways

  • The Model, Data, Memory framework is the essential architecture for agentic AI, where data serves as the competitive moat and memory enables long-term system improvement.
  • Strategic AI PMs must demonstrate model pragmatism by choosing traditional ML like XGBoost for structured prediction tasks while reserving LLMs for complex reasoning or natural language interventions.
  • Success in high-level AI interviews depends on the ability to live-diagram system architectures, showing how data flows from raw inputs to model predictions and final automated actions.

Designing With AI With Designers of Figma & Codex

Design and code are merging into a single, fluid workflow through AI-driven integrations like the Model Context Protocol (MCP). Ed Bayes from OpenAI and Gui Seiz from Figma explain how the traditional handoff between designers and engineers is being replaced by a lossless, bi-directional loop. Using the Codex-Figma MCP, teams can pull live React components into Figma with perfect fidelity, preserving every CSS detail like border radius, padding, and shadows, and sync design changes back to the codebase automatically. This eliminates the need for redline specs and static handoff documents. A major theme is the Total Football approach to product development, where traditional roles like designer, engineer, and product manager become fluid and interchangeable. At OpenAI, designers often spend 70 to 80 percent of their time coding, and product managers are expected to build functional prototypes and ship pull requests (PRs) to production to stress-test their ideas. This shift is driven by the fact that high-fidelity, interactive prototypes are now as easy and inexpensive to create as paper sketches, removing the resource constraints that previously forced teams into a linear design process. The discussion also highlights the role of AI as an infinitely patient tutor. Proficiency in code is less important than curiosity; the tools allow non-technical team members to build functional iOS apps or complex prototypes from scratch. While the tools and mediums are changing, the fundamental mandate of the designer remains the same. They still uphold craft and act as the voice of the user. The evolution of these tools simply allows designers to work in the final medium of the product rather than a static representation of it. This new workflow is already being adopted by teams that were previously only AI curious, leading to a more integrated and technically empowered design culture that supports faster product-led growth (PLG) cycles.

Key Takeaways

  • The Codex-Figma MCP integration removes the friction of manual redlining by creating a live link between design artifacts and React code.
  • High-fidelity prototyping is now as fast as sketching, allowing teams to skip static wireframes and test functional ideas immediately.
  • The Total Football approach means designers and PMs are increasingly shipping code to production to validate their own concepts.

The AI PM Behavioral Interview Masterclass (Mock w/ Real Answers)

How this PM Used Claude Code to Support 20 People

The Claude Code Setup Nobody Shows You

Claude Code can be transformed into a comprehensive operating system for product managers by focusing on context management and agentic efficiency. A major challenge in using Claude Code is context consumption. For instance, a single web search can consume 10% of the available context, while system prompts and Model Context Protocol (MCP) connections can take up 10 to 16% before a single message is sent. To mitigate this, users should shift from MCPs to Command Line Interfaces (CLIs) like the GitHub or Vercel CLIs, which have zero context overhead. Another strategy involves using sub-agents to handle research tasks. By delegating a search to a sub-agent, the main session only receives a summary, reducing context cost from 10% to roughly 0.5%. This setup also benefits from skills that do not require complex code. For example, a front-end design skill can be created using specific rules that instruct Claude to avoid typical AI design patterns. More advanced skills can use tools like Puppeteer to screenshot outputs and self-correct layout issues before the user sees them. The organizational structure of this operating system relies on a specific file hierarchy. A Knowledge folder stores context about people and stakeholders, which compounds over time as meeting transcripts are added. A Projects folder isolates tasks to prevent context bleed, and a Tools folder houses custom scripts. A central CLAUDE.md file defines the agent's identity and operating rules. For data-heavy tasks, integrating Jupyter notebooks ensures transparency and trust by allowing users to trace the exact code and inputs used for any analysis. This approach moves beyond simple chat interactions toward a structured, reproducible workflow.

Key Takeaways

  • Context is the primary constraint in agentic workflows. Shifting from MCPs to CLIs and using sub-agents are essential tactics to prevent context exhaustion during complex tasks.
  • A persistent People folder creates a compounding context advantage. Updating stakeholder dossiers after every meeting makes the AI's output increasingly personalized and accurate over time.
  • Self-correcting agentic loops represent the next level of AI utility. Using tools like Puppeteer to verify and fix visual outputs before they reach the user ensures a higher standard of reliability for complex deliverables.

Stop Applying to AI PM Jobs Until You Watch This

16+ Years of Product Strategy in 50 Minutes (Using AI)

Product strategy in 2026 requires a shift from static documents to interactive prototypes. AI tools like Bolt, Lovable, and v0 allow for rapid visualization, making visual strategy the new standard. While AI handles execution, human judgment remains essential for navigating stakeholder dynamics and customer insights. The core framework involves a seven-step sequence: objective, users, superpowers, vision, pillars, impact, and roadmap. A strong objective must be measurable, such as moving a specific retention metric by a set date. Limiting focus to three or fewer objectives is critical for success. The Jobs to be Done framework is highlighted as more effective than traditional demographics. For example, understanding that commuters used milkshakes as a sidekick led to product adjustments that increased sales. To secure resources, product leaders must translate their metrics into financial outcomes like ARR or EBITDA. Reducing build times at Epic Games resulted in 1.5 million dollars in annual savings. A snap strategy can be developed quickly and then validated over a month through customer interviews and user testing. A successful strategy is one that an engineer can explain in 30 seconds and use to prioritize their work effectively.

Key Takeaways

  • Visual prototyping is now a requirement for strategy because tools like v0 and Bolt make it nearly instant to show rather than tell.
  • Human strategic judgment is a competitive moat that AI cannot replicate, especially regarding stakeholder management and nuanced customer feedback.
  • Product metrics must be directly mapped to financial outcomes like ARR or EBITDA to successfully negotiate for headcount and resources.

Gemini Gems Masterclass with the Creator at Google: 3 Gems You Must Build

I Should Be Charging $999 for This AI Prototyping Masterclass

Claude Code + Analytics Masterclass: Automate Product Analytics (2026)

Frank Lee details a high-leverage workflow for product managers using Claude Code and the Model Context Protocol (MCP). The core approach involves connecting analytics tools directly to Claude to automate deep chart analysis, reducing tasks that typically take hours to about 90 seconds. By loading product context into a repository and using MCP for data access, PMs can generate automated dashboard reports and synthesize customer feedback from multiple channels like Zendesk, Gong, and Slack in a single pass. A key feature of this setup is the use of skills, which are named prompts with specific heuristics that prevent context bloat by only loading when relevant. These skills give the agent a repeatable workflow without overwhelming the model with unnecessary instructions. The workflow extends to documentation, where insights are automatically converted into PRDs within tools like Cursor or Claude Code and then pushed directly to Linear. Lee emphasizes that the biggest mistake users make is connecting too many MCP servers at once, which burns context and degrades performance. Instead, he suggests loading only what is relevant to the specific task. For tools that do not have a native MCP, like Granola, he recommends building custom scripts to dump meeting notes into a product repo for easy retrieval. This shift toward agent-driven workflows, or vibe PMing, allows PMs to focus on strategy while agents handle the manual labor of data scanning and spec writing. This methodology is particularly effective for weekly business reviews, where Claude can scan dashboards on Monday morning to surface the most urgent issues and top insights automatically. The masterclass highlights how tools like Amplitude and Pendo integrate into this ecosystem. By pointing Claude Code at these dashboards, PMs can bypass manual scanning entirely. The process involves dropping a chart URL into the interface and triggering an analysis skill that navigates the data taxonomy to find anomalies. This allows the agent to hypothesize why metrics changed based on the broader product context stored in the repository. The end goal is a fully automated pipeline where customer feedback is synthesized, insights are turned into specs, and code is generated or tasks are created in Linear with minimal manual intervention.

Key Takeaways

  • MCP serves as a data access layer rather than a complex orchestration tool. Its primary value is connecting AI to external systems easily to pull specific data points into the reasoning loop.
  • Effective context management requires limiting the number of active MCP servers. Every tool description consumes context window space, so hiding or removing unused tools is essential for maintaining agent performance.
  • The transition to agent-driven workflows creates a significant competitive advantage for PMs. Automating the manual parts of the job like dashboard scanning and PRD drafting allows for faster iteration cycles and higher-level strategic focus.

I learned AI designing more in this 1 hr than any course ever

The Most Important New Skill for Product Managers in 2026: AI Evals Masterclass

The PLG Masterclass People Paid For—Get It Free

The traditional Slack and Dropbox playbook has evolved into a more sophisticated 7-layer framework for 2026. Modern leaders like Canva and Notion use product-led SEO and contextual billing gates rather than simple brand ads. A major shift is the move toward reverse trials, which give users premium features upfront before downgrading them to a lower plan. Data shows reverse trials can achieve an 18% conversion rate compared to just 4% for standard freemium models. Activation now focuses on building long-term habits through personalized onboarding forks rather than just reaching an initial aha moment. Retention strategies have also shifted from individual users to entire organizations, as seen with Figma's infiltration of multiple departments including marketing and sales. Monetization and expansion are increasingly driven by usage-based or hybrid pricing models instead of just seat counts. In companies like Apollo, usage-based metrics for buying contacts drive 70% to 80% of expansion revenue. The three main vectors of change in this new era are deep personalization across all layers, expansion across the whole organization, and the evolution of freemium through templates and reverse trials.

Key Takeaways

  • Reverse trials are significantly more effective than traditional freemium or free trials, often tripling or quadrupling conversion rates by letting users experience full value immediately.
  • Successful retention now requires moving beyond the individual user to embed the product across different departments like marketing, sales, and research to create organizational lock-in.
  • Expansion revenue is the most critical growth lever in modern SaaS, frequently providing the majority of total growth through value-based metrics and usage triggers rather than simple seat renewals.

This AI Expert's Method Will Change How You Do Customer Research

Caitlin Sullivan, a leading user research expert, outlines a specific workflow for using AI to analyze surveys and interviews without the risk of hallucinations. The core philosophy is to replicate the human research process by combing through data first before attempting synthesis. Instead of asking an AI for general themes immediately, researchers should use a multi-step prompting strategy. This begins with loading full context into the model, followed by per-participant analysis, and finally a verification stage. For survey analysis, the method requires inductive coding where labels are applied to every individual response before looking for patterns. This prevents the model from miscategorizing or oversimplifying results. Claude is highlighted as the preferred model for its ability to handle nuance and thoroughness, while Gemini is noted for its speed in surfacing high-frequency themes. The workflow also incorporates emotional intensity ratings to add depth to qualitative data. To ensure accuracy, the model must be forced to audit its own work to catch contradictions or exaggerated ratings. Advanced users can further optimize this by using Claude Code to run parallel agentic workflows, significantly cutting down analysis time while maintaining high factual precision. Converting raw transcripts into structured markdown files is recommended to improve model accuracy and manage token limits effectively.

Key Takeaways

  • Avoid jumping straight to theme extraction because AI performs best when it mirrors the human process of detailed data combing before synthesis.
  • Apply inductive coding to every survey response individually before asking for aggregate patterns to ensure the results are reliable and not miscategorized.
  • Implement a self-correction loop by forcing the AI to audit its own analysis for contradictions and exaggerated emotional intensity ratings.
  • Use multi-step prompting to separate context loading, participant analysis, and final verification rather than cramming instructions into a single prompt.
  • Leverage Claude Code to parallelize interview and survey analysis through agentic workflows to reduce total processing time by half.

I Spent 6 Months Learning Replit. Here's What Actually Matters.

This guide breaks down how to use Replit effectively for building software in 2026. It moves past basic tutorials to focus on high-leverage features like Design Mode and Agent 3. The core message is about choosing the right tool for the specific task to save both time and money. For example, using Design Mode for landing pages or mockups is significantly cheaper and faster than building in App Mode. The guide also highlights Fast Mode as a way to reduce costs by 10x for simple edits compared to using the full Agent Mode. Integration is a major theme, specifically using direct Figma imports rather than screenshots to maintain design fidelity. For backend services, Connectors allow for one-time setup of tools like OpenAI, Stripe, and databases that sync across all projects. When it comes to building, the advice is to use Plan Mode for clarifying scope and then letting Agent 3 run autonomously for long stretches rather than micromanaging every step. Finally, the guide warns about deployment pitfalls, specifically how static deployment can break apps with backends, necessitating the use of auto-scale options for production-ready tools.

Key Takeaways

  • Strategic mode selection prevents over-engineering. Using Design Mode for static assets costs pennies and takes minutes, whereas App Mode adds unnecessary complexity for simple front-end tasks.
  • Cost management is a skill in AI-assisted development. Fast Mode offers a 90% discount over Agent Mode for minor tweaks like CSS or text changes, making it essential for iterative polishing.
  • High-fidelity prototyping requires direct data pipelines. Importing Figma frames via URL is far more accurate than using AI to interpret screenshots, closing the gap between design and code.
  • Agentic workflows require trust and patience. Letting Agent 3 run for 20 minutes without interruption is more efficient than constant course correction, which often leads to wasted credits and fragmented logic.

How to 10x your productivity as a PM with AI tools

Aakash Gupta outlines a comprehensive seven-layer stack designed to transform product management through AI integration. The framework begins with prompting fundamentals, emphasizing structured frameworks like RTF and XML formatting specifically for Claude. It moves into a three-part co-pilot system using Claude Projects for high context, Cluely for desktop context, and NotebookLM for deep document synthesis. For automation, Gupta suggests building an agent team using tools like Lindy, Relay, or n8n to handle repetitive workflows. Prototyping is identified as the most significant shift in PM work over the last 16 years, enabled by tools like Magic Patterns and Lovable that allow for rapid iteration from ideation to handoff. The discovery phase is enhanced by AI tools like Interpret and Unwrap to analyze customer feedback at scale. Advanced layers include Evals and Observability using platforms like Arise and BrainTrust to ensure model reliability. Finally, vibe experimentation via Chameleon allows PMs to ship experiments without heavy engineering involvement. The core operational strategy is the 20/60/20 rule where the PM handles the initial 20% of the work, AI generates the middle 60%, and the PM provides the final 20% of human editing and strategic oversight.

Key Takeaways

  • The 20/60/20 rule redefines the PM role from doer to editor and strategist, ensuring human intuition remains at the start and end of every workflow.
  • AI prototyping represents the biggest shift in PM methodology in nearly two decades, drastically reducing the time between discovery and functional handoff.
  • Moving beyond simple chat interfaces to a multi-layered stack including co-pilots, agents, and observability is required to reach elite PM status in an AI-first market.
  • Vibe experimentation allows PMs to bypass engineering bottlenecks for UI/UX testing, significantly increasing the volume of experiments a team can run.

The AI-Native PM Operating System [Live Demo]

Mike Bal, Head of Product at David's Bridal, demonstrates a shift from fragmented tool stacks to a unified AI-native operating system. The core strategy involves moving away from logging into dozens of separate UIs and instead using a central interface like Cursor or Claude Desktop. By leveraging the Model Context Protocol (MCP), product managers can connect data from JIRA, Figma, GitHub, and Confluence directly into their AI workspace. This setup allows for high-speed context retrieval, such as comparing a Confluence requirement document against a Figma design to spot discrepancies in under a minute. The workflow highlights the use of specialized agents for different tasks. Mike uses Manus for heavy research because it provides traceable sources and structured file outputs like CSVs and markdown summaries, which outperforms standard chat interfaces. He also shows how PMs can act as builders by using natural language in Cursor to create internal tools, such as e-commerce apps or CMS migration scripts, without deep coding knowledge. A key tactical insight is the importance of research hygiene: vetting AI-generated data in an external environment before bringing it into the core system to avoid anchoring the model on incorrect information or hallucinations.

Key Takeaways

  • Centralized operating systems built in Cursor or Claude Desktop eliminate the productivity tax of context switching between multiple SaaS tabs.
  • MCP serves as the essential glue that allows natural language queries to interact directly with technical tools like JIRA and GitHub.
  • The role of the PM is evolving into a builder role where natural language prompts can replace traditional engineering tickets for internal tool development.
  • Maintaining a separation between raw AI research and the internal context layer is vital to prevent the AI from adopting and repeating incorrect data.

AI Product Metrics Mock Interview (Meta/Google/OpenAI Case)

This breakdown covers a mock interview for an AI product manager role, specifically focusing on Descript's Underlord feature. The core of the discussion revolves around building a visual framework to track success. It emphasizes generating a broad metrics bank before narrowing down to a North Star. For Underlord, the selected North Star was the number of exports, evaluated through three vectors: sessions, completion rate, and exports per session. A critical part of the framework involves distinguishing between input metrics and output metrics like upgrades or renewals. The discussion also highlights the need for AI-specific guardrails, such as hallucination rates and support request volume, to ensure the AI actually delivers value without increasing friction. One nuanced point is how to handle time-based metrics. In AI products, time spent might actually increase because the tool allows users to perform more complex tasks. To account for this, the framework suggests controlling for complexity by comparing time spent on specific tool combinations. The session concludes with a tactical power move for candidates: following up after an interview with a refined dashboard mockup to demonstrate proactive thinking.

Key Takeaways

  • Visual frameworks act as a necessary anchor in long interviews to keep the interviewer aligned with complex logic.
  • AI success isn't just about speed; time-to-task metrics must be adjusted for complexity since AI often encourages users to do more.
  • Guardrails for AI products must include hallucination rates and support volume to verify the model's reliability.
  • A North Star metric should be broken down into an operational equation to make it actionable for product teams.

These 3 AI Browsers make Chrome Feel useless

AI agent browsers like ChatGPT Atlas, Perplexity Comet, and Arc Dia are transforming how product managers and GTM teams handle web-based workflows. While Chrome remains the standard for general browsing, these tools offer specialized agentic capabilities that eliminate hours of manual work. ChatGPT Atlas excels at deep research and data extraction. It can scrape LinkedIn profiles for recruiters, fill out job applications, and build complex competitor comparison tables from multiple sources. It also features Gmail integration to streamline communication workflows. What used to take hours now takes minutes through a single prompt. Perplexity Comet focuses on speed and real-time information retrieval. It is the go-to tool for quick lookups involving stock prices, sports scores, or breaking news across platforms like Reddit and Twitter. Arc Dia serves as the automation powerhouse. It is designed for recurring tasks such as weekly competitor pricing monitoring, documenting product onboarding flows with screenshots, and generating recurring reports. Dia also offers YouTube video summarization and deep integration with Jira and Atlassian, making it a strong choice for technical product management and documentation. A major productivity unlock across these tools is tab context. This feature allows the AI to read and synthesize information across all open tabs simultaneously. Instead of copying and pasting data from five different competitor sites, a user can simply ask for a summary of common pricing strategies. This eliminates the friction between the web and the LLM. Despite their power, these browsers are currently slower than traditional search engines. However, the trade-off is worth it for batching high-effort manual tasks. Users should exercise caution by avoiding sensitive accounts like banking or private email within these AI-driven environments, keeping those activities in standard browsers.

Key Takeaways

  • Agentic browsing shifts the focus from finding information to executing tasks. The ability to scrape data and fill forms directly in the browser replaces manual spreadsheet work.
  • Tab context is the real killer feature for GTM research. Synthesizing insights across multiple open tabs removes the friction of manual data aggregation.
  • Browser choice depends on the specific workflow. Atlas is for extraction, Comet is for speed, and Dia is for recurring automation and documentation.

If you can’t AI prototype after this, nothing will help you

You'll be left Behind as an AI PM If You Don't Use ChatGPT Apps

ChatGPT apps represent a fundamental shift in how software is distributed and consumed, combining the Model Context Protocol (MCP) with interactive UI widgets. While Anthropic originally developed MCP to allow AI agents to call external tools, OpenAI has built a visual layer on top that enables embedded app experiences directly within the chat interface. This creates a massive distribution opportunity targeting 900 million weekly active users. Early data indicates that traffic coming through these AI interactions converts at a rate 26% higher than traditional search, positioning this as the next frontier for SEO. Building for this ecosystem requires a cross-platform mindset. Since MCP is an open standard, tools built for ChatGPT can also function within Claude, Cursor, and other agentic environments. The discovery mechanism relies heavily on tool descriptions and metadata rather than traditional keywords. This means product managers must treat their tool definitions as a new form of semantic SEO, ensuring that the LLM understands exactly when and how to trigger a specific app. Testing and quality assurance in this space involve three distinct categories of evaluations. Direct evals check if the app triggers when specifically named. Indirect evals test if the app appears when a user describes a relevant problem or outcome. Negative evals ensure the tool does not activate for irrelevant requests. For product managers, the workflow involves rapid prototyping using tools like Chippy to demonstrate value to stakeholders before engineering teams build production-ready versions. The most successful apps will likely be those that offer embedded collaboration, such as interactive spreadsheets or task lists, rather than simple information retrieval. With a public marketplace expected by early 2025, the window for early adoption is closing fast for both enterprise players and solo builders.

Key Takeaways

  • The Model Context Protocol acts as the universal connector for the agentic web, making tool interoperability the new standard for SaaS distribution.
  • Discovery in AI marketplaces is driven by semantic metadata, requiring a shift from keyword-based SEO to intent-based tool description optimization.
  • Interactive widgets within the chat UI transform LLMs from simple text generators into functional operating systems for collaborative work.
  • The 26% conversion lift from AI traffic suggests that agentic discovery is significantly more efficient at matching user intent with specific software solutions.
  • Systematic evaluation of direct, indirect, and negative triggers is the only way to ensure reliable app performance in non-deterministic chat environments.

How To ACE AI Product Design Interviews (Anthropic PM Mock Interview)

How to Build AI Evals in 2026 (Step-by-Step, No Hype)

Systematic AI evaluation is essential for moving beyond simple demos to production-ready applications. While foundation models like Claude might have upstream evals, custom B2B SaaS applications require specific, rigorous testing. The process begins with observability, capturing traces of real user interactions. Tools like Brain Trust, LangSmith, or Arise are useful, but logging to simple CSV or JSON files is often sufficient to start. The core of effective evaluation is error analysis, which consists of open coding and axial coding. Open coding involves manually reviewing around 100 traces and journaling observations without immediate root cause analysis. This reveals real-world issues that generic metrics miss, such as markdown rendering in SMS, incorrect tool calls, or failed human handoffs. Axial coding then categorizes these notes into five or six actionable themes. Counting these categorized issues allows teams to prioritize fixes based on frequency and impact rather than vibe checks. Product managers must lead this process because they possess the domain expertise and taste required to judge product quality, which engineers may lack. Separating the prompt, which is written in English, from the PM is a common mistake. When building LLM judges, binary scores of true or false are superior to 1-5 scales because they are easier to align with human preferences and reflect actual business decisions. It is critical to measure the judge's performance using True Positive Rate (TPR) and True Negative Rate (TNR) rather than simple agreement, which can be misleading if failures are rare. Code-based evals should handle objective formatting issues, while LLM judges manage subjective calls like conversational flow. Starting with real production data or dog-fooding is preferred over synthetic data generation.

Key Takeaways

  • Error analysis is the secret weapon for AI products. Manually reviewing 100 traces provides more actionable insight than any automated helpfulness score because it captures nuance like formatting errors or logic failures.
  • PMs should own the prompt and the evaluation process. Since prompts are written in natural language, the domain expert is best suited to iterate on them, and outsourcing this to engineers often leads to a loss of product taste.
  • Avoid agreement as a success metric for LLM judges. A judge can have high agreement simply by always predicting pass in a system with few errors. Use TPR and TNR to ensure the judge actually catches failures.
  • Binary scoring is more effective than Likert scales for LLMs. Business decisions are usually binary, and LLMs struggle with the nuance of numerical scales, making binary outputs easier to validate and align.

Claude Code Advanced Masterclass in Under 81 Mins

Claude Code reached $1B ARR in six months by prioritizing deep work for power users. The Model Context Protocol (MCP) is the core technology enabling this, allowing Claude to connect with Linear, Google Workspace, Slack, and GitHub. This integration shortens the product management cycle from a week to a single morning. A typical workflow involves analyzing surveys, generating a PRD, creating a presentation, and drafting 19 engineering tickets in Linear. Using Opus 4.5, these tasks happen without traditional templates. Instead, users build Skills which are reusable automation patterns. While these do not always trigger automatically, they provide a massive efficiency boost when called explicitly. The presentation skill can generate 19 fully editable Google Slides in parallel, saving hours of manual formatting. The GitHub integration is particularly powerful, allowing Claude to function as a remote worker that processes issues and updates markdown files while the user is away. For production environments, structured workflows are often superior to autonomous agents. A Level 1 workflow uses roughly 5,000 tokens and runs in 40 seconds, whereas a Level 3 cognitive agent might use 90,000 tokens. Success in this new environment requires building reusable systems rather than one-off prompts. PMs who invest in building these workflows early gain a compounding advantage as models improve. Technical best practices include setting error workflows, adding retry logic, and pinning data during development to ensure consistency. The essential MCP stack for PMs prioritizes document tools first, followed by task management, communication, and data sources. The end goal is a system where the PM spends their entire day within the Claude environment, using it as a central operating system for all product tasks. This approach moves beyond simple chat interactions toward a comprehensive agentic infrastructure.

Key Takeaways

  • The shift from generic AI usage to deep, tool-integrated workflows creates a compounding advantage for product leaders.
  • MCP acts as a universal context layer, making the distinction between different software tools less relevant as Claude becomes the primary interface.
  • Prioritizing code-based workflows over pure cognitive agents improves reliability and reduces token costs for repetitive production tasks.
  • Building a library of reusable Skills and automated hooks transforms the role of a PM from a document creator to a system architect.

Google AI PM Reveals the Tools 99% of Product Managers Don’t Use

Marily Nika, a veteran AI Product Manager at Google, outlines a streamlined six-tool stack designed to optimize the PM workflow. The core philosophy shifts from document-heavy processes to a prototype-first approach. Using Google AI Studio, PMs can build functional prototypes to demonstrate features before writing a single line of a Product Requirements Document (PRD). This method facilitates better engineering alignment and reduces the time spent on abstract documentation. For domain expertise and research, Nika utilizes Notebook LM to rapidly ingest long-form content. In one instance, she processed a four-hour investor relations video in fifteen minutes to prepare for an interview. She also uses Perplexity’s Reddit filter to uncover authentic user sentiment and discussions, bypassing traditional web search noise to understand what users actually want. The workflow also incorporates ChatGPT for generating PRDs, specifically using a custom GPT trained on personal writing styles and past documents to ensure consistency and speed. Nika emphasizes that AI tools should only be adopted if they meet four specific criteria: they must save 10x time, work across various contexts, function within company limitations, and compound in value over time. Regarding career growth, she suggests a "crab-like" movement. This involves transitioning into AI roles by leveraging existing domain expertise rather than starting from scratch, such as moving from a hearing aid background to an AirPods PM role. She highlights that the fundamental craft of product management—understanding the "why," "who," and how to measure success—remains critical. AI changes how features are delivered, but not the underlying use cases. By 2026, the distinction between an AI PM and a standard PM is expected to vanish as AI integration becomes the industry standard. PMs who fail to adopt these tools risk falling behind as the productivity gap widens.

Key Takeaways

  • Prototype-First Workflow: Moving from PRDs to functional prototypes using tools like Google AI Studio allows PMs to align with engineers on actual functionality early, significantly reducing documentation cycles.
  • Strategic Tool Selection: Tools are only worth integrating if they provide a 10x efficiency boost, work across multiple contexts, and compound in value; otherwise, they contribute to tool-hopping fatigue.
  • The "Crab" Career Strategy: Successful transitions into AI PM roles involve moving laterally into adjacent spaces where existing domain knowledge provides a competitive edge over pure technical skills.
  • AI as a Research Accelerator: Tools like Notebook LM and Perplexity's Reddit filter enable PMs to gain deep domain expertise and authentic user insights in minutes rather than days.

Master 80% of n8n in 59 mins

n8n serves as a powerful alternative to Zapier and Make by combining standard automation with advanced AI agent capabilities. Pawel Huryn demonstrates how to build high value systems like competitor monitoring for a fraction of the cost of enterprise tools. A key highlight is the ability to run these workflows on the free version of n8n using external APIs like Perplexity and OpenAI. The technical deep dive covers the difference between rigid workflows and flexible AI agents. While agents offer more adaptability, traditional workflows remain more reliable and token efficient for predictable tasks. Practical tips include pinning data to cache API responses during testing, which prevents unnecessary credit burn. Huryn also emphasizes token management by compressing context. He suggests extracting only essential summaries and URLs from search results to cut token usage by up to 70 percent. For coding needs, the strategy is to use ChatGPT to generate n8n code blocks from data screenshots rather than writing them manually. The guide also details essential maintenance practices such as setting error probes, limiting iterations to avoid infinite loops, and configuring three retry attempts for failed nodes. This approach transforms n8n from a simple connector into a sophisticated research and operations engine suitable for product managers and growth leads.

Key Takeaways

  • Hybrid automation models that mix fixed logic with LLM agents provide the best balance of reliability and intelligence. Use standard nodes for known steps to save tokens and switch to agents only when reasoning is required.
  • Significant cost advantages exist when using the free version of n8n combined with direct API calls. A competitor monitoring system that costs 500 dollars monthly in specialized SaaS can be replicated for a few dollars in API credits.
  • Context window optimization is a critical skill for agentic workflows. Programmatically stripping 70 percent of irrelevant metadata from LLM inputs directly improves response quality and reduces operational overhead.

How AI PMs Ship Features Users Love (Descript CEO Explains)

Everything PMs Need to Know about ChatGPT’s New Codex (Masterclass)

ChatGPT Codex is OpenAI's command line interface (CLI) tool designed to compete with Claude Code by providing an agentic experience that goes beyond simple chat. Unlike the browser version of ChatGPT, Codex can read entire folders, execute code, and connect to various APIs to perform complex tasks autonomously for extended periods. Installation is handled via a single terminal command using npm, and users can sign in with their existing ChatGPT Plus or Pro accounts to begin building within environments like Cursor or VS Code. A central feature of the Codex workflow is the AGENTS.md file. This acts as a persistent instruction set where PMs can define project rules, coding standards, and specific constraints. Codex reads this file before every task, effectively treating it as an onboarding document for a new digital team member. For rapid prototyping, the tool includes a YOLO mode activated by the --full-auto command, which allows the AI to run operations without seeking individual permissions. This is particularly useful for building quick prototypes like TikTok recipe bots or design system components in Storybook. The tool allows for strategic model switching between GPT-5 and GPT-5-Codex. While the standard GPT-5 model is faster and better suited for document analysis or PRD generation, the Codex-specific version is optimized for following technical specifications with high precision. Compared to Claude Code, Codex is roughly 50% cheaper on tokens and faster, though Claude remains superior for complex refactoring and managing sub-agents. PMs can also use Socratic questioning frameworks and markdown-based mega-prompts to automate the creation of high-quality PRDs and technical implementation docs.

Key Takeaways

  • Codex CLI shifts the AI from a passive chat interface to an active agent that manages local file systems and executes scripts directly.
  • The AGENTS.md file serves as a context layer that eliminates prompt repetition by housing permanent project constraints and build commands.
  • YOLO mode provides a high-velocity path for technical prototyping but requires a sandboxed or safe environment to prevent unintended system changes.
  • Model switching allows PMs to optimize for either analytical speed or technical accuracy depending on whether they are writing docs or building features.
  • Codex acts as a bridge for non-technical PMs to perform 'vibe engineering' by providing detailed specs that the AI can then build autonomously.

What AI PMs REALLY Need to KNOW in 2026 (Agents, Discovery, EVERYTHING)

AI PM roles now make up 20% of all product listings and command a salary premium of 30% to 40% due to the scarcity of high-level skills. Todd Olson, CEO of Pendo, outlines a transition path from core PM to AI PM using a five-layer technical pyramid. The foundation involves mastering AI fundamentals, data pipelines, and prompt engineering. However, the more advanced layers require moving into observability, trace analysis, and cost optimization. Retrieval-Augmented Generation (RAG) has become the standard for building AI features, but PMs must learn that providing too much context can confuse a model just as it would a human. A significant point of friction exists in trace analysis, where PMs may clash with engineering managers over technical boundaries. Despite this, evaluation sets (Evals) are the undisputed domain of the PM. Because PMs understand the user and business goals best, they are uniquely qualified to define what a successful model output looks like. From a business perspective, AI features can drastically reduce gross margins, sometimes dropping them from the traditional 80% to below 15%. This makes cost and performance optimization a critical product strategy. Successful AI products must solve difficult workflow problems, such as Pendo's discovery agent, rather than simply wrapping an existing LLM with a new logo. PMs are encouraged to ruthlessly cut features that do not perform and to lead board-level conversations with a clear narrative on how AI bets drive shareholder value.

Key Takeaways

  • Production at scale is the primary hiring requirement. While many can build a prototype, hiring managers prioritize PMs who have successfully shipped AI features to thousands of paying B2B customers.
  • Evals are the new product requirement document. Engineers manage the infrastructure, but PMs must own the creation and management of evaluation sets to ensure the AI meets business and user needs.
  • The AI margin gap represents a major strategic risk. Traditional SaaS margins are threatened by high compute costs, meaning PMs must treat cost optimization as a core part of the product roadmap.
  • Focus on automating complex workflows rather than building shiny objects. Sustainable AI products solve hard problems like scheduling and prioritization instead of just acting as simple ChatGPT wrappers.

I stole the AI product stack of the top 1% product managers for you (full tutorial)

The IC CPO model allows leaders to use AI for self-serving answers and automating high-level workflows. This stack centers on Claude Code and Cursor to build custom agents for calendar analysis, email triage, and real-time data analytics. The calendar agent identifies delegation opportunities by reviewing meeting patterns and red flags like excessive context switching. The email agent drafts replies and flags missing information, such as a missing meeting link, while observing the user's behavior to prioritize tasks. A key technical highlight is connecting Claude Code to Snowflake using MCP servers. This setup allows natural language queries to execute SQL directly, enabling a leader to ask questions like how many sites a specific customer has without waiting for a data scientist. To scale these behaviors across an organization, Webflow uses Builder Days. These sessions move teams from 0% to 30% AI adoption by providing licenses and technical support to build prototypes outside their usual comfort zones. This cultural shift is reinforced by updating career ladders to make AI-native work a core expectation rather than a bonus skill. Product development follows the MVO (Minimum Viable Output) framework, which prioritizes perfecting the model's output through RAG and context engineering before building a traditional MVP. This ensures the core AI value proposition works before investing in the product shell. Rigorous evals using tools like BrainTrust are treated as essential test cases to prevent model regressions, a lesson learned when a model update nearly broke a major launch. The strategy emphasizes building on existing company strengths, such as Webflow's CMS and hosting infrastructure, rather than chasing generic AI trends. This approach turns AI from a feature into a foundational layer of the product and the leadership workflow. By focusing on production-grade outputs from the start, teams avoid the trap of shipping shallow wrappers and instead build deeply integrated agentic features.

Key Takeaways

  • The IC CPO model shifts leadership from delegation-heavy management to high-leverage building. By using agentic tools to handle data analysis and scheduling, leaders reduce their dependency on internal teams and speed up decision cycles.
  • The MVO framework flips traditional SaaS development by focusing on the quality of the AI's response before designing the interface. If the model cannot consistently produce the desired output through prompt and context engineering, the product is not ready for an MVP phase.
  • Organizational AI adoption requires structured Builder Days to bridge the gap between early adopters and the late majority. Providing the right tools like Cursor and MCP servers in a low-stakes environment creates the necessary momentum for a cultural shift.
  • Evals are becoming a core competency for product managers. Treating model outputs like code that requires comprehensive test cases ensures reliability and prevents launch-day failures when underlying models are updated.

$10M ARR in 60 days with context engineering

Xiankun Wu achieved $10M ARR in 60 days with Kuse.ai by prioritizing context engineering over traditional prompting. This approach treats AI like a long-term hire rather than a one-off tool. Instead of relying on single prompts that often fail due to lack of background, context engineering combines system prompts, user memory, and Retrieval-Augmented Generation (RAG) to build a persistent knowledge base. This creates a "Mom Analogy" effect where the AI understands user preferences and goals without needing detailed instructions every time. Interactions improve over time as the AI accumulates context, creating a positive feedback loop that standard chatbots lack. The growth strategy relied on a massive "intern army" managing hundreds of Threads accounts. By posting daily use cases in underserved markets like Taiwan and Hong Kong, they generated 3 million impressions monthly with zero ad spend. Wu argues that Threads is superior to X for user acquisition because it lacks a rigid creator hierarchy and offers more generous organic reach. This allowed them to farm traffic and drive hundreds of daily visits to the product without the need for VC funding or traditional advertising. A core product philosophy introduced is Minimal Viable Output (MVO) over the traditional Minimal Viable Product (MVP). For AI startups, the priority should be perfecting the model's output through RAG and fine-tuning before investing in productization or UI. If the output isn't valuable, the features don't matter. Kuse also utilizes visual context engineering, providing a 2D spatial interface where users can draw and organize information. This allows the AI to understand spatial relationships and compounds the value of stored data over time. The product originally launched as a design agent but pivoted after observing that users were primarily using it as a horizontal knowledge base, proving the value of listening to user behavior over initial assumptions.

Key Takeaways

  • Context engineering creates a compounding data moat. Unlike standard chatbots where context is ephemeral, a persistent knowledge layer allows AI to provide increasingly accurate results without repetitive instructions.
  • The MVO framework shifts the focus from features to quality of results. In AI-native development, the ability to generate the correct output is the primary value proposition, making traditional UI-first development secondary.
  • Threads represents a significant arbitrage opportunity for GTM. The lack of an established creator hierarchy and the presence of real users make it a more effective acquisition channel than the saturated and connection-heavy environment of X.
  • Spatial interfaces provide a superior context layer for AI. Moving beyond linear chat to 2D environments allows AI to grasp complex relationships between different data points, leading to more sophisticated reasoning.

AI PM is the Job Opportunity of the Decade (Crash Course)

AI product management is shifting from a hype-driven role to one requiring significant technical depth. To land high-compensation roles at companies like OpenAI or Anthropic, PMs must move beyond basic prompt engineering into context engineering and system architecture. This involves mastering the design of instructions by combining system prompts, user prompts, long-term memory, and Retrieval-Augmented Generation (RAG) to create personalized experiences. A core distinction for modern AI PMs is understanding when to use RAG versus fine-tuning. RAG is the preferred method for integrating frequently changing knowledge, while fine-tuning is better suited for teaching a model specific vocabularies or specialized response patterns. The technical stack for rapid prototyping has also evolved, with tools like Lovable for front-end development and n8n for workflow automation allowing PMs to build functional AI applications in under 30 minutes. The path to becoming an AI PM follows a five-step architecture: understanding LLM fundamentals, building applications, mastering prompt engineering, implementing RAG, and finally developing agentic systems. Success in this field requires a build-first mentality, where practitioners create at least ten projects to understand how AI solves real business problems. Strategic implementation follows a three-wave approach, starting with efficiency gains, moving to quality improvements, and eventually reaching novel capabilities. Real-world applications, such as those at Traversal AI, demonstrate the power of agentic systems in processing massive datasets for demand forecasting and inventory optimization.

Key Takeaways

  • Context engineering is the new prompt engineering. It involves a more sophisticated design of system instructions and memory layers to achieve true personalization rather than just simple text generation.
  • The technical barrier for PMs has moved. Understanding the architectural difference between RAG for knowledge retrieval and fine-tuning for style or vocabulary is now a baseline requirement for high-level roles.
  • Rapid prototyping tools like n8n and Lovable have compressed development cycles. PMs can now validate agentic workflows and front-end interfaces in minutes, shifting the focus from how to build to what to build for maximum business impact.
  • Strategic AI adoption follows a predictable three-wave maturity model. Companies should first solve for time-saving efficiency before attempting to improve output quality or create entirely new product categories.

I Put Every AI Prototyping Tool to the Ultimate Test

Alex Danilowicz, CEO of Magic Patterns, demonstrates how AI prototyping tools are shifting the product development lifecycle from a write-then-build model to a prototype-then-validate approach. In a live head-to-head test, V0 narrowly beat Magic Patterns with a 3.7 GPA versus 3.6, while Replit, Lovable, and Bolt showed varying degrees of success in building a consumer-facing workflow builder. Magic Patterns focuses on visual fidelity and design system integration, whereas tools like Replit and V0 lean toward full-stack functionality and API connectivity. A critical feature of Magic Patterns is its Chrome extension that converts production HTML or Figma designs directly into Tailwind components for use in prompts. Prototyping reduces the 80% feature failure rate common in SaaS by validating usability and viability before engineering resources are committed. The suggested 4-step workflow involves defining an end goal, setting up a design system preset, gathering context from PRDs or screenshots, and iterating using specific select mode prompts. The discussion also covers context rot, the importance of understanding LLM context windows, and the shift toward PMs acting as technical orchestrators who use AI to bridge the gap between design and production code. This workflow helped Magic Patterns scale to $1M in revenue within six months by focusing on the component library angle of development.

Key Takeaways

  • Prototyping cuts feature failure rates from 80% to 50% by moving validation to the start of the cycle. This allows PMs to test every feature rather than just high-stakes bets.
  • Tool selection depends on the end goal of the prototype. Use Magic Patterns for high-fidelity user research and design system alignment; use V0 or Replit for functional backends and API integrations.
  • Design system presets are the moat for enterprise prototyping. Ingesting existing components via Chrome extensions ensures AI-generated UI matches brand standards immediately, preventing doom loops of styling corrections.
  • The PRD is evolving into a prompt. Instead of static documentation, high-performing PMs use PRDs and Jira tickets as context for AI tools to generate interactive solutions that replace dozens of alignment meetings.

The Product Delight Framework for AI PMs (How AI Products Like ChatGPT Win)

Product delight in AI goes beyond aesthetic flourishes like animations or confetti. It requires a transition from surface delight to deep delight, where functionality directly addresses a user's emotional state. Research indicates that emotionally connected users are twice as valuable as those who are merely satisfied, as they are more likely to stay, recommend, and purchase. The Delight Model provides a structured four step process: identifying functional and emotional motivators, turning those motivators into product opportunities, building solutions, and validating the experience to prevent negative outcomes. A core component of this strategy is the 50/40/10 rule for roadmap allocation. Teams should spend 50% of their effort on low delight core functionality, 40% on deep delight features that differentiate the product, and 10% on surface delight that adds brand personality. This is mapped on a Delight Grid where features are evaluated based on how well they meet both functional and emotional needs. For example, Gmail Smart Compose provides deep delight because it solves a functional task while simultaneously reducing the user's stress. If a feature only solves a functional need, it remains low delight. If it only hits an emotional note without utility, it is merely surface level. Humanization is a critical technique for AI PMs. Instead of comparing a product to a direct competitor, teams should compare the AI experience to a high quality human service. Google Meet used this by benchmarking against in person meetings rather than other video tools. Dyson applied similar logic by positioning its robots as equivalent to hiring a human cleaner. However, AI products face unique risks with corner cases. Probabilistic outputs can lead to delight disasters, such as insensitive automated summaries of personal tragedies or broken AI logic in sensitive contexts. Success in the AI era, as seen with ChatGPT, often stems from the feeling of companionship and personalization rather than just raw accuracy. Validating these experiences requires a ten point checklist covering business value, inclusivity, and measurability before any feature is shipped to users.

Key Takeaways

  • Deep delight occurs when a product solves a functional problem while simultaneously addressing an emotional pain point like stress or loneliness.
  • The 50/40/10 rule ensures that product teams do not neglect core stability while still investing heavily in the emotional differentiators that drive retention.
  • AI products should be benchmarked against human empathy and service levels rather than just technical specs or competitor feature lists.
  • Managing the 0.01% edge cases is more critical in AI than traditional software because the emotional stakes of automated hallucinations or insensitive summaries are much higher.

How to Land a $700K+ AI PM Job (Full 66-Min Roadmap)

The AI product management market is undergoing a massive shift, with 20% of all PM roles now mentioning AI compared to just 2% in 2023. These roles command a 30% to 40% salary premium, with Group PMs earning between $360,000 and $600,000, and CPOs reaching over $2 million. Landing these positions requires a shift from generic applications to a high-signal strategy focused on impact, scope, and recognizability. Recruiters typically spend only seven seconds scanning a resume, making the top three lines the most critical real estate. A successful template highlights years of experience, specific expertise, recognizable company names, and quantified revenue or user growth. To build this content efficiently, candidates can use tools like Whisper to brain dump career history at 200 words per minute, creating a bullet vault that is then categorized into bundles like product development, leadership, and technical execution. Outreach is the most effective way to secure interviews, yielding a 10% to 15% callback rate compared to the 1% average for cold applications. The strategy involves identifying the hiring manager, recruiter, and senior PMs for a role using tools like ContactOut to find direct emails. Messages should be under 150 words, featuring one intro line, three impact-focused bullets, and a clear call to action. Persistence is key, with a recommended follow-up cadence on days two, three, and five. For interview preparation, the Hook-Principles-Action-Results-Learnings framework helps structure behavioral stories. AI can be used as a sparring partner to grade case interviews against six dimensions: structured thinking, user focus, product sense, prioritization, communication, and creativity. By treating the job search as a sales funnel and leveraging AI for resume tailoring and interview rubrics, PMs can realistically bridge the gap from mid-level compensation to high-tier AI roles.

Key Takeaways

  • The AI premium is driven by a supply-demand imbalance where 20% of roles now require AI literacy, allowing candidates to negotiate 30-40% higher compensation packages.
  • Effective resume tailoring focuses on extracting non-generic must-haves from job descriptions using AI to ensure the summary and top bullets align with the hiring manager's specific pain points.
  • A 10-15% callback rate is achievable by bypassing automated tracking systems and messaging hiring teams directly with a focus on solving their problems rather than asking for favors.
  • Interview success relies on a tiered preparation model: starting with written content for clarity, moving to spoken delivery for naturalness, and finally adding strict time constraints to mimic high-pressure FAANG environments.
  • Building a target list of 50-100 companies across public, late-stage, and early-stage categories provides the necessary leverage to maximize total compensation during the offer stage.

How To ACE AI Product Sense Interviews (OpenAI PM Mock Interview)

We Ranked Every AI Tool for Product Managers — So You Don’t Have To

Anshumanni Rudra, a Group PM at Google, evaluates over 70 AI tools to identify a high-performance stack for product managers. The evaluation prioritizes tools that integrate deeply into existing workflows rather than standalone apps. Claude Code emerges as the top recommendation, specifically for its ability to handle complex codebase tasks across multiple terminal windows. This allows PMs to move from conceptual ideas to functional code rapidly. For prototyping, Replit Agent is highlighted as a superior choice due to its long-running planning capabilities and deep IDE integration, while Bolt is noted for its structured approach to full-stack deployment. In the realm of productivity and communication, superwhisper is categorized as S-tier for its high-accuracy dictation, effectively replacing traditional typing for debugging and documentation. Granola is preferred for meetings because it uses historical context to generate intelligent talking points, distinguishing it from standard transcription services like Otter. For agentic workflows, Lindy.AI is favored for its natural language interface, enabling PMs to build custom assistants for email and research without technical overhead. The analysis also notes a shift in tool dominance. Perplexity is downgraded to C-tier as its core search functionality is absorbed by other integrated AI modes. Cursor maintains an A-tier ranking primarily due to its user interface, which places the AI agent in a side panel that aligns better with PM cognitive patterns. The overarching strategy for tool adoption focuses on identifying specific weekly time-sinks and selecting tools that directly address those bottlenecks rather than following general trends.

Key Takeaways

  • The transition from AI as a chat box to AI as a terminal agent like Claude Code marks a shift where PMs can directly manipulate codebases without deep technical mastery.
  • UX design is becoming a primary differentiator for AI adoption, as seen with Cursor's side-panel agent outperforming more powerful but less intuitive interfaces.
  • Dictation technology like superwhisper is reaching a tipping point where it can realistically replace typing for complex professional tasks, fundamentally changing how PMs document and communicate.
  • Effective AI tool selection requires a workflow-first audit rather than a tool-first approach, focusing on high-friction tasks like meeting prep and rapid prototyping.

I got a private Masterclass in AI PM from Google AI PM Director

Jaclyn Konzelmann, Director of AI Product at Google, outlines the evolving landscape of AI product management. She emphasizes that AI models like Imagen now possess world models, understanding regional nuances like weather patterns without explicit prompting. This shift requires PMs to move from managing deterministic logic to guiding probabilistic outputs. A core framework introduced is the Anatomy of an Agent, which consists of models, tools, and memory. PMs must define these components before starting technical development. Another critical concept is the User Interaction Spectrum, which distinguishes between do it for me autonomous agents and do it with me collaborative tools. Konzelmann advocates for a Think Big, Ship Fast approach using an inverted triangle framework. This involves maintaining a massive vision while ruthlessly cutting scope for MVPs and using beta labels to manage expectations. She highlights the importance of the Paradigm Shift question: is the product just a faster horse or a completely new mode of transport? For those looking to enter the field, she suggests running ten side projects simultaneously to build intuition rather than focusing on a single perfect launch. Google's hiring criteria for AI PMs focus on product taste, visionary leadership, and the ability to handle chaos. She warns that any single AI idea can be commoditized quickly, so the real skill lies in consistent idea generation and execution rather than protecting a specific concept.

Key Takeaways

  • The Future-Proofing Question is vital because rapid model improvements can instantly invalidate months of custom engineering work.
  • Building AI products requires a shift from process improvement to workflow transformation to avoid building temporary solutions.
  • The 10 side projects strategy builds the necessary AI intuition to predict how models will behave in edge cases.
  • Successful AI PMs must balance visionary leadership with full-spectrum execution to navigate the ambiguity of emerging tech.

How to Run a $100M Company with AI: v0 + Devin Tutorial from Gumroad CEO, Sahil Lavingia

Sahil Lavingia demonstrates how Gumroad operates as a high-revenue company with minimal headcount by leveraging AI agents like Devin and v0. The core philosophy shifts from human-centric coordination to an AI-first architecture. This approach uses a three-tier workflow categorized by task complexity. Small tasks move directly from Slack to Devin for production deployment. Medium tasks involve using GPT for requirements and v0 for prototyping. Large projects start with a brief that AI tools interpret to create a foundation for further refinement in Cursor. A major shift in this model is the move away from traditional, lengthy Product Requirement Documents. Instead of 20-page specs, Lavingia uses short prompts to let AI generate prototypes. These prototypes reveal gaps in logic or communication, making the AI an active partner in refining the product vision. This iterative process replaces the need for extensive upfront documentation and cross-functional alignment meetings. The technical architecture is intentionally simplified to be AI-friendly. Gumroad is migrating from thousands of lines of custom CSS to Tailwind. This change is strategic because Tailwind provides a standardized design system that AI can easily understand and manipulate without side effects across hundreds of files. By reducing the codebase complexity, the company ensures that AI agents can make precise changes with high confidence. Financially, the goal is to reach $10M EBITDA while maintaining a lean structure. This dictatorship model eliminates the friction of buy-in from multiple departments, allowing the founder to move from idea to production in minutes. The focus is on perfecting the software and maximizing dividends rather than scaling the human organization.

Key Takeaways

  • Architecture is the primary bottleneck for AI agents. Moving to industry standards like Tailwind creates a predictable environment where AI can code without breaking global dependencies.
  • Prototyping is the new specification. Using AI to build immediate, low-fidelity versions of an idea identifies missing requirements through observation rather than theoretical planning.
  • The dictatorship model provides a massive speed advantage over traditional corporate structures by removing the need for cross-functional consensus.
  • Parallelizing AI workflows solves the latency issue. Running multiple AI sessions simultaneously allows a single person to manage several complex workstreams without waiting for a single agent to finish.

Masterclass: How to Turn an AI Agent into a Real Product (No Code)

Tyler Fisk outlines a framework for moving beyond vibe coding to build reliable, production-grade AI agents. A central piece of this approach is Gigawatt, a meta-prompting agent with 72,000 characters of instructions designed specifically to build other agents. This meta-agent researches domains, writes initial instructions, and iteratively improves them based on self-evaluation, typically raising quality from 77% to over 86%. The architecture shifts away from single, all-purpose agents toward multi-agent systems that separate concerns. For example, a customer service setup might use a Core agent set to a temperature of 0 for precise fact-finding and an Echo agent set to 0.7 for creative email drafting. System instructions for these agents are extensive, often spanning 7,000 to 9,000 tokens, covering roles, business context, step-by-step processes, and guardrails. Reliability is maintained through a strict information hierarchy. Agents prioritize RAG databases containing company documents first, followed by system instructions, and finally web searches with chain-of-verification to prevent hallucinations. A complete production workflow involves multiple specialized agents, such as Cinnamon for sentiment analysis, and always includes a human-in-the-loop checkpoint, often via Slack, before any external communication is sent. From a business perspective, Fisk frames the value of these agents through labor cost reduction. By automating expert tasks while keeping a human reviewer, companies can see annual costs drop from roughly $138,000 to $46,000. He also notes that emotion prompting, or adding positive reinforcement like "Go get 'em slugger" to prompts, can improve LLM performance by about 15%.

Key Takeaways

  • Multi-agent systems outperform single agents by isolating logic from creativity. Separating a deterministic researcher from a creative writer prevents the common failure where a single agent tries to do too much and loses precision.
  • Human-in-the-loop is non-negotiable for production. Moving from 80% to 100% reliability requires a manual checkpoint, usually integrated into existing team tools like Slack, to ensure the AI doesn't go rogue.
  • Information hierarchy is the primary defense against hallucinations. By forcing the agent to check internal RAG data before searching the web or relying on its own training data, you create a verifiable ground truth for the system.

I should be charging $999 for this Claude Code Tutorial

Claude Code shifts AI interaction from a browser chat to a terminal-resident agent that interacts directly with your local file system. It removes the friction of manual file uploads by living inside your project folders, where it can read code, execute terminal commands, and search the web natively. This setup enables a context engineering strategy where you organize your project into a specific folder structure containing business info, writing styles, and meeting transcripts. The agent references these files automatically, ensuring it always has the right background for any task. The system uses a CLAUDE file to maintain permanent project memory and governance. You can set rules here, like specific technical writing styles or commit protocols, and they stay active across every session without needing to be repeated in every prompt. Custom slash commands allow you to save your best prompts as shortcuts for recurring tasks like PRD reviews, meeting note formatting, or code audits. For high-stakes or complex tasks, Plan Mode lets the agent draft its full intent for your approval before it touches any code. This provides a critical safety layer that prevents the AI from making automated mistakes on sensitive files or complex architectures. A significant capability for productivity is multi-agent parallelization. You can spin up several specialized agents, such as a UXR researcher, a design critic, and a technical architect, to work on the same project simultaneously. This turns a week of manual analysis into an hour of oversight. The tool also provides real-time visibility into token costs and usage, helping you understand the actual economic spend of your AI workflows. By pairing Claude Code's research and writing capabilities with Cursor's coding strengths, you can build a high-performance AI stack that balances cost, speed, and technical depth. This transition from simple prompt engineering to structured context engineering is the primary differentiator for product managers working in AI-native environments.

Key Takeaways

  • Terminal-native agents solve the context bottleneck by removing manual uploads and letting the AI navigate directory trees on its own.
  • The CLAUDE file system creates persistent project governance, making sure constraints and style guides stick without you having to re-prompt.
  • Multi-agent parallelization moves you from sequential work to concurrent workflows, letting one person manage a fleet of specialized AI personas.
  • Plan Mode acts as a safety layer for agentic workflows, giving you a checkpoint to catch errors before the AI executes complex changes.

Complete Course: AI Agent Products (with Warp.dev CEO Zach Lloyd)

AI Is the Biggest Cyber Threat — Only Okta’s AI Security Playbook can safe you

Identity serves as the foundation for modern security because over 80% of breaches now target credentials rather than network vulnerabilities. Jack Hirsch from Okta highlights that the most pressing threats involve sophisticated social engineering and the rapid deployment of AI agents. One particularly alarming trend involves North Korean operatives who successfully navigate full interview processes to gain internal access at major firms. These individuals often have laptops shipped to device farms while they operate as insider threats from remote locations. AI agents represent a significant security blindspot because companies often deploy them without treating them as distinct identities that require formal access management. To address this, a new OAuth standard is emerging that allows these agents to inherit user permissions across enterprise applications without requiring individual authentication steps for every employee. This streamlines operations while maintaining a clear audit trail of what the agent is doing on behalf of the human user. Okta's security playbook focuses on a T-Shaped Identity Strategy. This involves deep security measures like phishing resistant authentication and lifecycle management combined with broad integration across all enterprise systems. A key shift in their methodology is moving away from point-in-time logins toward continuous session monitoring. Instead of repeatedly prompting users for MFA, the system looks for risk signals shared between different security vendors to detect if a session has been hijacked after the initial login. When building AI products, the core principle is to accelerate human workflows rather than abdicating responsibility to the machine. This means solving real problems before jumping into prototyping and maintaining a healthy skepticism toward the AI hype cycle. For personal security, the advice is practical and immediate: lock credit reports, use a password manager with unique credentials, enable passkeys, and set a carrier PIN to prevent SIM swapping. These steps form a baseline defense against the increasingly automated nature of modern identity theft.

Key Takeaways

  • Identity is the new perimeter. Since most attacks bypass firewalls by compromising users, security strategy must center on identity lifecycle and authentication strength rather than just network defense.
  • AI agents are currently treated as shadow identities. Organizations are creating a massive governance gap by letting agents access sensitive data without the same oversight and access management applied to human employees.
  • Continuous risk assessment is replacing static sessions. The future of security relies on real-time signal sharing between different software vendors to kill compromised sessions instantly rather than waiting for the next login prompt.

AI Agents • Live Demo: The Game-Changer for 2025

Jacob Bank, founder of Relay.app and former Director of Product at Gmail, shares how AI agents are transforming productivity for product managers and go-to-market teams. Bank currently runs a marketing operation powered by 55 agents and manages a company with $2M ARR and only 10 employees. He notes a significant adoption gap where only 2% of PMs use agents, while sales and marketing teams are moving much faster. This lag represents a strategic risk for product leaders who usually consider themselves the primary drivers of innovation. The discussion outlines a 10-step framework for achieving 10x productivity. It begins with auditing calendars for repetitive tasks like weekly bug reports, monthly competitor analysis, or quarterly updates. Bank recommends choosing platforms based on technical skill, suggesting Relay.app, Zapier, or Lindy for non-technical users and n8n or Make for more advanced teams. He demonstrates several live builds, including a 12-agent executive assistant that handles calendar management, email filtering, meeting prep, and newsletter summaries. Other demos include a meeting briefing generator that pulls context before calls and a Reddit brand tracker that monitors social mentions and competitive intelligence. Strategic implementation involves starting with simple prompts and 1-3 examples rather than complex instructions. Bank emphasizes maintaining human-in-the-loop approval for high-stakes communications until confidence is built. He suggests building a "rhythm of business" through automated reports for stakeholder updates, metrics dashboards, and customer feedback synthesis. The ultimate goal is to move from individual task automation to sophisticated workflows where a single trigger, like a calendar event, initiates a sequence of social posts, email campaigns, and follow-up reports. Bank warns that the productivity gap between those who leverage these tools and those who do not will become irreversible within a year.

Key Takeaways

  • Small teams can achieve massive scale by treating agents as a context layer. Bank's $2M ARR with 10 people shows that agentic workflows are the new baseline for lean SaaS operations.
  • The productivity gap is a competitive threat. PMs who fail to automate repetitive product ops like release notes and feedback synthesis will lose influence to faster-moving GTM teams.
  • Success requires thinking in systems rather than tasks. Moving from a single follow-up agent to a multi-agent webinar workflow is where the true 10x gains happen.

FAANG PM Reveals How to Build AI Agents (and Get Paid $750K+)

How to Build AI Products in FinTech ($100B Robinhood VP Lessons)

Robinhood's trajectory to a $100B valuation provides a blueprint for building products in highly regulated environments. Their approach to AI, exemplified by the Cortex assistant, prioritizes solving existing customer pain points over implementing technology for its own sake. Instead of creating entirely new behaviors, they focus on answering fundamental user questions, such as why a specific stock price moved, and integrate these answers directly into the user's current workflow. To maintain product clarity and focus, Robinhood utilizes the swipies framework. This involves designing the mobile onboarding screens before any development begins. If a product's value proposition cannot be explained clearly in a single sentence on a mobile screen, the idea is refined or discarded. This mobile-first constraint forces simplicity and ensures that the core message resonates with users immediately. Navigating the regulatory landscape is treated as a strategic advantage rather than a hurdle. Robinhood intentionally hires former regulators and domain experts for their legal teams. These individuals are tasked with finding creative ways to enable great customer experiences within the bounds of the law, transforming legal from a traditional blocker into a product partner. This approach extends to their AI development, where they focus on curating high-quality data and providing information rather than direct investment recommendations to avoid regulatory friction. Growth is achieved through relentless experimentation rather than following industry standard tactics. Their successful referral program, for instance, was the result of over 60 iterations, eventually moving from simple cash rewards to offering variable stocks. This aligns the incentive with the company's core service. Organizationally, Robinhood moved from functional silos to a business unit GM structure. This change reduces cross-functional friction and increases product velocity, allowing the company to ship meaningful features consistently and maintain its reputation as an innovation leader in the fintech space.

Key Takeaways

  • Regulatory expertise as a product feature: Hiring former regulators allows a company to build smoother, legally compliant user flows that competitors might avoid.
  • The swipies filter: Designing the onboarding experience first ensures that a product has a clear, communicable value proposition before resources are committed to building it.
  • Iterative growth over tactical copying: Success in growth loops like referrals comes from dozens of internal experiments to find the right incentive, not just copying a competitor's model.
  • AI as a workflow enhancement: In regulated sectors, AI is most effective when it synthesizes information to solve existing user questions rather than attempting to provide automated advice.

AI Agents for PMs in 69 Minutes — Masterclass with IBM VP

AI Product Leadership Masterclass with the author of The Making of a Manager

AI for Product Managers: 10X Growth with Smart Experimentation

AI has fundamentally changed the experimentation landscape by eliminating the build bottleneck. In the past, testing product ideas often required weeks of engineering effort to create variations. With generative AI, product managers can now move from a simple English prompt or a rough sketch to a live, functional experiment in just a few minutes. This shift enables a vibe coding workflow where mockups are instantly converted into live elements like onboarding flows or layout changes, allowing teams to review actual variations instead of static specs. The framework for modern experimentation follows a four-step process: ideation, building, deployment, and analysis. AI streamlines the early stages by suggesting variations and automating the code generation for tests. During the deployment phase, teams can utilize advanced models like Multi-armed bandits for rapid optimization of time-sensitive content, such as news headlines. For more personalized experiences, Contextual bandits score user intent in real-time, allowing for hyper-targeted interventions like showing discounts only to visitors who are likely to churn without them. Measuring the success of AI features requires a shift in metrics, particularly for Retrieval-Augmented Generation (RAG) systems. The discussion identifies three specific metrics essential for evaluating RAG performance. AI also enhances the analysis phase by automatically uncovering successful segments within failed experiments, such as identifying a specific lift in mobile conversions that was masked by overall neutral results. While AI handles the speed of execution, human product managers remain essential for providing business context, ensuring brand compliance, and setting strategic priorities. Companies like Booking.com demonstrate that a high-volume experimentation culture is a primary driver of revenue growth, a goal that AI now makes more attainable for all teams.

Key Takeaways

  • The build bottleneck is effectively solved by using AI to generate live variations from prompts, reducing the friction between having an idea and seeing it in production from weeks to minutes.
  • Intent-based personalization replaces broad A/B testing by scoring individual visitor intent to deliver the right experience to the right person at the right time rather than testing one version against another for all users.
  • AI salvages value from failed experiments by using automated segmentation to find hidden wins in data that would otherwise be discarded, allowing teams to iterate on specific high-performing user groups.

Google Product Manager Metrics Interview: GPT-5 Launch

How OpenAI Builds Products: The Framework You Need

Complete Course: AI Product Design

Elizabeth Laraki, a former design leader at Google and Facebook, outlines a strategic framework for building AI products that prioritize user needs over technology-first sprinkles. The core design process remains consistent regardless of the underlying tech. First, define the product by identifying specific user tasks and target audiences. Second, design the features, architecture, and user flows. Third, build the UI and brand identity. However, AI introduces a non-deterministic element where outputs are unpredictable, unlike traditional software where clicking a button always yields the same result. This shift requires new safeguards, such as auditing training data for bias and implementing human-in-the-loop systems to flag sensitive content. Laraki shares a cautionary tale of an AI image expander that added inappropriate details to a photo, highlighting why designers must build evals to catch these errors before they reach the user. A major pitfall in current AI development is the over-reliance on chat interfaces. While chat works for simple Q&A, it often fails for complex, multi-step workflows like document creation or visual editing because every change can trigger new hallucinations or regenerate the entire output. Instead, AI should be integrated into existing tools where it can support specific Jobs to Be Done. For example, Descript uses AI to automate painful parts of the video editing lifecycle, like removing filler words or editing from a transcript, rather than forcing users into a generic chat box. Context is equally critical to successful AI implementation. The design for a voice interface must change based on whether a user is driving a car or wearing smart glasses. Effective AI design also involves making the AI's work visible so users can scrutinize and edit the results. Laraki illustrates the importance of deep user research through the Google Maps India redesign, which shifted from street-name navigation to landmark-based directions to match local mental models. This principle applies directly to AI: the technology must map to how people actually think and work, providing specific utility rather than just a layer of generic intelligence. During a live design session for a hypothetical LinkedIn for AI, she demonstrates how to move from ambiguity to specific matchmaking features by identifying AI's unique ability to find personality patterns rather than just matching keywords.

Key Takeaways

  • AI's non-deterministic nature creates a risk of the unexpected that requires UI transparency. Unlike traditional software where inputs lead to fixed outputs, AI can produce unintentional results. Designers must build evals and A/B options to let users verify and correct AI outputs.
  • Chat is a supporting tool, not a universal interface. For complex tasks like itinerary planning or visual design, chat-only interfaces often lead to frustration due to constant regeneration and hallucinations. The most successful integrations embed intelligence directly into the canvas where the work happens.
  • Context-aware design is the next frontier for AI agents. A single AI model needs different UX treatments depending on the environment, such as prioritizing hands-free voice in a car versus proactive suggestions in smart glasses. Designing for the moment of use is more important than the raw power of the underlying LLM.

5 AI Agents Every PM Must Use in 2025 (Act Fast)

The 1 Skill PMs Need in 2025: AI Product Discovery Masterclass by World’s Leading Authority

These 7 AI Tools Made Me $1,000,000+ In The Last 12 Months. Here's How:

$1.25 billion Unicorn. Only 2 Product Managers. The Linear Method:

If This 81 Minute Video Doesn't Make You an AI PM, I'll Delete My Channel

The transition from traditional product management to AI specific roles requires a shift from deterministic software building to managing probabilistic systems. AI PMs generally fall into two categories: core PMs focusing on infrastructure and models like OpenAI or Google Cloud, and applied PMs who leverage existing models to build specific applications like Notion AI or Cursor. The latter represents the largest market opportunity. While the fundamentals of product management such as user empathy, stakeholder management, and problem solving remain constant, the AI Product Development Lifecycle introduces new complexities in validation and technical execution. Technical contextualization is a core responsibility for modern PMs. This involves choosing between prompt engineering for simple use cases, Retrieval Augmented Generation (RAG) for real time data access, and fine tuning for specialized domain expertise. A critical shift in the workflow is the move from traditional QA to AI Evaluations. Because LLM outputs are indeterministic, PMs must design systematic evaluation frameworks to check for hallucinations, bias, and formatting accuracy. This process often involves using a more capable judge model to audit the outputs of the production model. The Model Context Protocol (MCP) is highlighted as a significant advancement for agentic workflows, allowing LLMs to interact with external tools like Jira, Slack, or GitHub through a unified standard. For career growth, the focus should be on proof of work rather than passive learning. Building a portfolio through product teardowns, side projects using tools like Lovable or Bolt, and creating value add proposals for target companies is more effective than traditional resume submissions. Compensation for these roles is significantly higher than standard PM positions, with top tier AI PMs in the US and India commanding substantial premiums due to the high leverage they provide to engineering teams.

Key Takeaways

  • The Applied AI layer is where most PM value resides. While infrastructure is dominated by a few giants, the ability to integrate LLMs into specific business workflows creates the most job opportunities and user impact.
  • Evaluations are the new PRDs. In a probabilistic world, the PM's job shifts from defining exact features to defining the boundaries of acceptable output and building the systems to measure those boundaries at scale.
  • Contextualization is the primary product differentiator. Since many companies use the same underlying models, a product's competitive advantage comes from how effectively it integrates proprietary data via RAG or specialized fine tuning.
  • The Model Context Protocol enables the transition from thinking AI to doing AI. By standardizing how models interact with third party APIs, PMs can now build agentic systems that execute tasks rather than just generating text.

Crash Course: AI Agents for Coding (Linear, CodeGen, Cursor, Devin)

Nan Yu, Head of Product at Linear, demonstrates how AI agents like Codegen are transforming software development from a manual writing process into a high-level orchestration and review task. By integrating agents directly into Linear, teams can move beyond single-player AI tools like Cursor. This setup creates a transparent audit trail where managers and peers can watch the agent's progress, review its implementation plans, and verify its code within the existing project management workflow. A major shift highlighted is the ability to clear massive backlogs of low-priority bugs by assigning them to agents en masse, turning months of procrastination into a stack of ready-to-merge pull requests. The conversation explores how this changes the engineering role, moving the focus from writing code to validating and reading it. For product managers, the removal of task-completion constraints means they can finally focus on high-level strategy, customer understanding, and business impact. The demo specifically shows an agent implementing a manual theme toggle on a website, illustrating that while the tasks might be small, the cumulative effect on product quality and craft is significant. The discussion also touches on the social and organizational shifts required, such as defining who is responsible for agent-generated code and how to hire engineers who excel at code auditing rather than just code origination.

Key Takeaways

  • The collaborative audit trail is a major advantage of using Linear as an orchestration layer because it makes AI work visible to the whole team, unlike isolated IDE-based AI interactions.
  • AI agents enable resource elasticity by allowing teams to request multiple competing implementations of a single feature for a few dollars, a practice that would be socially and financially impossible with human engineers.
  • The engineering role is shifting toward a verification and validation model where the ability to read and spot errors in code becomes more valuable than the ability to write it from scratch.
  • Product management is moving toward a fantasy state where output is no longer the primary constraint, allowing PMs to focus entirely on prioritization and business impact.

The ONE AI Skill Every Product Manager NEEDS in 2026

AI evaluations, or evals, represent a systematic way to measure quality and inject human judgment into the AI development process. Moving past subjective vibe checks is critical for product managers who want to ensure their specific taste and customer requirements are reflected in the final output. The process involves creating a rubric of binary criteria (pass or fail) rather than using 1 to 5 scales. Numerical scales often lead to intellectual laziness and poor model alignment because they lack clear definitions for each score. Binary choices force teams to make definitive calls on what is ship-ready. Error analysis is the core skill required to build reliable AI features. It requires looking at data to identify specific failure modes and quantifying how often they occur. This creates a continuous flywheel for product improvement. Without this structured approach, teams often fall into a whack-a-mole trap where fixing one prompt error inadvertently creates new problems elsewhere. The three gulfs framework helps teams decide which technical lever to pull. The gulf of specification is solved through better prompting, while the gulf of generalization is addressed via RAG or fine-tuning. Most teams should prioritize prompting and RAG. Fine-tuning is often expensive, brittle, and unnecessary if the underlying instructions are flawed. The methodology for building these systems draws from social science research, specifically grounded theory. Techniques like open coding and axial coding allow teams to find patterns in unstructured text outputs and turn them into automated evaluators. This involves labeling data, identifying clusters of errors, and then using an LLM as a judge to scale that judgment. To keep the process efficient, teams should appoint a benevolent dictator to make final calls on rubrics. Ultimately, a robust eval pipeline is the only true moat for an AI product. It allows for model portability and ensures the system remains aligned with the company's unique vision even as underlying foundation models change.

Key Takeaways

  • Evals are the only real moat. Since foundation models are trained on average taste, your specific evaluation pipeline is what allows your product to stand out and maintain quality as you switch models.
  • Binary criteria beat scales. Using pass/fail rubrics forces teams to make hard decisions about what is actually good enough to ship. 1 to 5 ratings are harder for LLMs to execute and humans to calibrate.
  • Error analysis is the starting point. You cannot build a successful AI product without looking at your data. Identifying failure modes through axial coding provides the intuition needed to build automated judges.
  • Fine-tuning is a last resort. Most issues are specification problems that can be fixed with better prompts or RAG. Fine-tuning adds massive MLOps complexity and should only be used when prompting hits a hard ceiling.

Complete Course: AI Product Discovery

Tanguy Crusson, Head of Product for Jira Product Discovery (JPD), outlines a four-stage discovery system used at Atlassian: Wonder, Explore, Make, and Impact. This framework moves away from traditional PRDs and waterfall development toward a continuous loop of learning. In the Wonder stage, PMs focus on problem exploration by conducting deep user interviews and creating 10-minute video reels of raw customer pain. This approach prioritizes emotional resonance and urgency over static documents. The Explore stage involves rapid prototyping, often starting with simple slides or Figma mockups to validate core concepts before writing code. A key metric for success is real pull, where users demand the solution immediately after seeing a low-fidelity concept. During the Make stage, the team uses a safety funnel approach, releasing features to small groups of 10, then 100, then 1,000 users to ensure a high-quality experience before a full launch. This methodology helped JPD scale to 18,000 customers while maintaining a CSAT score above 85. Technical spikes are used to assess feasibility during exploration, and PMs maintain a live feature document that evolves weekly with engineers and designers. The process emphasizes staying close to the ground through tools like Dovetail for transcripts, Loom for video updates, and direct Slack channels with lighthouse customers. Crusson argues that the ability to learn faster than competitors is the ultimate unfair advantage in SaaS, regardless of AI integration. The system replaces sanitized reports with raw user exposure, ensuring the product team remains aligned with actual market needs rather than internal assumptions.

Key Takeaways

  • Video reels of raw customer pain are more effective for team alignment than traditional documents because they trigger emotional urgency in engineers and stakeholders.
  • The safety funnel method protects long-term growth by only letting users into a new feature once you are certain they will have a great experience.
  • High leverage product work comes from setting up automated feedback loops and tooling that make talking to customers feel effortless and routine.
  • Speed in discovery comes from testing core concepts with low-fidelity slides or Figma screenshots before committing any engineering resources to a build.

College Dropout Raised $20M Building AI Tools | Cluely, Roy Lee

Zoom Head of Product: How We Build Product

Zoom's rapid expansion during the pandemic saw revenue jump from $1 billion to $4.5 billion, necessitating a massive scale-up of the product team from 20 to over 200 PMs. To manage this demand shock, the company implemented a three-month feature freeze, shifting all resources toward security, stability, and infrastructure. This period was characterized by daily Tiger Team meetings led by CEO Eric Yuan to triage thousands of incoming requests and bugs. A central pillar of Zoom's product philosophy is the Problem, Root Cause, Solution framework. This approach requires PMs to look past surface-level symptoms and identify the specific reason a system or user expectation failed before proposing a fix. The company is currently pivoting from a pure video conferencing tool into a broader media and collaboration platform. This includes the launch of Zoom Docs, a collaborative canvas for real-time editing, and Zoom Clips for asynchronous video communication. On the AI front, the AI Companion is seeing 60% quarter-over-quarter growth, primarily driven by meeting summaries that leverage transcripts to provide immediate post-meeting value. While AI avatars are discussed as a way to alleviate Zoom fatigue, the focus remains on using AI to capture decisions and action items from the rich content of live discussions. Zoom maintains a nimble culture despite its size, using a quarterly business review process that emphasizes pre-reads and discussion over slide presentations. The product team prioritizes speed and practicality, often shipping features within days of a customer request. This responsiveness is balanced by a commitment to simplicity, where even minor features like Raise Hand undergo deep UX refinement to ensure the happy path remains intuitive for a global user base.

Key Takeaways

  • The three-month feature freeze demonstrates that strategic pauses are often necessary to preserve long-term platform integrity during extreme hypergrowth.
  • Zoom's Problem-Root Cause-Solution framework prevents feature bloat by forcing PMs to validate the underlying why before building any solution.
  • The shift toward Zoom Docs and Zoom Clips suggests a strategic move to capture the entire work lifecycle, moving beyond the live meeting window.
  • AI strategy at Zoom focuses on utility like summaries and task automation rather than just novelty to solve the triple-booked calendar problem.

How I make $18K/mo with a niche podcast (STEAL THIS)

Aakash Gupta breaks down the transition of his podcast from a time-intensive hobby to a profitable business generating $18,000 per month. He spent nearly half his time on the project for months without seeing financial returns, eventually reaching profitability around episode 50. By episode 80, the show's ad slots were completely sold out. The core of his growth strategy involved moving away from generic long-form conversations toward tactical content like screen shares and revenue breakdowns. This shift allowed him to provide more direct value to a specialized audience of product managers and growth experts. Monetization for niche creators relies heavily on direct sponsorships, which accounted for 95% of his early revenue, rather than YouTube AdSense. He highlights the importance of distribution through resonance rather than volume, noting that high-quality trailers and genuinely useful episodes drove more growth than daily short-form clips. He suggests that creators should build their own unique formats rather than copying what is popular. The video also covers practical AI applications for Product Managers. Gupta demonstrates workflows for transforming resumes and using AI for rapid prototyping to stay competitive in the 2025 job market. He shares insights on career growth, including a roadmap for breaking into US-based PM roles and strategies for achieving significant salary increases, such as the $525,000 jump he observed at LinkedIn. To maintain his paid newsletter and podcast audience, he employs a rigorous feedback loop. This includes analyzing every unsubscribe reason and interviewing former subscribers to identify content gaps. He emphasizes that productivity is not about working more hours but about protecting distraction-free blocks for deep work. By building the right systems and protecting his time, he turned a slow start into exponential growth.

Key Takeaways

  • Ditch standard formats. Gupta found growth by moving past the typical interview style to provide visual, tactical value through screen shares and data breakdowns.
  • Sponsorships are the primary lever. For niche audiences, direct brand partnerships are far more lucrative than platform ad revenue, making early monetization possible.
  • Quality beats frequency. Viral growth was driven by the utility of the content and high-production trailers rather than the sheer volume of social media clips.
  • Obsess over churn. Retention in the creator economy requires a fanatical approach to feedback, specifically interviewing canceled subscribers to understand why they left.

10 Years After the Lean Product Playbook: PM in the Age of AI

Dan Olsen, author of The Lean Product Playbook, argues that while AI has revolutionized the speed of execution, the core principles of product management remain unchanged. The most significant shift is the prototyping revolution. Tasks that previously required weeks of coordination between PMs, designers, and developers moving from text to sketches, wireframes, and finally code can now be completed in minutes using AI tools. This allows for much faster iteration cycles and more frequent user testing. Olsen highlights a critical hierarchy in user research where the level of human interaction should match the level of uncertainty. For new products or markets, in-person research is essential, whereas remote unmoderated testing suffices for usability checks on existing products. He introduces a three-bucket system for organizing user feedback: Feature Set, UX Design, and Messaging. By testing in waves of five to eight users and tracking percentages of issues found, teams can systematically improve their products. A major pitfall in the current landscape is the Jira Jockey trap. When the PM-to-developer ratio exceeds 1:8, PMs often spend all their time managing tickets rather than doing discovery. Olsen advocates for the 4 D's framework: Discover, Define, Design, and Develop. He warns that many teams are currently sprinkling AI on products without solving real customer pain points. AI should be treated as a solution in search of a problem, and its implementation must be preceded by a clear understanding of market needs. For tools, Olsen suggests starting with no-code prototyping platforms like Lovable or Bolt for quick validation before graduating to more technical tools like Cursor. He also notes that while AI has raised the floor for UX maturity across all teams, human designers are still necessary for creating truly differentiated experiences and breakthrough innovation.

Key Takeaways

  • The prototyping cycle has collapsed from weeks to minutes, making rapid validation the primary competitive advantage for modern product teams.
  • Usability is often mistaken for product-market fit; a product can be perfectly easy to use while failing to solve a problem anyone actually cares about.
  • AI tools have effectively raised the baseline for UX maturity, allowing teams without dedicated designers to produce professional-grade prototypes, which shifts the designer's role toward high-level innovation.
  • Effective discovery requires protecting time from the Jira Jockey trap, specifically maintaining a healthy PM-to-developer ratio to ensure the Discover and Define phases aren't skipped.

If you only have 2 hrs, this is how to become an AI PM

AI product management is shifting from traditional documentation toward a technical, evaluation-driven workflow. The core skill set now centers on five pillars: prototyping, observability, systematic evaluations (evals), technical architecture (RAG vs fine-tuning), and engineering collaboration. Prototyping tools like Cursor and Bolt allow PMs to build functional agents quickly, though Cursor is preferred for its deeper control over agent systems. Observability acts as the necessary telemetry, providing the traces required to run effective evaluations. Instead of static documents, AI PMs now define success through systematic evals, measuring model performance using language-based labels like 'friendly' or 'robotic' rather than arbitrary numeric scales. Most AI development should prioritize prompt engineering, which covers 95% of use cases, followed by RAG for external data integration, with fine-tuning reserved for specific cost or speed optimizations. Effective collaboration with AI engineers involves providing labeled datasets and success metrics rather than long-form requirements. The transition to AI PM requires a hands-on approach, where side projects and a deep understanding of model behavior replace traditional product management frameworks.

Key Takeaways

  • Observability is the foundation of the AI stack. You cannot effectively measure or improve an AI system without first capturing detailed traces of its performance to use as a baseline for evaluations.
  • The vibe coding approach of manually checking a few outputs is insufficient for production. PMs must implement systematic evaluation frameworks to define and measure what quality actually looks like at scale.
  • Prompting and RAG are almost always superior starting points to fine-tuning. Fine-tuning is often an over-engineered solution for problems that are better solved through improved context or clearer instructions.
  • The role of the PM is shifting from writing requirements to curating data. AI engineers value high-quality labeled datasets and clear evaluation criteria over traditional, static Google Doc PRDs.
  • Language-based labels outperform numeric scales for LLM judges. Because models are trained on language rather than mathematics, they are more accurate when categorizing outputs into descriptive buckets like 'friendly' versus 'robotic'.

v0 Tutorial from the CPO: AI Prototype Like a Pro

He Built a $2M/Yr One-Person Business - Steal His Playbook

I can’t believe we built an AI employee in 62 mins (Cursor, ChatGPT, Gibson)

Harish Mukhami and Aakash Gupta demonstrate how to build a functional AI customer success agent in about an hour. The workflow moves beyond simple prototyping toward production-ready infrastructure. The tech stack centers on Cursor for development, using O3 Mini for high-level planning and Claude Sonnet for the actual coding. A critical component is GibsonAI, which provides the framework for the agent to handle user research, product documentation, and customer success tasks end-to-end. The process involves connecting tools via the Model Context Protocol (MCP), allowing Cursor to interact directly with databases and schemas. This integration enables the agent to monitor user engagement patterns and usage metrics 24/7. Instead of just providing insights, the system is designed to move through a three-tier implementation: starting with a dashboard, moving to human-approved recommendations, and eventually reaching autonomous execution for low-risk tasks. The goal is proactive churn prevention rather than reactive win-back strategies. By automating roles that primarily ingest and output information, such as SDRs or executive assistants, teams can focus on high-touch strategic decisions. The demonstration emphasizes that starting with scalable database infrastructure from day one prevents the common death valley between a demo and a live product.

Key Takeaways

  • The shift from AI prototypes to AI employees requires production-grade infrastructure like scalable databases from the start to avoid rebuilding when scaling to thousands of users.
  • The Model Context Protocol (MCP) acts as a critical integration layer that allows development environments like Cursor to query databases and update schemas directly, eliminating manual tool switching.
  • A tiered autonomy approach—moving from dashboards to human-approved recommendations and finally autonomous action—serves as relationship insurance to prevent AI from sending damaging or incorrect customer communications.
  • Proactive churn prevention is significantly more effective than reactive win-back, and AI agents are uniquely suited to monitor engagement patterns 24/7 to flag risks before they escalate.

What this $2.45B CPO knows that you Don’t!

The Last Video You Need to Watch on Strategy

Strategy is often misunderstood as a list of initiatives or a budget described in prose. Roger Martin defines true strategy as an integrated set of choices that positions a firm to win by compelling customers to act. A successful strategy answers five core questions: what is the winning aspiration, where will you play, how will you win, what capabilities are required, and what management systems are needed. These choices must form a reinforcing loop where the market segment and the competitive advantage are perfectly matched. For instance, Southwest Airlines succeeded by making integrated choices, such as using one plane type, avoiding hubs, and eliminating seat assignments, that all reinforced their low-cost and fast-turnaround strategy. A key strategic concept is exploiting the mixed motives of competitors. This occurs when a company takes a position that a competitor could technically replicate but chooses not to because doing so would hurt their existing business model. For example, Walmart initially avoided e-commerce to protect its massive investment in physical stores. This creates a window of opportunity for a focused player to gain ground without immediate retaliation. The most powerful strategic positions come from understanding these internal conflicts within your rivals. Strategy differs fundamentally from planning. Planning focuses on internal activities and resource allocation, while strategy focuses on external outcomes and customer behavior. Effective strategists use Bayesian updating, treating their strategy as a theory that must be constantly tested. Instead of assuming the future will mirror the past, leaders should ask what would have to be true for their strategy to succeed and pivot when those assumptions are proven wrong. This requires watching specific metrics that validate your theory of the market. Business schools often fail by teaching analytical tools as strategy itself. Tools like the Five Forces or resource-based views are merely inputs. Real strategy requires making hard choices about what not to do. For product managers, the common mistake is creating a roadmap of features without first determining the strategic foundation. A roadmap should be an output of strategy, not a substitute for it. By focusing on integrated choices rather than isolated initiatives, organizations can create a compelling reason for customers to choose them over any alternative.

Key Takeaways

  • Strategy is an external theory of winning that focuses on customer choice, whereas planning is an internal list of activities and costs.
  • The most defensible positions exploit mixed motives, where competitors are physically or financially unwilling to respond because it would damage their core business.
  • Strategic success requires Bayesian updating, which means identifying the core assumptions behind a plan and monitoring them as the primary way to manage risk.
  • A winning strategy requires a perfect match between the chosen market segment and the specific advantage used to serve it.

Complete Vibe Coding Tutorial: Build a Full Stack App with AI | Andy Carroll (Windsurf)

Vibe coding is a methodology that enables non-technical product managers to build full-stack applications by collaborating with AI agents instead of writing manual code. The process relies on a specific toolchain including Windsurf for the development environment and Lovable for frontend generation. To manage the backend and deployment, a stack consisting of GitHub for version control, Netlify for hosting, and Supabase for database management is recommended. A core principle of this approach is front-loading the planning phase. Using AI to generate architecture plans and strategy documents before any code is written helps prevent technical debt and avoids the need for major refactoring later. The development workflow involves navigating two distinct modes in tools like Windsurf. Chat mode is used for brainstorming and minor adjustments, while write mode gives the AI permission to modify the codebase directly. It is important to use write mode selectively, as AI agents can sometimes refactor large sections of code when only a small change was requested. When technical hurdles arise, switching between different LLMs such as Claude 3.7, DeepSeek, and GPT-4 is an effective strategy. Each model has different strengths, and a fresh perspective from a different model often resolves logic errors that stalled another. For product leaders, vibe coding transforms traditional PM deliverables. Roadmaps and status reports can be generated directly from the progress tracked in GitHub, turning them into living documents. This speed allows for rapid validation of ideas through functional MVPs. Instead of spending weeks on pixel-perfect designs or internal debates, teams can launch good enough products to gather real user data. This shift allows teams to automate up to 90% of implementation tasks and focus their energy on unique features and go-to-market strategy.

Key Takeaways

  • Vibe coding moves the product manager from a coordinator role to an active builder, enabling the creation of functional software without a traditional engineering team.
  • Setting up a deployment pipeline on day one prevents a massive accumulation of errors, as small issues are caught and fixed during frequent updates.
  • Model switching is a critical skill in AI development, as different LLMs provide unique logic and creative solutions when one model hits a performance ceiling.

How I Built This: 100M+ AI Startup

Never Search Alone - Phyl Wrote the Book

Job searching is often a lonely and high-anxiety process, but Phyl Terry argues it should be treated like a product launch. The core of the Never Search Alone methodology is the Job Search Council, a small group of four to five peers who meet weekly to provide accountability and emotional support. This structure shifts the search from a desperate effort to a disciplined, community-backed strategy. A critical step is achieving Candidate-Market Fit, which requires job seekers to treat themselves as the product. This involves narrowing focus to specific roles and industries rather than casting a wide net, which actually makes a candidate more memorable to recruiters and networks. The process includes a Listening Tour where candidates gather feedback from former colleagues to understand how their skills are perceived in the current market. This feedback helps refine role clarity and identifies where a candidate's strengths truly intersect with market demand. During the interview phase, Terry suggests creating a Job Mission with OKRs document. This draft outlines how the candidate would approach the role's objectives, demonstrating high initiative and clarifying expectations before an offer is even signed. When it comes to negotiation, the strategy prioritizes securing the resources needed for success, such as team training or budget for technical debt, before discussing salary. This approach signals a commitment to the company's goals and sets the stage for long-term performance.

Key Takeaways

  • Community-driven job hunting replaces isolation with a structured support system that manages the psychological toll of career transitions.
  • Narrowing your target market increases your value because specificity makes you easier to refer and more attractive to specialized hiring teams.
  • The Job Mission with OKRs document flips the interview power dynamic by showing exactly how you will deliver value rather than just answering questions.
  • Negotiating for success resources before compensation builds immediate credibility and ensures you are not walking into a setup to fail situation.

How I Wrote My 3 Most Viral Posts

“Most Product Teams have no ROI” -ex Google, Bitly PM

Product teams typically cost a business around $1 million annually, yet many struggle to articulate their specific return on investment. In a climate of increased fiscal scrutiny, product managers must move beyond following Silicon Valley best practices and instead master the commercial reality of their specific organization. The low impact death spiral occurs when teams prioritize easy, low-stakes features to avoid scrutiny, which leads to cluttered products and complex dependencies that eventually block high-impact work. To break this cycle, teams should ensure their goals are no more than one mathematical operation away from core company objectives like revenue or customer lifetime value. For platform or internal teams that do not directly own revenue, impact is defined by how they accelerate or enhance the metrics of the teams they support. A commercially minded approach actually leads to higher job satisfaction because it allows PMs to stop fighting the business and focus on delivering measurable value. The ultimate mindset shift involves asking whether a CEO would fully fund the team today based on its current output. This proactive alignment with business success makes PMs the architects of their own scrutiny rather than passive subjects of it.

Key Takeaways

  • The One Step Away Rule: High-performing teams keep their goals no more than one mathematical operation or why statement away from core company metrics like revenue or LTV to ensure clear business relevance.
  • The CEO Funding Test: Proactively asking if a CEO would fund your team today forces a shift from feature delivery to value creation and helps identify when a team should actually reshuffle resources for better impact.
  • Commercial Realism over Best Practices: PMs often burn out trying to force product-led frameworks from big tech into companies with different commercial pressures; success comes from aligning with the actual business model rather than idealized versions of it.
  • Architecting Scrutiny: By proactively defining and reporting on business impact, product leaders can manage executive expectations and avoid being viewed as a mere line-item expense during budget cuts.

How I Went From PM to VP Product - My Story

Complete Course: AI Product Management

AI product management requires a fundamental shift from deterministic logic to managing probabilistic systems. Instead of traditional requirements, PMs now define intent, behavior, and failure modes. Prompting is no longer a secondary task; it serves as the core UX layer for LLMs, where well structured instructions determine the quality of the product experience. This shift means the PM's role is becoming more about teaching the model how to behave through clear, repeatable instructions rather than just specifying features. A common mistake in the field is jumping to fine-tuning too early. Most product needs can be met through sophisticated prompting or Retrieval-Augmented Generation (RAG). RAG is particularly effective for maintaining accuracy and reducing hallucinations by pulling in real time context, such as product changelogs, without the need for constant model retraining. If a product updates frequently, RAG keeps the AI current without hardcoding information. Fine-tuning should be reserved for specific stylistic needs, as few-shot prompting often yields better results for data summarization and analysis while being faster, cheaper, and easier to maintain. The architecture of AI agents is moving toward structured pipelines that handle intent classification, tool selection, and error handling. This is where the Machine-Callable Programs (MCP) framework becomes essential, allowing models to interact with multiple tools effectively. Building these reliable 'thinking' systems creates a competitive moat by coordinating model behavior with reliable systems thinking. An agent is essentially a system that can think through execution logic and handle errors gracefully. Product Requirements Documents (PRDs) must evolve to include structured prompts, input/output examples, and definitions for acceptable variance. Traditional PRDs were built for deterministic systems where inputs led to fixed outputs. In the AI era, PMs are not just writing requirements; they are writing intent and planning for expected failure modes. This includes defining what acceptable variance looks like and planning for fallbacks, retries, and recovery UX. Ultimately, the PM is not just managing the model but collaborating with it to ensure the product remains aligned with user needs despite the inherent unpredictability of LLMs.

Key Takeaways

  • Prompting is the primary UX layer. PMs should focus on clear, repeatable instructions to ensure consistent model behavior rather than treating prompts as an afterthought.
  • RAG is superior to fine-tuning for most SaaS applications. It keeps the AI current and reduces hallucinations by providing context at runtime, which is cheaper and more adaptable than retraining.
  • AI agents require a structured pipeline. Predictability comes from designing systems that handle intent and tool selection logic, rather than just relying on the model to figure it out.
  • PRDs must define acceptable variance. Since AI is non-deterministic, product specs need to include failure modes, retry logic, and recovery flows to maintain a high-quality user experience.

How Amplitude Became the 1 Product Analytics Tool | Spenser Skates, CEO & Founder

This PM was Laid Off - Now he has 125K followers

We Built an AI Agent to Automate PM (ZERO CODING)

"Most Product Managers are Bullsh*t Managers" -2x CPO

25 Product Designs That Will Make You Jealous

Kate Syuma, former Head of Growth Design at Miro, breaks down 25 high-performing product flows from companies like Dropbox, Linear, and Notion. The discussion covers five main areas: website experience, signup flows, onboarding, sharing/invitation, and upgrade triggers. On the website side, the focus is on delivering value through visceral delight. This behavioral psychology principle suggests users respond to visuals first. Dropbox uses interactive animations to engage users immediately. Rows.com allows users to try the product before registering. This increases activation rates even though it risks losing some lead data. Linear uses interactive use-case navigation to keep the UI clean while showing depth. Amplemarket combines roles and use cases in its positioning to speak directly to different personas like founders or sales leaders. Signup flows should follow Hick's Law by limiting options. Dropbox and Loom use progressive disclosure to ask profiling questions without overwhelming the user. Grammarly places its Chrome extension early because it is essential for the user to see value. Onboarding works best when it is personalized. FigJam uses contextual templates with hover previews. Miro experimented with human-led video onboarding featuring a customer success manager. While it was hard to scale then, AI now makes personalized video guides a viable strategy for complex B2B tools. Slack keeps onboarding simple with a four-step guided walkthrough, assuming the product is intuitive enough for users to explore. Sharing flows in multiplayer products benefit from the power of defaults. Linear defaults to a single invite link to reduce friction. Notion uses contextual prompts to upgrade guests to members. This educates users on roles during the sharing process. Upgrade triggers should be action-based. Riverside and Loom show the full value of paid plans instead of just one feature. Miro uses triggers for specific features like Spaces to drive collaborative purchase decisions within teams. Canva uses a scalable system of crown icons and popups to maintain consistency across its many use cases.

Key Takeaways

  • Visceral delight is a core behavioral principle where users respond to visuals before text. Interactive animations on landing pages are high-leverage engagement tools.
  • Progressive disclosure in signup flows reduces cognitive load by asking one question at a time. This significantly improves completion rates for complex B2B profiling.
  • The power of defaults in sharing flows reduces friction. Linear's single-link invite is more effective than offering too many permission options upfront.
  • Contextual upgrade triggers outperform generic banners. They appear exactly when a user hits a functional limit, making the value of a paid plan immediately obvious.
  • Personalization in onboarding is the most significant driver of activation. While difficult to scale manually, AI now allows for personalized guides across diverse use cases.

Model Context Protocol (MCP), clearly explained and why it matters

The $100k+ side hustle blueprint.

Decode and Conquer - He Wrote the Book

We built an AI prototype in 10 mins

Use Bolt to build a $32K/mo site in 68 mins (CEO Demo)

We built an AI Product Manager in 58 mins (Claude, ChatGPT, Loom, Notion AI)

This Ex Amazon VP Makes $950K In Retirement

He Went From $50K to $750K Per Year

Cracking the PM Interview - She Wrote The Book

11 Lessons From The World's Best PMs

Aakash Gupta reflects on 50 episodes of his podcast, distilling 11 core lessons from top product leaders while sharing the realities of running a content business. Marty Cagan emphasizes moving toward a product operating model where PMs focus on value and viability rather than just managing features. This requires winning the trust of designers and engineers by not overstepping into their domains. Melissa Perri highlights that strategy must be communicated repeatedly across multiple formats like memos and prototypes because context does not sink in immediately. Operational speed is another major theme. Chloe Shih notes how high-performing teams like Tik Tok prioritize rapid execution over lengthy negotiations. Ravi Mehta argues that quality is a product's most important feature, citing Spotify's technical reliability as a key differentiator over competitors. For growth, Satyajeet Salgar suggests taking low-confidence, high-exponential return bets rather than playing it safe with incremental gains. On the career and business side, Anuj Rathi encourages PMs to adopt a general manager mindset by owning final outcomes across HR, brand, and org design. Anthony Pierri advises treating homepages as storefronts that entice users rather than exhaustive Wikipedia pages. Gupta also shares his personal struggles with podcast growth, noting that guest fame and packaging often outweigh content length or production quality. He discusses the challenge of churn in paid newsletters and the importance of doing discovery calls with canceled subscribers to refine the content calendar.

Key Takeaways

  • Quality acts as a primary feature and competitive moat. As seen with Spotify, technical reliability and seamless execution can beat a superior artist-led strategy if the core experience is flawless.
  • The most effective PMs operate with a General Manager mindset. Moving beyond the product silo to influence HR, brand, and organizational design is what separates senior leaders from feature-focused managers.
  • Packaging often trumps content depth for growth. In the creator economy and marketing, timely and searchable topics (the storefront) drive more acquisition than the actual length or quality of the underlying material.
  • Discovery and delivery should be parallel rather than sequential. High-impact teams use a Double Diamond approach where they iterate on discovery even after shipping to ensure the final result actually moves the needle.
  • Pricing is a neglected growth lever. Conducting a data-driven retrospective on past pricing changes often reveals that companies did not go far enough and that revenue gains rarely negatively impact core conversion metrics.

I Use Claude More than ChatGPT, Here's Why

This PM Built a Six-Figure ($100K+) AI Side Hustle

We prototyped 5 features in 84 mins (Bolt, Cursor, Lovable, Replit, v0)

Reforge Crash Course for $0

Godfather of Product Management: Do THIS!

“Product Management is going to be completely different in 5 years” - 2x CPO

How to Become an AI PM

“AI is driving us to the golden age of the feature factory!”

Day in the Life of a Senior Director of PM at Meta, Salesforce

How To Become an AI PM

How to Grow to 130K LinkedIn Followers

Pierre Herubel details his transition to an infographic-heavy LinkedIn strategy in mid-2023, which catalyzed a gain of 30,000 followers in just three months. The foundation of any successful B2B content motion is a solid business strategy, specifically nailing the ICP, positioning, and messaging before drafting a single post. While product strategy focuses strictly on the ICP, content strategy must broaden its scope to engage a wider audience to satisfy social algorithms and expand reach. Herubel utilizes an Authority First framework, recommending a mix of 70% authority-building content, 15% brand-focused content, and 15% direct offers. Authority content focuses on solving specific audience problems through unique insights, while brand content shares values and backstories. The offer category includes content upgrades or lead magnets designed to move users from social platforms into an owned content ecosystem like a newsletter. The technical execution of his infographics relies on design fundamentals such as margins, shapes, and color consistency. He advocates for a Z or F pattern to guide the reader's eye and suggests that the infographic itself should act as the primary hook. His repurposing system follows an expand process: starting with short-form insights on X or LinkedIn, testing for resonance, and then developing successful ideas into newsletters, YouTube scripts, or full products. Social selling is framed as a four-step process: positioning, publishing, engaging, and triggering DM conversations. Herubel warns against pitch slapping, which is sending unsolicited, automated sales pitches, and instead recommends warm outreach based on profile visits or post engagement. This approach focuses on starting natural conversations and offering high-value resources like case studies to build trust. His revenue model has evolved from pure consulting to a diversified creator business including courses, coaching, and a ghostwriting agency, all fueled by consistent, disciplined content production.

Key Takeaways

  • Content strategy needs a North Star document to map pillars to formats, ensuring every post reinforces core business positioning.
  • The 70/15/15 Authority First ratio prevents over-selling, prioritizing value-led expertise to build the trust required for B2B conversions.
  • Effective social selling uses warm outreach triggered by engagement rather than automated pitches, focusing on natural conversation and specific content upgrades.
  • Visual design in B2B is a functional tool for information density. Using F patterns and consistent color palettes creates a visual identity that strengthens the repetition effect.

Here's How to PM with AI in Startups

How to Make $1M/Month from a SaaS

Her Layoff Went Viral - Now She has 300K+ Subscribers

He Was Promoted 3x in 5 Years at Google

"This is Why Swiggy Won" - SVP

How Zoho Became a $6 Billion Company

Product Strategy Masterclass by Disney+Hotstar EVP PM

How Airbnb Became a $85B Company

“Scrum and Agile have an identity crisis!”

How to Drive GROWTH in B2B

B2B products frequently become bloated because product teams are incentivized to ship new features rather than monitor success or retire underperforming ones. This over-engineering clutters the user experience and drains engineering resources. Ben Williams argues that every company will eventually need to adopt product-led principles, specifically regarding retention, to avoid being replaced by competitors with more elegant, user-centric designs. Product-led growth is not a binary state but a spectrum across acquisition, retention, and monetization. In a product-led sales environment, zero-commission models can be more effective because they align sales incentives with product value and customer success. Instead of chasing quick closes, reps focus on accounts that already show meaningful engagement data. Effective growth modeling requires both qualitative loops and quantitative spreadsheets to act as a navigation system for the business. These models must be dynamic to account for shifting market conditions and to prevent metrics from losing their meaning once they become targets. Engagement should be viewed as a spectrum rather than a binary milestone. By bucketizing users into specific engagement states, companies can identify early warning signals for churn or high-propensity expansion opportunities. For example, Snyk utilized four engagement states to predict that users in the highest tier had an 80% chance of being retained a year later. Moving users between these states is often more impactful than focusing on resurrection. Acquisition can be scaled through programmatic SEO and sidecar products like the Snyk Advisor, which attracted a million developers monthly by solving high-frequency problems related to open-source package security. However, these tools only work if there is a strong bridge to the core product. The Feel, See, Hear framework emphasizes that experiencing product value firsthand is the most powerful conversion tool. Finally, product strategy is best communicated through low-fidelity visuals that provide a North Star without creating rigid, unchangeable roadmaps.

Key Takeaways

  • Retention is a lagging indicator that cannot be moved directly; teams must instead focus on the leading indicators of activation and engagement states to drive sustainable growth.
  • The consumerization of Enterprise software means that even legacy giants must prioritize UX, as users now demand the same elegance in work tools that they experience in personal apps.
  • Programmatic SEO sidecar products only drive meaningful growth if the value proposition of the tool directly mirrors the core product's utility, creating a natural bridge for conversion.
  • Zero-commission sales structures for PLG companies transform the sales role into a success-oriented function that leverages existing usage data rather than traditional high-pressure tactics.

Onboarding MASTERCLASS

How to SUCCEED as a Head of Growth

How to Get a Product Leadership Job (Dir, VP, CPO)

Everything You Need to Know About Product Strategy

How to make $10M from a SaaS

Thibault Louis-Lucas, a serial entrepreneur known as Tibo, shares the strategies that led to over $10 million in SaaS exits, primarily through the sale of TweetHunter and Taplio to Lempire. His core philosophy centers on a ship first, think later approach, where speed and market feedback take precedence over technical perfection or initial user satisfaction. This mindset is rooted in his experience of shipping 11 different products before achieving significant commercial hits. A major turning point in his strategy involved distribution. He famously gave away 25% equity in one of his ventures to a Twitter influencer, a move that prioritized reach and customer acquisition over ownership. This highlights his belief that distribution is the ultimate key to success in the crowded SaaS market. The conversation also dives into the technical and financial hurdles of building on third-party ecosystems. For instance, he details the transition from paying nothing for the Twitter API to facing a $42,000 monthly bill, illustrating the inherent risks of platform dependency. Tibo also discusses the practical application of AI for product managers and solo builders. He views AI as a tool to drastically cut development costs and accelerate the iteration cycle. His validation process is strictly revenue-based; if a product doesn't generate money quickly, it isn't worth pursuing. This revenue as validation model helps him decide which ideas to scale and which to abandon. Beyond his own builds, he has moved into acquiring projects like Feather and Revid, applying his growth frameworks to existing tools. The discussion touches on the nuances of building in public, managing time as a high-output builder, and why he believes many product managers are currently facing challenges in the evolving tech landscape.

Key Takeaways

  • Prioritizing distribution through high-equity influencer partnerships can be more effective for growth than traditional marketing or keeping full ownership.
  • Platform risk is a critical strategic threat because relying on third-party APIs can lead to sudden, massive overhead increases that force business model pivots.
  • The ship first methodology treats revenue as the only meaningful validation, suggesting that technical debt is a secondary concern compared to market fit.
  • AI enables a new class of lean product development where solo founders can bypass traditional engineering bottlenecks to launch functional products quickly.

Dovetail CEO: How I Built a $970M SaaS

Product Discovery Masterclass

How to Go VIRAL Consistently

Day in Life of Microsoft and Facebook PM

He Turned $25K Debt Into a $4.2B Company

The Ultimate Guide To Growth Marketing

How to Build Things WAY Faster

21 Harsh Truths about Product Management in AI

The SwagWala PM Tells All in 3h Deep Dive

How to LAND a Product Management Job

Shobhit Chugh outlines a strategic framework for landing and excelling in product management roles by focusing on reputation and specific expertise. The core concept is the Angle of Mastery, which moves beyond generic domain knowledge to focus on specific product stages, company sizes, or technical applications. For example, a PM might specialize in scaling B2B SaaS products from $1M to $10M in revenue. This specificity makes a candidate more appealing than a generalist with similar years of experience. Chugh also introduces the Good Soldier to Good General transition, where PMs must stop trying to solve every problem and instead focus on the two or three levers that actually move the business forward. This requires letting go of the need to make everyone happy in the short term to maintain a clear roadmap. The Career Flywheel concept highlights that most career-defining decisions happen when the PM is not in the room. Success depends on building advocates who showcase your work as top-notch. On the tactical side of job hunting, Chugh argues that resumes often fail because they lack context or a unique story. He suggests focusing on the "why" behind difficult problems rather than just listing launches. For interviews, he prioritizes behavioral preparation over frameworks, advising candidates to use "drama" in their stories to elicit emotional responses and demonstrate command of the situation. He notes that the impression you leave is often more important than having the perfect answer to every technical question. Negotiation is another pillar, where Chugh recommends deferring salary discussions until the company is fully bought into the candidate's potential. He suggests that upleveling, such as moving from Senior PM to Director, is often the easiest way to meet high compensation requirements. Finally, he emphasizes extreme ownership, where a PM takes responsibility for everything within their circle of control while accepting that they cannot act on every issue simultaneously. This mindset shift helps PMs manage ambiguity and make tough calls, like killing a product that no longer serves the business goals.

Key Takeaways

  • Narrowing your specialty as you become more senior increases your market value and makes you indispensable to specific types of companies.
  • Effective leadership requires letting go of the need to make everyone happy in the short term to ensure long term product clarity.
  • Reputation often carries more weight than the actual message; if you lack a strong reputation, you must leverage the credibility of others in the room.
  • Behavioral interviews are more about performance psychology and eliciting emotions than just checking off technical boxes.

KNOW Your Users as a Product Manager

CRUSH Your Growth Role with this 108 min tutorial

Maven PM Crash Course with the CEO

He Built a $28M ARR SaaS

Guillaume Moubeche, founder of Lempire and Lemlist, shares the journey of scaling his company from a $1,000 investment to over $28M ARR and a $150M+ valuation. After a failed first software attempt, Guillaume returned to his expertise in lead generation to build Lemlist. He identified a gap in the market where existing tools failed at personalization and deliverability. By launching an MVP focused on these specific pain points, he was able to gain immediate traction in a crowded market against competitors like Outreach and SalesLoft. A major turning point for the product was a radical UX overhaul. Guillaume noticed that users were signing up but not launching campaigns. He moved from an open discovery model to a rigid, step-by-step onboarding funnel. Although this caused initial churn and user complaints, it ultimately tripled the activation rate. This move highlighted his philosophy that product simplicity and forced progression are often better than flexibility for new users. Growth was significantly accelerated by an AppSumo campaign that generated $175,000 in two weeks. More importantly, it provided a community for feedback and a mindset shift. Guillaume credits AppSumo's leadership with pushing him to think beyond $1M ARR and model the infrastructure needed for $10M+. Today, Lempire is highly profitable with $10M in annual EBITDA, allowing Guillaume to shift his focus from daily operations to strategic vision, talent acquisition, and M&A. He remains critical of traditional product management, arguing that PMs often focus too much on long specifications and iterating on small problems rather than making bold, vision-led decisions that solve the actual underlying business issues.

Key Takeaways

  • Forcing users through a rigid, linear onboarding path is often more effective for activation than allowing open discovery, even if it causes temporary friction for power users.
  • The most valuable product managers are those who can reframe a user's stated complaint into a strategic problem, such as shifting focus from data quality to overall meeting rates.
  • Bootstrapping to high profitability provides a massive strategic advantage for M&A, as the company can acquire complementary tools using its own cash flow without VC interference.
  • Scaling from $1M to $10M+ requires a modeling exercise where you work backward from the target to determine the exact lead volume and team structure required to hit that goal.

Product-Led Growth: Review From the Author

AI PM @ Google: Crack the Job Search

Be EVIDENCE-GUIDED as a Product Manager

How SVPG Transforms Companies to the Product Model | Christian Idiodi

How to Build a Snap Product Strategy

Ed Biden argues that spending weeks on a strategy is a mistake. Instead, he advocates for a Snap Strategy, which is a one-page document that outlines your objective, qualitative and quantitative definitions of success, and key work pillars. This document acts as a day one answer that you iterate on as you find evidence. It is a tool for alignment, especially with founders who might have deep context but fail to share it. By writing it yourself and asking for corrections, you move faster than waiting for a perfect brief. He critiques product purists who follow frameworks like Marty Cagan's Inspired too rigidly. While discovery is vital, being a purist can slow down the business. A full-stack PM recognizes when to do deep discovery and when to just ship, especially for well-understood features like payment integrations. Success is measured by customer happiness and business value, not adherence to a process. Recognizing what phase you are in is the difference between being a helpful partner and being an obstruction to the company's goals. On the economics of product, Biden uses a rule of thumb that a product team costs about a million dollars a year. PMs need to understand this hurdle rate. If a feature doesn't have a clear path to generating that value, it might not be the right bet. He suggests impact modeling over impact sizing, using a hurdle rate to work backward and see if the required lift is realistic. This commercial awareness helps PMs justify their roadmap to the CFO and ensures they are working on bets that actually move the needle for the business. Transitioning to leadership means moving from working on the product to working on the product organization. This involves three pillars: strategy, execution, and people. Leadership requires setting high standards and clear expectations rather than just being nice. When hiring, Biden prioritizes intellectual horsepower and empathy over specific industry or functional experience. He believes these core traits are harder to teach than domain knowledge.

Key Takeaways

  • The Snap Strategy serves as a hypothesis-driven prototype for alignment. It forces stakeholders to react to a concrete plan, which is faster than starting from a blank page.
  • Commercial awareness is a missing link for many PMs. Viewing a team as a million-dollar investment changes how you prioritize features and justify your roadmap to a CFO.
  • Product leadership is distinct from senior product management. It requires a shift from direct output to indirect influence through organizational design and coaching.
  • The full-stack concept in product is about situational awareness. PMs must be able to switch between wartime shipping and peacetime discovery based on the company's runway and risk profile.

Product-Led Growth (PLG) Masterclass

Master Go-to-Market Strategy in the Era of AI

She Founded the Largest Product Agency in the World

How to be an AI Product Manager

How to Write a Great Product Strategy: Meta Director

How to Build High-Performing Product Teams

How Meta Does Product Growth

Andre Nader, a nine year veteran of Meta, explains the inner workings of the company's unique product growth function and how it differs from traditional product management. At Meta, a growth team's product is a metric rather than a specific feature or app section. This distinction allows growth teams to work across the entire ecosystem to drive Topline metrics like monthly active users. The core framework used is Understand, Identify, Execute. This involves deeply analyzing data to find problem spaces, identifying every possible lever that could move a metric, and then executing with extreme precision on the most impactful levers. The role evolved from digital marketing into Product Growth to better reflect the technical, data driven nature of the work and to align with market compensation for data science and product management. Nader emphasizes the value of glue activities, which involve filling gaps between engineering, data science, and product management to unblock teams. He advocates for simple, non ratio metrics to ensure clear communication and alignment across large organizations. Beyond growth strategy, the discussion covers optimizing FAANG compensation and the path to Financial Independence Retire Early (FIRE). Nader details the importance of understanding Restricted Stock Units (RSUs) and maximizing tax advantaged accounts like the Mega Backdoor Roth. He introduces the concept of enough, specifically calculating that a $5.6 million portfolio is required for a conservative 3% withdrawal rate to sustain a high quality life in San Francisco. He suggests that once the income game is won at a FAANG company, professionals should shift their focus toward optimizing for optionality and lifetime enjoyment rather than just higher earnings.

Key Takeaways

  • Growth teams at Meta treat a specific metric as their product, giving them the mandate to operate across any part of the app stack to drive that number.
  • The Understand, Identify, Execute framework prioritizes deep data logging and correlation analysis before moving to causal AB testing and aggressive optimization.
  • Job titles in big tech significantly impact compensation because companies benchmark pay against specific job categories like Digital Marketing versus Product Management.
  • Simple Topline metrics are superior to complex ratios because they reduce the explanation tax in meetings and keep large, cross functional teams aligned on the same goal.
  • Achieving financial independence in high cost areas like San Francisco requires a clear enough number that accounts for fixed costs, healthcare, and future obligations like tuition.

How to Develop Your Product Strategy - Google PM Director

The MESSY Reality of B2B Product Management

Master POSITIONING in 88 mins

How to Get a Product Management Job

Do THIS to get a $700K+/year PM Job

High compensation for product managers at big tech companies like Google, Meta, and Nvidia often reaches $400,000 for individual contributors and exceeds $1 million for senior leadership. This pay scale is driven by the massive distribution these companies command, where a single PM might oversee products used by billions of people. The ratio of engineers to PMs typically ranges from 20 to 1 up to 70 to 1, meaning each PM has significant leverage and impact. To land these roles, candidates must navigate several filters. The most critical is having recognizable brand names on a resume, which acts as a heuristic for recruiters. Geography also plays a major role, with most high paying opportunities concentrated in hubs like San Francisco, Seattle, and New York. For those starting out, pursuing technical degrees in computer science or design is highly recommended, as is building personal projects to demonstrate entrepreneurial initiative. Mid career professionals should focus on strategic job hopping every two to three years to upgrade their resume brand and increase their total compensation. Internal career switchers from marketing or sales should consider taking a temporary level drop to secure the product manager title at a prestigious company, which opens doors for future high level roles elsewhere. For PMs at lesser known companies, networking becomes the top priority. Reconnecting with former colleagues who moved to big tech and building specific technical expertise, such as shipping an iOS app to catch Apple's attention, are effective tactics. Success in these interviews requires a long term mindset, often involving three months of consistent daily practice to internalize product design, strategy, and analytical frameworks. Candidates need to move beyond just watching videos to practicing with well calibrated peers to handle the barrage of behavioral and technical questions.

Key Takeaways

  • Resume branding acts as a non-negotiable filter in big tech hiring, making it necessary to prioritize company prestige over internal promotions when aiming for top-tier compensation.
  • The trimodal distribution of PM salaries is a function of revenue per employee and user scale, where the impact of a single feature can justify seven-figure pay packages.
  • Career switchers should prioritize the PM title over their current seniority level, as the title combined with a top-tier company name is the primary currency for future mobility.
  • Networking is the only effective way to bypass the unknown company filter, requiring a shift from passive applications to active value-driven relationship building.

Product Manager Salaries in India Revealed: Largest Dataset Ever (2024)

Enterpret CEO: How I built a $25M+ Startup

How to Handle a Layoff: Interviews and Resume

Write Better PRDs Faster with ChatPRD

ChatPRD is an AI-powered tool designed to streamline the creation of product requirements documents (PRDs). Created by Claire Vo, a Chief Product Officer, the tool originated as a personal solution to save time while maintaining high-quality output for technical and platform-oriented specs. Vo built it to handle the heavy lifting of drafting while she managed large engineering organizations. Unlike generic LLMs that often produce formal or forced text, ChatPRD focuses on a practical, direct communication style that mimics a seasoned product manager. It has grown from a mega prompt in the GPT store to a standalone application with over 10,000 users, proving the market demand for specialized AI agents in the product space. A core differentiator is the tool's ability to avoid generic frameworks and fluff, focusing instead on realistic PM communication. Users can leverage features like Bring Your Own Template, which allows them to bypass the blank page problem by automatically scaffolding content into their company's specific document formats. This feature is particularly useful for PMs who are often handed rigid templates that require hours of manual work to populate. By automating the initial draft within a familiar structure, the tool allows PMs to focus on high-level strategy rather than formatting. The platform currently utilizes the GPT-4o model, which provides the speed and technical capability required for advanced features. Beyond simple chat interactions, the app includes a Doc Mode for direct content manipulation and customization options that help the AI learn about a user's specific role and product context. The development roadmap includes a knowledge base feature intended to ingest company-specific data such as OKRs, failed experiments, and customer research. This integration aims to provide the AI with the necessary context to ensure product work aligns with broader company goals and avoids redundant efforts. This evolution from a simple prompt to a context-aware infrastructure layer reflects the broader shift toward agentic tools in the SaaS ecosystem.

Key Takeaways

  • ChatPRD bridges the gap between generic AI outputs and professional-grade documentation by prioritizing a no frameworks approach that mirrors the directness of an experienced CPO.
  • The Bring Your Own Template feature solves a major friction point in B2B SaaS workflows where rigid corporate standards often slow down the initial drafting phase.
  • Moving from a GPT wrapper to a standalone app allows for deeper feature integration like Doc Mode and future knowledge bases, transforming the tool into a context-aware strategic assistant.

Crack the Technical Interview for PMs

Tell Me About Yourself - Structure a Strong Answer

Product Roadmaps: Advanced Tips & Tricks

Everything you need to know about SaaS Pricing

How to Become an IC CPO: Build Agentic COS with Claude Code

We Built an AI Agent to Automate PM in 73 mins (ZERO CODING)

Anthony Maggio, VP of Product at Airtable, explains how the platform is shifting from a spreadsheet-database hybrid to an AI-native application builder for non-developers. He breaks down Airtable AI into two main functions: Co-builder, which helps users design apps through natural language, and AI at runtime, which embeds agents and automations directly into business workflows. The conversation includes a live demo of Product Central, a dedicated app for PM teams. It shows how AI can ingest Gong or Zoom transcripts to automatically summarize interviews, tag themes, and link feedback to specific roadmap items. Maggio demonstrates a deep research assistant that queries internal data sets (like past research PDFs or PRDs) with full citations, similar to Perplexity but for private company data. Other automated workflows include AI-weighted backlog prioritization and one-click PRD generation that exports directly to Google Docs. Maggio also details Airtable's unique internal product culture. The company enforces a build on Airtable first rule for all procurement, leading them to replace legacy SaaS like Workday with custom Airtable apps for performance reviews. Notably, Airtable assigns revenue goals (ARR) directly to product leaders, forcing PMs to act as General Managers who own the full-stack success of their features, including sales enablement and marketing alignment. He argues this prevents finger-pointing between functions and ensures the product team builds things users actually value. The episode concludes with career advice for PMs. Maggio emphasizes learning velocity as the primary driver for career jumps. To move from Director to VP, he suggests shifting focus from executing a group strategy to uncovering entirely new market opportunities and pitching needle-moving ideas that aren't yet on the company's radar.

Key Takeaways

  • Horizontal platforms are challenging functional point solutions by allowing teams to connect data across silos, such as linking CRM data directly to product feedback, without custom code.
  • Assigning ARR targets to product pillars transforms PMs into GMs, ensuring that product development is strictly aligned with commercial viability and GTM success.
  • AI product development is inherently non-linear and requires a shaping phase where teams test the boundaries of model reliability before committing to a specific UX.
  • The Airtable on Airtable procurement rule serves as a continuous feedback loop, exposing platform gaps and forcing the product to mature enough to handle complex enterprise needs like multi-layered permissions.

AI Product Discovery: Complete Course - by Aakash Gupta

Tanguy Crusson, Head of Product for Jira Product Discovery, explains how Atlassian uses a four stage framework called Wonder, Explore, Make, and Impact to de-risk development. He argues that discovery is not a one time event but a continuous effort to validate value, usability, and feasibility. AI is currently a major time saver for prototyping with tools like V0 and Lovable, and for analyzing customer feedback. In the Wonder stage, PMs focus on problem exploration. They use ethnographic research and video snippets to capture customer pain points. Crusson suggests that PMs should get professional research training to avoid leading questions and learn to use silence effectively. The Explore phase involves testing lo-fi prototypes, sometimes just slides, to see if customers have an immediate aha moment. He also describes using painted door experiments, such as placing ads for non-existent features on the Atlassian website to validate market demand before writing code. The Make phase uses a safety funnel approach. Instead of a massive launch, they roll out to 10, then 100, then 1000 customers. This ensures the product is ready and maintains high CSAT scores before a full release. This method helped Jira Product Discovery reach 18,000 paying customers in two years. For stakeholder management, Crusson uses a high speed train strategy. By sending frequent, bite sized updates, he builds enough trust that leaders feel they do not need to interfere. He also replaces traditional PRDs with live feature documents and direct collaboration in Figma. He believes PMs must understand the entire stack, from go to market strategy down to technical debt and code constraints.

Key Takeaways

  • The safety funnel prioritizes quality over raw speed. By limiting early access to small groups (10-100-1000), teams avoid the uphill battle of winning back users who had a bad first impression.
  • Frequent updates act as a shield. Sending weekly snippets of customer interviews and small wins creates a sense of momentum that keeps stakeholders from micromanaging or interrupting the team's flow.
  • Technical spikes and live documents replace PRDs. Bringing engineers into the discovery process early through spikes and Figma collaboration is more efficient than handing over a finished requirements document.
  • AI is best for rapid iteration. Using AI to turn designs into functional prototypes in minutes allows teams to test ideas with customers before writing any production code, significantly reducing the cost of failure.

How to Land a $700K+ AI PM Job: Complete Guide 2025

AI PM roles are seeing a massive surge, now making up 20% of PM job listings and paying 30% to 40% more than traditional roles. High-tier positions like Group PMs at Google can reach $700K, while CPOs can exceed $2M. To land these, candidates must shift from a self-centered search to solving specific company problems. Recruiters typically spend only five to seven seconds per resume, looking for three core signals: impact, scope, and recognizability. The process begins with "The Work," a deep-dive extraction of career history using 32 specific questions. This raw data is fed into AI to create a "bullet vault" organized into six categories: product development, leadership, strategy, business, project management, and technical skills. Bullets must follow a strict format of Action Verb, Context, Result, and Metric. AI is then used to tailor this vault to specific job descriptions by extracting non-generic must-haves and rewriting the resume summary to hook recruiters. Outreach is equally critical, as cold applications often have a mere 1% callback rate. Using tools like ContactOut, candidates should find hiring managers and send short, three-bullet emails focused on how they can solve the team's specific problems. For interview prep, AI acts as a sparring partner. Candidates should use rubrics for behavioral, case, and execution interviews. The behavioral framework focuses on hooks, principles, actions, results, and learnings. For case interviews, AI evaluates responses based on structured thinking, user focus, and prioritization. The key is an iterative approach: start with written content, move to spoken delivery, and finally add time constraints to simulate the high-pressure environment of top-tier tech interviews.

Key Takeaways

  • Recognizability is a primary filter in high-end recruiting, meaning that without big-name brands like Google or Facebook on a resume, callback rates often stay at the 1% baseline regardless of individual skill.
  • Resume tailoring should focus on stack ranking existing bullets from a comprehensive vault rather than inventing new content, allowing for rapid customization that matches a hiring manager's specific quarterly goals.
  • Effective outreach relies on low-friction requests, such as asking a manager to simply forward a resume to their recruiting partner, which bypasses the need for time-consuming coffee chats while proving immediate value.
  • Interview success is driven by rubric-based AI coaching that provides objective feedback on logic and empathy before a candidate moves into high-pressure timed practice sessions.

AI PM Roadmap: Core PM to AI PM with Todd Olson

Todd Olson, CEO of Pendo, outlines the shift from traditional product management to AI-centric roles. AI PM job postings have doubled from 10% to 20% of the market in a single year, often commanding 30% to 40% higher salaries due to skill scarcity. Olson warns against AI washing on resumes, emphasizing that true AI PMs must understand the underlying technology stack, including token economics, data pipelines, and model trade-offs. The transition is structured as a pyramid. At the base are foundations like prompt engineering and RAG (Retrieval-Augmented Generation). The middle layers involve observability, trace analysis, and cost optimization, where PMs must balance innovation speed with gross margin health. The top layers focus on evaluation and strategy. A critical shift in AI product management is the move from measuring daily active users to focusing on outcome-based metrics and workflow completion. Olson introduces the concept of the Forward Deployed PM, a role where product managers sit directly with customers to fine-tune context windows and embeddings for specific use cases. Strategically, companies should avoid building generic wrappers and instead focus on solving hard, tedious problems using unique data assets. Pendo demonstrates this by automating discovery workflows, such as identifying specific user segments for interviews and synthesizing qualitative feedback from Gong, Zendesk, and Salesforce. The future of B2B SaaS will likely move away from rigid product silos toward cross-cutting workflows where humans and agents collaborate to complete complex tasks. The Model Context Protocol (MCP) is highlighted as a standard that will open new possibilities for data integration and agentic capabilities. PMs are also encouraged to be vigilant about killing underperforming AI features, as poor quality in one area can stifle adoption across the entire platform.

Key Takeaways

  • PMs must own the evaluation layer. While engineers build the harness, the PM is the expert on user needs and must define the quality metrics that determine if an LLM output is successful.
  • Gross margin management is a core PM skill in the AI era. Early-stage AI features often trade tokens for growth, but long-term viability requires optimizing compute costs through caching and smaller models.
  • The Forward Deployed PM model is essential for high-stakes B2B AI. Success requires PMs to get technical enough to tweak RAG parameters on-site with customers to ensure the AI handles specific domain context correctly.
  • Strategic roadmapping should prioritize cross-cutting workflows over individual features. AI allows products to break down traditional silos, meaning PMs must now design for end-to-end automation rather than just UI interactions.

Building in Public: The 7 AI Tools I'm Using in My $1M+/Yr Business

Aakash Gupta outlines a stack of seven AI tools that enabled him to generate over $1 million in annual revenue while cutting $400,000 in operational costs. The workflow begins with Zapier and Claude for email management. By using Claude to categorize incoming mail and draft replies, he reduced the time spent on his inbox from an hour to ten minutes daily, replacing a virtual assistant and saving $14,000 per year. For product development, he utilizes v0 by Vercel for rapid front-end prototyping. He advocates for a parallel prototyping strategy, running the same prompt through tools like Lovable and Bolt to see which produces the best initial version. This approach bypasses the need for expensive Upwork developers for early-stage builds. Once a prototype is ready, he moves it into Cursor to handle the backend and production-ready code. This vibe coding process allows non-technical users to fix errors and deploy apps with payments and authentication by interacting with an AI agent in natural language, potentially saving nearly $100,000 annually on senior developer salaries. Marketing is handled through Google's Veo3 for high-quality video ads and Lindy for cross-platform content distribution. Lindy acts as a vibe marketing engine, automatically turning podcast episodes into blog posts and LinkedIn updates into X threads, which replaces the need for dedicated social media managers. The content production workflow relies on Riverside AI and Opus Clips. These tools automate podcast editing, audio enhancement, and the creation of viral social clips, saving over $5,500 monthly in engineering and clipping costs. Finally, he uses Claude Projects as a strategic co-pilot. By feeding the project specific context, guest lists, and performance data, he creates a specialized agent for guest vetting, research, and cold outreach. This stack demonstrates a shift from hiring human specialists to orchestrating AI agents that can build, market, and scale a business with minimal overhead.

Key Takeaways

  • Parallel prototyping using v0, Lovable, and Bolt simultaneously is faster and more effective than relying on a single AI builder or a human freelancer for initial versions.
  • The vibe coding workflow in Cursor allows founders to maintain and scale technical products without a full-time senior developer by treating the AI as an autonomous coding agent.
  • AI agents like Lindy and Zapier provide a massive ROI by replacing mid-level operational roles like VAs and social media managers with low-cost API credits.
  • Strategic growth can be automated by building a context layer in Claude Projects, turning a general LLM into a specialized business strategist that understands specific audience metrics.

The biggest model update this week wasn't GPT-5.1, it was Kimi K2: AI Update #3

Kimi K2 from Moonshot AI is a new reasoning model that uses interleaved reasoning to outperform frontier models at a lower cost. Instead of a simple think-then-act loop, K2 uses a five-step process: plan, act, verify, reflect, and refine. This allows the model to catch errors mid-stream and adjust its strategy, which is a major upgrade for building autonomous agents that need to make hundreds of tool calls. It runs on a Mixture of Experts architecture with 1 trillion parameters, but only 32 billion activate per token. This keeps it fast and cheap, with a standard mode costing much less than Claude. In a direct comparison for product management tasks, Kimi K2 showed better product sense than GPT-5.1. While GPT-5.1 gave abstract scores, K2 used real engagement data and estimated effort in months. It correctly identified that certain features had a clearer GTM path even if the math suggested otherwise. OpenAI's GPT-5.1 update focuses more on personality and warmth, adding eight presets like Professional or Cynical and improving instruction following. The model now dynamically adjusts its reasoning time based on the complexity of the prompt. The AI market is seeing massive growth and shifts. Cursor reached $1B ARR and raised $2.3B, while Anthropic is planning $50B in data center spending. Meta's Yann LeCun is also expected to leave to start his own AI company. New tools are focusing on multi-source learning and real-time translation, signaling a move toward more integrated AI workflows in business.

Key Takeaways

  • Interleaved reasoning is a game changer for agent reliability. By checking its work after every action, Kimi K2 avoids the common failure loops seen in older reasoning models.
  • The gap between frontier models and specialized reasoning models is gone. Kimi K2 proves that a model can be 10x cheaper while still offering better strategic judgment for specific roles like product management.
  • OpenAI is pivoting toward user experience over raw power. GPT-5.1's focus on warmth and personality presets shows they are trying to fix the robotic feel that drove users toward competitors.
  • Building autonomous GTM agents is becoming much more affordable. Kimi K2's architecture allows for high-volume tool orchestration without the massive costs usually associated with reasoning models.

AI Evals Masterclass with Hamel & Shreya - by Aakash Gupta

This masterclass covers the systematic process of building and refining AI evaluations using real production data. The core message is that evals are the most critical skill for AI product managers to move beyond vibe checks and demos into reliable production features. The process begins with observability, which involves capturing traces of LLM interactions. While tools like BrainTrust or LangSmith are helpful, the fundamental requirement is simply logging traces to a readable format like CSV or JSON. The most overlooked but high leverage step is manual error analysis. This involves open coding, where a human reviews traces and takes notes on failures, and axial coding, where those notes are categorized into actionable themes like conversational flow issues or human handoff failures. The discussion highlights why PMs and domain experts must lead this process rather than outsourcing it to engineers. PMs possess the specific product taste and context needed to identify nuanced failures that an LLM or a general engineer might miss. When building automated LLM as a judge systems, the experts recommend using binary true or false scoring instead of 1 to 5 scales. Binary scores are easier to align with human preferences and map directly to business decisions. To ensure the judge is reliable, it should be measured using True Positive Rate (TPR) and True Negative Rate (TNR) rather than simple agreement percentages, which can be misleading in datasets with imbalanced error rates. The end goal is a suite of code-based and LLM-based evals that allow for rapid iteration on prompts and system architecture without regressing on quality.

Key Takeaways

  • Manual error analysis is the foundation of a defensible AI product and cannot be outsourced to developers or off-the-shelf metrics.
  • Binary scoring is more effective than Likert scales because it simplifies alignment and mirrors actual business decision-making.
  • Simple agreement metrics are a trap; use True Positive and True Negative rates to verify if your LLM judge actually catches errors in imbalanced data.
  • The prompt is essentially English code, making it a tragedy to separate the PM from the prompt engineering process.

The Marty Cagan Episode: Product Management Crash Course in 61 Minutes

Marty Cagan explains the fundamental shift from an output-based feature team model to an outcome-driven product operating model. The core of this transition involves three dimensions: changing how companies decide which problems to solve through product strategy, how they solve them via product discovery, and how they build and deploy solutions using continuous delivery. Cagan emphasizes that while output is easy to measure, achieving outcomes is significantly harder and requires a higher level of skill. Most companies currently operate with feature teams where stakeholders define roadmaps and teams simply execute. In contrast, empowered product teams are given specific problems to solve and are held accountable for the business results. This model requires four specific competencies: real product managers who own value and viability, product designers focused on the full customer experience, tech leads who care about what is built, and product leaders who provide strategic context and coaching. Transformation typically takes one to three years and depends heavily on the alignment between the Head of Product and Head of Engineering. Cagan warns that many PMs in feature teams are actually performing project management tasks, making them vulnerable during market downturns. He advocates for PMs to take agency by deeply learning their customers, data, and business constraints. Effective coaching is also highlighted as a major bottleneck, noting that many agile coaches lack the actual product experience needed to guide a strategic transformation. The discussion also touches on the durability of product principles versus the rapid evolution of tools and methods.

Key Takeaways

  • The shift from output to outcomes is the defining characteristic of high performing product organizations.
  • PMs in feature teams are often performing project management and are at high risk during layoffs because they do not directly own business value or viability.
  • Successful transformation is impossible without product leaders who can provide strategic context and active coaching rather than just managing people.
  • Real product coaching requires practitioners with actual experience in the model, not just process facilitators or agile coaches.
  • Product principles are durable and universal across hardware and software, while tools and processes should be adapted to the specific context.

AI Prototyping for PMs | Wix Co-Founder Masterclass

Nadav Abrahami, co-founder of Wix, breaks down how AI prototyping is changing the game for product managers. He shows off Dazzle, a tool that lets PMs build real, functional apps from just a prompt or a screenshot. This gives PMs a way to move fast without waiting on engineering for every little experiment. The main point is that functional prototypes beat static designs because you can actually test them with real users and real data. Nadav suggests a workflow where you start with a template, try out a few different versions, and then polish the best one. He also has a fresh take on PRDs: use the prototype for the main 90% of the experience and save the PRD for the tricky edge cases. He thinks PMs should start getting more technical by using AI to read code and even pushing small changes themselves to stay ahead.

Key Takeaways

  • AI prototyping acts as a force multiplier for PMs, enabling them to validate features and build functional tools without immediate engineering support.
  • Functional prototypes provide higher quality user feedback than static designs because they allow users to interact with real data and logic.
  • The PRD is evolving into a companion document that focuses on edge cases, while the prototype becomes the primary tool for selling ideas and defining core flows.
  • Future PMs will likely blur the lines with engineering by using AI to understand codebases and pushing small, non-critical updates themselves.

The Ultimate Guide to ChatGPT Codex: OpenAI's Claude Code Killer

ChatGPT Codex is OpenAI's command line interface (CLI) tool designed for advanced workflows that go beyond the standard browser experience. Unlike the web interface, Codex allows users to interact directly with local file systems, execute code, and connect to external APIs. For product managers, this means more efficient prototyping and the ability to manage complex documentation within a single integrated environment. Setting up Codex requires a terminal and Node.js. Once installed, users can toggle between standard models like GPT-5 and specialized versions like GPT-5 Codex, which is optimized for technical tasks. Key features include a compact command to manage context windows and a YOLO mode that bypasses repetitive permission prompts for file edits and web searches. The primary advantage of the CLI approach is the ability to provide massive context by pointing the tool at entire folders containing PRDs, meeting notes, and user interviews. This enables high-fidelity synthesis and document creation that follows specific company templates. For instance, a PM can use a Socratic questioning prompt to refine a feature idea before the AI generates a final PRD based on existing high-quality examples in the directory. While Claude Code currently leads in agentic features like parallel task execution and automated planning, Codex offers a more engineer-first experience with granular control. It can be made agentic through workarounds like running multiple terminal instances or writing scripts that trigger parallel processes. Ultimately, Codex serves as a powerful thought partner and prototyping engine for PMs who want to move faster than the browser allows.

Key Takeaways

  • The CLI environment transforms the LLM from a passive chat interface into an active file system operator that can read, write, and execute code within a local project context.
  • Using a Socratic questioning layer before document generation significantly improves output quality by forcing the user to validate assumptions and edge cases before the AI drafts a final version.
  • Codex YOLO mode is essential for fluid workflows because it removes the friction of manual approvals for every file change, though it works best within a contained directory to minimize risk.
  • While Claude Code is more polished for parallel agentic tasks, Codex is often preferred by those who want exact adherence to specific rules and a more transparent interaction model.

AI PM Masterclass + Tutorial on Google's AI | Google AI PM

Jaclyn Konzelmann, Director of AI Product at Google, provides a deep dive into the evolving landscape of AI Product Management. She confirms that the AI PM role is a distinct, high-leverage position that requires a shift in how products are built and conceived. A core theme is the transition from first order thinking to second order thinking. Instead of building single-use apps, like a tool specifically for children's storybooks, Konzelmann advocates for building platforms like Opal that allow users to chain together complex workflows for any purpose. This approach future-proofs products against rapidly improving models. She also discusses the psychological aspect of innovation, noting that true 10x thinking is inherently uncomfortable. PMs must learn to distinguish this productive discomfort from actual failure. The conversation covers Google's latest tools, including Nano Banana for image and video generation, and how they enable powerful new use cases. For hiring, Konzelmann looks for six specific traits: product taste, visionary leadership, clarity in chaos, storytelling, full spectrum execution, and deep AI intuition. She emphasizes that as AI commoditizes basic tasks, the ability to consistently generate high-quality, creative ideas becomes the primary differentiator for PMs. The interview process for these roles is rigorous, often involving written questions and multiple rounds focused on design, analytics, and strategy. To break into the field, she recommends building in public, contributing to side projects, and developing a 'golden eval set' of prompts to validate model performance.

Key Takeaways

  • Discomfort is a signal of 10x innovation. Konzelmann suggests that if a project feels easy or comfortable, it likely isn't pushing boundaries enough. PMs should embrace the 'gray cloud' phase of zero to one building as a sign they are onto something significant.
  • Platform thinking is the ultimate future-proofing strategy. By building infrastructure and workflow tools rather than narrow applications, PMs can ensure their products gain value as underlying models like Gemini or Nano Banana improve, rather than being replaced by them.
  • The blending of role profiles requires full spectrum ownership. The lines between PMs, engineers, and researchers are blurring in AI-native development. Successful AI PMs must be comfortable jumping into execution and taking total ownership of the product's success regardless of their formal title.
  • Product taste is the hardest skill to hire for but the most essential. In an era where execution is faster than ever, the ability to intuitively understand what makes a product 'good' and why users will care is the most valuable asset an AI PM can possess.

DeepSeek is Back with Gemini-3 Performance, But Cheaper: AI Update #6

DeepSeek V3.2 has launched, matching Gemini-3.0-Pro performance while reducing costs by 70 percent. This new model introduces an attention mechanism that brings inference costs down to 0.70 dollars per million tokens. A critical technical update is the model's ability to maintain reasoning capabilities during tool calls, which solves a major friction point for complex agent workflows where models previously lost context. The Speciale variant has also shown elite performance in math and coding benchmarks, outperforming GPT-5-High in specific agent-based coding tasks like Terminal Bench 2.0. In the broader market, Anthropic is targeting a 350 billion dollar IPO and has acquired the JavaScript toolkit Bun, while OpenAI is reportedly delaying ads to focus on competitive threats from Google. For product leaders, the focus is shifting from simple implementation to profitability. Most AI features fail not because of pricing, but because they do not move retention metrics. The recommended strategy is a hill climbing approach: first use high-end models to hit quality benchmarks, then optimize for cost by swapping to cheaper models or open-source alternatives once product-market fit is validated. Pricing models are also evolving toward hybrid structures that combine a base seat fee with usage-based credits to protect margins from power users who consume high levels of compute.

Key Takeaways

  • DeepSeek is shifting the AI competition from pure performance to frontier-level capabilities at commodity prices, making agent-heavy applications economically viable for the first time.
  • The ability to preserve reasoning across multiple tool calls in V3.2 is a massive unlock for developers building autonomous agents that need to interact with external APIs without losing their train of thought.
  • AI feature success is primarily a retention problem. If a tool like an AI email writer does not become a daily habit for the user, no amount of pricing optimization will save the product.
  • Hybrid pricing models using credits are becoming the industry standard to manage the high cost of power users while maintaining the stability of per-seat revenue.
  • The hill climbing methodology suggests that PMs should prioritize quality over margins in the early stages of AI development to ensure the use case actually solves a high-value problem.

Here's my brutally honest ranking of the top 70 AI PM Tools, with Google Product Leader Anshumani Ruddra

Product leaders Aakash Gupta and Anshumani Ruddra break down the top AI tools for 2025. The discussion centers on a shift from simple chat interfaces to vibe coding and agentic execution. Claude Code is the top overall tool because it handles deep context and executes tasks directly in the terminal. For PMs building prototypes, Replit and V0 are the strongest options. Lindy stands out for building custom AI agents without needing a technical background. The ranking also covers data insights where Chameleon and Amplitude are automating experimentation. In the research category, Interpret is the winner for linking customer feedback to revenue impact. The main takeaway is that the best tools are now true multipliers that turn PMs into builders.

Key Takeaways

  • PMs are becoming builders by using terminal-based tools like Claude Code to manage PRDs and code at the same time.
  • The market is splitting between tools that just answer questions and tools that actually do work. Execution-focused apps like Replit and Lindy are winning.
  • Dictation and meeting assistants like Super Whisper and Granola are now vital for keeping context across busy schedules.
  • Legacy platforms like Jira and Pendo are falling behind specialized AI-first tools that offer higher productivity gains.

Gemini 3 isn't just the top model, it's rewriting AI infrastructure: AI Update #4

Google released Gemini 3 and Nano Banana Pro, marking a shift where Google now holds the top model spot on leaderboards like LMArena. Beyond benchmarks, the core story is Google's reliance on its own custom TPU infrastructure rather than Nvidia's supply chain. Gemini 3 was trained entirely on TPUs, which Google is now lending to companies like Anthropic and Midjourney. This vertical integration challenges the Nvidia moat narrative. Gemini 3 shows high performance in multimodal tasks and reasoning, scoring 91.9% on the GPQA Diamond science benchmark. The Nano Banana Pro model specifically excels at rendering text within images. Google also introduced Antigravity, an agentic IDE born from the $2.4B acquisition of Windsurf, aiming to compete in the market currently led by Cursor. Other major industry moves include Jeff Bezos returning as CEO for Project Prometheus with $6.2B in funding, OpenAI launching GPT-5.1-Codex-Max, and Elon Musk's Grok-4.1 release. The broader trend suggests that while Nvidia sells the hardware, Google is building a self-sufficient ecosystem that could redefine the economics of AI at scale.

Key Takeaways

  • Vertical integration through proprietary TPUs allows Google to bypass the Nvidia supply chain bottleneck and improve AI economics.
  • The battle for the frontier has moved from just model size to specialized architecture and training efficiency.
  • Agentic IDEs like Antigravity and Cursor represent the next major distribution channel for LLMs in professional workflows.

AI Agent Browsers: Should you use one? | ChatGPT Atlas vs Perplexity Comet vs Arc Dia

The discussion evaluates three prominent agentic browsers, ChatGPT Atlas, Perplexity Comet, and Arc Dia, focusing on their utility for product managers and power users. ChatGPT Atlas stands out as the most agentic tool, capable of performing complex browser-based operations like filling out job applications from a resume or scraping LinkedIn profiles for contact information. While Atlas is slower because it visually processes the screen to interact with elements, its ability to bypass traditional scraping hurdles makes it a powerful utility for outreach and automation. It can even handle multi-step forms and generate personalized responses based on the context of the page. Perplexity Comet is positioned as the premier research tool, excelling at real-time data aggregation and comparison. It features deep integration with Google Sheets, allowing users to populate spreadsheets with live pricing data or research findings. Comet leverages browser extensions like Honey to access historical price data, making it ideal for competitive analysis and procurement tasks. It is particularly efficient at consolidating information from multiple open tabs into a single, structured output. Arc Dia offers the most polished user experience and specializes in tab context, effectively identifying relevant information across numerous open tabs without requiring the user to manually point to specific sources. Dia is particularly valuable for teams within the Atlassian ecosystem, as it integrates directly with Jira to automate ticket creation from GitHub repositories or Loom bug reports. This allows a PM to turn a recorded bug walkthrough into a fully populated ticket in seconds. Despite their strengths, these browsers face challenges with dark patterns in web design, such as complex cancellation flows, and niche captchas involving rotation or dragging. Privacy remains a significant consideration, as these tools require high-level access to screen content and login credentials to function effectively. Currently, these agentic features are often available for free or included in standard pro plans as companies compete for user adoption, representing a significant alpha opportunity for early adopters to automate hours of manual research and data entry.

Key Takeaways

  • The primary differentiator between these browsers is the tab context capability, where the AI treats all open tabs as a unified operating system to synthesize information without manual copying.
  • Agentic browsers are currently outperforming traditional scrapers by using visual reasoning to navigate dark patterns and interactive elements that typically break automated bots.
  • The integration of AI agents with existing SaaS ecosystems like Atlassian or Google Workspace transforms the browser from a viewing portal into an active participant in the workflow.
  • Early adoption provides a significant competitive advantage in token-intensive tasks like deep research and lead generation, which are currently subsidized by providers to drive user growth.

“Product Management isn’t going to exist in 5 years” - 2x CPO

Abishek Viswanathan, former CPO at Apollo.io and Qualtrics, argues that traditional product management will be obsolete within five years. The role is shifting toward a hybrid model called product engineering, where the gap between customer insights and code is bridged by AI. Tools like v0 by Vercel and Cursor allow PMs to turn specifications into functional, deployable prototypes in minutes rather than weeks. This shift eliminates the three degrees of separation that traditionally existed between engineers and customer feedback. Engineers are now expected to spend a third of their time directly with customers, while PMs must become technical enough to build and deploy their own hypotheses. The technical bar for PMs is rising significantly. Viswanathan emphasizes that understanding relational databases, SQL, and modern architecture like Kafka or GraphQL is no longer optional. This technical depth prevents PMs from proposing utopian, unfeasible solutions or overly simplistic ones that fail to leverage technology. He introduces the Expected Outcome framework from his time at Zynga, which requires builders to predict specific metrics like daily revenue and active user engagement before a single line of code is greenlit. This approach ensures every engineering hour is maximized for ROI. Beyond product teams, go-to-market functions are also transforming. Sales is moving away from order-taking toward consultative, technical problem-solving where reps might walk into meetings with custom-built prototypes. Customer Success is evolving into value realization, focusing on deep implementation and ensuring customers hit their specific business goals. The rise of founder mode, supported by rapid AI iteration, allows leaders to quickly separate losing ideas from winning instincts. Ultimately, the future organization will be leaner, with fewer specialized roles and more high-leverage builders who understand the entire stack from customer pain to platform architecture.

Key Takeaways

  • The traditional PM role is merging with application engineering because AI tools have lowered the barrier to coding and prototyping for non-engineers.
  • Non-technical PMs face a high risk of obsolescence unless they master SQL and relational database logic to perform their own feasibility and impact analysis.
  • Designers are shifting away from individual feature wireframes toward building horizontal platform systems that reduce user cognitive load across the entire product.
  • The Expected Outcome framework is critical for avoiding the feature factory trap by forcing teams to kill ideas that do not meet pre-defined ROI thresholds.
  • GTM teams must become technical consultants who can build functional demos and manage complex change management within customer organizations.

Build a team OS with Claude Code - by Aakash Gupta

Hannah Stulberg, a PM at DoorDash, outlines a framework for creating a Team OS using Claude Code. This system treats a team's shared knowledge as a code repository, allowing AI agents to navigate and act on product context efficiently. The core of the system is the Claude.md file, which acts as a guiding index. Stulberg emphasizes a nested structure where each folder has its own index. This approach minimizes context consumption, ensuring the AI maintains thinking room for complex reasoning rather than filling the context window with irrelevant data. The repository includes folders for product strategy, analytics (SQL schemas and metrics), and engineering (RFCs and bug reports). A key part of the workflow involves non-technical partners using GitHub pull requests to update the repository. This ensures the AI always works from the latest context. Stulberg also details an advanced planning process using Claude's Plan Mode. This involves setting up multi-phase tasks, parallelizing research agents, and requiring verification steps before the AI executes a final document. The goal is to move away from simple prompting toward a more structured, agentic workflow that scales a single PM's impact across larger engineering teams.

Key Takeaways

  • Context management is the primary skill for AI native teams. By keeping Claude.md files lean and using nested indexes, teams prevent context rot and ensure the model has enough room to reason effectively.
  • The PM role is evolving into a context architect. Instead of just writing docs, PMs now ensure that analytics schemas, customer feedback, and strategy are structured in a way that AI agents can autonomously query and synthesize.
  • GitHub is no longer just for code. It serves as the version controlled source of truth for team context, where non-technical partners like designers and ops leads contribute via pull requests to keep the AI knowledge base fresh.
  • High quality AI output requires a shift from prompting to planning. Using Claude's Plan Mode to parallelize research and set verification checkpoints prevents the model from rushing into low fidelity execution.

FAANG PM Reveals How to Build AI Agents (and Get Paid $750K+)

Mahesh Yadav, a product leader with experience at Meta, Google, Amazon, and Microsoft, explains how to build AI agents and secure high-paying PM roles. The future of product development is agentic, with top positions now commanding over $750,000. The technical demonstration uses Langflow for backend logic and V0 for the frontend to build a competitive analysis tool. This approach uses Tavily for AI-optimized search and OpenAI for processing, showing that PMs can build functional prototypes without deep coding knowledge. The process involves defining inputs like competitor names, setting up a search tool, and writing a structured system prompt that includes roles, instructions, and guardrails. The cart before the horse development model is a central theme. Instead of traditional six month research cycles, Yadav advocates for prototyping for a few weeks first. This lets PMs show customers what is possible and iterate based on real usage before writing a PRD. Vibe coding in interviews is about demonstrating product taste and the ability to iterate rather than just writing code. Interviewers look for builders who have handled data responsibly and understand model limitations. The 18 month career roadmap begins with mastering agent concepts like memory and guardrails. PMs should then build side projects with real users and eventually contribute to open source communities. FAANG cultures differ significantly, with Microsoft focusing on innovation, Amazon on business results and P&L, Meta on speed in experimentation, and Google on UX magic. Every PM should build agents for customer feedback analysis, A/B test simulations, and document reviews to stay ahead. The evolution of AI is moving from simple chatbots to multi-agent and multimodal systems that can autonomously change the state of the world through API calls.

Key Takeaways

  • The cart before the horse method is a strategic pivot for AI GTM where prototyping cost is low but customer expectations are undefined. It replaces long research phases with rapid cycles that use the prototype as the primary tool for requirement gathering.
  • PM technical skills are shifting toward prompt architecture and evaluation. Being able to structure roles, goals, and guardrails in a system prompt is more valuable than traditional engineering skills.
  • Multi-agent systems are the next big wave for 2025. Moving beyond simple Q&A to agents that can actually change the state of the world through API calls is where the real value lies.
  • Building credibility in AI requires public work. Running side-by-side evaluations of models like Llama or GPT and sharing those insights is the fastest way to get noticed by top companies.

Claude Code Masterclass for PMs with Carl Vellotti

Claude Code hit a billion dollar ARR milestone in record time because it focuses on deep work and coding rather than general chat. Carl Vellotti and Aakash Gupta show how product managers can use this tool to move from manual document management to building agentic systems. The core of this transition is the Model Context Protocol (MCP). MCP is a new standard that lets LLMs connect directly to services like Linear, Google Workspace, and Notion. By setting up these servers, a PM can stay in the terminal to pull data in and push work out without constant context switching. The masterclass walks through an end-to-end feature launch. It starts with drafting a user research survey and pushing it to Google Docs. After collecting responses, Claude analyzes the data and writes a PRD. A major advantage here is parallelization. Claude can spin up sub-agents to build a multi-slide Google presentation and 19 engineering tickets in Linear at the same time. This is much faster than standard chat interfaces because the CLI handles multiple work streams simultaneously. The session also covers skills and hooks. Skills are reusable prompt templates that trigger automatically for specific tasks like research synthesis. Hooks allow for automated actions, such as sending a Mac notification when a long task finishes. Finally, they look at GitHub integration. By adding Claude as a GitHub app, PMs can manage their codebase and documentation remotely by tagging the AI in issue comments. This shift means the next generation of PMs will be AI-native builders who have direct access to repositories and use agentic workflows to drive product growth.

Key Takeaways

  • MCP servers are replacing traditional API integrations for AI workflows because they provide a standardized set of tools that LLMs understand out of the box.
  • The parallelization capabilities of Claude Code allow it to act as a project manager that delegates tasks to sub-agents, making complex deliverables like slide decks or ticket backlogs possible in minutes.
  • Product managers are evolving into technical builders who use GitHub as a coordination layer for AI agents rather than just a place for engineers to store code.

How PMs Build Composable AI Operating Systems

Mike Bal, Head of Product at David’s Bridal, outlines a framework for an AI native PM operating system centered on composable architecture and the Model Context Protocol (MCP). The core shift involves moving away from logging into dozens of separate UIs and instead using a centralized context layer like Claude Desktop or Cursor to interact with tools like Jira, GitHub, Confluence, and Figma. By using MCP, PMs can perform complex tasks such as comparing a PRD in Confluence against a live Figma design to identify gap analyses without leaving their primary workspace. Bal emphasizes that Claude has become the preferred tool over ChatGPT due to its superior writing capabilities and more reliable model performance, particularly for deep product work. The workflow extends into prototyping and 'vibe coding' using Google AI Studio and Reforge Build. These tools allow PMs to move from a conceptual image or prompt to a functional, hosted app in minutes, which can then be pulled into Cursor for further refinement. This technical agility enables PMs to communicate edge cases and functional requirements to designers and developers more effectively. Bal also discusses the 'Manus' tool for independent research agents that provide a transparent trace of their thinking, which is more effective for context gathering than standard LLM research modes. To manage the high cost of various AI licenses, he suggests a usage based approach using API keys and wholesale rates rather than blanket enterprise subscriptions. Ultimately, the success of this operating system relies on maintaining high product taste and skepticism, ensuring that AI remains an extension of the PM's intuition rather than a replacement for intentionality.

Key Takeaways

  • The transition from UI hopping to a centralized context layer via MCP allows PMs to maintain a flow state while querying disparate data sources like Jira and GitHub.
  • Vibe coding and rapid prototyping in Google AI Studio enable PMs to demonstrate functional intent to engineering teams, reducing the friction of traditional handoffs.
  • Product taste remains the ultimate differentiator as AI lowers the barrier to output, making the PM's ability to red team and defend their logic more critical than ever.
  • A composable tool stack allows for modularity where PMs can hotwire specific APIs for one off projects without committing to long term enterprise software bloat.

The Claude Code Tutorial for AI PMs: Why You Need to Use It + How

This tutorial explores Claude Code, a terminal-based tool from Anthropic that shifts the focus from prompt engineering to context engineering. Unlike standard chat interfaces, Claude Code operates directly within a local file system, allowing it to read, search, and edit files while maintaining a deep understanding of project structure. Carl Vellotti demonstrates how product managers can use this tool to automate high-friction tasks like writing PRDs, summarizing meeting transcripts, and performing competitive research. A core feature is the initialization process, which creates a persistent memory file to store business context, writing styles, and specific project rules. This ensures the AI consistently follows brand guidelines and technical constraints without needing repetitive prompting. The discussion covers advanced agentic capabilities, including Plan Mode, where the AI generates a checklist for user approval before executing complex tasks. This prevents the AI from making unauthorized changes or getting stuck in loops. Vellotti also highlights the ability to run parallel sub-agents, such as spinning up multiple instances to analyze different customer interviews simultaneously. The tool integrates with the Model Context Protocol (MCP) to pull data from external sources like Reddit, Slack, and Google Drive. Beyond documentation, the tutorial touches on vibe coding, showing how PMs can use Claude Code to build functional prototypes from simple specs. Vellotti also shares his personal workflow for Meme Mage, an AI-powered tool he built to maintain consistency for his large Instagram following by using LLMs to match video templates with specific PM personas.

Key Takeaways

  • Claude Code represents a shift toward context engineering where the AI has persistent access to a local file system and project-specific rules.
  • The Plan Mode feature is essential for complex workflows because it forces the AI to propose a sequence of actions for human review before execution.
  • Product managers can achieve higher quality outputs by creating dedicated markdown files for business info and writing styles that the AI references automatically.
  • The tool supports parallel processing through sub-agents, allowing users to delegate multiple distinct research or analysis tasks at once.
  • A strategic distinction exists between tactical agents like Claude Code for one-off tasks and automation platforms like n8n for recurring workflows.

AI-Powered Discovery Guide with Caitlin Sullivan

Caitlin Sullivan details a high-precision workflow for using AI in user discovery, emphasizing that the tool is only as effective as the human methodology it replicates. The most common failure in AI-driven research is jumping directly to synthesis. To avoid this, Sullivan advocates for a multi-step process: loading context, performing granular analysis, conducting a verification audit, and finally synthesizing the findings. She prefers Claude over other models like GPT-4 or Gemini because it consistently provides more nuanced and thorough qualitative analysis. In the interview analysis phase, the workflow focuses on identifying value anchors, which are the specific features or outcomes keeping a user subscribed, and fragile points, which represent frustrations that exist despite a current subscription. These elements are used to generate a stability rating that predicts churn risk. Sullivan uses Markdown files for transcripts to provide better structure for the LLM and to bypass token limits often encountered with raw text or PDFs. For survey analysis, the guide highlights the importance of inductive coding. This involves letting themes emerge naturally from the customer feedback rather than forcing responses into a pre-defined taxonomy. This open coding approach ensures that the AI does not force-fit data into incorrect categories, which often happens when users provide a list of tags upfront. Sullivan also includes a specific instruction to have the AI use code for any mathematical calculations or frequency counts to ensure accuracy. A standout feature of this methodology is the audit or verification step. This involves forcing the model to perform a second pass to identify contradictions in user statements or potential biases in its own initial analysis. This step is designed to prevent cherry-picking the most convenient stories and ensures the final insights are bulletproof. Finally, for advanced practitioners, Sullivan demonstrates how to use Claude Code in the terminal to parallelize these workflows. By using agentic Markdown files and system prompts, a researcher can analyze multiple data streams simultaneously, significantly reducing the time to insight without sacrificing the rigor of the discovery process.

Key Takeaways

  • AI research works best when it mirrors manual, high-quality research steps. Skipping straight to the answer leads to hallucinations and missed nuance.
  • The audit step makes insights bulletproof. Forcing the AI to find its own mistakes or user contradictions prevents you from presenting flawed data to stakeholders.
  • Inductive coding is the best way to find new opportunities. Preset categories often blind teams to the real reasons customers are churning or staying.
  • Moving to terminal-based agents like Claude Code allows for parallel processing. You can triangulate survey and interview data simultaneously rather than sequentially.

AI Evals Explained Simply by Ankit Shukla - by Aakash Gupta

AI features often fail because teams don't evaluate them properly. Unlike traditional software, LLMs are stochastic, meaning they are non-deterministic and can give different answers for the same input. Ankit Shukla argues that the AI PM must act as a ringmaster to tame this behavior. Prototypes usually fail to scale due to data drift, high costs, engineering limits, lack of guardrails, or poor collaboration. Evals solve these issues by creating a feedback loop that allows PMs to verify quality and potentially switch to cheaper models without losing performance. The framework starts with defining success criteria and expected behavior. PMs then build a base product with system prompts and orchestration layers. The most critical step is creating a high-quality dataset using past logs, expert input, and synthetic data. Evals are categorized into offline and online. Offline evals are run before launch to ensure the product meets the PRD requirements. Online evals monitor production data for drift. Metrics include code-based checks for length, traditional NLP scores like BLEU or ROUGE, and modern LLM-as-a-judge prompts. In a financial case study, evals ensure an AI analyst doesn't cross legal lines by giving direct investment advice. Ultimately, evals are not just fancy QA: they are a transformational tool for PMs to guide engineering teams through hill climbing toward acceptable performance thresholds.

Key Takeaways

  • Evals function as the modern PRD for AI products. Instead of just listing features, PMs define the evaluation sets that engineers must optimize against to reach production readiness.
  • Cost optimization is impossible without automated evals. You can only safely move from a high-cost model like GPT-4o to a cheaper small language model if you have a robust eval suite to prove quality hasn't dropped.
  • The Golden Dataset is your most valuable intellectual property. Collecting real-world edge cases and expert-verified correct answers is more important for long-term success than the specific model you use today.
  • Offline evals are for development while online evals are for production. You need both to catch data drift and ensure that user expectations haven't shifted since the initial launch.

We Built an AI Product Manager in 58 mins (Claude, ChatGPT, Loom + Notion AI)

Tal Raviv and Aakash Gupta demonstrate how product managers can build a personalized AI copilot to automate high-effort tasks and improve strategic decision-making. The core approach relies on using Claude Projects or ChatGPT GPTs to house deep context, including company vision, target customer profiles, org charts, and personal performance reviews. This setup allows the AI to move beyond generic responses and provide highly specific, relevant advice. One primary use case is qualitative data analysis, where Raviv suggests uploading CSVs of app store reviews or survey results. Instead of simple summaries, he recommends prompting the AI for specific patterns illustrated by exact quotes and outliers to maintain a sharp, realistic view of user sentiment. The discussion covers several practical workflows for the modern PM. For time management, Raviv shows how uploading screenshots of a calendar to an LLM can generate a time audit, identifying gaps in deep work and visualizing meeting loads via pie charts. For technical growth, PMs can use AI as an interactive tutor for concepts like Retrieval-Augmented Generation (RAG), adapting the explanation to their specific company context. The conversation also explores using AI to simulate difficult stakeholder conversations, allowing PMs to practice their responses and see the impact of their phrasing from the other person's perspective. Beyond AI tools, the episode highlights productivity 'hacks' for managing the 'unfair' PM role. These include 'product scrapbooking' in Notion to continuously collect discovery artifacts, using Loom for sub-one-minute context sharing to replace meetings, and optimizing Slack through custom sections like 'Critical Initiatives' and 'Dopamine.' The goal is to move from reactive work to a proactive, builder-oriented mindset by leveraging automation and structured knowledge management.

Key Takeaways

  • Context is the primary lever for AI utility. Moving from generic outputs to high-value assistance requires feeding the LLM internal company documents, personal feedback, and specific team org charts.
  • Avoid the 'CEO summary' trap in data analysis. Forcing the AI to provide exact quotes and identify outliers prevents the loss of emotional nuance and specific pain points often buried in aggregate summaries.
  • Asynchronous context sharing via ultra-short Looms is a superior alternative to long meetings. Keeping videos under 60 seconds increases the likelihood that stakeholders will actually watch them before a sync.
  • Continuous discovery scrapbooking solves the 'cold start' problem. By maintaining a messy but searchable repository of feedback in Notion, PMs can provide immediate evidence and context the moment a new initiative is prioritized.

Most People are Building AI Products Wrong - Here's How to do it Right

Most product teams are stuck in AI theater, bolting assistants onto sidebars as party tricks that users rarely touch twice. The real opportunity is building AI-native products where intelligence is invisible and integrated into the core workflow. Attio, a CRM startup taking market share from giants like Salesforce and HubSpot, serves as the primary case study. Attio has achieved significant scale with over 200 million customer records and a $64M Series A by rethinking the CRM for the AI era. They focus on invisible intelligence, where features like automatic workflow naming or meeting sequence adjustments happen without flashing badges or new interaction patterns. The framework for building these products consists of three facets. First, democratize experimentation across the entire organization. At Attio, engineers and non-technical staff have access to enterprise AI accounts and are encouraged to run casual experiments. This bottom-up approach often uncovers more value than top-down roadmaps, with engineers frequently building prototypes over weekends. Second, bake AI into the cake rather than just adding sprinkles. This means intelligence is part of the product architecture, solving specific user problems like generating real-time insights from call recordings rather than just providing a raw transcript. They prioritize the data and insights over the transcription technology itself. Third, build invisible infrastructure. This involves creating a flexible framework that handles model selection, prompt engineering, and cost optimization behind the scenes. This allows the product team to focus on solving customer needs rather than wrestling with LLM technicalities. Extensive dogfooding is also critical. For instance, the team refined their call intelligence summaries from twenty lines down to three after using the feature in their own standups. Finally, while quantitative evaluations are useful, human judgment remains essential for qualitative features. Success is found when technology anticipates needs and solves problems before the user even asks.

Key Takeaways

  • Invisible intelligence wins by reducing cognitive load and making AI features feel like natural extensions rather than optional tools.
  • The distinction between AI cake and AI sprinkles defines the difference between long-term retention and short-term marketing buzz.
  • A distributed experimentation culture acts as a technical moat by uncovering high-value use cases that siloed teams would likely miss.
  • Building flexible infrastructure that abstracts model complexity allows teams to pivot quickly as LLM capabilities and costs evolve.

Tutorial of Top 5 AI Prototyping Tools: Bolt, Lovable, v0, Replit, and Cursor

AI prototyping enables product managers to validate solutions in hours or days instead of months. This shift empowers PMs to build features on top of existing products, gather direct customer feedback, and align internal teams using interactive prototypes rather than static mocks. The workflow often begins with a screenshot of an existing UI, which tools like Bolt can use to generate a PRD and initial code structure. Effective use of these tools requires an iterative approach. Instead of requesting a full application in one prompt, users should break projects into small components. Techniques like reflection, which involves asking the AI to review its own work, and planning, which requests a plan before code, significantly improve output quality. For instance, Bolt is highly customizable and excels at UI replication, while Cursor provides an AI-powered IDE environment for deeper debugging and model switching between Sonnet 3.5 and O1. Tool selection depends on the project's technical requirements. Bolt, Lovable, and v0 primarily handle client-side applications. Lovable is particularly strong for Figma to code conversions, using metadata to create near-identical replicas of designs. For full-stack needs involving servers, databases, and authentication, Replit is the preferred choice. It features an Agent mode that can automate complex tasks like setting up PostgreSQL databases and running SQL commands to verify data integrity. Ultimately, these tools serve as communication assets. They allow PMs to test the experience of a product, especially for AI-native features where user interaction with an LLM is central. By integrating prototypes into the Opportunity Solution Tree framework, teams can test hypotheses faster and have deeper discovery conversations with customers. While technical skills enhance the process, natural language remains the primary interface.

Key Takeaways

  • AI prototyping shifts the PM's role from writing requirements to demonstrating experiences. This reduces the gap where engineers only read part of a document by providing a clickable, interactive reference.
  • The Screenshot to PRD workflow in Bolt creates a living document that maintains context better than a standard chat history. This helps the AI stay aligned with project goals during long-term iterations.
  • Context management is the primary bottleneck for complex AI builds. Tools like Cursor that score file relevance are better for large codebases, while web-based tools like Bolt are better for isolated feature experiments.

Complete Course: AI Product Management

This technical deep dive covers the essential skills required for modern AI product management, focusing on the transition from traditional PM roles to AI-specialized positions. The discussion highlights why AI PMs command higher salaries, primarily due to their ability to bridge business strategy with technical execution in emerging technologies. Key technical areas explored include advanced prompt engineering, fine-tuning, Retrieval-Augmented Generation (RAG), and the Model Context Protocol (MCP). In the realm of prompting, the focus is on moving beyond basic queries to high-level techniques like providing deep context, assigning specific personas (PM, Designer, Engineer), and using step-by-step reasoning. A unique tactic mentioned involves offering the AI a hypothetical financial reward for high-quality output, which often results in more detailed responses. For product documentation, the AI PRD is introduced as a tool for organizational alignment, emphasizing market opportunity, strategic fit, and AI-specific metrics like evaluation audits and guardrails. The technical walkthroughs demonstrate practical implementations using tools like n8n for workflow automation, Pinecone for vector databases, and Lovable for frontend development. Fine-tuning is presented as a method to internalize brand voice and reduce costs by using smaller models like GPT-4o mini. RAG is showcased as the solution for querying massive datasets that exceed context windows. Finally, the session covers the Model Context Protocol (MCP) as a standard for agentic interoperability, demonstrating how an AI can automatically generate Jira epics and stories directly from Figma design files. The discussion concludes with a look at autonomous agents that can perform deep market research, planning and executing multi-step tasks independently.

Key Takeaways

  • AI PMs are increasingly valued for their ability to manage technical trade-offs between model performance, latency, and cost, specifically through fine-tuning and RAG architectures.
  • Fine-tuning smaller models like GPT-4o mini on specific datasets is a superior strategy for maintaining brand voice and reducing token costs compared to long-context prompting.
  • The Model Context Protocol (MCP) is a critical standard for agentic workflows, allowing LLMs to interact with enterprise tools like Jira and Figma without custom API integrations for every task.
  • Effective AI product discovery requires a 'product trio' perspective where the AI evaluates ideas based on value, usability, viability, and feasibility from different expert personas.

How to Vibe PM with Claude Code and Your Analytics Data

Frank Lee, Principal PM at Amplitude, outlines a high leverage workflow called Vibe PMing that uses Claude Code, Cursor, and the Model Context Protocol (MCP) to automate the product management lifecycle. The core of this approach involves connecting AI agents directly to product data and external tools to handle manual tasks like data synthesis and spec writing. Lee uses Claude Code as a terminal based agent and Cursor as his primary IDE, linking them to Amplitude for analytics, Linear for task management, and Granola for meeting notes. A central technical concept is the use of Skills in Claude Code, which are metadata rich prompts that teach the model specific heuristics for tasks like identifying anomalies in charts or looking for seasonality in data. The workflow covers five primary use cases: deep chart analysis to explain metric spikes, automating weekly business reviews (WBRs), synthesizing qualitative feedback from sources like Zendesk and Slack, converting those insights into PRDs using markdown templates, and either prototyping the solution in code or routing it to engineering. Lee also addresses context management, recommending the use of markdown files to transfer state between sessions and the compaction feature in Claude Code to handle long conversations. Looking forward, Amplitude is launching a suite of embedded agents and specialized sub agents for session replays and website optimization, aiming to make data navigation accessible through natural language rather than complex UI menus. This shift allows PMs to move from being data gatherers to orchestrators who focus on strategy and solutions rather than manual reporting.

Key Takeaways

  • MCP transforms the PM role from manual data extraction to high level orchestration by allowing agents to navigate complex data taxonomies that previously required expert knowledge.
  • The Vibe PMing stack creates a local first product operating system where specs, meeting notes, and prototypes live in a version controlled GitHub repo for seamless context retrieval.
  • Skills in Claude Code represent a new layer of prompt as tool where specific business heuristics are encoded into the agent behavior to ensure consistent analysis quality.
  • Context window limitations are effectively managed through state transfer via markdown files and compaction, though this manual overhead will likely decrease as model windows expand.
  • The future of product work involves agents everywhere, including Slack and IDEs, where PMs can trigger complex data analysis or feature flag edits through simple natural language commands.

How to Write a Product Strategy in 1 Day / 1 Week / 1 Month

Product strategy is the rationale for what a team works on and, more importantly, what they choose to ignore. It serves as the link between daily tasks and the broader company vision. Rather than a static document created through months of research, strategy should be treated as a continuous activity. The framework suggests starting with a Stupid Wild-Ass Guess (SWAG) within the first two weeks of a role and refining it over time. This iterative approach allows for faster feedback, identifies evidence gaps, and provides a working hypothesis for immediate action. This framework breaks down into seven core sections. First, the Objective combines a mission statement with a specific metric, such as moving ARR from $125m to $150m. Second, understanding Users involves Jobs To Be Done (JTBD) to identify what progress customers want to make and Customer Journey Mapping to visualize their experience. Third, Superpowers identify unique, hard-to-copy advantages like network effects, scale economies, or switching costs. Fourth, the Vision creates an inspiring future state, often through a visiontype or interactive prototype. Fifth, Pillars are the two to four main themes of work that bridge the gap between the current state and the vision. Sixth, Impact involves modeling how these pillars drive business value using driver trees. Finally, the Roadmap provides visibility on execution, moving from high-confidence near-term tasks to loosely defined future ideas. The depth of each section scales with time. A Snap Strategy (1 day) relies on intuition and existing docs. A Working Hypothesis (1 week) incorporates stakeholder feedback and initial user interviews. A full Product Strategy (1 month) uses quantified data, surveys, and evidence from shipped features to validate the direction. This approach balances the need for speed with the necessity of reducing uncertainty.

Key Takeaways

  • Strategy as a Living Hypothesis: Moving from a one-off planning mindset to a continuous refinement cycle prevents analysis paralysis and allows for real-world testing of GTM assumptions.
  • The Power of No: A successful strategy is primarily a tool for prioritization that provides a clear rationale for rejecting requests that do not align with high-impact pillars.
  • Superpowers as Moats: Identifying hard-to-copy advantages early is critical for maintaining margins and preventing the product from becoming a commodity in competitive SaaS markets.
  • Visualizing the Vision: Using visiontypes or prototypes instead of just text helps align stakeholders and teams on the intended user experience more effectively than static documents.

Free PM MBA: Roadmap to PM & AI Mastery - by Aakash Gupta

Aakash Gupta organizes 90 podcast episodes into a structured curriculum designed to bridge the gap between traditional product management and the emerging AI landscape. The content is split into two primary tracks: a Complete PM MBA and an AI PM Blueprint. The PM MBA track covers foundational concepts from industry leaders like Marty Cagan and Melissa Perri, moving through advanced strategy, growth systems, and executive leadership. It includes specialized modules for B2B product management, SaaS pricing, and go-to-market strategy. The AI PM Blueprint provides a tiered progression for mastering AI product roles. It starts with foundations and moves into technical implementation, covering AI engineering for PMs, prototyping tools, and evaluation frameworks. Notable sections focus on AI workflows and automation, featuring topics like the Model Context Protocol (MCP) and vibe coding. The curriculum also addresses the business side of AI, including building AI startups and navigating the AI PM job market. The resource emphasizes practical insights from real builders rather than just high-profile executives. It covers specific methodologies such as the Lean Product Playbook, Product-Led Growth mastery, and the Linear Method. For career growth, it includes guidance on moving from individual contributor roles to VP and CPO levels, as well as navigating the fractional CPO market. The collection serves as a comprehensive knowledge base for anyone looking to integrate agentic infrastructure and AI automation into their product management workflow.

Key Takeaways

  • The product management role is rapidly evolving into a technical hybrid position where understanding AI engineering, evals, and prototyping is essential for senior leadership.
  • Modern growth mastery requires a deep integration of product-led growth principles with AI-driven automation and sophisticated data strategy.
  • Career progression in the current market favors PMs who can demonstrate builder skills like vibe coding and the ability to deploy AI employees or automated workflows.
  • The roadmap highlights a shift toward fractional leadership and specialized B2B positioning as viable paths for experienced product executives.

Frequently Asked Questions

  • Given that Sahil Lavingia and Nadav Abrahami advocate for replacing PRDs with rapid AI prototypes using tools like v0 and Dazzle, how should PMs reconcile this 'ship first' velocity with Rachel Wolan and Xiankun Wu's strict requirement for 'MVO (Minimal Viable Output) before MVP,' which demands perfecting RAG and context engineering before any UI is built?
  • Ankur Goyal and Hamel Husain argue that 'evals are the new PRD' and require rigorous 'axial coding' and binary scoring of traces, but PMs are increasingly relying on 'vibe coding' and functional prototypes to bypass traditional documentation. In light of this, how can teams enforce strict evaluation rubrics without bottlenecking the rapid iteration cycles enabled by tools like Cursor and Replit?
  • Carl Vellotti and Hannah Stulberg recommend building a comprehensive 'Team OS' using CLAUDE.md files and multiple MCP servers to give agents deep organizational knowledge. However, Frank Lee and Todd Olson warn that too much context causes 'context bloat' and degrades LLM performance. How should PMs architect their 'context engineering' to provide sufficient background without crippling the model's reasoning capabilities?
  • Jack Hirsch from Okta identifies AI agents as the biggest enterprise security blindspot, advocating for a 'T-Shaped Identity Strategy' and continuous session monitoring. As PMs build autonomous, multi-agent workflows using tools like n8n or Gigawatt, how can they implement these strict cross-app access standards without introducing friction that defeats the purpose of agentic automation?
  • Kyle Poyar notes that AI features can destroy gross margins—dropping them from 80% to 15%—and recommends hybrid pricing models, yet he also stresses that features fail if they don't drive retention through 'hill climbing' quality improvements. How should PMs balance the need to subsidize expensive model calls to achieve 'deep delight' (as described by Nesrine Changuel) with the commercial reality of maintaining profitable unit economics?
  • Harish Mukhami demonstrates using O3 Mini for planning and Claude Sonnet for coding, while recent updates highlight Kimi K2's superiority in 'interleaved reasoning' and DeepSeek V3.2's 70% cost reduction. Given this fragmentation, how should PMs design their 'Agent Architecture' to dynamically route tasks to the most efficient model rather than defaulting to a single frontier model like Gemini 3?
  • Rachel Wolan champions the 'IC CPO' model where product leaders use Claude Code and Snowflake MCPs to self-serve analytics, while Dan Olsen warns against the 'Jira Jockey' trap where PMs get bogged down in execution. How can PMs leverage these advanced agentic workflows to increase their technical leverage without accidentally transforming their role into full-time agent orchestration and technical debugging?
  • Elizabeth Laraki emphasizes that AI introduces non-deterministic risks (like the Google image expander disaster) requiring visible AI safeguards, whereas Anthony Pierri argues for strict, 'first-order benefit' positioning that promises immediate, tangible outcomes. How can PMs market the specific, guaranteed benefits of their product while designing UX flows that must inherently accommodate unpredictable, probabilistic AI outputs?