Vibe Coding

Vibe Coding

Updated March 6, 2026

The Complete Guide to AI-Native Software Development

22 chapters. 200+ prompts. Updated monthly. The only vibe coding resource that evolves as fast as the field.

0
In-depth chapters
0
Production-ready prompts
0
Security CVEs analyzed
0
Tools compared
📅 Updated March 2026 📈 Monthly updates for subscribers 🎓 Part of the EndOfCoding ecosystem
0%
of developers using AI tools
$0B
Claude Code annual revenue
0M
GitHub Copilot paid users
$0B
AI coding tools market (2026)

Choose Your Plan

The vibe coding landscape changes every week. Your subscription keeps you current.

Free Preview
$0
  • ✓ First 3 chapters
  • ✓ 10 sample prompts
  • ✓ 2 video tutorials
  • ✓ Interactive quiz
↓ Start Reading Below
MOST POPULAR
Monthly
$9/mo
  • ✓ All 22 chapters
  • ✓ 200+ prompt library
  • ✓ Video tutorials
  • ✓ Monthly updates
  • ✓ Tool comparison matrix
  • ✓ Security playbook
Annual — Save 27%
$79/yr
  • ✓ Everything in Monthly
  • ✓ Bonus resources
  • ✓ Early access to new content
  • ✓ Priority support

30-day money-back guarantee. Cancel anytime. Payments handled securely by Lemon Squeezy (Merchant of Record). All prices in USD.

📰
EndOfCoding.com
Thought leadership & articles
🎓
Vibe Coding Academy
Interactive courses & lessons
🎥
YouTube @endofcoding
Video tutorials & demos

Frequently Asked Questions

Everything you need to know before you start.

What exactly is vibe coding?
A term coined by Andrej Karpathy in February 2025 for a new development style where you describe what you want in natural language, and AI tools generate the code. It ranges from AI-assisted autocomplete to fully autonomous AI agents building entire applications. This ebook covers all five levels in depth with real data, case studies, and 200+ production-ready prompts.
Who is this ebook for?
Developers exploring AI tools, engineering managers evaluating team adoption, entrepreneurs building products with AI, and anyone curious about the future of software development. Whether you use Cursor, Claude Code, GitHub Copilot, Bolt.new, or v0, this guide covers your tools and workflow.
How is the subscription different from a one-time purchase?
The vibe coding landscape changes weekly — new tools launch, security incidents emerge, pricing shifts. Your subscription includes monthly updates to all 22 chapters, new entries in the prompt library and tool comparison matrix, a fresh monthly intelligence brief, and new community showcase features. You always have the most current resource in a fast-moving field.
What do I get in the free preview?
The first 3 chapters are completely free: the origin story of vibe coding, a precise definition and framework, and the underlying philosophy. You also get the interactive quiz to find your vibe coding level, 10 sample prompts, and a glimpse of every chapter topic. No credit card required.
Can I cancel anytime?
Yes. Monthly and annual subscriptions can be cancelled at any time through your Lemon Squeezy billing portal. You keep access until the end of your current billing period. No questions asked, no hidden fees.
📖
How to read this ebook: Use the sidebar to navigate 22 chapters. Click expandable sections for deep dives. Take the interactive quiz to find your vibe coding level. Use Ctrl+K to search across all content. Chapters 1–3 are free — subscribe to unlock all 22.

01. The Moment Everything Changed

Updated March 6, 2026

On February 2, 2025, Andrej Karpathy — former OpenAI co-founder, former Tesla AI director, and one of the most respected voices in machine learning — posted what would become one of the most consequential tweets in software development history:

"There's a new kind of coding I call 'vibe coding', where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. I just see stuff, say stuff, run stuff, and copy-paste stuff, and it mostly works." — Andrej Karpathy, February 2, 2025

Within weeks, the term had gone viral. Within a month, Merriam-Webster added "vibe coding" as a slang and trending term. By December 2025, Collins English Dictionary named it their Word of the Year.

But vibe coding didn't just enter the dictionary. It entered the economy. It entered boardrooms. It entered the workflows of millions of developers. And it sparked one of the fiercest debates the software industry has seen in decades.

The Timeline

February 2025
Karpathy coins "vibe coding"
The tweet goes viral. Merriam-Webster adds it within weeks. Developers worldwide start experimenting.
March 2025
Y Combinator reveals the data
25% of YC Winter 2025 startups report codebases that are 95% AI-generated.
May 2025
Claude Code launches publicly
Anthropic's terminal-based coding agent goes GA. It will reach $1B ARR in 6 months.
May 2025
Lovable security vulnerability
170 of 1,645 apps built on the vibe coding platform found to expose personal data.
June 2025
Devin hits $73M ARR
Cognition's AI software engineer grows 73x in 9 months. Goldman Sachs adopts it.
July 2025
Wall Street Journal reports mainstream adoption
Professional software engineers are using vibe coding for commercial products.
August 2025
Google Jules exits beta
Google's async coding agent goes public. 2.28M visits, 140K+ code updates.
September 2025
The "Vibe Coding Hangover"
Fast Company reports senior engineers entering "development hell" with AI-generated codebases.
November 2025
Claude Code hits $1B ARR
One of the fastest-growing enterprise software products in history.
December 2025
Collins Word of the Year
"Vibe coding" is named Collins English Dictionary Word of the Year 2025.
December 2025
Tenzai security study
69 vulnerabilities found across 15 applications built by 5 major AI coding tools.
January 2026
"Vibe Coding Kills Open Source" paper
Researchers publish arXiv paper arguing vibe coding threatens the open-source ecosystem by reducing user engagement with maintainers. Tailwind CSS docs traffic down 40% from 2023.
January 2026
Cognition reaches $10.2B valuation
Cognition raises $400M Series C. Devin ARR passes $155M. Goldman Sachs, Citi, Dell, Cisco, Palantir among enterprise clients.
January 2026
GitHub Copilot reaches 4.7M paid users
Agent mode becomes default workflow for complex tasks. MCP support rolls out to all VS Code users.
February 2026
Claude Opus 4.6 launches with Agent Teams
Anthropic releases Opus 4.6 with agent teams in Claude Code — multiple AI agents working in parallel on different aspects of a project, coordinating autonomously.
March 2026
The Open Source Reckoning & Enterprise Adoption
Researchers warn vibe coding erodes open-source funding. Pega becomes first enterprise platform to brand its AI features as "vibe coding." Cursor 2.5 launches subagent architecture. GitHub Copilot opens multi-model access. Devin 2.2 achieves 67% PR merge rate.

02. What Vibe Coding Actually Is

Updated March 6, 2026

Strip away the hype, and vibe coding is a specific practice with specific characteristics.

Vibe coding is an AI-assisted software development approach where a developer describes what they want in natural language, an AI model generates the code, and the developer evaluates the result through execution rather than code review. The developer does not read, edit, or attempt to understand the generated code. They test whether it works, and if it doesn't, they feed the error back to the AI.

💡
**Key distinction:** In traditional AI-assisted development, the developer remains the author and the AI accelerates. In vibe coding, the AI is the author and the developer is the director.
</div>

Karpathy described his own workflow precisely:

"I 'Accept All' always, I don't read the diffs anymore. When I get error messages I just copy paste them in with no comment, usually that fixes it. If it doesn't, I just revert to the last working state and re-prompt with more context."

The Three Core Loops

Vibe coding operates on three nested feedback loops:

1
Loop 1: Generate and Test
**1.** Describe what you want in natural language
  **2.** Accept the generated code without reading it

  **3.** Run it

  **4.** Does it work? Ship it. Doesn't work? Move to Loop 2.

  This is the happy path. For simple features, you may never leave this loop.

</div>
2
Loop 2: Error-Driven Repair
**1.** Copy-paste the error message to the AI (no commentary needed)
  **2.** Accept the fix without reading it

  **3.** Run it again

  **4.** Repeat until resolved or move to Loop 3.

  Most errors resolve within 1-3 iterations of this loop. The AI sees the error, understands the context, and fixes it.

</div>
3
Loop 3: Revert and Rephrase
**1.** Revert to the last working state
  **2.** Describe the desired outcome differently, with more context

  **3.** Return to Loop 1

  This is the escape hatch. If the AI gets stuck in a loop of broken fixes, go back to a clean state and try a different approach. This is why checkpoints matter — always have a rollback point.

</div>

What Vibe Coding Is NOT

  • Not using GitHub Copilot for autocomplete — that's AI-augmented coding (Level 1)

  • Not asking ChatGPT to explain code — that's using AI as a learning tool

  • Not reviewing AI-generated code before accepting — that's AI-collaborative coding (Level 2)

  • Not no-code/low-code platforms — those use visual builders, not natural language to code

    Vibe coding is specifically: natural language in, code out, test behavior, never read the code.

03. The Philosophy: Trusting the Machine

Updated March 6, 2026

Vibe coding isn't just a technique. It's a philosophical stance about the relationship between developers and code.

The End of Code as Sacred Text

For decades, programming culture has treated source code as something to be crafted, reviewed, optimized, and understood. Code reviews are rituals. Clean code is a moral virtue. Understanding every line is a professional obligation.

Vibe coding rejects this entirely. It treats code as a disposable intermediary between human intent and running software. The code doesn't matter. The behavior matters.

This is not as radical as it sounds. Most software professionals already interact with layers of abstraction they don't fully understand:

04. The Spectrum: Five Levels of AI-Assisted Development

Updated March 6, 2026

Vibe coding is not binary. In practice, developers operate along a spectrum. Understanding where you sit — and where you should sit for a given project — is critical.

0
Level 0: Traditional Development
No AI at all
You write every line. You understand every line. No AI assistance of any kind. Increasingly rare but still essential for certain domains like embedded systems, cryptography, and kernel development.
  **When to use:** Security-critical code, regulatory requirements, environments where AI tools are prohibited.

</div>
1
Level 1: AI-Augmented Coding
You are the author. The AI is a fast typist.
You use AI for autocomplete, documentation lookup, and boilerplate generation, but you review and understand every line. Think: GitHub Copilot suggestions that you accept or reject with full awareness.
  **Tools:** GitHub Copilot, VS Code AI extensions

  **Code understanding:** 100% — you review everything

  **When to use:** Production code, team projects, anything you need to maintain

</div>
2
Level 2: AI-Collaborative Coding
You are the architect. The AI is the builder.
You describe features in natural language and get back substantial code blocks. You review the code, understand the approach, and make modifications. You might use Cursor's Composer or Claude Code for generating components, but you read the diffs.
  **Tools:** Cursor Composer, Claude Code, Codex CLI

  **Code understanding:** 70-90% — you review most things

  **When to use:** Professional development, startup codebases, any code that needs to scale

</div>
3
Level 3: Guided Vibe Coding
You are the product manager. The AI is the engineering team.
You describe what you want and accept most code without deep review, but you maintain a general understanding of the architecture. You spot-check security-sensitive sections. You understand the overall structure even if you don't read every function.
  **Tools:** Cursor Agent, Claude Code, Bolt.new

  **Code understanding:** 30-60% — architecture yes, implementation details no

  **When to use:** MVPs, internal tools, prototypes headed toward production

</div>
4
Level 4: Pure Vibe Coding
You are the client. The AI is the agency.
Karpathy's original vision. You describe, accept all, test, paste errors, repeat. You don't read diffs. You don't understand the code. You only care if it works.
  **Tools:** Bolt.new, Lovable, Replit Agent, v0

  **Code understanding:** 0-10% — you only test behavior

  **When to use:** Personal projects, throwaway prototypes, hackathons, idea validation

</div>
5
Level 5: Autonomous Agent Coding
You are the executive. The AI is the employee.
You don't even supervise in real-time. You assign tasks to AI agents that clone repos, create branches, write code, run tests, and open pull requests — all while you do something else. You review the final result.
  **Tools:** Devin, Google Jules, OpenAI Codex (cloud mode)

  **Code understanding:** Review-based — you check the output, not the process

  **When to use:** Routine tasks, migrations, test generation, documentation, with human review gate

</div>
📈
**Where do most developers operate?** In 2026, most professional developers work between Levels 1 and 3. Pure Level 4 is most common among non-technical founders, hobbyists, and rapid prototypers. Level 5 is emerging fast in enterprise environments. Notably, Karpathy himself has evolved from "vibe coding" to advocating **"agentic engineering"** — professionals orchestrating AI agents with oversight, not just vibes.
</div>
### Which level are you?
Take the interactive quiz at the end of this ebook to find out.

<button class="quiz-btn quiz-btn-primary" style="margin-top:0.5rem;" onclick="goTo('ch-quiz')">Take the Quiz &#8594;</button>

05. The Tools: A Complete Landscape (2025–2026)

Updated April 9, 2026

The tooling ecosystem for AI-assisted development has exploded. The market is consolidating fast — with Cursor seeking a ~$50B valuation at $2B+ ARR, Lovable at $6.6B, Cognition at $10.2B, and billion-dollar acquisition battles playing out in real time. Anthropic's acquisition of Bun (the fast JavaScript runtime) signals Claude Code's push into native runtime integration. Here's the current state of play across every major category.

AI-Native IDEs

Cursor
Anysphere
The IDE Karpathy originally referenced. Built on VS Code with deep AI integration. Cursor 3 (April 2, 2026) is a ground-up redesign centered on agent orchestration: the new Agents Window replaces the Composer pane with a full-screen workspace for running multiple AI agents simultaneously in side-by-side, grid, or stacked layouts. Design Mode lets you click any element in a browser preview and direct agents to modify that exact component visually. Cloud-to-local handoff for agent sessions. Automations triggered by external services. Faster large-file diff rendering, less memory-heavy. The Await tool lets agents pause for background shell commands and subagents. MCP Apps now support structured content. Previously (March 2026): always-on Automations, JetBrains support via Agent Client Protocol, team plugin marketplaces.
$2B+ ARR • ~$50B valuation (fundraising) • 1M+ daily users • 50,000 businesses • >50% Fortune 500
IDEAgentMCPAutomationsJetBrainsDesign Mode
Windsurf
Cognition (via complex acquisition)
AI IDE with persistent "memories" for long-term context. Subject of a dramatic $3B acquisition saga: OpenAI's bid collapsed after Microsoft blocked it, Google hired the CEO and key researchers in a $2.4B deal, and Cognition acquired the remaining product, brand, and IP. Now supports Gemini 3.1 Pro. Ranked #1 in LogRocket AI Dev Tool Power Rankings (Feb 2026). Combined Cognition entity (Devin + Windsurf) raised $500M at ~$10B valuation with $82M+ ARR.
IDEMemoryCognition
VS Code + Extensions
Microsoft
The original. Still viable with GitHub Copilot, Continue, and Cline extensions. Best for developers who want AI assistance without switching editors.
IDEExtensions

Autonomous Coding Agents

Claude Code
Anthropic
Terminal-based coding agent. Reads and modifies code across entire repositories. Powered by Claude Opus 4.6 with agent teams — multiple AI agents working in parallel. March 2026: voice mode (/voice push-to-talk), STT in 20 languages, MCP management via /mcp dialog, Claude API skill for building on Anthropic's platform. Computer-use capabilities let Claude operate your Mac autonomously. Companion product Claude Cowork works directly with local files. Late March 2026 (v2.1.63–2.1.76): /loop command adds cron-like scheduled tasks — turning Claude Code into a background worker for PR reviews, deployment monitoring, and recurring analysis. 1-million-token context window. Max output increased to 64k tokens for Opus 4.6 (128k upper bound for Opus 4.6 and Sonnet 4.6). MCP servers can now request structured input mid-task via interactive dialogs. Skills.md enables persistent agent behaviors. Early April 2026: Anthropic acquires Bun (the fast JavaScript runtime built by Jarred Sumner) — bringing native Bun integration and faster JS execution directly into Claude Code workflows. Claude overtook ChatGPT as the #1 AI app on the App Store. Revenue surpassed $2.5B ARR (named world's most disruptive company, Time March 2026). In a Mozilla partnership, Claude Opus 4.6 autonomously found 22 CVEs in Firefox's C++ codebase. April 4, 2026 — OpenClaw Policy Change: Anthropic announced that Claude Code subscription limits no longer apply to third-party harnesses such as OpenClaw. Users of third-party Claude Code integrations must move to pay-as-you-go billing; a $200/mo Max subscription was reportedly being used to run $1,000–$5,000 of agent compute. Affected users received a one-time credit. Additional April updates: PowerShell tool for Windows (opt-in preview), flicker-free alt-screen rendering, named subagents in @ mentions, 60% faster Write tool diff computation. Note: Pentagon labeled Anthropic a supply-chain risk in March 2026 over weapons/surveillance policy; defense tech contractors migrating away.
$2.5B+ ARR • #1 App Store • /loop Scheduled Tasks • 1M Token Context • Computer Use • Voice Mode
CLIAgentAgent TeamsScheduled TasksComputer UseVoiceEnterprise
Devin
Cognition Labs
Positioned as an "AI software engineer." Full agent-native IDE with parallel task execution, interactive planning, Devin Wiki, and Devin Search. Goldman Sachs, Citi, Dell, Cisco, Palantir among enterprise clients. $10.2B valuation after $400M Series C.
$155M+ ARR • 10x migration speed
AgentAsyncEnterprise
OpenAI Codex CLI
OpenAI
Open-source terminal agent built in Rust. Sandboxed execution, code review, MCP integration, session resume, and CI/CD automation. Now powered by GPT-5.4 (March 2026) — OpenAI's latest with native computer-use capabilities, 1M token context, and 33% fewer errors vs GPT-5.2. GPT-5.4 comes in Standard, Thinking, and Pro variants. ChatGPT for Excel/Sheets integration signals enterprise push.
npm i -g @openai/codex • GPT-5.4
CLIOpen SourceSandboxComputer Use
Google Jules
Google
Asynchronous agent powered by Gemini 3 Pro. Clones codebases into Cloud VMs, works independently, opens PRs automatically. Concurrent task execution. Cognition (Devin's parent) also shipped Windsurf Codemaps — AI-annotated structured maps of entire codebases powered by SWE-1.5 and Claude Sonnet 4.5, enabling hyper-contextualized navigation of large repos before making changes.
2.28M visits • 140K+ code updates
AgentAsyncCloud
Gemini CLI
Google
Open-source terminal agent powered by Gemini 3 Flash. Skills system with sub-agents, event-driven scheduler, and agent registry. Direct competitor to Claude Code and Codex CLI in the terminal space.
github.com/google-gemini/gemini-cli
CLIOpen SourceSkills
GitHub Copilot
GitHub / Microsoft
The original AI coding assistant, now with full agent mode. Autonomously identifies subtasks, edits across multiple files, runs tests, and fixes errors. MCP support. March 2026: GPT-5 mini and GPT-4.1 now included without consuming premium requests. Plan mode metrics available across JetBrains, Eclipse, Xcode, and VS Code. Users can assign the same issue to Claude, Codex, or Copilot agents simultaneously. March 11: Custom agents, sub-agents, and Plan Agent are now generally available in JetBrains IDEs (agent hooks in preview). March 12: New GitHub Copilot Student plan launched — free access maintained but premium model self-selection removed in favor of Copilot Auto mode. April 2026 — Agent Mode GA & New Features: Agent Mode now fully generally available on VS Code and JetBrains across all Copilot plans. Copilot SDK entered public preview (April 2) — building blocks for embedding Copilot agentic capabilities into custom apps and workflows. Autopilot mode (public preview) — agents approve their own actions and auto-retry on errors until task completion. Copilot CLI v1.0.18 added a Critic agent that automatically reviews plans using a complementary model. Sandbox MCP servers now available on macOS/Linux. Privacy policy change (effective April 24): GitHub Copilot Free/Pro/Pro+ user interaction data will be used for AI model training by default — opt out in account settings if this applies to you.
26M+ total users • 20M+ paid • 6+ IDEs • Agent Mode GA • Copilot SDK
IDEAgentMCPMulti-Model
Kilo Code
Kilo.ai (GitLab co-founder)
Open-source AI coding agent with 1.5M+ users. Orchestrator mode with planner/coder/debugger sub-agents. 500+ model support. Available in VS Code, JetBrains, and CLI. $19/mo or BYO API key. Launched March 2026.
1.5M+ users • Open Source
AgentOpen SourceMulti-Agent
Amazon Q Developer
Amazon
AI coding assistant deeply integrated with AWS. Code generation, transformation, and debugging with strength in serverless and cloud infrastructure patterns.
AgentAWS

Browser-Based Builders

Bolt.new
StackBlitz
Browser-based dev environment. Describe an app, get a working deployable application. No local setup. Excellent for rapid prototyping.
BrowserFull-StackDeploy
v0
Vercel
AI-powered UI generation. Describe a component, get production-ready React + Tailwind code. Deep Next.js integration. Best for frontend prototyping.
UIReactNext.js
Lovable
Lovable (Sweden)
App creation for non-developers. Natural language to working, deployable software. By March 2026: $400M ARR (up from $200M at end-2025) with only 146 employees, 200,000+ new projects per day. March 23: CEO Anton Osika announced an M&A offensive — Lovable is actively acquiring startups and builder teams to extend its platform lead. Previously acquired cloud provider Molnett. Faced security scrutiny (170/1,645 apps had vulnerabilities).
$400M ARR • $6.6B valuation • 200K projects/day • M&A offensive
No-CodeBrowser
Replit Agent
Replit
Complete app building from descriptions with deployment and database management. 75% of AI-enabled Replit users don't write code themselves. March 11: Raised $400M Series D at a $9 billion valuation (led by Georgian Partners, with a16z, Coatue, Y Combinator, Databricks Ventures) — triple its September 2025 valuation in six months. Targeting $1B ARR by end of 2026.
75% write zero code • $400M Series D • $9B valuation
BrowserFull-StackDeploy

The Infrastructure Layer: MCP

🔗
**Model Context Protocol (MCP)** is Anthropic's open protocol that allows AI assistants to connect to external tools and data sources. It has become the standard way for coding agents to interact with databases, APIs, file systems, and other developer tools. All major agents (Claude Code, Cursor, Codex CLI, Devin) support MCP.
</div>

The Model Race (March 2026 Update)

The foundation models powering these tools are advancing on multiple fronts. Key releases in early March 2026:

  • GPT-5.4 (OpenAI): Native computer-use, 1M context, Standard/Thinking/Pro variants. Already integrated into Codex CLI and Copilot.
  • Gemini 3.1 Flash-Lite (Google): Ultra-low-latency variant designed for inline code completions and real-time suggestions. Powers Windsurf and Jules background tasks.
  • GLM-4.7 (Zhipu AI): China's leading code model, competitive with GPT-5 on multilingual programming benchmarks. Growing adoption in Asian markets.
  • DeepSeek-V3.2-Speciale (DeepSeek): Open-weight model rivaling proprietary offerings. Strong at multi-file reasoning and long-context code generation.

Open-source LLMs now account for over 60% of production AI deployments — a tipping point driven by DeepSeek, Llama, Qwen, and Mistral. This has shifted the economics: developers increasingly use open-weight models for routine code generation while reserving proprietary models for complex architectural reasoning.

Andrej Karpathy, who coined "vibe coding" in February 2025, introduced a new term in early 2026: "agentic engineering" — the discipline of designing, orchestrating, and supervising autonomous AI agents that write code, run tests, and deploy systems with minimal human intervention. The term has rapidly entered common usage, marking the evolution from "coding with AI" to "engineering with agents."

06. The Agent Revolution

Updated April 15, 2026

The most significant development since Karpathy's tweet isn't better autocomplete. It's the emergence of autonomous coding agents — AI systems that independently plan, implement, test, and deploy software.

From Copilot to Colleague

Phase 1: Autocomplete (2021-2023)
The AI predicted the next line
GitHub Copilot launched. Useful, but fundamentally a typing accelerator. The developer remained in full control of every decision.
Phase 2: Composers (2023-2024)
The AI generated entire features
Cursor Composer, ChatGPT Code Interpreter. Multi-file generation became possible. But the developer still supervised each generation cycle.
Phase 3: Agents (2025-2026)
The AI works independently
Agents understand entire codebases, create execution plans, implement changes across dozens of files, run tests, fix failures, and open pull requests. The developer assigns a task and reviews the result — sometimes hours later.
Phase 4: Persistent Workers (Early 2026)
The AI runs on a schedule without being asked
Claude Code's /loop command and Claude Managed Agents enable scheduled background tasks. Agents run CI pipelines, triage issues, and maintain codebases overnight. The developer reviews a morning summary of what the AI decided and changed while they slept.

What Agents Can Do Today

Modern coding agents reliably handle tasks that would take a junior developer 4-8 hours:

🔃
Migrations
Framework, API, database schema conversions
🐛
Bug Fixes
Diagnose from logs, implement fix, write regression tests
🛠
Features
Complete frontend + backend + database changes
Tests
Comprehensive test suites for existing code
📄
Documentation
Generate and maintain docs across entire codebases
🔒
Security Fixes
Scan for vulnerabilities and implement remediations

The April 2026 Benchmark Picture

Agent performance has accelerated dramatically. The current public leaderboard (April 2026):

Model SWE-bench Verified Access
Claude Mythos Preview 93.9% Restricted (Project Glasswing)
Claude Opus 4.6 80.8% Public
Gemini 3.1 Pro 80.6% Public
GPT-5.4 75.0% Public
Kimi K2.5 (open-source) ~75% Open

Kimi K2.5 by Moonshot AI is the current #1 open-source option: 1 trillion parameter MoE architecture with 32 billion active parameters, competitive with frontier models at a fraction of the inference cost.

New Agent Orchestration Frameworks (April 2026)

Two major frameworks launched in April 2026 that reshape how multi-agent systems are built:

The practical implication: you no longer need to build agent infrastructure from scratch. These frameworks handle the hard parts — state, retries, tool routing, parallelization — so you can focus on the task logic.

What Agents Still Struggle With

Cognition's own 2025 performance review of Devin put it well:

"Devin is senior-level at codebase understanding but junior at execution."

The Parallel Execution Advantage

Unlike human developers, agents can run multiple instances simultaneously, work 24/7, and process entire backlogs of tickets overnight.

10x
Faster file migrations (bank case study)
14x
Faster repo migrations (Oracle Java)
20x
Faster vulnerability remediation
7.8m
Average task completion (Devin)
+10pp
Task success rate with Managed Agents vs prompting
93.9%
Claude Mythos SWE-bench (restricted access)

07. Vibe Coding in Practice: Real Workflows

Updated March 6, 2026

Theory is interesting. Practice is what matters. Here are four concrete workflows for different scenarios.

#### The Weekend Prototype
**Scenario:** You have a product idea and want a working prototype by Monday.

**Tools:** Bolt.new or Cursor + Claude &bull; **Level:** 3-4

1. Write a detailed description (spend 20-30 min — it's the most important step)
  1. Include: target users, core features, data model, key screens, visual style

  2. Paste into Bolt.new or Cursor Composer

  3. Iterate through natural language: "Make the sidebar collapsible" / "Add dark mode"

  4. Deploy to Vercel or Netlify

  5. Share with potential users for feedback

    
    

Build a job application tracker. I'm applying to software engineering positions and need to track: company name, position title, application date, status (applied/phone screen/onsite/offer/rejected), salary range, notes, and next action date. I want a clean dashboard showing all applications in a table with sorting and filtering. Include a kanban view grouped by status. Use a modern blue/slate color scheme. Store in localStorage. Make it responsive for mobile.


  </div>

  <div class="tab-content" id="wf2">
    #### The Startup MVP

    **Scenario:** Building a real product for real users, fast.

    **Tools:** Claude Code + Cursor + v0 &bull; **Level:** 2-3

    1. Start with a product requirements document (even a rough one)
2. Use v0 to prototype key UI screens
3. Use Claude Code to scaffold the full architecture
4. Build feature-by-feature, testing each before moving on
5. Review auth code and data handling; accept UI code freely
6. Deploy to real hosting, set up monitoring
7. Plan a "hardening phase" for security-critical paths

    <div class="callout warning">
      <div class="callout-icon">&#9888;&#65039;</div>
      <div class="callout-content">**The trap:** Skipping step 7. Many YC startups vibe-coded their MVPs successfully but faced "development hell" when trying to scale without hardening.

</div>
    </div>
  </div>

  <div class="tab-content" id="wf3">
    #### The Enterprise Integration

    **Scenario:** Adding a feature to an existing production codebase.

    **Tools:** Claude Code or Devin + CI/CD pipeline &bull; **Level:** 5 with human gate

    1. Create a detailed ticket with acceptance criteria
2. Assign to an AI agent (Devin, Claude Code, or Jules)
3. Agent analyzes codebase, creates a plan, implements the change
4. Agent runs existing test suite and fixes failures
5. Agent opens a pull request
6. Human reviews: security, performance, architecture, edge cases
7. Merge after human approval

    This is Level 5 but with human review as the final gate. It's how most enterprises adopt AI coding in 2026.

  </div>

  <div class="tab-content" id="wf4">
    #### The Solo Creator

    **Scenario:** You're not a developer. You have an idea for an app.

    **Tools:** Lovable, Bolt.new, or Replit Agent &bull; **Level:** 4

    1. Describe your application as if explaining it to a friend
2. Let the builder create the first version
3. Use it yourself — note what's wrong or missing
4. Describe changes in plain language
5. Repeat until satisfied
6. Deploy using the platform's built-in hosting

    <div class="callout danger">
      <div class="callout-icon">&#128308;</div>
      <div class="callout-content">**Critical:** If your app handles user data, sensitive information, or payments, hire a security professional to review it before going live. The Lovable vulnerability study (170/1,645 apps) shows this isn't hypothetical.

</div>
    </div>
  </div>

08. Real-World Case Studies

Updated March 6, 2026

These are documented, real examples — not hypotheticals.

Andrej Karpathy practiced what he preached, building MenuGen using nothing but natural language instructions. He provided goals, examples, and feedback — never touching the code directly. The project demonstrated that vibe coding could produce functional software, though Karpathy himself noted it was appropriate for "small weekend projects" rather than production systems.
</div>
New York Times journalist Kevin Roose, not a professional programmer, experimented with vibe coding in early 2025. He built several "software for one" applications — personal tools tailored to his exact needs. The results were mixed: some tools worked well, but in one notable case, an AI-generated e-commerce feature **fabricated fake product reviews**. Roose's experience illustrated both the democratization promise and the trust problem.
</div>
Goldman Sachs adopted Devin as part of their "hybrid workforce" — AI agents working alongside human engineers. They deployed Devin for code migrations, documentation generation, and routine maintenance. A representative case: **documenting 400,000+ repositories** that had accumulated years of tribal knowledge, freeing engineering teams for new feature development.
</div>
**25%** of companies in YC's Winter 2025 batch had codebases that were 95% AI-generated. These startups moved from idea to working product in days rather than months. Several raised seed funding based on prototypes built almost entirely through natural language. The trend raised questions about what happens when these companies need to scale.
</div>
Misbah Syed, founder of Menlo Park Lab, built the generative AI application Brainy Docs using vibe coding: "If you have an idea, you're only a few prompts away from a product." The company used AI-generated code for consumer-facing applications, demonstrating vibe coding could produce **revenue-generating products**, not just prototypes.
</div>
Bank of America used conversational coding agents to rapidly prototype fraud detection systems. Engineers described detection patterns in natural language and iterated through AI-generated implementations. Prototypes were achieved in a fraction of the traditional time, then **hardened by specialized security engineers** before deployment — a model example of the "vibe then harden" approach.
</div>
Perhaps the most striking validation of vibe coding as a business strategy came in early 2026 when **Wix acquired Base44 for $80 million in cash**. Base44, a solo-founder startup barely six months old, had built a vibe coding platform enabling non-developers to create functional applications through natural language. The acquisition demonstrated that vibe-coded companies could reach significant exit values in record time. YC-backed Emergent, another vibe coding company, reached a **$300 million valuation**.
</div>
Throughout 2025 and into 2026, the Indie Hackers community documented dozens of revenue-generating applications built primarily through vibe coding. Solo creators with limited coding backgrounds built and launched SaaS products within weeks. The pattern was consistent: **vibe code the MVP, validate with real users, then decide whether to hire engineers** for the production version.
</div>
SaaStr founder Jason Lemkin documented a cautionary experience: **Replit's AI agent deleted his database** despite explicit instructions not to make any changes. This incident became one of the most-cited examples of the risks of giving autonomous agents too much power without proper safeguards.
</div>
In January 2026, researchers from Central European University and the Kiel Institute published **"Vibe Coding Kills Open Source"** on arXiv. The paper documented a systemic problem: vibe coding raises productivity by making it easy to use open-source libraries, but **severs the user engagement** through which maintainers earn returns. Users no longer read documentation, file bug reports, or contribute. Tailwind CSS docs traffic dropped ~40% from early 2023. Stack Overflow questions entered structural decline after ChatGPT launched. The paper argued that sustaining open source under widespread vibe coding requires fundamentally new funding models for maintainers.
</div>
The most dramatic business story of the vibe coding era. OpenAI agreed to acquire Windsurf (formerly Codeium) for **$3 billion** — its largest acquisition ever. Then Microsoft reportedly blocked the deal over exclusivity clauses. Google swooped in with a **$2.4 billion** reverse acquisition package, hiring Windsurf's CEO and key researchers for DeepMind. Cognition then acquired the remaining product, brand, IP, and team. The result: one AI coding startup's technology and talent split across three of the biggest companies in AI. A sign of just how valuable vibe coding infrastructure has become.
</div>

09. The Numbers: Adoption and Impact

Updated April 9, 2026

The data tells a clear story: AI-assisted development isn't a trend. It's a structural shift.

Adoption

0%
Developers using AI tools (JetBrains 2026)
0%
Developers using AI tools daily, globally (Stack Overflow Dev Survey, Q1 2026)
0%
US developers using AI tools daily (March 2026)
0%
All new code that is AI-generated (GitHub State of Octoverse, March 2026)
0%
All production code commits containing AI-generated lines (Sourcegraph Code Intelligence, March 2026)
0%
Business AI adoption — all-time record (Ramp AI Index, Feb 2026)
0%
Replit AI users who write zero code

AI Market Share (March–April 2026)

34.4%
OpenAI business market share (declining -1.5% MoM)
24.4%
Anthropic business market share (growing +4.9% MoM)
~70%
Head-to-head wins: Anthropic vs OpenAI in new business (Ramp)
93.9%
Claude Mythos on SWE-bench — restricted to Project Glasswing defense partners (April 7, 2026)
80.8%
Claude Opus 4.6 on SWE-bench — best publicly available coding agent score

Revenue & Growth

$2.5B+
Claude Code ARR
$155M+
Devin ARR (18 months from $1M)
$2B+
Cursor ARR (~$50B valuation, April 2026)
20M+
GitHub Copilot paid users (April 2026)
$50M
Emergent AI ARR in 7 months
$82M+
Cognition ARR (Devin+Windsurf)

Valuations (Early 2026)

$10B
Cognition ($500M raise, Mar 2026)
~$50B
Anysphere (Cursor) — seeking April 2026
$9B+
Anthropic projected ARR
$6.6B
Lovable ($400M ARR, 200K projects/day)
$9B
Replit ($400M Series D, Mar 2026 — tripled in 6 months)

Productivity

0%
Faster project completion
10-14x
Faster agent migrations vs. human
500K
Developer hours saved (TELUS, 2025-26)
1,000+
PRs/week via AI agents (Stripe)
75%
Reduction in PR turnaround time for AI-tool teams (9.6 days → 2.4 days, Index.dev 2026)
3.6 hrs
Average time saved per developer per week (survey median, April 2026)

Developer Sentiment (April 2026)

0%
Developers using AI tools (JetBrains 2026)
0%
Professional developers using AI tools daily (SonarSource 2026)
0%
Developers who have started using AI agents (April 2026)
0%
Developers with "high trust" in AI output (down from 70%+ in 2023)
0%
Developers frustrated by "almost right" AI solutions (top complaint, SonarSource)
0%
Professional devs adopted vibe coding

Cultural Impact

10. The Dark Side: Security, Debt, and Failure

Updated April 1, 2026

For every success story, there's a cautionary tale. The risks are real, documented, and in some cases severe.

The Tenzai Security Study

🔒
In December 2025, security startup Tenzai tested five major tools — Claude Code, OpenAI Codex, Cursor, Replit, and Devin — building three identical test applications each. Across **15 apps**, they found **69 vulnerabilities**: ~45 low-medium, the rest high or critical.
  **Key finding:** AI tools avoid generic security flaws but struggle where what makes code safe vs. dangerous depends on context.

</div>
0%
AI code with security vulnerabilities
0%
AI code with exploitable bugs
0%
Developers who trust AI accuracy (down from 43%)
0%
Practitioners who say AI code is "fast but flawed"
35
CVEs from AI-generated code in March 2026 alone (27 from Claude Code)
400–700
Estimated AI code vulnerabilities per month (incl. unpublished CVEs)

The Acceleration: 35 CVEs in One Month

The security threat from AI-generated code is not static. It is accelerating. In March 2026, security researchers confirmed 35 CVEs directly attributable to AI-generated code — 27 of them from Claude Code alone. Researchers from the CERT/AI Working Group estimate the actual monthly count including triaged-but-unpublished vulnerabilities is 400 to 700 per month.

The trend is steep and mirrors adoption curves:

Month Confirmed AI Code CVEs Estimated Total
Jan 2026 12 250–350
Feb 2026 21 310–450
Mar 2026 35 400–700

The root cause is structural: AI coding tools generate code that compiles and passes tests, but they optimize for functional correctness rather than security context. A model trained on decades of existing internet code learns the prevalence of insecure patterns alongside secure ones — and reproduces them with equal confidence. As AI-generated code's share of all new code climbs toward 41% (GitHub, March 2026), the absolute volume of AI-sourced vulnerabilities scales with it.

The deeper concern: the vulnerability rate is growing faster than the adoption rate, suggesting the tools are getting worse at security relative to their capability growth.

**IDEsaster Disclosure (Early 2026):** Security researchers found **30+ vulnerabilities across every major AI IDE**, resulting in **24 CVEs assigned** and putting an estimated **1.8 million developers** at risk. AI-generated code was found to be **2.74x more likely** to introduce XSS vulnerabilities than human-written code.
</div>

Documented Security Incidents

24 CVEs
IDEsaster — All Major AI IDEs
30+ vulnerabilities found across every major AI IDE. 1.8 million developers at risk. AI code 2.74x more likely to introduce XSS.
CVE-2025-54135
CurXecute — Cursor IDE
Malicious MCP server responses could execute arbitrary commands on developers' machines.
CVE-2025-55284
Claude Code DNS Exfiltration
Data exfiltration from developer computers through DNS requests.
PROMPT INJECTION
Windsurf Memory Poisoning
Malicious code comments poisoned Windsurf's long-term memory, enabling silent data theft over months.
PROMPT INJECTION
Gemini CLI Code Execution
Asking the Gemini CLI to analyze a project triggered a malicious injection hidden in a readme.md file.
MASS VULN
Lovable Supabase RLS Crisis (March 2026)
Researchers analyzed 1,645 Lovable-generated apps and found critical Row Level Security misconfigurations in 170 of them (10.3%). Affected apps exposed user data to any authenticated user. A separate CodeRabbit study confirmed AI-generated code has 2.74x higher security vulnerability rates than human code, with 1.7x more "major" issues per 1,000 lines. Source: RedReamality (March 15, 2026).
CVE-2025-48757
Base44 Platform
Unauthenticated access vulnerability exposed 170+ production applications built on the platform.
DATA BREACH
Tea App
Basic authentication failures in an AI-generated app leaked 72,000 user IDs and selfies.
CVE-2026-21858
n8n Remote Code Execution (CVSS 10.0)
Unauthenticated RCE allowing full server takeover on ~100,000 n8n automation servers. The highest possible CVSS score.
SUPPLY CHAIN
SANDWORM_MODE npm Worm
First malware to install rogue MCP servers, poisoning AI coding assistants to exfiltrate API keys. Self-replicates by stealing npm tokens and republishing victims' top 20 packages. Spread through 19 typosquatted packages.
MCP ATTACK
MCP Server Injection Crisis (8,000+ Servers)
92% exploitation probability at 10 MCP plugins. 72.8% attack success rate across 45 real-world servers. 36.7% of 7,000+ servers have SSRF exposure. More capable AI models are more vulnerable to MCP-based prompt injection.
CVE-2025-59536
Claude Code Remote Code Execution (CVSS 8.7)
High-severity RCE vulnerability in Claude Code's project file handling. Attackers could craft malicious repository files to execute arbitrary commands on a developer's machine when Claude Code processed the project. Patched in Claude Code 1.9.3.
CVE-2026-21852
Agentic IDE File Exfiltration via Tool Misuse
Vulnerability in multiple agentic IDE integrations allowing prompt-injected instructions to abuse legitimate file-read tools for exfiltrating source code, .env files, and SSH keys to attacker-controlled servers — without triggering standard security controls.
CVE-2026-33017 • CISA KEV • CVSS 9.3
Langflow Unauthenticated Remote Code Execution (Active Exploitation)
Critical unauthenticated RCE in Langflow — the open-source AI workflow builder widely used by vibe coders to prototype LLM pipelines. No authentication required for exploitation. Added to CISA KEV list March 2026 with patch deadline April 8. Actively exploited in the wild. Affects all Langflow versions prior to the March 2026 patch. If you run Langflow locally or self-hosted, treat this as an emergency patch. Source: CISA KEV, NVD.
CVE-2025-32432 • CISA KEV • CVSS 10.0
Craft CMS Code Injection — Maximum Severity
CVSS 10.0 code injection vulnerability in Craft CMS — a common CMS backend choice in AI-generated web projects. Added to CISA KEV with patch deadline April 3. The maximum CVSS score means any authenticated user (or in some configurations, unauthenticated) can execute arbitrary code on the server. Vibe-coded projects using Craft as their CMS backend should patch immediately or temporarily disable public access.
CVE-2025-54068 • CISA KEV • CVSS 9.8
Laravel Livewire RCE — Nation-State Attribution
Critical RCE in Laravel Livewire with nation-state actor attribution confirmed by threat intelligence sources. Added to CISA KEV with patch deadline April 3. Laravel is one of the most frequently suggested PHP frameworks in AI coding assistants — a large percentage of AI-generated web projects use it. This isn't a theoretical risk: active exploitation with sophisticated threat actors is confirmed. Patch immediately.

AI as Vulnerability Hunter: The Other Side of the Coin

🔎
**Claude Opus 4.6 Finds 22 Firefox CVEs (March 2026):** In a partnership with Mozilla, Anthropic's Claude Opus 4.6 autonomously analyzed Firefox's C++ codebase and identified **22 previously unknown CVEs**. The model found memory safety vulnerabilities, use-after-free bugs, and buffer overflows that human reviewers had missed. This demonstrates a dual reality: the same AI capability that generates vulnerable code can also find vulnerabilities at scale — the question is who uses it first, defenders or attackers.
</div>

The Threat Landscape: Ransomware Meets AI

The broader cybersecurity environment compounds the risk of insecure AI-generated code. As of early 2026, there are 124 active ransomware groups — a 49% year-over-year increase. These groups are increasingly using AI to generate phishing lures, analyze codebases for vulnerabilities, and automate lateral movement. The intersection of AI-generated insecure code and AI-accelerated exploitation creates a compounding threat surface.

The AI Slopageddon: Open Source Fights Back

By early 2026, a new phenomenon emerged that open-source maintainers dubbed the "AI Slopageddon" — a flood of low-quality, AI-generated bug reports, pull requests, and security "findings" overwhelming popular projects:

  • cURL: Daniel Stenberg reported a deluge of AI-generated vulnerability reports so poor they were "worse than spam" — wasting maintainer time triaging hallucinated CVEs. He began publicly shaming the worst offenders and lobbied HackerOne to penalize AI-slop submissions.
  • Ghostty: The terminal emulator project implemented explicit policies rejecting AI-generated contributions after a wave of superficially plausible but fundamentally broken PRs.
  • tldraw: The collaborative whiteboard project documented a pattern of AI-generated issues that described bugs that didn't exist, in code paths that didn't exist, with reproduction steps that couldn't work.

The pattern is consistent: AI tools lower the barrier to appearing competent enough to submit contributions, but the submissions lack the understanding that makes them useful. Maintainers are now spending significant time filtering AI slop instead of building software — an ironic cost of the productivity tools meant to help them.

The $1.5 Trillion Technical Debt Problem

Analysts have warned of a potential $1.5 trillion in technical debt by 2027 from AI-generated code:

  • 41% higher code churn — AI code gets rewritten more often

  • 8x increase in duplicated code blocks (GitClear, 2024)

  • 30% of AI suggestions accepted in professional environments

  • Forrester: 75% of tech leaders will face moderate-to-severe tech debt by 2026

    The "Vibe Coding Hangover"

    By late 2025, Fast Company reported senior engineers entering "development hell" maintaining vibe-coded systems:

    🧬
    Zombie Apps
    Functional but unmaintainable
    🍝
    Spaghetti Code
    Works but no coherent structure
    🚧
    Complexity Ceiling
    Can't extend without breaking
    😶
    Debug Impossibility
    Nobody can trace the code they never read

11. The Great Debate

Updated March 6, 2026

The software community is deeply divided. Understanding the strongest arguments on each side helps you form a nuanced view.

#### "It's the natural evolution of abstraction."
Programming languages have always moved toward higher abstraction. Assembly to C to Python. Each level lets developers focus on intent rather than implementation. Natural language is simply the next layer.

#### "It democratizes creation."

Millions of people have software ideas but lack years of training. Vibe coding lets a nurse build a patient tracking app, a teacher build a classroom tool, a small business owner build inventory management. The expansion of who can create software is historically significant.

#### "The speed advantage is transformative."

A prototype in hours instead of weeks. An MVP in days instead of months. The 25% of YC companies with 95% AI code didn't choose vibe coding for ideology — they chose it because they needed to move fast.

#### "Traditional code isn't as reliable as we pretend."

Human-written code has bugs, security vulnerabilities, and technical debt too. AI-generated code may have different failure modes, but the idea that human code is inherently reliable is a myth.
#### "Code you don't understand is code you can't maintain."
Software spending is ~60% maintenance. If nobody understands the codebase, maintenance is impossible. You're not saving time — you're borrowing it from the future at a ruinous interest rate.

#### "Security requires understanding, not just testing."

You can test whether a login form works. You can't easily test whether passwords are properly hashed, session tokens are cryptographically secure, or APIs have rate limiting — unless you read the code.

#### "It creates learned helplessness."

Developers who rely entirely on vibe coding lose fundamental skills. When the AI makes a mistake in a novel way, they have no fallback. Fragile teams build fragile systems.

#### "The economics don't work at scale."

Vibe coding is cheap upfront and expensive later. The $1.5 trillion tech debt projection isn't speculation — it's extrapolation from observed code churn, duplication, and architectural degradation.
#### Context Is Everything
The most reasonable position — and the one supported by data — is that vibe coding is a powerful tool with a specific and limited appropriate scope.

<div class="callout success">
  <div class="callout-icon">&#9989;</div>
  <div class="callout-content">
    **It excels for:** prototyping, validation, personal tools, learning, hackathons, and small-scale applications with limited security requirements.

  </div>
</div>
<div class="callout danger">
  <div class="callout-icon">&#10060;</div>
  <div class="callout-content">
    **It fails for:** production systems at scale, security-sensitive applications, regulated industries, and software that needs multi-year maintenance.

  </div>
</div>
**The winning model in 2026:** Vibe code the prototype, then bring in disciplined engineering for the production system. The companies dominating right now — the ones raising at $10B valuations, the ones with $1B ARR in six months — are all betting that this model scales. And the data supports them.

The critics are not wrong about the risks. But they are wrong about the trajectory. Every objection to vibe coding was once made about high-level languages, about frameworks, about cloud computing. The abstraction always wins. The question is never *whether* but *how*.

12. When to Vibe (and When Not To)

Updated March 6, 2026

🟢 Green Light: Vibe Code Away

- **Prototypes and MVPs** — Validate ideas before investing in production engineering - **Internal tools** — Dashboards, data scripts, one-off analysis - **Personal projects** — Only you use it, only you depend on it - **Learning** — Trying new frameworks, languages, or patterns - **Hackathons** — Speed is everything, longevity is nothing - **UI prototyping** — Design exploration and layout testing - **Automation scripts** — Repetitive tasks that eat your time

🟠 Yellow Light: Proceed with Caution

- **Customer-facing apps** — Vibe the prototype, then review and harden - **Small SaaS** — Viable for launch, plan for rewrite - **API integrations** — Fast to build, auth needs human review - **Mobile apps** — UI can be vibe coded; data/security need attention - **Team projects** — Works if one person understands the architecture

🔴 Red Light: Don't Vibe Code

- **Financial systems** — Payments, accounting, trading - **Healthcare** — Patient data, clinical decisions, HIPAA - **Auth & authz** — Login systems, permissions, tokens - **Infrastructure** — Server config, network security, deployment - **Regulated industries** — SOX, PCI-DSS, GDPR compliance - **Distributed systems** — Microservices, message queues, cache invalidation - **Cryptography** — Encryption, key management, certificates
💡
**The 80/20 Rule:** For most applications, 80% of the code is boilerplate, UI, and standard patterns that AI handles well. The remaining 20% — authentication, business logic, data integrity, security — deserves human attention. **Vibe code the 80%. Engineer the 20%.**

13. Mastering the Craft: Advanced Techniques

Updated March 6, 2026

If you're going to vibe code, do it well. These techniques separate productive vibe coders from frustrated ones.

The Art of the Initial Prompt

The single most important factor in vibe coding success. Spend 30 minutes writing a comprehensive description before generating a single line of code.

WHAT
What does it do? (user perspective)
WHO
Who uses it? (audience, skill level)
HOW
How should it look? (design, colors)
DATA
What entities? How do they relate?
EDGE
What happens when things go wrong?
TECH
Any framework/language preferences?

Weak vs. Strong Prompts

``` Build me a todo app ```
``` Build a project management application for freelance designers. Users: Solo freelancers managing 3-10 client projects. Core features: - Project board with columns: Incoming, In Progress, Review, Complete - Each card: client name, title, deadline, progress bar - Detail view with task checklist, file links, notes, time log - Dashboard: projects due this week, hours logged, revenue summary Design: Clean, minimal. Coral accent (#FF6B6B). Dark mode. Tablet-friendly. Data: localStorage, structured for future database migration. Behavior: Drag-and-drop cards. Auto-save. Keyboard shortcuts. ```

Key Patterns

Before requesting any significant change, save your current state. Vibe coding can regress working features while adding new ones.
  ```

Working: dashboard + project cards + drag-and-drop -> Save/commit BEFORE adding: task checklist feature


    </div>
  </div>

  <div class="expand-section">
    <button class="expand-header" onclick="this.parentElement.classList.toggle('open')">
      <span class="expand-arrow">&#9654;</span> The "Explain Then Generate" Pattern
    </button>
    <div class="expand-body">
      For complex features, ask the AI to explain its approach before generating code:

      ```
Before writing any code, explain how you would implement
real-time collaborative editing in this application.
What approach? What trade-offs? Then implement it.
  This gives you architectural understanding even in a vibe coding workflow.

</div>
Different models excel at different things:
  - **Claude Opus 4.6 (via Claude Code)** — Complex reasoning, architecture, large codebases, agent teams for parallel work
  • GPT-5.2 (via Codex CLI) — Code generation, systematic transformations, sandboxed execution

  • Gemini 3 Pro / Flash (via Jules or Gemini CLI) — Multimodal (screenshots, diagrams), open-source CLI with skills system

  • GitHub Copilot Agent Mode — Best for working within existing VS Code workflows with agent capabilities

  • v0 — React/Next.js UI generation

  • Bolt.new — Full-stack prototypes you want immediately

**Bad:** "It's broken"
**Good:** "When I click 'Add Task', nothing happens. Console shows: `TypeError: Cannot read property 'push' of undefined at TaskList.addTask (app.js:47)`. This started after I added drag-and-drop."

Include: **action** (what you did), **actual** (what happened), **expected** (what should happen), **error** (verbatim), **context** (what changed recently).

14. Building a Sustainable Workflow

Updated March 6, 2026

Pure vibe coding is fast but fragile. Here's how to build a workflow that's both fast and sustainable.

Phase 1: Vibe and Validate (Days 1-3)
Pure vibe coding for a working prototype
Don't worry about code quality. Just get something that works and demonstrates the core value proposition. Goal: a demo for users, investors, or stakeholders.
Phase 2: Test and Tighten (Days 4-7)
Switch to Level 2-3, review critical paths
Review auth/authz, data storage, payment processing, input validation, and API endpoints. Use AI to generate comprehensive tests.
Phase 3: Harden for Production (Week 2)
Security scanning, proper error handling, monitoring
Run OWASP ZAP or Snyk. Review all DB queries. Add rate limiting, HTTPS, CORS, CSP. Set up logging. Review dependencies for known vulnerabilities.
Phase 4: Maintain and Evolve (Ongoing)
Document, automate, and plan cleanup sprints
Document architecture. Automated testing on every change. AI agents for routine updates. Human review for architectural and security changes. Periodic cleanup sprints.
### The 80/20 Rule
Vibe code the 80% (UI, boilerplate, standard patterns).

Engineer the 20% (auth, business logic, data integrity, security).

15. The Business of Vibes

Updated March 6, 2026

Vibe coding isn't just changing how software is built. It's changing the economics of software businesses.

The New Cost Structure

- Hire 3-5 engineers at $150K-$250K each - 3-6 months to MVP - **Total cost to first version: $300K-$1M+**
- 1 technical founder + AI tools ($20-$500/month) - 1-4 weeks to MVP - **Total cost to first version: $500-$5,000**
<p style="margin-top:1rem;"><em>This doesn't mean you never need engineers. It means you can validate before investing.</em></p>

The New Archetypes

🏆
The 10-Person $10M Company
Small teams with AI agents handling work that traditionally required 50+ engineers
👨‍💻
The AI-Fluent Developer
Engineers who can specify precisely and evaluate AI output critically
👥
Agent-Augmented Teams
Each human manages 2-5 AI agents working in parallel

The Talent Shift

Companies are increasingly hiring for:

16. What Comes Next

Updated March 14, 2026

Now (Early 2026) — Already Happening

Chapter 17: The Complete Prompt Library

Updated April 13, 2026

209+ production-ready prompts for every stage of AI-native development. Updated monthly.


How to Use This Library

Each prompt is tagged with:

The prompts are designed to be copy-pasted directly. Customize the bracketed [sections] for your specific project.


Category 1: Project Kickoff Prompts

1.1 The Complete Spec Prompt (Expert)

Tool: Claude Code, Cursor Composer | Time: 30-60 min generation

I'm building [product name], a [type of application] for [target audience].

## Product Vision
[One-sentence description of what this product does and why it matters]

## Target Users
- Primary: [who, age range, technical skill level, key pain point]
- Secondary: [who, why they'd use it]

## Core Features (MVP - Priority Order)
1. [Feature 1]: [User story: "As a [user], I want to [action] so that [benefit]"]
2. [Feature 2]: [User story]
3. [Feature 3]: [User story]

## Data Model
- [Entity 1]: [fields and types]
- [Entity 2]: [fields and types]
- Relationships: [Entity 1] has many [Entity 2], etc.

## Design Direction
- Style: [modern/minimal/playful/corporate/brutalist]
- Color palette: [primary hex, accent hex, background]
- Typography: [sans-serif/serif/mono, reference sites]
- Layout: [single page / multi-page / dashboard / wizard]
- Responsive: [mobile-first / desktop-first / both]

## Technical Stack
- Framework: [Next.js / React / Vue / Svelte / vanilla]
- Styling: [Tailwind / CSS Modules / styled-components]
- Database: [Supabase / Firebase / localStorage / Prisma+PostgreSQL]
- Auth: [Supabase Auth / NextAuth / Clerk / none]
- Hosting: [Vercel / Netlify / Railway]

## What Success Looks Like
- A user can [core workflow] in under [N] steps
- The app loads in under [N] seconds
- [Specific measurable outcome]

## What This Is NOT
- Not a [common misunderstanding]
- Don't include [feature to avoid]
- Don't over-engineer [aspect]

Build the complete MVP. Start with the data model, then core layout, then features in priority order.

1.2 The Weekend Prototype Prompt (Beginner)

Tool: Bolt.new, Lovable, Replit Agent | Time: 15-30 min

Build a [type of app] that solves this problem: [describe the pain point in one sentence].

The main user is [who] and they need to:
1. [Core action 1]
2. [Core action 2]
3. [Core action 3]

Design: Clean and modern. Use [color] as the accent color. Dark mode preferred.
Store data in localStorage.
Make it work on mobile.

Keep it simple. I'd rather have 3 features that work perfectly than 10 that are buggy.

1.3 The "Clone This" Prompt (Intermediate)

Tool: Cursor, Claude Code | Time: 1-2 hours

Build a simplified version of [well-known app, e.g., Trello/Notion/Slack].

Include ONLY these features from the original:
1. [Feature to clone]
2. [Feature to clone]
3. [Feature to clone]

DO NOT include: [features to skip]

Match the general layout and UX patterns of the original but use your own design.
Use [tech stack]. Deploy-ready for Vercel.

Focus on making the core interaction feel as smooth as the original.

1.4 The Landing Page Prompt (Beginner)

Tool: v0, Bolt.new | Time: 15-30 min

Create a conversion-optimized landing page for [product name].

Product: [One line description]
Target audience: [Who would buy this]
Price: [Price point or "Free"]

Sections (in order):
1. Hero: Headline "[compelling headline]", subheadline "[supporting text]", CTA button "[button text]"
2. Problem: 3 pain points the audience faces
3. Solution: How the product solves each pain point (with icons or illustrations)
4. Social proof: [testimonials / stats / logos / "As seen in"]
5. Features: 3-6 key features with brief descriptions
6. Pricing: [pricing tiers if applicable]
7. FAQ: 4-5 common questions with answers
8. Final CTA: Repeat the main call-to-action

Design: Professional, trustworthy. Primary color [hex]. Lots of whitespace.
Mobile-responsive. Fast-loading (no heavy images).
Include Open Graph meta tags for social sharing.

Category 2: Feature Addition Prompts

2.1 Authentication System (Advanced)

Tool: Claude Code, Cursor | Time: 1-2 hours

Add a complete authentication system to this [framework] application.

Requirements:
- Email/password signup with email verification
- Login with session management (HTTP-only cookies, not localStorage)
- Password requirements: minimum 8 chars, 1 uppercase, 1 number, 1 special char
- "Forgot password" flow with email reset link (expires in 1 hour)
- "Remember me" option (extends session to 30 days, default is 24 hours)
- Rate limiting: max 5 failed attempts per IP per 15 minutes, then 30-min lockout
- CSRF protection on all auth forms
- Secure headers: HSTS, X-Content-Type-Options, X-Frame-Options

Auth provider: [Supabase Auth / NextAuth / Clerk / custom JWT]

Protected routes: [list routes that require auth]
Public routes: [list routes that don't require auth]

After login, redirect to [dashboard/home/previous page].
Show clear error messages for: wrong password, account not found, account locked, email not verified.

Write tests for: successful login, failed login, signup validation, session expiry, rate limiting.

2.2 Payment Integration (Advanced)

Tool: Claude Code | Time: 2-3 hours

Add [Stripe / Paddle] subscription billing to this application.

Products:
- Free tier: [what's included, usage limits]
- Pro tier: $[price]/month - [what's included]
- [Optional: Enterprise tier: $[price]/month - [what's included]]

Implementation:
1. Pricing page showing all tiers with feature comparison
2. Checkout flow: user selects plan -> [Stripe Checkout / Paddle Overlay] -> redirect to success page
3. Webhook handler for: subscription.created, subscription.updated, subscription.cancelled, invoice.payment_failed
4. User dashboard showing: current plan, next billing date, usage this period, upgrade/downgrade buttons
5. Usage tracking: count [what metric] per billing period, enforce limits on free tier
6. Graceful downgrade: when subscription cancelled, access continues until period end
7. Failed payment handling: 3 retry attempts over 7 days, then downgrade to free

Store subscription status in [Supabase / database].
Add middleware to check subscription status on protected API routes.
Show upgrade prompts when free users hit limits.

Environment variables needed:
- [STRIPE_SECRET_KEY / PADDLE_API_KEY]
- [STRIPE_WEBHOOK_SECRET / PADDLE_WEBHOOK_SECRET]
- [STRIPE_PRO_PRICE_ID / PADDLE_PRO_PRICE_ID]

2.3 Real-Time Features (Advanced)

Tool: Claude Code, Cursor | Time: 2-4 hours

Add real-time [collaboration / notifications / live updates] to this application.

What should update in real-time:
- [Specific data that changes: "new messages", "task status changes", "user presence"]

Technology: [Supabase Realtime / Socket.io / Pusher / Server-Sent Events]

Requirements:
- Changes made by User A appear for User B within [1 second / 500ms]
- Show [typing indicators / presence dots / live cursors] for active users
- Handle disconnection gracefully: show "reconnecting..." banner, auto-reconnect with exponential backoff
- Dedup messages that arrive during reconnection
- Don't poll - use persistent connections
- Fallback to polling if WebSocket connection fails

Optimize for:
- [N] concurrent users per [room / document / channel]
- Messages/updates of approximately [size] bytes each
- Mobile networks with intermittent connectivity

Show connection status indicator (green dot = connected, yellow = reconnecting, red = offline).

2.4 Search and Filter System (Intermediate)

Tool: Any | Time: 30-60 min

Add search and filtering to the [items/products/posts] list in this application.

Search:
- Full-text search across: [field 1], [field 2], [field 3]
- Debounced input (300ms delay before searching)
- Show "X results for 'query'" count
- Highlight matching text in results
- Empty state: "No results for 'query'. Try different keywords."

Filters:
- [Filter 1]: [type: dropdown/checkbox/range] with options [list options]
- [Filter 2]: [type] with options [list options]
- [Filter 3]: [type] with options [list options]
- Date range: from/to date pickers
- Sort by: [option 1 / option 2 / option 3], ascending/descending

Behavior:
- Filters combine with AND logic (search + filter1 + filter2)
- Show active filter count as badge on filter button
- "Clear all filters" button when any filter is active
- URL params reflect current filters (shareable filtered views)
- Persist last-used filters in localStorage

Performance:
- Client-side filtering for under 1000 items
- Server-side (API) filtering for larger datasets
- Show loading skeleton while filtering

Category 3: UI/UX Prompts

3.1 Dashboard Layout (Intermediate)

Tool: v0, Cursor | Time: 30-60 min

Build a dashboard layout for [application type].

Layout:
- Left sidebar: navigation menu (collapsible on mobile, icons + labels)
- Top bar: user avatar + dropdown menu, notification bell with count badge, search bar
- Main content area: responsive grid that adapts from 1 to 3 columns

Sidebar navigation items:
1. [Icon] Dashboard (home)
2. [Icon] [Section 1]
3. [Icon] [Section 2]
4. [Icon] [Section 3]
5. [Icon] Settings
6. [Icon] Help

Dashboard home shows:
- Row 1: 4 stat cards ([Metric 1]: [value], [Metric 2]: [value], etc.)
- Row 2: Main chart (line chart showing [metric] over [time period]) + recent activity feed
- Row 3: Quick actions grid (3-4 action cards with icons)

Design: [light/dark] theme. Accent color: [hex].
Use Tailwind CSS. Smooth transitions on sidebar toggle.
Mobile: sidebar becomes a hamburger drawer overlay.

3.2 Form with Validation (Beginner)

Tool: Any | Time: 15-30 min

Build a multi-step form for [purpose, e.g., "user onboarding", "job application", "event registration"].

Steps:
1. [Step name]: Fields: [field1 (type, required?), field2, field3]
2. [Step name]: Fields: [field4, field5, field6]
3. [Step name]: Review all entered data + submit button

Validation:
- Email: valid format + show error immediately on blur
- Phone: format as (XXX) XXX-XXXX as user types
- Required fields: show red border + error message
- [Custom validation]: [describe rule]

UX:
- Progress indicator showing current step (1/3, 2/3, 3/3)
- "Back" and "Next" buttons (Next disabled until current step is valid)
- "Save as draft" option (localStorage)
- Smooth slide transition between steps
- Auto-focus first field on each step
- Show success animation on submit

Accessible: proper labels, aria attributes, keyboard navigation (Tab through fields, Enter to submit).

3.3 Data Table (Intermediate)

Tool: Any | Time: 30-60 min

Build a data table component for displaying [data type, e.g., "user list", "order history", "inventory"].

Columns:
1. [Column]: [type: text/number/date/status/avatar] - [width: narrow/medium/wide]
2. [Column]: [type] - [width]
3. [Column]: [type] - [width]
4. Actions: Edit, Delete, [custom action]

Features:
- Sort by clicking column headers (asc/desc, show arrow indicator)
- Select rows with checkboxes (select all, bulk actions)
- Inline editing: click cell to edit, Enter to save, Escape to cancel
- Pagination: 10/25/50 per page selector, page numbers, total count
- Responsive: on mobile, switch to card layout (one card per row)
- Empty state: illustration + "No [items] yet. Create your first one."
- Loading state: skeleton rows while data loads

Styling: Clean borders, alternating row colors, hover highlight.
Status column: colored badges (green=active, yellow=pending, red=inactive).

Category 4: API and Backend Prompts

4.1 REST API Scaffold (Advanced)

Tool: Claude Code | Time: 1-2 hours

Build a REST API for [application] with these resources:

Resources:
1. [Resource 1, e.g., "Users"]:
   - Fields: [id, name, email, role, created_at, updated_at]
   - Endpoints: GET /api/users, GET /api/users/:id, POST /api/users, PUT /api/users/:id, DELETE /api/users/:id

2. [Resource 2]:
   - Fields: [list fields]
   - Endpoints: [list CRUD endpoints]
   - Relationships: [belongs_to Resource1, has_many Resource3]

Response format (all endpoints):
Success: { data: {...}, meta: { page, limit, total } }
Error: { error: { code: "VALIDATION_ERROR", message: "Email is required", details: [...] } }

Requirements:
- Input validation with descriptive error messages
- Pagination: ?page=1&limit=20 (default limit=20, max=100)
- Filtering: ?status=active&role=admin
- Sorting: ?sort=created_at&order=desc
- Rate limiting: 100 requests per minute per IP
- CORS configured for [allowed origins]
- Request logging (method, path, status, duration)

Auth: Bearer token in Authorization header.
- Public endpoints: [list]
- Authenticated endpoints: [list]
- Admin-only endpoints: [list]

Framework: [Next.js API routes / Express / Fastify / Hono]
Database: [Supabase / Prisma / Drizzle]

4.2 Database Schema Design (Advanced)

Tool: Claude Code | Time: 30-60 min

Design a database schema for [application type].

Entities:
1. [Entity 1]: [description of what it represents]
   - Required fields: [list]
   - Optional fields: [list]
   - Unique constraints: [list]

2. [Entity 2]: [description]
   - Fields: [list]
   - References: [Entity 1] (one-to-many / many-to-many)

Business rules:
- [Rule 1, e.g., "A user can only have one active subscription"]
- [Rule 2, e.g., "Orders must have at least one line item"]
- [Rule 3, e.g., "Soft delete for users, hard delete for sessions"]

Generate:
1. SQL migration file with CREATE TABLE statements
2. Indexes for common query patterns: [list queries, e.g., "find users by email", "get orders by date range"]
3. Row-level security policies (if Supabase)
4. Seed data: 10-20 realistic sample records per table
5. TypeScript types matching the schema

Optimize for: [read-heavy / write-heavy / balanced]
Database: [PostgreSQL / MySQL / SQLite]

Category 5: Testing and Quality Prompts

5.1 Comprehensive Test Suite (Advanced)

Tool: Claude Code | Time: 2-4 hours

Write a comprehensive test suite for this [application/module].

Testing framework: [Vitest / Jest / Playwright / Cypress]

Coverage targets:
- Unit tests: all utility functions and business logic (aim for 90%+)
- Integration tests: all API endpoints (happy path + error cases)
- Component tests: all interactive components (user events + state changes)
- E2E tests: [list 3-5 critical user flows]

For each test, include:
- Clear descriptive name: "should [expected behavior] when [condition]"
- Arrange-Act-Assert structure
- Realistic test data (not "test123" or "foo bar")
- Error case coverage (invalid input, timeout, auth failure)
- Edge cases ([list specific edge cases for this app])

Mock strategy:
- External APIs: mock with [MSW / jest.mock / vi.mock]
- Database: use [test database / in-memory / fixtures]
- Time-dependent tests: mock Date.now()
- File system: use temp directories

Run the complete suite after writing. Fix any failures.
Generate a coverage report.

5.2 Security Audit Prompt (Expert)

Tool: Claude Code | Time: 1-2 hours

Perform a security audit of this codebase. Check for:

1. Authentication & Authorization:
   - Are passwords hashed with bcrypt/argon2 (not MD5/SHA)?
   - Are sessions stored securely (HTTP-only cookies, not localStorage)?
   - Is CSRF protection implemented on state-changing requests?
   - Are API keys and secrets in environment variables (not hardcoded)?
   - Are authorization checks on every protected endpoint (not just frontend)?

2. Input Validation:
   - Is all user input validated server-side (not just client-side)?
   - Are SQL queries parameterized (no string concatenation)?
   - Is HTML output sanitized to prevent XSS?
   - Are file uploads validated (type, size, name)?
   - Are URL redirects validated against an allowlist?

3. Data Protection:
   - Is sensitive data encrypted at rest?
   - Is HTTPS enforced (HSTS headers)?
   - Are API responses filtered (no password hashes, internal IDs leaking)?
   - Is PII handled according to GDPR/CCPA requirements?
   - Are error messages generic (no stack traces to users)?

4. Infrastructure:
   - Are dependencies up to date (no known CVEs)?
   - Are security headers set (CSP, X-Frame-Options, etc.)?
   - Is rate limiting configured on auth and API endpoints?
   - Are CORS origins restricted (not "*")?
   - Are logs sanitized (no passwords or tokens in logs)?

For each issue found:
- Severity: Critical / High / Medium / Low
- Location: file path and line number
- Description: what's wrong and why it matters
- Fix: specific code change to resolve it
- Test: how to verify the fix works

Prioritize fixes by severity. Implement Critical and High fixes immediately.

Category 6: Refactoring and Optimization Prompts

6.1 Performance Optimization (Advanced)

Tool: Claude Code | Time: 1-2 hours

This application is slow. Analyze and optimize performance.

Symptoms:
- [Specific symptom: "initial page load takes 4+ seconds"]
- [Specific symptom: "scrolling is janky with 500+ items"]
- [Specific symptom: "API response takes 2+ seconds"]

Investigate and fix:
1. Bundle size: analyze with [next/bundle-analyzer or similar], remove unused dependencies, implement code splitting
2. Rendering: identify unnecessary re-renders, add React.memo/useMemo/useCallback where appropriate
3. Data fetching: implement caching, pagination, reduce payload sizes
4. Images: lazy load below-fold images, use next/image or responsive srcset, serve WebP
5. Database: add missing indexes, optimize N+1 queries, implement connection pooling
6. Network: enable gzip/brotli, set proper cache headers, minimize HTTP requests

For each optimization:
- Before: [metric measurement]
- After: [expected improvement]
- Method: [specific code change]

Run Lighthouse audit before and after. Target scores: Performance >90, Accessibility >95.

6.2 Code Cleanup (Intermediate)

Tool: Claude Code, Cursor | Time: 1-2 hours

Clean up this codebase without changing any functionality.

Tasks:
1. Remove dead code: unused imports, unreachable functions, commented-out blocks
2. Consolidate duplicated logic: find similar code patterns and extract shared utilities
3. Fix naming: rename variables/functions that don't describe their purpose
4. Organize file structure: group related files, consistent naming conventions
5. Add TypeScript types: replace 'any' with proper types, add interfaces for data shapes
6. Fix linting issues: run [ESLint / Prettier] and fix all warnings/errors
7. Update dependencies: check for outdated packages, update non-breaking versions
8. Add JSDoc comments to exported functions (not internal helpers)

Rules:
- Make small, focused commits (one type of change per commit)
- Run tests after each change to ensure nothing breaks
- Don't refactor code that has pending changes or open PRs
- Keep the diff readable: don't auto-format unrelated files

Category 7: Deployment and DevOps Prompts

7.1 Production Deployment Checklist (Advanced)

Tool: Claude Code | Time: 1-2 hours

Prepare this application for production deployment on [Vercel / AWS / Railway].

Pre-deployment checklist:
1. Environment variables: create .env.example with all required vars (no values), verify all are set in [hosting platform]
2. Error tracking: set up [Sentry / LogRocket / Bugsnag] for runtime error monitoring
3. Analytics: add [Vercel Analytics / Google Analytics / Plausible] for usage tracking
4. SEO: verify meta tags, Open Graph, Twitter cards, sitemap.xml, robots.txt
5. Performance: run Lighthouse, fix any scores below 80
6. Security: run npm audit, fix critical/high vulnerabilities, verify security headers
7. Database: verify connection pooling, set up backups if applicable
8. Caching: configure CDN caching headers, implement stale-while-revalidate for API routes
9. Monitoring: set up uptime monitoring (e.g., UptimeRobot, Checkly)
10. Domain: configure custom domain, SSL, www redirect

Create a deployment script or CI/CD pipeline that:
- Runs tests
- Runs linter
- Builds the application
- Deploys to [platform]
- Runs smoke tests against the deployed URL
- Notifies [Slack / Discord / email] on success/failure

Category 8: AI Agent Orchestration Prompts (Expert)

8.1 Multi-Agent Task Decomposition

Tool: Claude Code (subagents) | Time: 2-4 hours

I need to [describe large task, e.g., "add a complete user profile system with settings, avatar upload, activity history, and notification preferences"].

Decompose this into subtasks that can be worked on in parallel:

1. Data layer: schema changes, migrations, API endpoints
2. UI components: form components, display components, layouts
3. Business logic: validation rules, permission checks, notification triggers
4. Tests: unit tests, integration tests, E2E tests

For each subtask:
- Define the interface/contract (inputs, outputs, data shapes)
- List dependencies on other subtasks
- Identify which can run in parallel vs. must be sequential

Then implement each subtask, integrating them at the defined interfaces.
Run the full test suite after integration to catch any contract mismatches.

8.2 Codebase Analysis and Improvement Plan

Tool: Claude Code | Time: 1-2 hours

Analyze this entire codebase and create an improvement plan.

Evaluate:
1. Architecture: Is the structure scalable? Are concerns properly separated?
2. Code quality: Consistency, readability, duplication, complexity (cyclomatic)
3. Error handling: Are errors caught, logged, and presented well?
4. Testing: Coverage, quality of tests, missing edge cases
5. Security: Common vulnerabilities (OWASP Top 10 applicable ones)
6. Performance: Obvious bottlenecks, missing optimizations
7. Developer experience: Build time, hot reload, debugging ease

Output:
- Score each category 1-10 with specific evidence
- Top 5 improvements ranked by impact/effort ratio
- Specific action items for each improvement
- Estimated time for each action item

Don't fix anything yet. Just analyze and plan.

Category 9: Content and Data Prompts

9.1 Seed Data Generator (Beginner)

Tool: Any | Time: 15-30 min

Generate realistic seed data for this application.

Data needed:
- [N] [entity type, e.g., "users"] with: [fields]
- [N] [entity type, e.g., "products"] with: [fields]
- [N] [entity type, e.g., "orders"] with: [fields]

Rules:
- Use realistic names (not "Test User 1")
- Dates spread across the last [time period]
- Prices/amounts in realistic ranges for [industry]
- Status distribution: [e.g., "60% active, 30% pending, 10% cancelled"]
- Include edge cases: [e.g., "one user with no orders, one product with 0 stock"]
- Relationships should be consistent (orders reference real user IDs and product IDs)

Output format: [JSON / SQL INSERT statements / TypeScript constants / CSV]

9.2 API Documentation Generator (Intermediate)

Tool: Claude Code | Time: 30-60 min

Generate comprehensive API documentation for all endpoints in this application.

For each endpoint, document:
- Method and path (e.g., GET /api/users/:id)
- Description (one sentence)
- Authentication required? (yes/no, what type)
- Request: headers, query params, body schema with types and validation rules
- Response: status codes, body schema for success and each error case
- Example request (curl command)
- Example response (JSON)

Format: [Markdown / OpenAPI 3.0 spec / Swagger]
Include a table of contents.
Group endpoints by resource.
Add rate limiting info if applicable.

Category 10: Platform-Specific Prompts

10.1 Chrome Extension (Advanced)

Tool: Claude Code | Time: 2-4 hours

Build a Chrome Extension (Manifest V3) that [core functionality].

Features:
- Popup: [describe popup UI and what it shows]
- Content script: [what it does on web pages, e.g., "highlights [elements]"]
- Background service worker: [what it handles, e.g., "API calls, storage sync"]
- Options page: [settings the user can configure]

Permissions needed: [activeTab, storage, tabs, etc. - minimize permissions]

Storage:
- Use chrome.storage.sync for: [settings that sync across devices]
- Use chrome.storage.local for: [data that stays local]

Communication:
- Content script <-> Background: chrome.runtime.sendMessage
- Popup <-> Background: direct access to chrome.storage

Include:
- manifest.json with all required fields
- Icon set (16x16, 48x48, 128x128) - use simple colored SVG converted to PNG
- README with installation instructions (load unpacked)
- Privacy policy text (required for Chrome Web Store submission)

Test on these sites: [list 3-5 target websites]

10.2 CLI Tool (Intermediate)

Tool: Claude Code | Time: 1-2 hours

Build a command-line tool in [Node.js / Python / Go / Rust] that [core functionality].

Commands:
- [tool] init: [what it sets up]
- [tool] [command 1] [args]: [what it does]
- [tool] [command 2] [args]: [what it does]
- [tool] --help: show all commands with descriptions

Features:
- Colored output (green for success, red for errors, yellow for warnings)
- Progress bars for long operations
- Interactive prompts for required input (with defaults)
- Config file (~/.toolrc or .toolrc in project root)
- --verbose flag for debug output
- --json flag for machine-readable output
- Meaningful exit codes (0 success, 1 error, 2 usage error)

Error handling:
- Clear error messages with suggested fixes
- Never show stack traces (unless --verbose)
- Graceful handling of Ctrl+C

Package for distribution via [npm / pip / brew / cargo].
Include README with installation, usage examples, and config reference.

Prompt Patterns Reference Card

The Constraint Sandwich

Do [action].
Include: [must-have list]
Do NOT include: [exclusion list]
Match existing: [patterns/styles to follow]

The Iterative Refinement

[After seeing initial output]
Keep: [what works]
Change: [what needs to change]
Add: [what's missing]
Remove: [what's unnecessary]
Don't touch: [what shouldn't change]

The Context Dump

Here's the current state:
- File: [path] does [function]
- File: [path] does [function]
- The bug is in: [location]
- Error message: [exact text]
- This worked before I: [recent change]
- I've already tried: [attempts]
Fix the bug without changing [protected areas].

The Scope Lock

ONLY modify [specific files/functions].
Do NOT touch: [protected files]
Do NOT change: [protected behavior]
Do NOT add: [unwanted additions]
Keep the diff as small as possible.

The Quality Gate

Before considering this done:
1. All existing tests pass
2. New tests cover: [specific scenarios]
3. No TypeScript errors (strict mode)
4. No ESLint warnings
5. Lighthouse performance score > [N]
6. [Custom quality criterion]

Category 11: MCP & Agent Team Prompts (New — March 2026)

11.1 MCP Server Discovery Prompt (Intermediate)

Tool: Claude Code | Time: 5-10 min

Search for MCP servers that can help with [task domain — e.g., "database management", "file processing", "API integration"].

For each relevant server:
1. What it does and what tools it exposes
2. How to install/configure it
3. Example tool calls I can make through it
4. Any security considerations (auth tokens, permissions)

Then add the best match to my Claude Code MCP config and verify it works with a test call.

11.2 Agent Team Task Decomposition (Advanced)

Tool: Claude Code Agent Teams | Time: 15-30 min

I need to [describe complex task — e.g., "add OAuth login with Google and GitHub, including database schema, API routes, and frontend components"].

Break this into parallel agent tasks:
1. Identify which subtasks can run simultaneously
2. Identify dependencies between subtasks
3. For each subtask, specify: scope, files to touch, acceptance criteria
4. Assign each to the right agent type (Explore for research, general-purpose for implementation)
5. Define the merge strategy for combining results

Execute with agent teams. Each agent should work in isolation and produce a clear deliverable.

11.3 MCP-Powered Research Pipeline (Advanced)

Tool: Claude Code + MCP | Time: 10-20 min

Using available MCP tools, research [topic] and produce a structured report:

1. Use web search MCP to find the latest data on [topic]
2. Use file system MCP to check if we have existing content on this topic
3. Cross-reference findings with our codebase for relevance
4. Produce a markdown report with:
   - Key findings (with source URLs)
   - Relevance to our project
   - Recommended actions
   - Code changes needed (if any)

Save the report to [output path].

Category 12: Agentic Engineering Prompts (New — March 2026)

Andrej Karpathy coined "agentic engineering" in early 2026 to describe the discipline of designing, orchestrating, and supervising AI agents that autonomously write, test, and deploy code. These prompts operationalize that workflow.

12.1 Supervised Agent Loop (Advanced)

Tool: Claude Code, Codex CLI | Time: 30-60 min per iteration

You are operating as a supervised autonomous agent. Follow this loop:

1. READ: Analyze the current state of [codebase/feature/module] — files, tests, dependencies, open issues
2. PLAN: Propose a concrete plan with numbered steps. Each step must specify: what changes, which files, expected outcome
3. WAIT: Present the plan and STOP. Do not implement until I approve or modify

After I approve:
4. IMPLEMENT: Execute each step, committing after each logical unit
5. VERIFY: Run the test suite, linter, and type checker. Report results
6. REPORT: Summarize what changed, what passed, what failed, and what you recommend next

Rules:
- Never skip the WAIT step. Human approval is required before implementation
- If any test fails after implementation, diagnose and fix before proceeding
- If a step requires a decision between multiple approaches, present options with trade-offs
- Keep commits small and reversible
- Log every file you read or modify

Current task: [describe what needs to be done]
Acceptance criteria: [list specific, testable criteria]

12.2 Agent-Driven CI/CD Pipeline (Expert)

Tool: Claude Code + GitHub Actions | Time: 1-2 hours setup, then autonomous

Set up an agentic CI/CD workflow where AI agents handle the full lifecycle of a code change:

Phase 1 — Agent writes code:
- Agent receives a task description from [Linear / GitHub Issue / Slack]
- Agent creates a feature branch, implements the change, writes tests
- Agent opens a draft PR with a structured description

Phase 2 — Agent reviews code:
- A second agent (or agent team) reviews the PR for:
  - Security vulnerabilities (OWASP Top 10)
  - Performance regressions
  - Test coverage gaps
  - Style and convention compliance
- Review comments are posted on the PR

Phase 3 — Human gate:
- PR is marked ready for review only after agent review passes
- Human reviewer sees both the code and the agent's review analysis
- Human approves, requests changes, or rejects

Phase 4 — Agent deploys:
- On merge, agent monitors the deployment pipeline
- If deployment fails, agent diagnoses the failure and either auto-fixes or escalates
- Agent posts deployment confirmation with health check results

Implement this for:
- Repository: [repo URL or path]
- CI platform: [GitHub Actions / GitLab CI / CircleCI]
- Deploy target: [Vercel / AWS / Railway]
- Notification channel: [Slack / Discord webhook URL]

Include rollback triggers: if error rate exceeds [N]% within [M] minutes post-deploy, auto-revert.

12.3 Multi-Agent Code Review (Advanced)

Tool: Claude Code Agent Teams | Time: 15-30 min

Perform a multi-perspective code review of [PR number / branch / file set] using specialized agent roles:

Agent 1 — Security Auditor:
- Check for injection vulnerabilities, auth bypass, data exposure
- Verify input validation and output encoding
- Flag hardcoded secrets or overly permissive CORS
- Reference OWASP Top 10 and CWE IDs for any findings

Agent 2 — Performance Engineer:
- Identify N+1 queries, missing indexes, unoptimized loops
- Check for memory leaks, unnecessary re-renders, large bundle imports
- Estimate impact: "This will add ~Nms to response time because..."

Agent 3 — Architecture Reviewer:
- Evaluate against existing patterns in the codebase
- Flag violations of established conventions
- Identify coupling, missing abstractions, or wrong layer placement
- Check if the change scales for [expected load / data volume]

Synthesis:
- Combine all three reviews into a single summary
- Categorize findings: Must Fix (blocks merge) / Should Fix (before next release) / Consider (future improvement)
- For each Must Fix, provide a specific code suggestion

Output the review as a structured markdown report I can paste into the PR.

Category 13: MCP & Tool Integration Prompts

New category added March 2026 as MCP becomes standard infrastructure for AI-native development.

13.1 MCP Server Bootstrap (Intermediate)

Tool: Claude Code | Time: 30-60 min

Create a production-ready MCP server in TypeScript that exposes [service name] to AI tools.

Service to wrap: [e.g., "our internal Postgres database", "Stripe API", "Linear project management"]

Resources to expose:
- [Resource 1]: [description and schema — e.g., "list of open tasks with id, title, status, assignee"]
- [Resource 2]: [description and schema]

Tools to implement:
- [Tool 1]: [action — e.g., "create_task(title, description, assignee)"]
- [Tool 2]: [action — e.g., "update_status(task_id, new_status)"]

Requirements:
- TypeScript strict mode
- Input validation with Zod schemas
- Error messages that are AI-readable (not just HTTP codes)
- Rate limiting: max [N] calls per minute
- Logging of all tool invocations with timestamp and caller info
- Auth via environment variable API key
- README with setup instructions and example Claude Code config block

Security requirements:
- Read-only by default; tools that write require explicit "write_mode: true" config flag
- Reject any resource URI that attempts path traversal or contains shell metacharacters
- Log and reject requests exceeding [N] tokens in a single call

Output: Complete server/index.ts, package.json, README.md, and .env.example

13.2 MCP Workflow Chaining (Advanced)

Tool: Claude Code | Time: 1-2 hours

Design and implement a multi-step workflow using MCP tool chaining for: [workflow description]

Example workflow: "When a new GitHub issue is labeled 'bug', automatically:
1. Read the issue and linked code file
2. Write a failing test that reproduces the bug
3. Attempt a fix
4. Create a PR with the fix and test linked to the original issue"

My MCP servers available:
- GitHub MCP (@modelcontextprotocol/server-github)
- Filesystem MCP (@modelcontextprotocol/server-filesystem)
- [any others]

Workflow to implement: [describe your workflow in plain English]

Requirements:
- Each step should verify its output before proceeding to the next step
- If any step fails, log the failure state and stop (don't proceed with partial data)
- Human-in-the-loop checkpoint after [step N]: show summary and ask for confirmation before proceeding
- Final output: summary of all actions taken with links to created artifacts

Implement as a Claude Code workflow with a CLAUDE.md that documents how to trigger it.

13.3 MCP Security Audit (Expert)

Tool: Claude Code | Time: 15-30 min

Audit the MCP configuration in this project for security risks.

Check the following:
1. Permissions scope — are any MCP servers granted broader access than their use case requires?
   - filesystem: should only need the project directory, not ~/
   - github: should be read-only unless explicitly creating PRs
   - database: should use a read-only connection string for query-only workflows

2. Secret handling — are API keys and tokens:
   - Stored in environment variables (not hardcoded in settings.json)?
   - Excluded from version control (.gitignore includes .env and settings.json)?
   - Rotated recently (< 90 days)?

3. Tool invocation logging — is there an audit trail for:
   - Which tools were called?
   - What arguments were passed?
   - What was returned?

4. Community server provenance — for each non-official MCP server installed:
   - Is it from a verified publisher?
   - When was it last updated?
   - Does the package.json match the advertised functionality (no unexpected network calls)?

5. Blast radius — if an MCP server is compromised, what's the worst case?
   - Can it read files outside the project?
   - Can it make outbound network calls?
   - Can it modify production data?

Output: Security report with PASS/FAIL/WARN for each check, plus specific remediation steps for any failures.


Category 14: AI Code Review & Quality Assurance Prompts

Added March 14, 2026 — Responding to Anthropic's Code Review launch and the growing need for systematic AI-generated code quality gates.

14.1 Claude Code Review Gate (Intermediate)

Tool: Claude Code | Time: 5-10 min per PR

Review the changes in this git diff for logic errors, security issues, and integration risks.

Focus areas:
1. Logic correctness: Does the code do what the PR description claims? Walk through the happy path and 2-3 edge cases.
2. Security: Check for injection vulnerabilities, auth bypasses, and data exposure. Reference OWASP Top 10.
3. Integration assumptions: Does this code make assumptions about external services, database schemas, or API contracts that aren't validated?
4. Error handling gaps: What happens when network calls fail, database is unavailable, or inputs are malformed?
5. Performance: Will this code degrade under load? Flag any N+1 queries, synchronous file I/O, or unbounded loops.

For each issue found:
- Severity: CRITICAL / HIGH / MEDIUM / LOW
- Location: file:line
- Problem: what's wrong
- Fix: concrete code suggestion or approach

Conclude with: APPROVE / REQUEST_CHANGES / BLOCK with a one-sentence rationale.

14.2 Context Hub API Accuracy Check (Intermediate)

Tool: Claude Code with Context Hub MCP | Time: 10-20 min

I'm using [library/API name] version [X.Y.Z] in this codebase.

Using Context Hub to pull the current API docs, verify:
1. All method signatures in [file.ts / module name] match the current API spec
2. Deprecated methods that have replacement alternatives
3. Breaking changes in the last 3 minor versions that affect our usage
4. Any authentication or rate-limiting changes we should be aware of

For each discrepancy:
- File and line where we use the outdated API
- Current correct signature
- Migration path (with code example)

Output a migration checklist sorted by priority.

14.3 Multi-Agent Parallel Quality Review (Expert)

Tool: Claude Code with agent teams | Time: 20-40 min

Spin up a quality review using agent teams for this pull request.

Agent roles to create:
1. Security Agent: Focus exclusively on vulnerabilities, auth, input validation, secrets
2. Logic Agent: Focus on correctness, edge cases, business rule compliance
3. Performance Agent: Focus on database queries, caching opportunities, bundle size impact
4. Test Coverage Agent: Check which new code paths lack test coverage; write tests for critical gaps

Each agent should:
- Work independently on its area
- Output findings in the format: [SEVERITY] [FILE:LINE] [DESCRIPTION] [SUGGESTED FIX]
- Flag items that need human judgment before proceeding

After all agents complete, synthesize findings into a unified PR review comment, deduplicating overlapping findings and ordering by priority.

Final output: A ready-to-paste GitHub PR comment with all findings.

Category 15: Supply Chain Security Prompts (New — March 2026)

Added in response to the PHANTOMRAVEN npm campaign (March 2026) — 88 packages using Remote Dynamic Dependencies to bypass registry scanners. These prompts help you audit and harden your dependency chain when using AI coding tools.

15.1 npm Dependency Security Audit Prompt (Intermediate)

Tool: Claude Code, Cursor Composer | Time: 15-30 min

Audit all dependencies in this project's package.json for supply chain security risks.

For each dependency:
1. Check if the package name could be a typosquat of a popular package (e.g., "util-logger-enhanced" vs "util-logger")
2. Identify any postinstall/preinstall/install scripts — list them explicitly
3. Flag any scripts that contain URL fetching (https://, fetch, axios, request, got)
4. Check for packages with very low download counts (<1000/week) that are in production dependencies

For flagged packages:
- Package name
- Risk type: TYPOSQUAT / REMOTE_DYNAMIC_DEPENDENCY / LOW_REPUTATION / SUSPICIOUS_SCRIPT
- The specific script or pattern that triggered the flag
- Recommended action: REMOVE / REPLACE_WITH / INVESTIGATE

Then check our .npmrc or npm config:
- Is ignore-scripts set? (Recommended for CI)
- Is package-lock.json committed and up to date?
- Are we using npm ci in CI pipelines instead of npm install?

Output: A prioritized security remediation list for our dependency chain.

15.2 AI-Generated Package Validation Prompt (Advanced)

Tool: Claude Code | Time: 10-20 min

I'm about to install the following packages that were suggested by an AI coding agent:
[paste package list or package.json]

Before I run npm install, perform a supply chain security pre-check:

1. For each package:
   - Is this a real, well-known package? (Check name against known-good packages)
   - Does the name match any typosquat patterns? (Common patterns: adding -enhanced, -helper, -core, -utils, -wrapper)
   - Is there a more official/canonical package that does the same thing?

2. For the overall dependency set:
   - Any packages that are redundant (installing two packages that do the same thing)?
   - Any packages that AI tools commonly hallucinate (packages that sound plausible but don't exist or are malicious)?
   - Are all packages available on the official npm registry at npmjs.com?

3. Generate the safe install command:
   - If any package is suspicious: exclude it and explain why
   - Include --ignore-scripts if any package has postinstall scripts
   - Suggest npm ci if a lockfile exists

Output: SAFE TO INSTALL / REVIEW REQUIRED / DO NOT INSTALL with specific reasoning for each package flagged.

15.3 Postinstall Script Inspector Prompt (Expert)

Tool: Claude Code with file system access | Time: 20-40 min

Scan all packages in our node_modules directory for Remote Dynamic Dependency (RDD) patterns.

This is a supply chain security audit targeting the PHANTOMRAVEN attack class (March 2026), where malicious packages appear clean to static scanners but fetch payloads from external URLs during install or runtime.

For each package in node_modules/:
1. Read package.json and extract: scripts.postinstall, scripts.preinstall, scripts.install
2. If any script exists: read the script file it references
3. Flag the package if the script contains ANY of:
   - URL fetching: https://, http://, fetch(), require('https'), require('http')
   - Command execution with dynamic content: execSync, exec, spawn, child_process
   - File writes to system locations: /etc/, /usr/, ~/.ssh/, ~/.bashrc
   - Environment variable exfiltration: process.env being sent to external URL
   - Base64 decoding followed by eval/exec (obfuscation pattern)

For each flagged package:
- Package name and version
- Script type (postinstall/preinstall/install)
- Exact suspicious line(s)
- Risk classification: CRITICAL (remote fetch + exec) / HIGH (exec with env vars) / MEDIUM (network access only) / LOW (local file operations)
- Recommended action

Output: JSON report of all findings, sorted by severity. Safe packages can be omitted.


Category 16: Cursor Automations Prompts

Added March 16, 2026 — event-driven automation workflows using Cursor 2.6+ Automations.

16.1 PR Security Review Automation (Intermediate)

Tool: Cursor Automations 2.6+ | Trigger: GitHub PR Opened | Time: 2-3 min per PR

Review this pull request for security vulnerabilities and code quality issues.

## Security Checks
1. SQL injection risks — any database queries with user-controlled input concatenated as strings
2. Missing input validation — API endpoints that accept user data without sanitizing/validating
3. Hardcoded credentials or secrets — passwords, API keys, tokens in code or config files
4. Missing authentication checks — routes or functions that should require auth but don't
5. XSS vulnerabilities — user input rendered in HTML without escaping
6. Overly permissive CORS — origins: '*' or missing Content-Security-Policy headers

## Quality Checks
1. TypeScript strict mode violations — any 'any' types, missing return types, unsafe casts
2. Missing error handling — async operations without try/catch, promises without .catch()
3. Obvious logic errors — off-by-one errors, wrong comparison operators, inverted conditions

## Output Format
For each issue found:
- **Severity**: CRITICAL / HIGH / MEDIUM / LOW
- **File**: [filepath:line]
- **Issue**: [one sentence description]
- **Fix**: [corrected code snippet if < 10 lines]

If no issues found, post exactly: "✅ Security review passed — no issues found."

16.2 Automated Changelog Entry Generator (Beginner)

Tool: Cursor Automations 2.6+ | Trigger: GitHub PR Merged | Time: 1-2 min per merge

A pull request was just merged. Generate a changelog entry for it.

PR Title: [PR title]
Author: [PR author]
Files changed: [list of changed files]
PR description: [PR body]

Determine the entry type:
- feat: New feature or capability added
- fix: Bug fix or error correction
- perf: Performance improvement
- security: Security patch or hardening
- refactor: Code restructuring without behavior change
- docs: Documentation updates only

Generate: `[type]: [concise description] ([PR number])`

Rules:
- Under 100 characters
- Present tense ("add", "fix", "update")
- Focus on user-facing change, not implementation detail
- If PR touches multiple areas, generate separate entries

Output ONLY the changelog line(s), nothing else.

16.3 Incident First Responder Automation (Advanced)

Tool: Cursor Automations 2.6+ | Trigger: PagerDuty P1/P2 Alert | Time: 3-5 min per incident

A production incident was just triggered. Perform automated initial diagnosis before the on-call engineer responds.

Alert name: [alert title]
Alert description: [alert body]
Triggered at: [timestamp]

Diagnosis Steps:
1. Review all commits in the past 24 hours — note files changed and summaries
2. Identify which files/functions match keywords in the alert
3. Analyze error handling patterns in those functions for known failure modes

Output Format:
**🚨 Automated Incident Analysis**

**Top 3 Most Likely Causes:**
1. [Cause] — [file:function] — [why likely]
2. [Cause] — [file:function] — [why likely]
3. [Cause] — [file:function] — [why likely]

**Recent Changes That May Be Related:**
- [commit hash] — [description] — [changed files]

**Immediate Debugging Suggestions:**
- Check [specific endpoint/function] for [specific condition]

*Automated code analysis only — verify against live logs.*


Category 17: Agentic Security & Engineering Prompts

Added March 18, 2026 — security-focused and agentic engineering workflows for the multi-agent Claude Code era.

17.1 The AI IDE Security Audit (Intermediate/Security)

Tool: Claude Code | Context: Any codebase | Time: 10-15 min

Perform a security audit of this codebase with focus on vulnerabilities that AI-assisted development commonly introduces.

## Scope
Analyze all source files for:

1. **Injection vulnerabilities** — SQL, command, LDAP, XPath injection from unsanitized user input
2. **Authentication gaps** — Missing auth checks on routes/functions that handle sensitive data
3. **Secret exposure** — Hardcoded API keys, passwords, tokens, or .env values in source files
4. **Insecure deserialization** — JSON.parse, eval(), or pickle on untrusted data
5. **SSRF vectors** — Server-side requests built from user-controlled URLs
6. **MCP trust boundaries** — Any code that processes MCP server responses without validation
7. **Prompt injection surfaces** — Any point where untrusted text reaches an LLM call

## Output Format
For each finding:
- **Severity**: CRITICAL / HIGH / MEDIUM / LOW
- **CVE Pattern**: Closest matching CWE identifier
- **File**: [filepath:line_number]
- **Vulnerable Code**: [snippet, max 5 lines]
- **Attack Scenario**: [one sentence — how this gets exploited]
- **Fix**: [corrected code snippet]

## Summary
After findings, provide:
- Total by severity
- Top 3 most urgent fixes
- Estimated remediation time

17.2 The Agentic Engineering Orchestrator (Advanced/Agents)

Tool: Claude Code Agent Teams | Context: Multi-feature project | Time: 30-90 min

You are the orchestrator in a multi-agent engineering team. Your job is to decompose the following work into parallel workstreams, assign each to a specialized sub-agent, and coordinate their outputs into a coherent result.

## Work to Decompose
[Describe the feature, refactor, or system to build]

## Available Sub-Agents
- **Architect**: Designs system structure, data models, API contracts
- **Frontend**: Implements UI components and user interactions
- **Backend**: Implements API routes, business logic, database operations
- **Security**: Reviews all code for vulnerabilities before integration
- **Tests**: Writes unit, integration, and E2E tests for all new code

## Orchestration Protocol
1. Architect produces: data model, API contract, component tree — share with all agents
2. Frontend and Backend work in parallel using the Architect's contracts
3. Security reviews Frontend and Backend output before integration
4. Tests writes tests against the finalized integrated code
5. You review the full result for coherence before declaring done

## Output Per Phase
After each phase, report:
- What each agent produced
- Any conflicts or gaps between agents
- What's needed before the next phase begins

## Definition of Done
[ ] All acceptance criteria met
[ ] Security review passed (0 CRITICAL, 0 HIGH issues)
[ ] Test coverage ≥ 80% on new code
[ ] No TypeScript errors in strict mode
[ ] API contracts match between frontend and backend

17.3 The Enterprise AI Safety Gate (Expert/Enterprise)

Tool: Claude Code | Context: Enterprise CI/CD pipeline | Time: 5-10 min per PR

You are an AI safety reviewer for enterprise code. A pull request is about to be merged. Your job is to evaluate whether this PR meets enterprise AI governance standards before it goes to production.

## PR Contents
[Paste diff or file list]

## Safety Gate Checklist

### Data Privacy
- [ ] No PII or PHI in logs, error messages, or API responses
- [ ] Database queries are parameterized (no string concatenation)
- [ ] Any new data collection has a stated retention and deletion policy
- [ ] Third-party data sharing follows documented agreements

### AI/LLM Specific
- [ ] LLM calls don't include user PII in prompts without explicit consent
- [ ] Prompt templates validated against injection — no raw user input concatenated directly
- [ ] Model outputs are validated before use in security-critical operations
- [ ] Fallback behavior defined for model unavailability or rate limits

### Access Control
- [ ] New endpoints have authentication checks
- [ ] Authorization is at the data level, not just route level
- [ ] Service-to-service calls use least-privilege tokens
- [ ] New environment variables documented and rotation-scheduled

### Observability
- [ ] New features have error logging (not just success paths)
- [ ] Alerts defined for failure modes that affect users
- [ ] No secrets or credentials in log statements

## Verdict
For each unchecked item, provide:
- **Risk**: What could go wrong
- **Fix**: Exact code change required
- **Blocking**: YES (must fix before merge) / NO (follow-up ticket acceptable)

Final verdict: APPROVED / APPROVED WITH CONDITIONS / BLOCKED

Category 18: Multi-Agent Coordination Prompts (New — March 2026)

Multi-agent workflows became table stakes in early 2026 when every major AI coding platform shipped concurrent agent support. These prompts complement Category 8 and Category 11 with new patterns for parallel decomposition, critic loops, and clean inter-agent handoffs.

18.1 The Parallel Feature Decomposer (Advanced)

Tool: Claude Code, Cursor Composer | Time: 5 min setup, parallel execution

I need to implement [feature name]. Break this into parallel subtasks that can be worked on simultaneously by separate agents with zero dependencies between them.

Feature: [description]
Tech stack: [stack]
Existing codebase patterns: [key patterns to follow]

For each subtask, specify:
1. Which files to create/modify (exact paths)
2. What to implement (precise scope)
3. What NOT to touch (boundaries)
4. Expected output format for handoff to reviewer

Produce 3-5 independent subtasks. If tasks have dependencies, order them and flag which ones can run in parallel within each phase.

18.2 The Agent Role Briefing (Intermediate)

Tool: Claude Code, any agent-capable tool | Time: 2 min

You are the [ROLE] agent in a multi-agent workflow.

Your role: [Writer / Reviewer / Tester / Planner / Researcher]

Input you'll receive: [description of what the previous agent produced]
Your output: [exact format and content expected]
Scope: ONLY work on [specific files/areas]
Do NOT: [what other agents are handling]

Previous agent's output:
---
[paste output here]
---

Proceed with your role's task.

18.3 The Critic Loop Trigger (Advanced)

Tool: Claude Code | Time: 10-15 min for full loop

Review the following implementation as an adversarial code reviewer. Your goal is to find bugs, edge cases, and security issues — not to be constructive, but to be thorough.

Implementation to review:
[paste code]

For each issue you find:
1. Describe the exact failure scenario (input → unexpected output)
2. Rate severity: Critical / High / Medium / Low
3. Write a specific failing test case that exposes the issue
4. Suggest the minimal fix

Do NOT suggest style improvements, refactors, or optimizations — only real bugs and security issues.
After this review, a separate agent will patch the implementation based on your findings.

18.4 The Coordinator Prompt (Expert)

Tool: Claude Code with multiple sessions | Time: 15 min setup

You are the coordinator agent for a multi-agent coding session.

Task: [high-level description]
Codebase: [brief description of relevant structure]
Team: 3 worker agents (Writer, Reviewer, Tester)

Your job:
1. Break the task into a dependency graph
2. Identify which subtasks can run in parallel (Phase 1) vs sequentially (Phase 2, 3)
3. Write a briefing document for each agent with: their exact scope, input format, output format, and what to ignore
4. Estimate completion time assuming parallel execution

Output format:
## Dependency Graph
[ASCII or list showing task dependencies]

## Phase 1 (Parallel)
### Agent A: [role]
- Scope: [exact files]
- Task: [precise instructions]
- Output: [format]

### Agent B: [role]
...

## Phase 2 (Sequential, after Phase 1)
...

## Estimated total time: [X min]

18.5 The Context Handoff (Intermediate)

Tool: Claude Code, any multi-session workflow | Time: 1 min

I'm handing off context from one agent session to another.

Previous agent summary:
- Role: [what the previous agent did]
- Files modified: [list with brief description of changes]
- Decisions made: [key choices and why]
- Known issues: [anything incomplete or that needs attention]
- Output artifacts: [files/data produced]

Your task as the next agent:
[specific instructions for the receiving agent]

Do not re-do work already completed. Pick up exactly where the previous agent left off.


Category 19: Voice & Automation Prompts

New in March 2026 — for Claude Code voice mode, Cursor Automations, and rapid-build workflows.

19.1 The Voice-Driven Feature Sprint (Intermediate)

Tool: Claude Code (voice mode) | Time: 15-30 min

Use with Claude Code's new voice mode (/voice push-to-talk). Speak naturally — Claude Code transcribes and acts.

[Speak naturally while reviewing code or pacing the room]

"I'm looking at [file or feature]. Here's what I want to happen:
[Describe the feature in plain English — no need to be precise]

Start by reading the relevant files, tell me what you see, then
propose an implementation plan before writing any code.

When you're ready to implement, walk me through each step out loud
so I can follow along and course-correct."

Voice-mode tips:


19.2 The Cursor Automations Trigger Template (Advanced)

Tool: Cursor (2.6+) | Time: 5 min setup, then runs automatically

Configure Cursor Automations to trigger on GitHub events. Replace bracketed sections.

# Cursor Automation: [Automation Name]

## Trigger
Event: [GitHub PR opened | Linear ticket moved to "In Progress" | Slack message in #[channel] | PagerDuty alert]
Filter: [branch: main | label: "needs-review" | priority: P1 | keyword: "deploy"]

## Agent Instructions
You are a [role: code reviewer / feature implementer / incident responder].

When triggered:
1. Read the context from the event (PR diff, ticket description, alert payload)
2. Identify the [files to review | feature to implement | system to check]
3. Take the following action:
   [specific action: add review comments | create implementation branch | run health check]
4. Report results to: [Slack channel | PR comment | Linear comment]

## Constraints
- Never modify [protected files or branches]
- Always [create a backup branch | request human approval] before [destructive action]
- If confidence < 80%, stop and ask for clarification via [channel]

## Memory
Persist these facts across runs:
- [Project architecture notes]
- [Known gotchas in this codebase]
- [Team preferences for [language/framework]]

19.3 The Rapid SaaS Build Sprint (Expert)

Tool: Claude Code, Cursor Composer | Time: 2-4 hours (full product)

The "I built a SaaS in 3 hours" prompt. Designed for experienced developers who know their stack.

I'm building [product name] — [one sentence description].

## Target User
[Who. What problem. Why they'd pay for it.]

## The One Core Action
If a user could only do ONE thing in this app, it's: [action]
Build this first and make it work perfectly. Everything else is secondary.

## Stack (non-negotiable)
- Framework: [Next.js App Router / FastAPI / etc.]
- Database: [Supabase / PlanetScale / SQLite]
- Auth: [Supabase Auth / Clerk / none for v1]
- Payments: [Stripe / Paddle / skip for v1]
- Deploy: [Vercel / Railway / Fly.io]

## Sprint Rules
1. No mock data, no placeholder UIs — real data from the first commit
2. Ship the database schema first, validate with 3 real records
3. Build auth only if the core action requires it — skip otherwise
4. Use [Tailwind / shadcn/ui] for all UI — no custom CSS
5. Every feature must have an error state and a loading state
6. Stop and check in with me after: schema done, core action done, auth done

## Definition of Done
A real user could sign up, perform [the core action], and have their data saved within 10 minutes of visiting the site.

Start with the Supabase schema. Show me the SQL before running it.


Category 20: AI Benchmark & Selection Prompts

New March 21, 2026: Prompts for evaluating and selecting AI coding tools based on your specific workflow.

20.1 The Agent Benchmark Evaluator (Intermediate)

Tool: Claude Code | Time: 30-45 min

Use this to run your own benchmark across AI coding agents on a real task from your codebase — not synthetic tests.

I want to evaluate which AI coding agent is best for my specific workflow.

## My Codebase Context
- Language/framework: [e.g., Next.js 16, TypeScript, Supabase]
- Codebase size: [e.g., 50 files, 8K lines]
- Typical task type: [e.g., adding API endpoints, fixing bugs, writing tests]
- Team size: [solo / 2-5 / 10+]

## Benchmark Task
Run this real task from my codebase through [Agent A] and [Agent B]:
[Paste a real GitHub issue or feature request from your project]

## Evaluation Criteria (score 1-5 each)
1. Did it understand the full repository context without me explaining it?
2. Did the implementation match our existing code patterns?
3. How many clarifying questions did it ask vs. just proceeding?
4. How long did it take to produce a working diff?
5. Did I need to fix anything before the code was usable?

## What I Need
After running both agents, compare them on these criteria and recommend which tool to standardize on for [my specific task type]. Include specific examples from the output as evidence.

20.2 The Open-Source Model Cost Audit (Intermediate)

Tool: Claude Code | Time: 45-60 min

With open-source models closing the gap on frontier models at 60-70% lower cost, use this to audit where you can switch.

I want to identify which parts of my AI workflow I can shift to open-source models without sacrificing quality.

## Current Setup
- Primary model(s): [e.g., Claude Sonnet 4.6 for all tasks]
- Monthly AI API cost: [approximately $X]
- Top 3 use cases by volume:
  1. [e.g., code generation for new features — 40% of calls]
  2. [e.g., code review comments — 30% of calls]
  3. [e.g., documentation generation — 20% of calls]

## Candidate Open-Source Models
Consider: Xiaomi MiMo-V2-Pro (1T parameters, 67% cheaper than Claude Sonnet), Llama 4, Mistral Large 2, DeepSeek-V3

## Analysis Needed
For each of my top 3 use cases:
1. What is the minimum acceptable quality bar? (Give an example of an acceptable and unacceptable output)
2. Which open-source model would you test first for this use case and why?
3. What's the estimated cost saving if it works?
4. What's the risk if it fails in production?

Provide a migration plan: which use cases to migrate first, how to run A/B tests, and what rollback looks like.

Category 21: Agentic Engineering Setup Prompts (New — March 2026)

New category based on the Karpathy "agentic engineering" framework — prompts for setting up disciplined AI-driven workflows with quality gates.

21.1 CLAUDE.md Architect (Advanced)

Tool: Claude Code | Time: 15-30 min

Analyze the codebase at [path] and generate a comprehensive CLAUDE.md file.

The CLAUDE.md should include:

## Project Overview
- What this codebase does (1-2 sentences)
- Primary language(s) and framework(s)
- Key architectural patterns used

## Core Conventions
- File naming conventions (with examples)
- Directory structure rules (what goes where and why)
- Import/export patterns

## Quality Gates (Non-Negotiable)
- Tests that MUST pass before any change is committed
- Linting rules to follow
- Type checking requirements

## What NOT to Do
- Patterns we've decided against and why
- Libraries not to add (and what to use instead)
- Anti-patterns specific to this codebase

## Agent Workflow
- How to run the dev server
- How to run tests
- How to build for production
- How to check for TypeScript errors

## Security Rules
- Files/directories agents must never modify
- Environment variables that must never be hardcoded
- External services agents should not call without explicit permission

Make the CLAUDE.md concrete and specific — generic advice is useless. Every rule should reference actual files or patterns in this codebase.

21.2 Agentic Quality Gate Builder (Intermediate)

Tool: Claude Code, Cursor | Time: 20-40 min

I want to add an automated quality gate to my [language/framework] project that runs after every AI-generated code change.

The gate should check:
1. **Test suite**: Run [test command] and fail if any tests fail
2. **Type safety**: Run [type check command] — zero errors tolerated
3. **Linting**: Run [lint command] — auto-fix where possible, fail on unfixable
4. **Security**: Check for hardcoded secrets, dangerous eval() patterns, SQL injection risks
5. **Import safety**: Verify no new dependencies added without explicit approval

Implementation requirements:
- Must run in under [time limit] seconds to not slow the feedback loop
- Should output a clear pass/fail with specific failure locations
- Should be invocable as a single command: [preferred command]
- Should be wirable into a pre-commit hook and CI/CD pipeline

Current stack: [describe stack]
Current test framework: [framework name]
CI/CD system: [GitHub Actions / GitLab CI / etc.]

Generate the quality gate script and the CI/CD configuration to run it automatically.

21.3 AI Security Code Review Prompt (Expert)

Tool: Claude Code, Claude Opus 4.6 | Time: 5-15 min per review

You are a senior security engineer performing a code review on AI-generated code.

Review the following code with heightened skepticism. AI coding agents have known failure patterns:
1. **Injection vulnerabilities** — SQL, command, LDAP, XPath injection from unvalidated inputs
2. **Auth logic errors** — Subtle flaws in authentication checks that appear correct but fail edge cases
3. **Memory safety** (in C/C++/Rust unsafe) — Use-after-free, buffer overflows, integer overflow
4. **Cryptographic misuse** — Incorrect IV reuse, weak algorithms, homemade crypto
5. **Secret leakage** — Hardcoded credentials, secrets in logs, insecure environment handling
6. **Race conditions** — Especially in async/concurrent code with shared state
7. **Denial of service** — Unbounded loops, unvalidated file sizes, ReDoS patterns

For each issue found:
- Severity: CRITICAL / HIGH / MEDIUM / LOW
- Location: [file:line]
- Description: What is wrong and why
- Exploitation: How an attacker could use this
- Fix: Specific corrected code

Code to review:
[PASTE CODE HERE]

Context: This code was generated by [AI tool] for [purpose]. It handles [data types] from [source: user input / internal / external API].


Category 22: Supabase Security Audit for Vibe-Coded Apps

Use case: Audit your Supabase database configuration after vibe coding a project. Catches the class of RLS vulnerabilities found in 10.3% of Lovable-generated apps (March 2026).

Tool: Claude Code, Claude Sonnet 4.6 | Time: 10-20 min

You are a Supabase security specialist auditing a vibe-coded application for database security vulnerabilities.

Perform a comprehensive Row Level Security (RLS) audit. Check the following:

## 1. RLS Enablement
Run this query and report any tables with RLS disabled:
SELECT schemaname, tablename, rowsecurity
FROM pg_tables
WHERE schemaname = 'public' AND rowsecurity = false;

## 2. Policy Quality Check
For each RLS policy, check:
- Is USING (true) present? (allows all rows — no security)
- Is WITH CHECK (true) present? (allows all writes — no security)
- Does it use auth.uid() to scope to the current user?
- Does it handle the anon role correctly?

## 3. Key Misconfigurations to Flag
- Tables with no policies despite having RLS enabled (blocks all access)
- SELECT policies missing WHERE auth.uid() = user_id
- Policies that expose other users' data via JOINs
- Service role key referenced in client-side code
- Missing UPDATE/DELETE policies on write-capable tables

## 4. Multi-Tenant Check (if applicable)
If this is a multi-tenant app:
- Verify org_id or team_id is used in all cross-tenant boundaries
- Check for tenant isolation gaps in shared tables

For each issue found, provide:
- Severity: CRITICAL (data exposure) / HIGH (bypass possible) / MEDIUM / LOW
- Table affected
- Current policy (or lack thereof)
- Corrected SQL to fix the issue

App context: [DESCRIBE YOUR APP — what data it stores, who the users are, whether it's multi-tenant]
Database schema: [PASTE YOUR schema.sql OR table definitions]

Category 23: Claude Computer Use Automation Setup

Use case: Design automated workflows using Claude's new macOS computer use capability (launched March 2026). Build repeatable sequences for dev tasks, QA, and content pipelines.

Tool: Claude Pro/Max with Computer Use | Time: 15-30 min setup

You are designing an automation workflow using Claude's computer use capability on macOS.

I want to automate the following repeatable task:
[DESCRIBE TASK — e.g., "Deploy a Next.js app: run tests, build, commit, push, check Vercel deployment status"]

Design the workflow with:

## 1. Pre-conditions
What state must the system be in before starting? (files exist, servers running, env vars set)

## 2. Step-by-Step Sequence
For each step:
- Action: What Claude should do (click, type, navigate, wait)
- Expected outcome: What success looks like
- Failure check: What to do if this step fails

## 3. Error Handling
- Which steps are reversible vs. irreversible?
- When should Claude pause and ask for confirmation?
- What system state should be restored on failure?

## 4. Verification
Final checks to confirm the task completed successfully.

## 5. CLAUDE.md Entry
Write the CLAUDE.md instruction for this workflow so future runs can invoke it consistently.

Context about my setup:
- OS: macOS [version]
- Main tools: [list your dev tools — terminal, browser, editors]
- Frequency: [how often you'll run this]
- Risk tolerance: [low / medium — prefer confirmation prompts or full automation]


Category 24: Framework-Agnostic Agent Design (GitAgent / Agent Portability)

Use case: Design AI agents that are portable across frameworks (LangChain, AutoGen, Claude Code) using the GitAgent spec pattern — decouple agent logic from runtime so you're not locked in. Directly applicable to AgenticNode workflows.

Tool: Claude Code, Cursor | Difficulty: Advanced | Time: 45-90 min

I am building an AI agent and want to make it framework-agnostic using the GitAgent portability pattern.

## Agent Description
Name: [AGENT NAME]
Purpose: [What this agent does — one sentence]
Inputs: [What data/context it receives]
Outputs: [What it produces or acts upon]
Steps: [High-level step sequence]

## Current Runtime
I'm currently using: [LangChain / AutoGen / Claude Code / custom]

## Design Requirements
1. Write a GitAgent-style YAML specification for this agent that captures:
   - Agent identity (name, version, description)
   - Inputs and output schema
   - Tool list (with name, description, and input/output for each tool)
   - Step sequence with branching logic
   - Error handling and retry policy
   - Context window management strategy

2. Show how the same spec would be executed on:
   - LangChain + LangGraph runtime
   - Claude Code (via CLAUDE.md + slash commands)
   - AutoGen (multi-agent teams)

3. Identify which parts of my agent logic are runtime-specific (cannot be abstracted) and flag them.

4. Recommend a testing harness to validate the agent behaves identically across runtimes.

Constraints:
- Prefer stateless steps where possible (easier to port)
- No framework-specific imports in the spec YAML
- Include a version field so specs can evolve without breaking existing runners


Category 25: AI Pipeline Security Audit (Langflow / n8n / Flowise)

Use case: Audit AI workflow pipelines built with Langflow, n8n, or Flowise for security vulnerabilities — critical after CVE-2026-33017 (Langflow unauthenticated RCE, CISA KEV, actively exploited) and CVE-2026-21858 (n8n CVSS 10.0 RCE). Vibe-coded projects frequently deploy these tools without hardening.

Tool: Claude Code, Cursor | Difficulty: Intermediate | Time: 20-40 min

You are a security engineer auditing an AI pipeline deployment for production readiness.

## Pipeline Details
Tool: [Langflow / n8n / Flowise / custom]
Version: [current version — verify against latest CVE advisories]
Deployment: [local / Docker / cloud / self-hosted VPS]
Public internet access: [yes / no]
Authentication: [none / basic auth / OAuth / API key]

## Audit Scope

1. Authentication & Exposure
   - Is the pipeline UI/API exposed to the public internet?
   - What authentication is in place? Is it sufficient?
   - Are any endpoints unauthenticated that should not be?
   - Check known CVEs: CVE-2026-33017 (Langflow RCE), CVE-2026-21858 (n8n RCE)

2. Secret Management
   - Are API keys (OpenAI, Anthropic, DB) in environment variables or hardcoded?
   - Scan all workflow files for hardcoded secrets
   - Are .env files excluded from version control?

3. Input Validation
   - What user inputs flow into LLM prompts? Could they carry prompt injection?
   - Are file upload nodes present? What types are accepted?
   - Are external webhook inputs validated before processing?

4. Network & Infrastructure
   - What ports are exposed? Should any be firewalled?
   - Are outbound connections restricted to necessary services?
   - Is TLS/HTTPS enforced for all external communication?

5. Output Safety
   - Do any nodes execute code based on LLM output? (Highest risk)
   - Are LLM outputs sanitized before use in DB queries or shell commands?

## Output Format
For each finding:
- Severity: Critical / High / Medium / Low
- Description: What the vulnerability is and how it could be exploited
- Evidence: Specific node, file, or configuration
- Fix: Exact remediation steps

End with a prioritized remediation checklist.

Category 25b: Emergency Patch Protocol for CISA KEV Vulnerabilities

Use case: When a dependency in your project appears on the CISA Known Exploited Vulnerabilities catalog, use this prompt to rapidly assess exposure and patch safely.

Tool: Claude Code | Difficulty: Beginner | Time: 10-20 min | When to use: Immediately upon CVE notification

I need to emergency-patch a CISA KEV vulnerability in my project.

## CVE Details
CVE ID: [CVE-XXXX-XXXXX]
Affected component: [package name and affected versions]
My current version: [run: cat package.json | grep [pkg] OR pip show [pkg]]
CISA patch deadline: [date from KEV catalog]
Exploitation status: [actively exploited / proof-of-concept / theoretical]

## My Project Context
Project type: [Next.js / Laravel / Python / Node.js / etc.]
Where this component is used: [auth / API / background jobs / dev tooling only]
Deployed to: [Vercel / VPS / local only / Docker]
Can I take downtime: [yes, X minutes / no, zero-downtime required]

## Tasks

1. Assess actual exposure
   - Is this component reachable from the public internet?
   - What attack surface does this CVE target?
   - What would a successful exploit look like in my deployment?

2. Identify the safe patch version
   - Minimum version that resolves this CVE
   - Any breaking changes between my version and patched version?

3. Write the patch plan
   - Exact commands to update the package
   - Configuration changes needed post-update
   - Test commands to confirm patch applied and nothing broke

4. Post-patch verification checklist
   - How do I confirm the vulnerable version is no longer running?
   - What smoke tests should I run immediately after patching?

Execute the patch plan. If breaking changes exist, propose a migration path before making any changes.

Category 26: MCP Integration Prompts (Added March 2026)

Model Context Protocol (MCP) is now the standard way to give AI coding assistants persistent context and tool access. These prompts help you integrate MCP correctly and securely.

26.1 MCP Server Setup Prompt

Tool: Claude Code | Difficulty: Intermediate | Time: 30-60 min

Set up an MCP (Model Context Protocol) server for my project that exposes the following tools to AI assistants:

## Tools to Expose
1. [Tool 1 name]: [what it does — e.g., "read_project_data: reads the projects.json registry"]
2. [Tool 2 name]: [what it does — e.g., "run_health_check: pings all deployment URLs"]
3. [Tool 3 name]: [what it does — e.g., "get_recent_errors: reads the last 50 error log lines"]

## Implementation Requirements
- Use the @modelcontextprotocol/sdk package
- Implement as stdio transport (not HTTP) for local use
- Each tool must have a clear JSON schema for inputs
- Each tool must return structured JSON output
- Add error handling that returns helpful error messages, not stack traces
- Include a test script that exercises each tool

## Configuration
Generate the MCP configuration block for claude_desktop_config.json:
{
  "mcpServers": {
    "[server-name]": {
      "command": "node",
      "args": ["path/to/server.js"]
    }
  }
}

## Context This Will Enable
When this MCP server is active, an AI assistant will be able to [describe what new capabilities this enables for your workflow].

Build the complete MCP server. Start with the tool definitions, then the handlers, then the test script.

26.2 CLAUDE.md Project Context Setup

Tool: Claude Code | Difficulty: Beginner | Time: 15 min

I'm setting up a project-level MCP context file so Claude Code has persistent context about my project without me having to re-explain it every session.

Create a CLAUDE.md file that covers:

## Project Identity
- Name: [project name]
- Purpose: [one sentence]
- Stack: [tech stack]
- Current status: [active development / maintenance / paused]

## Key Files and Their Purpose
- [file path]: [what it contains and when to read it]
- [file path]: [what it contains and when to read it]

## Commands
- Build: [command]
- Dev server: [command]
- Test: [command]
- Deploy: [command]

## Architecture Decisions That Are NOT Up for Discussion
- [Decision 1]: [why it was made — do not suggest alternatives]
- [Decision 2]: [why it was made]

## Known Issues (Don't Re-Investigate)
- [Issue 1]: [known limitation, not a bug to fix]

## My Workflow
- I prefer [file-by-file / whole-feature] implementations
- Always [run tests / lint / build] before marking a task done
- When in doubt, [ask / make conservative choice / make opinionated choice]

Make the CLAUDE.md scannable and under 200 lines.

26.3 Next.js Secure Middleware Pattern (Security-critical — post-CVE-2025-29927)

Tool: Claude Code, Cursor | Difficulty: Intermediate | Time: 20 min

Add authentication to my Next.js app using the secure dual-layer pattern (required post-CVE-2025-29927).

## Protected Routes
- /dashboard/:path* — requires authenticated user
- /api/protected/:path* — requires authenticated user, returns 401 JSON (not redirect)
- /admin/:path* — requires authenticated user with admin role

## Auth Provider
I'm using: [NextAuth v5 / Supabase Auth / Clerk / custom JWT]

## Implementation Rules
1. Middleware ONLY for UX redirects (fast redirect to /login for protected pages)
2. Every /api/protected route MUST verify the session server-side independently
3. NEVER rely on middleware as the sole auth gate for API routes
4. Include the x-middleware-subrequest header strip check as a comment

## Pattern to Implement
For each protected API route:
// DO NOT rely on middleware alone — verify here
const session = await getServerSession(authOptions)
if (!session) {
  return NextResponse.json({ error: 'Unauthorized' }, { status: 401 })
}

Generate:
1. middleware.ts with the correct matcher config and a comment explaining it is NOT a security boundary
2. A shared auth utility function (lib/auth-guard.ts) that API routes can call
3. One example protected API route using the utility
4. A test that verifies the API route returns 401 when no session exists


Category 27: Claude Sonnet 4.6 Agentic Workflow Optimization

Added April 6, 2026 — in response to Claude Sonnet 4.6 release with improved agentic search efficiency and raised Message Batches API limits (300K tokens).

27.1 Model Upgrade Audit Prompt (Intermediate)

Tool: Claude Code, Cursor | Time: 15-30 min

Use this prompt when upgrading an existing agent or AI-powered codebase from an older Claude model to Sonnet 4.6.

I'm upgrading my Claude-powered application from [old model, e.g. claude-sonnet-4-5] to claude-sonnet-4-6.

Please:
1. Search the codebase for all occurrences of the old model string and list them with file:line references
2. Identify any hardcoded max_tokens values that may be suboptimal for the new model's context window
3. Check if I'm using the Message Batches API — if so, note I can now use up to 300K tokens per batch
4. Flag any places where I'm manually chunking context that the new model's improved agentic search might handle automatically
5. Generate a migration checklist with all changes needed

For each file that needs updating, show the exact before/after diff.

27.2 Token Efficiency Audit for Agentic Tasks (Advanced)

Tool: Claude Code | Time: 20-45 min

Use when your agent is burning too many tokens on multi-step tasks.

I have an agentic workflow that's consuming too many tokens. Here's my current implementation:

[paste your agent code or describe the flow]

Please analyze it and identify:

1. **Redundant search calls** — are we searching for the same files/content multiple times?
2. **Over-fetching** — are we loading full files when we only need specific sections?
3. **Context bloat** — are we accumulating context that could be cleared between tool calls?
4. **Inefficient query patterns** — are search queries too broad, returning large irrelevant results?

For each issue found:
- Show the exact code causing the inefficiency
- Explain why it's wasteful
- Provide the refactored version
- Estimate token savings (rough %)

Claude Sonnet 4.6 has improved internal context retention and chunked result prioritization — some of these fixes may happen automatically with a model upgrade, but the rest need code changes.

27.3 Batch API Migration Prompt (Advanced)

Tool: Claude Code, API | Time: 30-60 min

Use when you need to process many files, records, or items in parallel and want to migrate to the Message Batches API.

I need to process [N items] using Claude — currently I'm making sequential API calls which is slow and rate-limited.

Items to process: [describe: files, database records, API responses, etc.]
What I need from each: [describe output: classification, extraction, summary, audit, etc.]
Current code: [paste existing sequential implementation]

Please:
1. Refactor this to use Anthropic's Message Batches API (claude-sonnet-4-6 supports 300K tokens per batch)
2. Design the batch request structure with appropriate custom_id for tracking
3. Add batch polling logic with status checks (batches are async — poll until complete)
4. Handle partial failures gracefully (some items may fail; process the rest)
5. Add result aggregation that maps custom_id back to the original item

Requirements:
- Do not exceed 300K total tokens per batch
- Include a token estimation function to split large datasets across multiple batches if needed
- Output should be typed (TypeScript interfaces for the results)

Show the complete implementation including the polling loop and result handler.

Category 27b: Supply Chain Attack Response Prompts

Added April 6, 2026 — in response to GHSA-fw8c-xr5c-95f9 (Axios npm supply chain compromise — RAT delivery via versions 1.14.1 and 0.30.4, actively exploited). Use these prompts when a dependency in your project is confirmed or suspected to be compromised.

27b.1 Dependency Compromise Triage (Expert)

Tool: Claude Code | Time: 15-30 min | Use: Immediately when a supply chain attack is confirmed

A critical supply chain compromise has been identified in [package name] versions [affected versions].

CVE/Advisory: [paste CVE ID or GHSA link]
Compromise type: [malware delivery / credential theft / backdoor / data exfiltration]
Safe versions: [version to upgrade to, or N/A if no safe version exists]

Please immediately:

1. **Inventory scan** — search the entire codebase and package.json/lock files for any reference to [package name]. Include:
   - Direct dependencies in package.json
   - Transitive dependencies (check package-lock.json / yarn.lock / pnpm-lock.yaml)
   - Any inline requires or dynamic imports

2. **Version check** — for each occurrence, identify the exact version installed (from lock file, not semver range)

3. **Exposure assessment** — where in the code is this package called? What data flows through it? Could attacker code have accessed:
   - Environment variables (API keys, secrets)
   - File system paths
   - Network request payloads
   - Database credentials

4. **Remediation plan** — provide exact commands to:
   a. Remove the compromised package
   b. Install the safe version (if available)
   c. Regenerate the lock file
   d. Verify the fix (how to confirm the compromised version is gone)

5. **Incident indicators** — what forensic artifacts should I look for to determine if the malicious code executed?

Output as an actionable checklist with bash commands I can run right now.

27b.2 Post-Compromise Lockfile Hardening (Intermediate)

Tool: Claude Code | Time: 20-30 min | Use: After resolving a supply chain incident

I just resolved a supply chain attack via [package name]. Now I want to harden my project against future similar attacks.

Current package manager: [npm / yarn / pnpm / bun]
Current CI/CD: [GitHub Actions / GitLab CI / CircleCI / etc.]

Please implement the following supply chain hardening measures:

1. **Lockfile pinning** — ensure package-lock.json / yarn.lock / pnpm-lock.yaml is committed and `npm ci` (not `npm install`) is used in CI

2. **Subresource integrity** — if using CDN-delivered packages, add integrity hashes

3. **Dependency review workflow** — add a GitHub Actions step using `actions/dependency-review-action` to block PRs that introduce known-vulnerable packages

4. **npm audit in CI** — add `npm audit --audit-level=high` to the build pipeline so new high/critical CVEs fail the build

5. **Provenance verification** — if any packages support npm provenance (published with `--provenance` flag), add verification

6. **postinstall script detection** — create a CI step that flags packages with postinstall scripts (the Axios attack vector) for manual review:
   ```bash
   # Check for packages with postinstall scripts
   cat package-lock.json | jq '.packages | to_entries[] | select(.value.scripts.postinstall) | .key'

For each measure, provide the exact code/config to add, where to add it, and what it protects against.


---

## Category 28: AI Agent Framework Security Audit *(Added April 2026)*

*Prompted by CVE-2026-33017 (Langflow unauthenticated RCE, CISA KEV), CVE-2026-39888 (PraisonAI sandbox escape CVSS 9.9), and CVE-2026-35615 (PraisonAI path traversal). The eval() epidemic of 2026 — if your pipeline uses Langflow, n8n, Flowise, PraisonAI, or any framework executing dynamic code, run this audit.*

### 28.1 AI Framework Vulnerability Audit (Advanced)
**Tool**: Claude Code | **Time**: 20-40 min

Audit my AI agent framework setup for eval()/exec() vulnerabilities and unauthenticated execution endpoints.

Framework in use: [Langflow / n8n / Flowise / PraisonAI / custom] Framework version: [x.x.x] Deployment: [local / cloud VM / Docker / k8s] Public endpoints: [yes / no / unknown]

Please audit the following attack vectors:

  1. Unauthenticated code execution endpoints

    • Search for POST endpoints that accept flow/pipeline/component definitions
    • Verify each endpoint requires valid authentication before executing any logic
    • Flag any "public" or "tmp" endpoints that bypass auth
  2. eval()/exec() usage in custom components

    • Scan all custom nodes, components, and tool definitions for eval/exec calls
    • For each occurrence: identify if input is user-controlled, identify sandbox/validation
    • Classify as: CRITICAL (user input, no sandbox), WARNING (user input, weak sandbox), INFO (internal use only)
  3. File path validation

    • Find all file read/write operations that accept user-supplied paths
    • Verify path validation happens AFTER normalization, not before
    • Check for: os.path.normpath() order, startswith() vs abspath() comparison
  4. Dependency freshness

    • Check framework package version against latest release
    • Cross-reference against CISA KEV catalog entries for this framework
    • List any CVEs in installed version that are unpatched

For each finding, provide: severity (CRITICAL/HIGH/MEDIUM), file:line, description, and the exact fix code.


### 28.2 Safe Dynamic Code Execution Pattern (Intermediate)
**Tool**: Claude Code, Cursor | **Time**: 30-60 min

I need to execute user-provided code snippets in my AI agent pipeline safely. Replace any existing eval()/exec() approach with a properly sandboxed implementation.

Current approach (what to replace): [paste your current eval/exec code here]

Requirements:

Implement one of these options based on my requirements:

Option A (Python — Pyodide WASM): Run Python in a WebAssembly sandbox with no OS access Option B (Any language — E2B cloud sandbox): Spin up isolated cloud microVM per execution, destroy after Option C (Node.js — vm2 / isolated-vm): V8 isolate with strict resource limits and no Node.js built-in access Option D (Docker exec): Container-per-execution with network isolation and readonly filesystem

For the chosen option:

  1. Provide the complete implementation replacing my eval/exec
  2. Add resource limits (CPU time, memory, execution time)
  3. Add input sanitization before the sandbox boundary
  4. Add output validation after sandbox returns
  5. Add error handling for timeout, OOM, and sandbox escape attempts
  6. Estimate cost per 1,000 executions for cloud options

### 28.3 AI Pipeline Endpoint Hardening (Advanced)
**Tool**: Claude Code | **Time**: 15-30 min

Harden all AI agent pipeline API endpoints against unauthenticated execution attacks.

Stack: [FastAPI / Express / Next.js API routes / Flask / other] Auth system: [JWT / session cookie / API key / Supabase Auth / Clerk / other] Current endpoints that execute logic: [list them]

For each endpoint, implement:

  1. Authentication middleware

    • Verify auth token/session on every request before any execution
    • Return 401 with no information leakage if auth fails
    • Implement rate limiting: [N] requests per minute per authenticated user
  2. Request validation

    • Validate all fields with strict schemas (Zod / Pydantic / Joi)
    • Reject any field not in the schema (no passthrough of extra fields)
    • Max payload size: [N] KB — reject larger with 413
  3. Execution authorization

    • Implement per-user or per-role permissions for which pipeline actions are allowed
    • Log all execution requests with: timestamp, user ID, action, IP, payload hash
  4. Secrets isolation

    • Verify no API keys, database credentials, or secrets are accessible from within executed code
    • If using environment variables, confirm they are not readable from user-controlled execution context

Provide the middleware code, the updated endpoint handlers, and a test that verifies unauthenticated requests are rejected.


---

## Category 29: Claude Sonnet 4.6 — 1M Context & Agentic Search Prompts (April 2026)

*Claude Sonnet 4.6 introduced two capabilities that change how you structure prompts: a 1M token context window (beta) and GA web search/web fetch with code-execution-based result filtering. These prompts exploit both.*

### 29.1 The Whole-Codebase Refactor Prompt (Expert)
**Tool**: Claude Sonnet 4.6 via API or Claude Code | **Context required**: 200K–1M tokens

With the 1M context window, you can load an entire medium-sized codebase and ask for architectural analysis without chunking. This works for repositories up to ~150K lines.

Codebase Refactor Brief

Repository: [project-name] Goal: [Specific refactor objective — e.g., "migrate from Pages Router to App Router", "replace all class components with hooks", "extract shared utilities from duplicated code"] Constraints:

Files loaded below (entire codebase follows in this message): [Paste full codebase or use file upload — Claude Sonnet 4.6 handles up to 1M tokens]

Output requested:

  1. A prioritized list of refactor changes (most impactful first)
  2. For each change: which files are affected, what changes, and estimated risk level (low/medium/high)
  3. A proposed commit sequence (small atomic commits, safest order)
  4. Any architectural concerns that would block this refactor

Do NOT generate code yet — produce the analysis and plan first. I will confirm before implementation begins.


### 29.2 The Research-Then-Build Prompt (Intermediate)
**Tool**: Claude Sonnet 4.6 (web search GA) | **Time**: 15–30 min

Sonnet 4.6's web search and web fetch are GA, with dynamic result filtering via code execution. This prompt chains research directly into implementation — no context-switching between browser and editor.

Research-Then-Build Task

What I'm building: [Short description — e.g., "a rate limiter middleware for my Next.js API routes"]

Research phase (do this first — use web search):

  1. Search for: "[topic] best practices [current year]"
  2. Fetch the top 2–3 relevant documentation pages
  3. Identify: (a) the standard pattern, (b) common failure modes, (c) security considerations
  4. Write a 3-bullet summary of your findings before writing any code

Build phase (only after research summary is written):

Validation:

Start with the research phase. Do not write code until research summary is complete.


### 29.3 The Extended-Thinking Architecture Decision Prompt (Advanced)
**Tool**: Claude Sonnet 4.6 with extended thinking | **Time**: 5 min prompt, 10–20 min thinking

Extended thinking gives the model more compute budget before it commits to an answer. Use this for architecture choices where a wrong call means weeks of rework.

Architecture Decision Request

Decision to make: [e.g., "Should I use Supabase Realtime or polling for my live dashboard?"]

Context:

What I've already considered:

What I need:

  1. Evaluate both options against my specific constraints (not generic trade-offs)
  2. Identify what I'm missing or wrong about in my reasoning
  3. Recommend one option with confidence level (high/medium/low) and what would change your recommendation
  4. Give me the one question I should answer before committing

Take your time — a slow, thorough answer beats a fast, wrong one.


---

## Category 30: April 2026 — Agent Framework, Security Audit & Parallel Fleet Prompts

*Three new workflows unlocked by the April 2026 AI tooling wave: Microsoft Agent Framework 1.0 multi-agent orchestration, Claude Mythos-style security audit chaining, and Cursor 3 parallel agent fleet management.*

### 30.1 The Microsoft Agent Framework 1.0 Orchestration Prompt (Advanced)
**Tool**: Microsoft Agent Framework 1.0 (.NET or Python), Claude Code | **Time**: 30–60 min setup

Agent Framework 1.0 ships with A2A and MCP protocol support, enabling cross-runtime agent interoperability. Use this prompt to design multi-agent workflows that span different AI providers without lock-in.

Multi-Agent Workflow Design Request

Workflow goal: [What the agent system should accomplish end-to-end — e.g., "receive a GitHub issue, research the codebase, implement a fix, open a PR, and notify Slack"]

Agents needed (describe each):

Coordination protocol: A2A (agent-to-agent messages) | MCP (tool calls to shared context) | Both Runtime: .NET | Python | Both

State management:

Error handling:

Output required:

  1. Agent architecture diagram (ASCII or described)
  2. Agent Framework 1.0 code scaffold for each agent class
  3. The A2A message schema for agent handoffs
  4. The MCP tools each agent needs registered
  5. DevUI configuration for browser-based debugging

Generate the scaffold. I will fill in the business logic per agent.


### 30.2 The AI Security Audit Chain Prompt (Expert)
**Tool**: Claude Sonnet 4.6 or Claude Code with CyberOS MCP | **Time**: 20–40 min per codebase

Inspired by Claude Mythos / Project Glasswing's defensive security workflow — systematically chain vulnerability discovery, triage, and remediation across a codebase without missing surface area.

AI-Powered Security Audit — Systematic Chain

Codebase: [Repo path or paste content] Stack: [e.g., Next.js 14 + Supabase + Stripe + Python FastAPI backend] Deployment: [Vercel + AWS Lambda | Self-hosted | Cloud provider] Compliance scope: [OWASP Top 10 | SOC 2 | PCI-DSS | All]

Phase 1 — Attack Surface Map

List every:

Do not analyze yet. Only map. Output as a numbered list.

Phase 2 — Vulnerability Scan

For each item on the attack surface map, check for:

Classify each finding: CRITICAL / HIGH / MEDIUM / LOW / INFO Include CWE ID and the exact file:line where the issue exists.

Phase 3 — Remediation Plan

For each CRITICAL and HIGH finding:

  1. Explain the vulnerability in one sentence
  2. Write the fixed code (before/after diff)
  3. Explain why the fix works

Phase 4 — Verification

After remediations are applied:

Start with Phase 1. Do not proceed to Phase 2 until I confirm the attack surface map is complete.


### 30.3 The Cursor 3 Parallel Agent Fleet Prompt (Advanced)
**Tool**: Cursor 3 Agents Window | **Time**: 5 min to launch, 30–120 min execution

Cursor 3's Agents Window lets you run multiple AI agents simultaneously across local, SSH, and cloud environments. This prompt template structures how to decompose work across a fleet efficiently so agents don't conflict.

Parallel Agent Fleet Assignment

Project: [Brief description of the codebase] Goal: [What needs to be accomplished — e.g., "ship the user dashboard feature including data layer, UI components, tests, and documentation"]

Fleet decomposition (define independent workstreams that can run in parallel):

Agent A — [Name: e.g., "Data Layer"]

Agent B — [Name: e.g., "UI Components"]

Agent C — [Name: e.g., "Tests & Docs"]

Conflict prevention:

Review order:

  1. Review Agent A output first
  2. Review Agent B output (may depend on A's types)
  3. Review Agent C output last (depends on both)

Launch in the Agents Window: Open one agent session per row above. Paste the Agent-specific block into each session. Start all simultaneously.


---

*This library is updated monthly with new prompts based on emerging tools, patterns, and reader requests. Last updated: April 13, 2026. Added: Category 30 (Agent Framework 1.0 orchestration, AI security audit chain, Cursor 3 parallel fleet management). Previous: Category 29 (Claude Sonnet 4.6 — 1M Context & Agentic Search Prompts, April 10). Category 28 (Long-Horizon Agentic Execution, April 9). Category 27 (Multi-Agent Orchestration, April 7). Category 26 (MCP Integration, March 31).*

18. Tool Comparison Matrix

Updated March 22, 2026

A living comparison of every major vibe coding tool. Updated monthly.

AI-Native IDEs

ToolPriceBest ForKey FeatureSecurity Concern
Cursor$20/moFull-stack dev, large codebasesComposer multi-file gen, Automations (event-driven agents), MCP AppsCurXecute (CVE-2025-54135)
Windsurf (acquired)N/ALong-context projectsMemories (persistent context)Memory poisoning via prompt injection
VS Code + Copilot$10/moAI without switching editorsInline suggestions, Agent Mode, chatLower risk (suggestions, not autonomous)

Autonomous Agents

ToolPriceBest ForAutonomyDifferentiator
Claude CodeUsage-basedEnterprise codebasesHigh (subagent teams)$2.5B+ ARR, 80.9% SWE-bench (#1 of 15 agents), multi-agent orchestration
Devin$500/moAsync tasks, migrationsVery HighFull AI employee model, Devin Review
Codex CLIUsage-basedOpen-source, Rust/systemsMediumOpen-source, sandboxed execution
JulesFree-$125/moAsync bugfixes, PR genHighWorks while you sleep, Gemini 3 Pro
Amazon QFree-$19/moAWS-heavy projectsMediumDeep AWS integration

Browser Builders (No-Code)

ToolPriceBest ForOutput QualityRisk Level
Bolt.newFree-$20/moRapid full-stack prototypesGoodMedium
v0Free-$20/moReact/Next.js UI componentsExcellentLow (UI only)
LovableFree-$25/moNon-dev app creationGoodHigh (170/1645 apps had vulns)
Replit AgentFree-$25/moComplete apps from descriptionGoodMedium — $400M Series D, $9B valuation (Mar 2026). 75% of Replit AI users write zero code.

Open-Source & Cost-Efficient Alternatives

For teams optimizing cost, data privacy, or running on self-hosted infrastructure.

Model/ToolParametersCost vs Claude SonnetSWE-bench / RankBest For
MiMo-V2-Pro (Xiaomi)1 Trillion (Hunter Alpha)-67% cheaper than Claude Sonnet 4.63rd globally on agent benchmarks (Mar 2026)Cost-sensitive production workloads, batch jobs
Gemini CLI (Google)N/A (cloud)Free tier availableCompetitive, Flash variantOpen-source terminal work, Google ecosystem
Codex CLI (OpenAI)N/A (cloud)Usage-based (GPT-5.4)77.3% Terminal-BenchSandboxed execution, CI/CD integration
obra/superpowersN/A (framework)Free + model API costs92,100 GitHub stars (Mar 2026)Custom agent framework, multi-step workflows
OpenClawN/A (framework)Free + model API costs210,000 GitHub stars (Mar 2026)Open-source agent orchestration, self-hosted

Choosing Your Stack

👨‍💻 Professional Developer
Claude Code + Cursor. Best reasoning + best IDE. Devin for async/overnight work.
🚀 Startup Founder
Cursor + Bolt.new. Cursor for core product, Bolt for rapid prototyping and validation.
👤 Non-Technical
Lovable or Bolt.new. But hire a security professional before handling user data.
🏢 Enterprise
Claude Code (team) + Devin (migrations) + human review gates.
🔗
**Watch tool demos:** See these tools in action on [YouTube @endofcoding](https://youtube.com/@endofcoding). Compare hands-on at [vibe-coding.academy](https://vibe-coding.academy).
</div>

19. The Security Playbook

Updated April 9, 2026

A practical guide to hardening vibe-coded applications before they touch real users.

**The reality:** The December 2025 Tenzai study found 69 vulnerabilities across just 15 AI-built applications. The February 2026 IDEsaster disclosure revealed 30+ vulnerabilities and 24 CVEs affecting 1.8M developers. AI-generated code is 2.74x more likely to introduce XSS than human code. Security is not optional.
</div>

The 30-Minute Security Checklist

Run this on every vibe-coded application before showing it to anyone outside your team:

🔒
Authentication (5 min)
- Passwords hashed with bcrypt or argon2 (not MD5, SHA, or plaintext) - Sessions stored in HTTP-only, Secure, SameSite cookies (not localStorage) - CSRF tokens on every form - Rate limiting on login endpoint (5 attempts per 15 min) - No credentials hardcoded in source code
</div>
📝
Input Handling (5 min)
- All database queries use parameterized statements (no string concatenation) - HTML output sanitized (no raw user input rendered) - File uploads validated (type, size, name — no path traversal) - API request bodies validated server-side (not just client-side)
</div>
🛡
Data Protection (5 min)
- HTTPS enforced (HSTS header set) - API responses don't leak internal data (no password hashes, debug info, stack traces) - Sensitive data encrypted at rest (API keys, user PII) - Error messages are generic (no "user not found" vs "wrong password" distinction)
</div>
Infrastructure (5 min)
- `npm audit` shows no critical/high vulnerabilities - Security headers: Content-Security-Policy, X-Frame-Options, X-Content-Type-Options - CORS restricted to specific origins (not `*`) - Environment variables for all secrets (not in code or git history)
</div>
👥
Access Control (5 min)
- Authorization checked server-side on every endpoint - Users can only access their own data (test by changing IDs in URL) - Admin functions require admin role verification - API keys have minimal permissions
</div>
📈
Monitoring (5 min)
- Error tracking set up (Sentry or similar) - Failed auth attempts logged - Rate limiting returns 429 with Retry-After header - No sensitive data in logs (passwords, tokens, PII)
</div>

AI Tool Security Advisories

**March 2026 — Claude Code CVEs:** Two critical vulnerabilities were disclosed affecting Claude Code. **CVE-2025-59536** allowed remote code execution — malicious repositories could trigger arbitrary shell commands when Claude Code initialized project files. **CVE-2026-21852** enabled API key exfiltration through crafted project files. Both were patched in prior releases. **Action:** Ensure you're running the latest Claude Code version. Never open untrusted repositories with AI coding tools without reviewing their configuration files first.
💡
**Lesson:** AI coding tools themselves are attack surfaces. Malicious actors can craft repositories that exploit tool initialization to run code, steal API keys, or exfiltrate data. Always keep your AI coding tools updated and treat repository configuration files (.claude/, .cursor/, .github/copilot/) with the same suspicion as executable code.

MCP Supply Chain: The New Attack Surface

March 2026 — OpenClaw Supply Chain Attack: Antiy CERT confirmed 1,184 malicious skill packages across ClawHub — approximately one in five packages in the open-source MCP ecosystem. This is the largest confirmed supply chain attack targeting AI agent infrastructure to date. Separately, security researchers documented 30+ CVEs targeting MCP servers, clients, and infrastructure in just 60 days (Jan–Feb 2026).

Key MCP CVEs (March 2026):

  • CVE-2026-23744 (CVSS 9.8, MCPJam Inspector ≤ v1.4.2): A crafted HTTP request to a critical endpoint bound to 0.0.0.0 with no authentication can install an arbitrary MCP server and execute code on the host. No user interaction required.
  • Azure MCP Server RCE (CVSS 9.6, demonstrated at RSAC 2026): A vulnerability in Microsoft’s Azure MCP server capable of compromising cloud environments via the agent connection.
  • SSRF exposure: BlueRock Security analyzed 7,000+ MCP servers and found 36.7% potentially vulnerable to server-side request forgery.

How to protect yourself:

  • Audit all installed MCP servers. Run ls ~/.config/claude/mcp* and remove any servers you didn’t explicitly install.
  • Only install MCP packages from verified, well-known authors with active maintenance history.
  • Pin MCP server versions in your configuration — don’t use @latest.
  • Check package provenance before installing from ClawHub or any MCP registry.
  • Treat MCP server packages as executable code with system access — because they are.

Supply Chain Attacks: April 2026 Alert

Critical — Week of March 31, 2026: A North Korean state-linked threat actor (UNC1069) compromised the npm account of the lead maintainer of axios — a package with ~100 million weekly downloads — publishing malicious versions 1.14.1 and 0.30.4. The packages deployed the WAVESHAPER.V2 cross-platform RAT on Windows, macOS, and Linux. The malicious versions were live for approximately 3 hours before detection. This is one of the most impactful supply chain compromises in npm history.

April 2026 Supply Chain Attack Summary:

Package / Tool Date Impact Attribution
axios 1.14.1, 0.30.4 March 31 WAVESHAPER.V2 RAT; ~100M weekly downloads UNC1069 (North Korea/DPRK)
LiteLLM 1.82.7, 1.82.8 March 24 Multi-stage credential stealer (SSH keys, cloud tokens, K8s secrets, .env files) Unknown
Langflow ≤ 1.8.2 (CVE-2026-33017) March 17 Unauthenticated RCE via public endpoint; exploited within 20h; CISA KEV Active threat actors
Trivy Docker Hub images (CVE-2026-33634) March 19 Malicious code in Aqua Security's Trivy scanner images TeamPCP

Langflow CVE-2026-33017 detail: Critical code injection in the AI agent framework's public flow build endpoint. No authentication required. Exploitation was observed in the wild within 20 hours of public disclosure and CISA added it to the Known Exploited Vulnerabilities catalog. If you run Langflow, upgrade to 1.8.3+ immediately.

Trivy Cascade extended (April 2026): The Trivy compromise (CVE-2026-33634) evolved into a much larger incident. Attackers force-pushed malicious code to 75 of 76 trivy-action GitHub Actions tags, then published additional malicious Docker images during the remediation effort (taking 5 days to fully evict). The attack then spawned CanisterWorm — a self-propagating npm worm that hit 64+ packages using blockchain-based command-and-control infrastructure, making it resistant to traditional domain seizure. CanisterWorm spread to Checkmarx KICS and AST GitHub Actions, and separately reached LiteLLM (95 million monthly PyPI downloads). Any CI/CD pipeline that used Trivy, Checkmarx KICS, or LiteLLM between March 19 and April 10 should be treated as potentially compromised and audited.

What this means for vibe coders:

  • Dependencies installed by AI-generated code are attack vectors. Always npm audit after any AI-generated package.json or install step.
  • AI coding tools themselves (Langflow, LiteLLM, MCP servers, security scanners) are now priority targets for supply chain attackers.
  • Security tooling is not immune — Trivy (a vulnerability scanner) was itself the vector. Audit your audit tools.
  • Pin exact dependency versions. Don't use @latest or loose semver ranges for packages you can't quickly audit.
  • Enable npm provenance verification and --ignore-scripts in CI pipelines to limit post-install attack surface.
  • Blockchain-based C2 is increasingly being used to make supply chain worms resistant to takedown — conventional domain blocklists are insufficient.

Vibe-Coded App Vulnerability Research

💡
Georgia Tech Vibe Security Radar (March 2026): Researchers analyzed 5,600 publicly deployed vibe-coded applications and found 2,000+ vulnerabilities, 400+ exposed secrets, and 175 instances of exposed PII. The 30-minute checklist in this chapter exists because these are the exact failure modes that recur across AI-generated codebases.

AI-generated code CVE trend:

Month CVEs attributed to AI-generated code
January 2026 6
February 2026 15
March 2026 35

The accelerating rate reflects both more AI-generated code in production and improved attribution tooling. Per Autonoma research, 53% of AI-generated code contains security holes. The pattern in these CVEs is consistent: AI models tend to generate working functionality quickly but skip authentication checks, hardcode credentials, and mis-scope data access — exactly the failures the 30-minute checklist is designed to catch.

The Coming Paradigm: AI as Autonomous Vulnerability Researcher

💡
April 2026 — Project Glasswing: Anthropic's Claude Mythos model (announced April 7, restricted to cybersecurity defense) scored 93.9% on SWE-bench and autonomously discovered CVE-2026-4747 — a 17-year-old remote code execution vulnerability in FreeBSD — and found thousands of zero-day vulnerabilities across every major OS and browser. Anthropic restricted public access specifically because it can autonomously both discover and exploit software vulnerabilities at scale. Access is limited to Project Glasswing defense partners (AWS, Google, Microsoft, CrowdStrike, Palo Alto Networks, and ~50 others) for defensive use only.

This is a meaningful shift. For years, the security community discussed AI as a tool to help humans find bugs faster. Claude Mythos demonstrates a model that can operate the entire vulnerability research workflow autonomously — including exploitation. The implications for vibe-coded applications:

  • The attack surface is permanent. Security is not a one-time audit. Autonomous vulnerability research tools will continuously discover new issues in deployed applications. Shipping and forgetting is no longer viable.
  • AI finds what humans miss. A 17-year-old RCE in FreeBSD escaped human detection for nearly two decades. AI can find deep logic bugs and memory-corruption patterns at scale.
  • Defense must scale too. The same AI capabilities that find bugs can also be used defensively to scan your code before it ships. Use AI-powered security scanning in your CI/CD pipeline — not as a replacement for the 30-minute checklist, but as an additional layer.
  • The vibe-coded app risk is elevated. AI-generated code is already producing 35+ CVEs per month. As autonomous vulnerability finders become more capable, that code will be scanned faster and more thoroughly by both defenders and attackers.

The practical response for vibe coders: treat every public-facing application as permanently under automated security review. Build with authentication, input validation, and secrets management from the first commit — not as an afterthought.

Security Prompts for AI Tools

Review this codebase for OWASP Top 10 vulnerabilities.
For each issue found: severity (Critical/High/Medium/Low),
file and line number, what's wrong, the fix, and how to test it.
Prioritize by severity.
🔗
**Deep dive:** Read the full IDEsaster analysis in [Chapter 10: The Dark Side](#ch10). Practice security scanning at [vibe-coding.academy](https://vibe-coding.academy).
</div>

Chapter 20: Video Tutorials -- Embedded Remotion-Generated Walkthroughs

Updated March 6, 2026

Bite-sized, binge-worthy video tutorials that show real vibe coding workflows in action. Each video is 60-120 seconds, focused on one specific technique, and embedded directly in the interactive ebook using Remotion components. Updated monthly with 2-4 new videos.


Why Video Tutorials Inside an Ebook

Reading about vibe coding is one thing. Watching a real app materialize from a single prompt in under ninety seconds is something else entirely.

Traditional ebooks give you text and screenshots. This one gives you motion. Every video in this chapter is a self-contained Remotion composition -- a React component that renders to video. That means each tutorial is versioned, reproducible, and embedded natively in the interactive ebook without relying on external hosting. You can watch them inline, pause on any frame, and in the web version, interact with the code snippets directly.

The videos are grouped into three series, each designed for a different purpose:

  1. Prompt to Product -- Viral-format demonstrations of complete apps built from single prompts. Optimized for shareability and shock value.
  2. The Prompt That... -- Educational deep-dives with a comedic edge. Each video dissects one prompt and its unexpected consequences.
  3. Tool Face-Off -- Head-to-head comparisons between competing tools, scored on speed, quality, and developer experience.

Every video follows the same production pipeline: markdown script, Remotion composition with screen recordings and motion graphics, AI-generated narration, and branded end cards. The result is a library that grows over time and works across platforms -- full-length on YouTube, clipped for TikTok/Reels/Shorts, and embedded here in the ebook.


Video Series 1: "Prompt to Product" (Viral Potential)

Each video in this series shows a complete, functional application being built from a single natural-language prompt. A real-time countdown timer runs in the corner. The screen recording is unedited -- what you see is what actually happened. The final reveal shows the deployed app running in a browser.

Series format:


Video #1: 60-Second SaaS (Bolt.new)

Title/Hook: "I built a $9/month SaaS in 60 seconds"

Tool: Bolt.new

Concept: Starting from a completely blank Bolt.new session, a single prompt generates a fully functional micro-SaaS -- a link shortener with analytics, user accounts, and a Stripe-ready pricing page. The countdown timer hits zero just as the app deploys.

Tone: Breathless, slightly disbelieving. The narration captures the genuine absurdity of how fast this is.

Script Outline (170 words): Open on a blank browser tab. The narrator says: "I'm going to build a SaaS product that charges $9 a month. I have 60 seconds." The countdown starts. Cut to the Bolt.new interface. The prompt appears on screen as it is typed: a link shortener with user authentication, click analytics dashboard, custom short domains, and a pricing page with free and pro tiers. Bolt.new starts generating. The split screen shows the prompt on the left, the live preview assembling on the right -- components appearing in real time, a login form, a dashboard with charts, a pricing table with toggle between monthly and annual. The timer passes 30 seconds. The app is taking shape. At 50 seconds, the deployment starts. At 58 seconds, a live URL appears. The timer hits zero. Cut to the deployed app in a fresh browser: working signup, working dashboard, working pricing page. End card: "Total cost: $0. Total code written by a human: 0 lines."

Visual Concepts for Remotion:


Video #2: Portfolio Speedrun (v0 + Vercel)

Title/Hook: "Your portfolio shouldn't take longer than your morning coffee"

Tools: v0 by Vercel, Vercel deployment

Concept: A developer's portfolio website -- hero section, project grid, about page, contact form, dark mode toggle -- goes from blank prompt to live Vercel deployment while a coffee timer ticks down. The coffee metaphor runs throughout: the video opens with pouring coffee, and each section of the site appears as the coffee cools.

Tone: Relaxed and conversational, contrasting with the speed of what is happening on screen. The humor comes from the mismatch between the casual narration and the absurd pace.

Script Outline (180 words): Open on a close-up of coffee being poured. The narrator says: "The average developer spends 3 weeks on their portfolio. I'm going to finish mine before this coffee is cool enough to drink." Cut to v0. The prompt describes a developer portfolio: dark theme, animated hero with a typewriter effect showing "I build things," a responsive project grid pulling from a JSON file, an about section with a timeline, a contact form, and a dark/light mode toggle. v0 generates the first component. The narrator walks through what is appearing while keeping the tone casual -- "Oh, that's a nice grid layout... didn't ask for that hover effect but I'm keeping it." At 40 seconds, the design is complete. The code is exported to a GitHub repo. Vercel picks up the push and begins deploying. The narrator takes a sip of coffee. The Vercel build completes. The live site loads: responsive, polished, with real content. "Still too hot to drink. I should probably build a second portfolio."

Visual Concepts for Remotion:


Video #3: The $0 Startup (Lovable)

Title/Hook: "This app makes money. I didn't write a single line."

Tool: Lovable

Concept: A non-technical founder builds a complete SaaS product using only Lovable -- from idea to deployed, revenue-generating application. The video emphasizes that the person building this has no programming background. The "reveal" is not just the app, but a real Stripe dashboard showing the first payment.

Tone: Inspirational but grounded. Not "anyone can do this" hype -- more "here's exactly what the process looks like when you've never coded before."

Script Outline (190 words): Open on a text overlay: "I'm not a developer. I'm a marketing manager." The narrator continues: "Last month, I had an idea for a tool that helps freelancers track their invoices. This morning, I built it." Cut to Lovable. The prompt is detailed and specific -- it describes an invoice tracker with client management, recurring invoice templates, PDF export, and a simple dashboard showing outstanding payments. Lovable begins generating. The narration explains the key decisions: why the prompt specifies Supabase for the backend, why it asks for Row Level Security so each user only sees their own data, why it mentions Stripe Connect for future payment processing. At 45 seconds, the app is running in Lovable's preview. The narrator tests the core workflow: create a client, generate an invoice, export to PDF. Everything works. At 70 seconds, the app deploys. Cut to a real Stripe dashboard showing a $12 test payment. "I didn't write code. I didn't hire a developer. I described what I needed. Total investment: a Lovable subscription and one afternoon of prompt writing."

Visual Concepts for Remotion:


Video #4: Clone Wars (Cursor)

Title/Hook: "I showed AI a screenshot of Notion. Here's what happened."

Tool: Cursor (Agent mode with Composer)

Concept: A screenshot of Notion's interface is fed to Cursor's AI, along with a prompt asking it to recreate the core functionality. The video follows the agent as it plans the architecture, generates the components, and builds a working Notion-like workspace -- pages, blocks, drag-and-drop, slash commands -- all from a single image and a paragraph of context.

Tone: Playful and slightly mischievous. The "clone wars" framing leans into the controversy of AI-generated clones while keeping it lighthearted.

Script Outline (185 words): Open on a screenshot of Notion's interface. The narrator says: "This is Notion. 400 engineers built this over 10 years. I'm going to see how close AI can get in 2 minutes." The screenshot is dragged into Cursor's Composer. The prompt is brief but precise: recreate a note-taking workspace with a sidebar, nested pages, rich text blocks, slash command menu for adding headers/lists/toggles, and drag-to-reorder blocks. Cursor's agent starts planning. An overlay shows the agent's thought process -- the file tree it is creating, the components it has decided to build, the libraries it is installing. At 30 seconds, the first components render: a sidebar with a page tree. At 60 seconds, the editor is working: typing, formatting, slash commands. At 90 seconds, drag-and-drop is functional. The narrator does a side-by-side comparison with the original screenshot. Some elements are strikingly close. Others are clearly AI-generated. "Is it Notion? No. Could you use it? Absolutely. Did a human write any of this code? Not a single character."

Visual Concepts for Remotion:


Video #5: The Debug Olympics (Claude Code)

Title/Hook: "Can AI fix a bug faster than Stack Overflow?"

Tool: Claude Code

Concept: A real, nasty bug -- the kind that would send a developer to Stack Overflow for an hour -- is presented to Claude Code. The screen is split: on the left, a simulated "Stack Overflow search" shows the traditional debugging path (finding related questions, reading answers, trying solutions). On the right, Claude Code analyzes the error, traces the root cause through multiple files, and delivers a working fix. A race timer tracks both sides.

Tone: Competitive and high-energy, like a sports broadcast. The narration calls the race like a commentator.

Script Outline (175 words): Open on a terminal showing a cryptic error: a React hydration mismatch caused by a timezone-dependent date format in a server component. The narrator, in a sports-announcer voice: "In the left corner, the defending champion: Stack Overflow and pure human tenacity. In the right corner, the challenger: Claude Code. The bug: a hydration error that has already cost this developer 45 minutes. Let the race begin." The split screen activates. Left side: a browser opens Stack Overflow, searches the error message, scrolls through three different answers, tries a solution that does not work, goes back. Right side: Claude Code receives the error, opens the relevant files, traces the date formatting issue across server and client components, identifies the mismatch, proposes a fix, and applies it. Claude Code finishes in 23 seconds. The left side is still reading the second Stack Overflow answer. "The AI finished before the human found the right question to ask."

Visual Concepts for Remotion:


Video Series 2: "The Prompt That..." (Educational + Humor)

This series takes a single prompt and follows it to its logical (and sometimes illogical) conclusion. Each video is educational at its core -- you learn prompt engineering techniques, tool capabilities, and common pitfalls -- but the framing is comedic. The "The Prompt That..." naming convention is designed for curiosity-driven clicks.

Series format:


Video #6: The Prompt That Built a Game

Title/Hook: "The Prompt That Built a Game"

Tool: Claude Code + Remotion (for the game rendering)

Concept: A single, carefully crafted prompt generates a complete browser game -- not a trivial one, but a polished arcade game with physics, particle effects, a scoring system, leaderboard, and mobile touch controls. The video walks through the prompt's structure, explaining why each sentence matters, then shows the game coming to life.

Tone: Enthusiastic and educational. The narrator genuinely enjoys playing the result.

Script Outline (190 words): Open on the prompt, displayed as a sticky note. The narrator reads it aloud, pausing to annotate key phrases: "Notice I specified 'physics-based' -- without this, the AI defaults to simple collision rectangles." "I said 'particle effects on collision' -- this forces the AI to implement a particle system, which makes the game feel premium." The prompt is sent to Claude Code. The terminal comes alive with file creation. The narrator explains the AI's architectural decisions as they happen: "It chose HTML Canvas over DOM elements -- good call for performance." "It's implementing a game loop with requestAnimationFrame -- exactly right." At 50 seconds, the game runs for the first time. It has bugs: a sprite clips through a wall. The error is pasted back. At 65 seconds, the game runs cleanly. The narrator plays it for 20 seconds, showing the physics, particles, and scoring in action. "One prompt. One paste of an error message. A game that would have taken a junior developer a week. The lesson: specificity in your prompt is not optional. Every adjective earns its keep."

Visual Concepts for Remotion:


Video #7: The Prompt That Broke Everything

Title/Hook: "The Prompt That Broke Everything"

Tool: Bolt.new

Concept: A seemingly reasonable prompt -- "refactor the entire codebase to use TypeScript strict mode" -- is applied to a working JavaScript project. The video documents the cascade of failures: type errors multiply exponentially, the AI tries to fix them but introduces new ones, the build breaks, and the project enters what the narrator calls "the error spiral." The video then shows the recovery: how to scope refactoring prompts correctly.

Tone: Darkly comedic, building to genuine relief. The narrator treats the error messages like a horror movie.

Script Outline (185 words): Open on a working application. Green checkmarks everywhere. The narrator says: "This app works perfectly. It has 47 files, zero bugs, and 100% of its tests pass. I am about to destroy it with one sentence." The prompt appears: "Refactor this entire codebase to use TypeScript strict mode with no 'any' types." The AI begins. At first, it looks productive -- .js files become .tsx files. Then the errors start. The error count appears as a rising counter in the corner: 12... 47... 134... 312. The narrator's tone shifts from confident to concerned to horrified. "It's adding type assertions everywhere. Those are band-aids. The types are lying." At 60 seconds, the build fails completely. The recovery begins: the narrator shows how to scope the same refactoring into small, file-by-file prompts with test verification between each step. The error count drops. The builds pass. "The lesson: AI can refactor anything. But 'anything' and 'everything at once' are different requests."

Visual Concepts for Remotion:


Video #8: The Prompt That Got Me Fired (Hypothetically)

Title/Hook: "The Prompt That Got Me Fired (Hypothetically)"

Tool: Claude Code

Concept: A developer accidentally uses a vibe coding workflow on a production codebase -- accepting all changes without review, pushing without tests, deploying on a Friday afternoon. The video is a dramatized worst-case scenario that teaches real lessons about when NOT to vibe code. Every mistake is a real mistake that real developers have made.

Tone: Mock-serious, documentary style. Presented like a true-crime investigation of a deployment gone wrong.

Script Outline (180 words): Open on a dramatic title card: "INCIDENT REPORT: February 14, 2026." The narrator, in a deadpan documentary voice: "The following is a reconstruction of actual events. Names have been changed. The code has not." The prompt is revealed: a developer asked the AI to "update the user billing logic to handle the new pricing tiers" on the production branch. Without reading the diff. Without running tests. On a Friday at 4:47 PM. The AI changed the billing calculation -- and introduced a rounding error that charged every customer $0.01 extra per transaction. The video shows the cascade: the deploy, the first customer complaint, the Slack messages, the rollback attempt that failed because there was no checkpoint. "By Monday morning, 47,000 transactions were affected." The recovery section shows what should have happened: feature branch, test suite, staging deployment, code review. "Vibe coding is a superpower. And like every superpower, using it in the wrong context has consequences."

Visual Concepts for Remotion:


Video #9: The Prompt That Replaced My Intern

Title/Hook: "The Prompt That Replaced My Intern"

Tool: Cursor + Claude Code

Concept: A tech lead has a list of 23 tedious but necessary tasks that would normally be assigned to a junior developer or intern: rename variables to follow conventions, add JSDoc comments to exported functions, update deprecated API calls, create missing test stubs, fix all ESLint warnings. One prompt handles all of them. The video compares the estimated "intern hours" with the actual AI minutes.

Tone: Sympathetic and slightly guilty. The narrator acknowledges the awkwardness of the topic while being honest about the productivity gains.

Script Outline (175 words): Open on a task list -- 23 items, each with an estimated time: "Rename callbacks to follow naming convention (2 hours)," "Add JSDoc to all exported functions (4 hours)," "Update deprecated moment.js calls to dayjs (3 hours)." Total estimate: 34 hours of intern work. The narrator says: "I used to give this list to our summer intern. It would take them a full work week. This morning I gave it to the AI." A single, structured prompt appears, listing all 23 tasks with clear specifications. Claude Code begins. A progress bar tracks completed tasks. The terminal output shows files being modified, tests passing. At 45 seconds, 23 of 23 tasks are done. The narrator reviews the changes: "The variable renames are consistent. The JSDoc comments are accurate. The moment-to-dayjs migration handles edge cases I didn't think of." Total time: 8 minutes. "The intern now works on architecture decisions and feature design. The AI handles the checklist."

Visual Concepts for Remotion:


Video #10: The Prompt That Even My Mom Could Use

Title/Hook: "The Prompt That Even My Mom Could Use"

Tool: Lovable

Concept: The narrator's actual non-technical parent uses Lovable to build a small app -- a recipe organizer -- from scratch, using only natural language. The video is screen-recorded over the parent's shoulder (with permission). The charm is in the completely non-technical prompt language: "I want a thing where I can put my recipes and find them later, like a cookbook but on the computer."

Tone: Warm, genuine, and slightly humorous. The non-technical language in the prompts is endearing, not mocking.

Script Outline (185 words): Open on a text overlay: "I gave my mom a Lovable account and one instruction: build whatever you want." Cut to the screen. The prompt is typed in plain, non-technical English: "I want to save my recipes. Each recipe should have a name, the ingredients, the steps, and a photo. I want to search by ingredient so when I have chicken I can find all my chicken recipes. Make it pretty with a warm color like my kitchen." Lovable generates the app. The narrator points out that "make it pretty with a warm color like my kitchen" resulted in a terracotta-and-cream color scheme that actually looks good. The recipe form works. The search works. Photo upload works. The narrator's parent adds a real recipe -- handwritten notes visible on the desk for reference. The app works exactly as described. "She didn't say 'database.' She didn't say 'component.' She didn't say 'responsive.' She said 'like a cookbook but on the computer.' And that was enough."

Visual Concepts for Remotion:


Video #11: The Prompt That Fooled the Senior Dev

Title/Hook: "The Prompt That Fooled the Senior Dev"

Tool: Claude Code

Concept: A blind code review experiment. A senior developer is shown two pull requests: one written by a mid-level human developer, one generated entirely by AI from a single prompt. The senior reviews both, provides feedback, and guesses which is which. The reveal shows whether they guessed correctly -- and what the AI code got right that the human code got wrong (and vice versa).

Tone: Fair and balanced. This is not an "AI is better" video -- it is an honest comparison that reveals strengths and weaknesses on both sides.

Script Outline (195 words): Open on two code editors, labeled "Developer A" and "Developer B." The narrator explains: "A senior engineer with 12 years of experience is going to review two implementations of the same feature -- a real-time notification system. One was written by a mid-level developer in 6 hours. The other was generated by Claude Code from a single prompt in 4 minutes. The reviewer doesn't know which is which." Cut to the review. The senior developer's comments appear as overlays: "Developer A has clean separation of concerns... but this error handling is naive." "Developer B's type safety is impressive... but this abstraction feels over-engineered." The senior guesses: "A is the human, B is the AI. The human code feels more intentional. The AI code is technically thorough but lacks personality." The reveal: they got it backwards. Developer A was the AI. Developer B was the human. The narrator unpacks the implications: the AI's code was structurally cleaner, but the human's code had more creative architectural choices. "Neither was strictly better. They were differently excellent."

Visual Concepts for Remotion:


Video Series 3: "Tool Face-Off" (Comparison)

This series puts competing tools head-to-head on identical tasks. Same prompt, same requirements, same hardware. The evaluation is structured and scored across consistent categories: speed, code quality, developer experience, and output completeness. These are the videos developers watch before choosing their next tool.

Series format:


Video #12: Round 1 -- IDE Showdown (Cursor vs Claude Code vs Codex CLI)

Title/Hook: "Round 1: IDE Showdown -- Cursor vs Claude Code vs Codex CLI"

Tools: Cursor (Agent mode), Claude Code, OpenAI Codex CLI

Concept: All three tools receive the same prompt: build a task management API with authentication, CRUD operations, and automated tests. The video captures all three attempts simultaneously using a triple split-screen. Each tool is scored on time to completion, test pass rate, code quality (measured by a linting score), and developer experience (subjective rating of the interaction).

Tone: Fair, analytical, and energetic. This is a sports broadcast, not a product review. Every tool gets genuine praise for its strengths.

Script Outline (200 words): Open on a tournament bracket graphic. The narrator, in an announcer voice: "Three tools. One prompt. One winner. This is the IDE Showdown." The prompt appears: a task management REST API with JWT authentication, full CRUD, input validation, pagination, and a test suite. The rules: no human intervention after the prompt is submitted, tools are scored on four categories, each worth 25 points. "Round 1: Speed." The triple split-screen activates. Cursor's agent starts planning, showing its step-by-step approach. Claude Code opens multiple files simultaneously, working fast. Codex CLI takes a methodical, file-by-file approach. Time stamps appear as each tool finishes. "Round 2: Tests." Each tool's test suite runs. Pass rates appear on the scoreboard. "Round 3: Code Quality." ESLint scores flash on screen. "Round 4: Developer Experience." The narrator rates the interaction quality: how clear was the agent's communication, how easy was it to follow along, how much manual intervention was needed. The scorecard fills in. The verdict is revealed. "All three built a working API. The differences are in the details."

Visual Concepts for Remotion:


Video #13: Round 2 -- Builder Battle (Bolt.new vs Lovable vs Replit Agent)

Title/Hook: "Round 2: Builder Battle -- Bolt.new vs Lovable vs Replit Agent"

Tools: Bolt.new, Lovable, Replit Agent

Concept: The browser-based builders compete on a task suited to their strengths: build a complete landing page with a waitlist form, social proof section, feature comparison, and email capture that stores submissions to a real database. Scoring covers design quality, functionality, mobile responsiveness, and deployment speed.

Tone: Enthusiastic and visual. Since these are design-heavy tools, the video emphasizes how each app looks and feels rather than focusing purely on code.

Script Outline (190 words): Open on the challenge card: "Build a startup landing page with working waitlist signup. You have 3 minutes." Each builder gets the same prompt: a landing page for a fictional AI writing tool called "DraftPilot," with a hero section, three feature cards, a testimonial carousel, a pricing comparison, and a waitlist form that saves emails to Supabase. The triple split-screen shows all three tools working simultaneously. The narrator calls attention to interesting differences in real time: "Bolt.new went straight for the hero section -- it's already looking polished." "Lovable is building the database connection first -- solid fundamentals." "Replit Agent just asked a clarifying question about the color scheme -- that's a nice touch." At 90 seconds, the designs are compared side-by-side: mobile views, desktop views, scroll behavior, form functionality. Each tool's waitlist form is tested with a real email submission. The scoring covers design (how good does it look), function (does the form actually save data), responsiveness (mobile rendering), and speed (time to deployable state). "Each builder has a personality. The question is which personality matches yours."

Visual Concepts for Remotion:


Video #14: Round 3 -- Agent Arena (Devin vs Jules vs Claude Code)

Title/Hook: "Round 3: Agent Arena -- Devin vs Jules vs Claude Code"

Tools: Devin, Google Jules, Claude Code

Concept: The autonomous agents tackle a more complex task: given an existing open-source project with 15 open issues, each agent is assigned 5 issues and must work independently to create pull requests. Scoring covers issue resolution rate, PR quality, test coverage of the fix, and how well the agent communicated its approach.

Tone: Analytical with a sense of drama. These are the most powerful tools in the landscape, and the comparison is genuinely informative for teams making purchasing decisions.

Script Outline (200 words): Open on a GitHub issues page showing 15 open issues. The narrator: "Welcome to the Agent Arena. Three autonomous AI agents. Five GitHub issues each. No human help. Who writes the best pull requests?" The issues range from a CSS bug to a database query optimization to a feature request for dark mode. Each agent receives its 5 issues and a cloned copy of the repo. The video shows a triple timeline: Devin working in its cloud VM, Jules working asynchronously through Google Cloud, Claude Code working in the terminal. Key moments are highlighted: "Devin just opened a PR for the CSS bug -- let's see the diff." "Jules is running the test suite before committing -- smart." "Claude Code found a related bug while fixing issue #7 and filed a new issue for it -- above and beyond." After all agents submit their PRs, a senior developer reviews them. Scoring: issues resolved (did the PR actually fix it), code quality (clean diff, no regressions), test coverage (did the agent add tests), and communication (how clear was the PR description and commit message). "At this level, the differences are subtle. But subtle differences matter at scale."

Visual Concepts for Remotion:


Video #15: Round 4 -- Speed vs Quality (Bolt vs Claude Code)

Title/Hook: "Round 4: Speed vs Quality -- Bolt.new vs Claude Code"

Tools: Bolt.new, Claude Code

Concept: This is the philosophical face-off: the fastest browser builder against the most thorough terminal agent. The same prompt -- a complete habit-tracking app with streaks, charts, and reminders -- goes to both tools. Bolt.new finishes in minutes. Claude Code takes longer but produces more robust code. The question is not "which is better" but "which is better for what."

Tone: Thoughtful and balanced. This video acknowledges that "better" depends entirely on context.

Script Outline (195 words): Open on a scale graphic: "Speed" on one side, "Quality" on the other. The narrator: "Every developer makes this trade-off. Today we make it explicit." The prompt: a habit tracker with daily check-ins, streak counting with freeze days, progress charts using a real charting library, push notification reminders, and data export. Bolt.new starts. The app assembles rapidly in the browser -- UI components appear, the habit list renders, the chart populates. Time: 3 minutes and 12 seconds. It looks good. It works. Claude Code starts. The terminal is busier -- it is setting up a proper project structure, adding TypeScript types, writing utility functions with edge case handling, creating a test file. Time: 14 minutes and 47 seconds. It also works. Now the comparison. The narrator stress-tests both: "What happens when the streak crosses a month boundary?" Bolt's version has a bug. Claude Code's handles it correctly. "What about the UI?" Bolt's is more visually polished out of the box. "Both answers are right. The question is what you need right now: a working prototype by lunch, or a production foundation by end of week."

Visual Concepts for Remotion:


Video Production Workflow

Every video in this chapter follows the same five-stage production pipeline. This section documents the pipeline so that new videos can be produced consistently and efficiently.

Stage 1: Script Writing

Every video begins as a markdown file. Scripts follow a strict format:

---
video_id: PTP-001
series: prompt-to-product
title: "I built a $9/month SaaS in 60 seconds"
duration_target: 60-90s
tool: Bolt.new
status: production
last_updated: 2026-02-25
---

## Hook (0:00 - 0:03)
[Opening visual description]
NARRATOR: "Opening line designed to stop the scroll."

## Setup (0:03 - 0:08)
[Screen state description]
NARRATOR: "Context setting. What we are about to do and why it matters."

## Build (0:08 - 0:55)
[Screen recording cues with timestamps]
NARRATOR: "Running commentary on what the AI is doing. Call out
interesting decisions. Keep energy high."

## Reveal (0:55 - 1:05)
[Final product display]
NARRATOR: "The payoff. Show the deployed result. Land the key stat."

## End Card (1:05 - 1:10)
[Branding overlay]
NARRATOR: "Call to action -- next video, ebook link, subscribe."

Script guidelines:

Stage 2: Visuals (Remotion Compositions)

Each video is a Remotion composition -- a React component that renders frame-by-frame to produce video output. The compositions combine three types of visual content:

Screen Recordings

Motion Graphics

Code Animations

Composition structure:

src/
  compositions/
    prompt-to-product/
      PTP001-SaaS60.tsx        # Main composition
      PTP001-assets/            # Screen recordings, images
    the-prompt-that/
      TPT001-Game.tsx
      TPT001-assets/
    tool-face-off/
      TFO001-IDEShowdown.tsx
      TFO001-assets/
  components/
    CountdownTimer.tsx
    ScoreBoard.tsx
    SplitScreen.tsx
    EndCard.tsx
    StickyNote.tsx
    CodeBlock.tsx
    ProgressTracker.tsx
    RaceTimer.tsx
  styles/
    theme.ts                   # Shared colors, fonts, spacing
    animations.ts              # Shared spring configs

Stage 3: Audio

Narration

Sound Design

Stage 4: Branding

Every video carries the EndOfCoding brand identity consistently:

Logo

Color Palette

Typography

End Card (last 5 seconds of every video)

Stage 5: Distribution

Each video exists in multiple formats for different platforms:

Full-Length (YouTube + Ebook Embed)

Short-Form Clips (TikTok / Instagram Reels / YouTube Shorts)

Ebook Embed

SEO and Metadata

YouTube Optimization

Cross-Linking


Embedding Videos in the Interactive Ebook

The interactive web version of this ebook uses Remotion's @remotion/player component to embed videos directly in the reading experience. This means videos are not external links -- they are native elements of the page, rendered inline alongside the text.

Technical Implementation

Each video is embedded using a VideoTutorial React component:

import { Player } from "@remotion/player";
import { PTP001 } from "../compositions/prompt-to-product/PTP001-SaaS60";

export const VideoTutorial = ({
  compositionId,
  title,
  duration,
  tools,
  transcript,
}: VideoTutorialProps) => {
  return (
    <section className="video-tutorial">
      <h3>{title}</h3>
      <div className="video-meta">
        <span className="duration">{duration}</span>
        <span className="tools">{tools.join(" + ")}</span>
      </div>
      <Player
        component={PTP001}
        compositionWidth={1920}
        compositionHeight={1080}
        durationInFrames={2700} // 90s at 30fps
        fps={30}
        controls
        style={{ width: "100%", maxWidth: 800 }}
      />
      <details className="transcript">
        <summary>View Transcript</summary>
        <p>{transcript}</p>
      </details>
    </section>
  );
};

Reader Experience

When a reader scrolls to a video in the ebook:

  1. Poster frame -- A thumbnail of the most visually interesting moment loads immediately (lazy-loaded image, minimal bandwidth)
  2. Play button overlay -- A single click starts playback. Videos do not autoplay
  3. Inline controls -- Play/pause, scrub bar, volume, fullscreen, and playback speed (0.5x to 2x)
  4. Transcript toggle -- A collapsible section below the video contains the full narration transcript, making the content accessible and searchable
  5. Chapter links -- If the video references tools or concepts covered in other chapters, inline links appear below the video

Offline and Static Fallbacks

For the markdown and Word versions of the ebook (which cannot embed video):

For the static HTML version (no JavaScript):


Video Production Schedule

New videos are added on a monthly cadence. The production schedule follows the tool landscape -- when a major tool update ships, a new video is produced within two weeks to document the changed workflow.

Month Planned Videos Series
March 2026 #1 60-Second SaaS, #6 Game Builder Prompt to Product, The Prompt That
April 2026 #12 IDE Showdown, #7 Broke Everything Tool Face-Off, The Prompt That
May 2026 #2 Portfolio Speedrun, #13 Builder Battle Prompt to Product, Tool Face-Off
June 2026 #3 The $0 Startup, #8 Got Me Fired Prompt to Product, The Prompt That
July 2026 #14 Agent Arena, #9 Replaced My Intern Tool Face-Off, The Prompt That
August 2026 #4 Clone Wars, #10 Mom Could Use Prompt to Product, The Prompt That
September 2026 #15 Speed vs Quality, #11 Fooled Senior Dev Tool Face-Off, The Prompt That
October 2026 #5 Debug Olympics, New TBD Prompt to Product, TBD

The schedule prioritizes alternating between series to maintain variety. High-impact tool launches (new Cursor version, Claude Code update, new entrant) can preempt the schedule.


Video Index

A quick-reference table of all videos in this chapter:

# Title Series Tool(s) Duration Status
1 I built a $9/month SaaS in 60 seconds Prompt to Product Bolt.new 60-90s Pre-production
2 Your portfolio shouldn't take longer than your morning coffee Prompt to Product v0 + Vercel 60-90s Pre-production
3 This app makes money. I didn't write a single line. Prompt to Product Lovable 60-90s Pre-production
4 I showed AI a screenshot of Notion. Here's what happened. Prompt to Product Cursor 60-90s Pre-production
5 Can AI fix a bug faster than Stack Overflow? Prompt to Product Claude Code 60-90s Pre-production
6 The Prompt That Built a Game The Prompt That Claude Code 90-120s Pre-production
7 The Prompt That Broke Everything The Prompt That Bolt.new 90-120s Pre-production
8 The Prompt That Got Me Fired (Hypothetically) The Prompt That Claude Code 90-120s Pre-production
9 The Prompt That Replaced My Intern The Prompt That Cursor + Claude Code 90-120s Pre-production
10 The Prompt That Even My Mom Could Use The Prompt That Lovable 90-120s Pre-production
11 The Prompt That Fooled the Senior Dev The Prompt That Claude Code 90-120s Pre-production
12 IDE Showdown: Cursor vs Claude Code vs Codex CLI Tool Face-Off Cursor, Claude Code, Codex CLI 90-120s Pre-production
13 Builder Battle: Bolt.new vs Lovable vs Replit Agent Tool Face-Off Bolt.new, Lovable, Replit Agent 90-120s Pre-production
14 Agent Arena: Devin vs Jules vs Claude Code Tool Face-Off Devin, Jules, Claude Code 90-120s Pre-production
15 Speed vs Quality: Bolt.new vs Claude Code Tool Face-Off Bolt.new, Claude Code 90-120s Pre-production

Measuring Video Impact

Each video is tracked across platforms with the following metrics:

Engagement Metrics

Conversion Metrics

Quality Metrics

Videos with below-average retention in the first 5 seconds get their hooks rewritten. Videos with above-average ebook-to-YouTube conversion get promoted in the chapter ordering.


This chapter is updated monthly with 2-4 new videos as the vibe coding tool landscape evolves. Each update includes new video entries, refreshed comparisons when tools ship major versions, and community-requested tutorials. Last updated: March 2026.

21. Monthly Intelligence Brief: April 2026

Updated April 15, 2026

What changed in the vibe coding world this month. Updated on the 1st of each month for subscribers.

📰
Headline: Cursor 3 reimagines the IDE around multi-agent orchestration. Anthropic's Claude Mythos scores 93.9% on SWE-bench and autonomously discovers zero-days in FreeBSD — but is restricted to cybersecurity defense only via Project Glasswing. Meta Superintelligence Labs debuts Muse Spark. The Trivy supply chain attack cascades into a self-propagating npm worm hitting 64+ packages with blockchain C2 infrastructure. Claude suffers three consecutive days of outages. GitHub Copilot announces it will train on user code by default from April 24. New (April 10–15): Vercel discloses 7 CVEs in Cloudflare's AI-built Vinext — the only confirmed production deploy is CIO.gov. GLM-5.1 becomes the first fully open-source model to top SWE-Bench Pro, beating all closed-source models. Claude Code ships worktree switching, PreCompact hooks, and auto-stream-abort.
PRODUCT
Cursor 3: Agents Window, Design Mode, Cloud-to-Local Handoff
Anysphere launched Cursor 3 on April 2 — a ground-up redesign focused on multi-agent orchestration rather than traditional code editing. The new Agents Window replaces the Composer pane with a full-screen workspace where multiple AI agents run simultaneously in side-by-side, grid, or stacked tabs. Design Mode lets you click any element in a browser preview and direct agents to modify that exact component visually, closing the design-to-code loop. Cloud-to-local handoff carries agent session context seamlessly. New Automations can be triggered by external services. The Await tool lets agents pause for background shell commands. Memory is lighter; large-file diffs are faster. MCP Apps now support structured content. Cursor 3 represents the maturation from "AI-augmented IDE" to "agent orchestration platform."
AI MODEL — RESTRICTED
Claude Mythos: 93.9% SWE-bench — Restricted to Cybersecurity Defense
On April 7, Anthropic announced its most capable model to date — Claude Mythos — via Project Glasswing. It is not publicly available. Access is restricted to cybersecurity defense organizations: AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, Microsoft, NVIDIA, Palo Alto Networks, Linux Foundation, and ~40 others. Benchmarks: 93.9% on SWE-bench (+13.1 percentage points over Opus 4.6), 97.6% on USAMO, 83.1% on CyberGym (vs 66.6% for Opus 4.6). During testing, Mythos autonomously discovered CVE-2026-4747 — a 17-year-old remote code execution vulnerability in FreeBSD — and found thousands of zero-day vulnerabilities across every major OS and browser. It is restricted specifically because it can autonomously both discover and exploit software vulnerabilities at scale. Project Glasswing channels its capabilities exclusively toward defense: patch prioritization, vulnerability remediation, and threat analysis for partner organizations.
AI MODEL
Meta Muse Spark: Meta Superintelligence Labs Debuts
On April 8, Meta released Muse Spark — the first model from its newly formed Meta Superintelligence Labs (built after the ~$14B deal to bring in Scale AI CEO Alexandr Wang). Muse Spark is natively multimodal with reasoning, tool-use, visual chain of thought, and multi-agent orchestration. It is not open source — unlike Llama, it's API-only in private preview. Benchmarks: 86.4 on CharXiv Reasoning (vs Gemini 3.1 Pro 80.2 and GPT-5.4 82.8), 50.2 on Humanity's Last Exam in Contemplating mode (vs Gemini 3.1 Deep Think 48.4). Meta claims 10x less compute than Llama 4 Maverick for equivalent capability. Muse Spark powers Meta AI across WhatsApp, Instagram, Facebook, and Messenger — reaching approximately 3 billion users. Coding is not a current strength; science, reasoning, and health benchmarks are where it leads.
SECURITY
Trivy Cascade Extends: CanisterWorm Self-Propagates Across 64+ npm Packages
The Trivy supply chain attack (CVE-2026-33634, first reported late March) cascaded into a much larger incident in early April. Attackers had force-pushed malicious code to 75 of 76 trivy-action GitHub Actions tags; it took five days to fully evict them, during which they published additional malicious Docker images during the remediation effort. The attack then cascaded into CanisterWorm — a self-propagating npm worm that hit 64+ packages using a blockchain-based command-and-control infrastructure, making it unusually resistant to takedown. CanisterWorm subsequently infected Checkmarx KICS and AST GitHub Actions, and separately reached LiteLLM (95 million monthly PyPI downloads). The combined blast radius makes this the most extensive supply chain cascade in AI developer tooling history. Treat any Trivy, Checkmarx, or LiteLLM pipeline that ran between March 19 and April 10 as potentially compromised.
RELIABILITY
Claude Down: Three Consecutive Days of Outages (April 6–8)
Anthropic's Claude services suffered three consecutive days of disruptions in the week of April 6. On April 6, a 10-hour outage generated 8,000+ Downdetector reports, with chat and login failures affecting Claude.ai and Claude Code users. On April 7, elevated errors ran from 14:32 to 15:12 UTC, affecting authentication across Claude.ai and Claude Code. On April 8, Sonnet 4.6 errors continued from 23:00 PT to 1:50 PT. No single root cause was publicly disclosed. For teams running autonomous Claude Code workflows, this week underscored the importance of retry logic, fallback providers, and not scheduling mission-critical agent tasks without error handling and alerting.
PRIVACY
GitHub Copilot to Train on User Code by Default from April 24
GitHub announced that starting April 24, 2026, interaction data for Copilot Free, Pro, and Pro+ users — including inputs, outputs, and code snippets — will be used for AI model training by default. Users must actively opt out in their GitHub account settings. Enterprise and Business plans are not affected. For teams working with proprietary code, client code, or regulated data, this policy change requires action before April 24. Meanwhile, April also brought the Copilot SDK (public preview, April 2) for embedding Copilot agentic capabilities into custom apps and workflows, and Autopilot mode (public preview) for fully autonomous agent execution with self-approval and auto-retry.
CRITICAL SECURITY
Axios npm Supply Chain Attack — North Korean State Actor
On March 31, attackers attributed to UNC1069 (a North Korea-nexus, financially motivated threat group) compromised the npm account of the axios lead maintainer and published malicious versions 1.14.1 and 0.30.4. The packages installed a hidden dependency “plain-crypto-js” that deployed the WAVESHAPER.V2 backdoor — a cross-platform remote access trojan targeting Windows, macOS, and Linux. Axios has approximately 100 million weekly downloads, making this one of the most impactful npm supply chain attacks ever recorded. The malicious versions were live for roughly 3 hours before being removed. Attribution confirmed by Google Threat Intelligence Group. Rotate all credentials in any environment that installed these versions.
SECURITY
LiteLLM and Langflow Supply Chain Attacks Hit AI Infrastructure
The week of March 24 saw two more high-severity supply chain attacks targeting the AI developer ecosystem. LiteLLM versions 1.82.7 and 1.82.8 were compromised with a multi-stage credential stealer harvesting SSH keys, cloud provider tokens, Kubernetes secrets, cryptocurrency wallets, and .env files — precisely the kind of secrets that accumulate in AI developer environments. Separately, CVE-2026-33017 disclosed a critical code injection in Langflow (the popular AI agent framework) affecting versions ≤ 1.8.2: an unauthenticated attacker could trigger remote code execution via the public flow build endpoint. Exploitation was observed within 20 hours of disclosure. CISA added CVE-2026-33017 to its Known Exploited Vulnerabilities catalog. Also disclosed: CVE-2026-33634 — malicious code embedded in Aqua Security’s Trivy scanner Docker Hub images, attributed to TeamPCP.
RESEARCH
Georgia Tech: 2,000+ Vulnerabilities in 5,600 Vibe-Coded Apps
The Georgia Tech Vibe Security Radar project released its analysis of 5,600 publicly deployed vibe-coded applications, finding over 2,000 vulnerabilities, 400+ exposed secrets, and 175 instances of exposed PII. Separately, tracking data shows AI-generated code now contributes 35 CVEs per month — up from 6 in January and 15 in February 2026. Autonoma research puts 53% of AI-generated code as having security holes. The pattern is consistent: AI models generate functional code quickly but skip authentication checks, leave credentials in source, and mis-scope data access. The backlash narrative is shifting from “vibe coding is dangerous” to “treat AI output like code from a fast junior developer — review it.”
MILESTONE
MCP Hits 97 Million Monthly Downloads in 5 Months
As of March 25, the Model Context Protocol SDK has reached 97 million monthly downloads — up from approximately 2 million at the time of its November 2025 launch, representing 4,750% growth in five months. There are now 5,800+ community and enterprise MCP servers, and every major AI lab (OpenAI, Google DeepMind, Cohere, Mistral) has integrated MCP support. The protocol has become the de facto standard for tool connectivity in agentic AI systems, faster than any previous developer infrastructure standard has achieved ecosystem-wide adoption.
PRODUCT
Cursor Self-Hosted Cloud Agents
Cursor launched self-hosted cloud agents on March 25 — a direct response to enterprise security requirements. Code, tool execution, build outputs, and secrets now stay entirely within the customer’s own network. The product also includes security automation templates: agents that review 3,000+ internal pull requests per week, catching 200+ vulnerabilities across large engineering organizations. This positions Cursor as enterprise-grade infrastructure, not just an IDE, and directly addresses the objection that AI coding tools require sending proprietary code to third-party servers.
CULTURE
Vibe Coding Turns One Year Old
February–March 2026 marks the first anniversary of Andrej Karpathy’s viral X post coining “vibe coding.” Collins English Dictionary named it the Word of the Year for 2025. Retrospective content flooded technical media: daily.dev, DEV Community, Taskade’s State of Vibe Coding 2026 report, and CodeRabbit’s semantic history of the term. Anthropic is publicly pushing a “vibe working” framing — extending the concept beyond code to all knowledge work done with AI. LogRocket’s March 2026 Power Rankings put Windsurf #1, followed by Google Antigravity, Cursor, Claude Code, and Codex. A year in: the tools are mature, the workflows are real, and the debate has moved from “will this work?” to “how do we do it safely?”
AI MODELS
SWE-Bench Convergence: Six Models Within 0.8 Points
The March 2026 SWE-bench Verified leaderboard shows an unprecedented convergence at the top: Claude Opus 4.6 (80.8%), Gemini 3.1 Pro (80.6%), GPT-5.4 (77.2%), and Claude Sonnet 4.6 (~75.6%) are now within striking distance of each other. Six models in total fall within 0.8 points. The era of any single model dominating coding benchmarks appears to be over. Qwen 3.5 — fully rolled out in early March — leads the 7–9B parameter class on HumanEval, continuing the open-weights pressure on proprietary pricing.

Previous Month: March 2026

Key Developments

ECOSYSTEM
The Open Source Crisis
Researchers across four universities found vibe coding creates a negative feedback loop for open source. Tailwind CSS downloads climbed while docs traffic fell 40% and revenue dropped 80%. cURL shut down its bug bounty after AI submissions drove valid rates to 5%. Ghostty banned AI-generated code. tldraw auto-closes all external PRs. RedMonk calls it "AI Slopageddon."
PRODUCT
Gemini 3 Powers Jules
Google rolled out Gemini 3 Pro to Jules, its async coding agent. Gemini 3 surpasses 2.5 Pro at coding with stronger intent alignment and improved agentic workflows. Jules now includes Tools for terminal access, CLI extension, and API access.
PRODUCT
Cursor 2.6: Automations, JetBrains, and MCP Apps
Cursor 2.6 shipped three major features in one week: always-on Automations (agents triggered by Slack, Linear, GitHub, PagerDuty with persistent memory), JetBrains IDE support via Agent Client Protocol (IntelliJ, PyCharm, WebStorm), and interactive MCP Apps (Figma, Amplitude, tldraw in chat). Team plugin marketplaces. Composer’s proprietary model runs at 2x the speed of Sonnet 4.5. Market share holds at ~25% of GenAI clients.
PLATFORM
Copilot Opens Multi-Model Access
Since Feb 26, all paid GitHub Copilot users can choose Claude, Codex, or Copilot as their agent model, assigning the same issue to all three simultaneously. 26M+ users across 6+ IDEs. Copilot’s coding agent spins up Actions VMs and opens draft PRs autonomously.
ENTERPRISE
Pega Makes Vibe Coding Enterprise-Ready
On March 5, Pegasystems announced a full vibe coding experience in Pega Blueprint. Users converse with app designs via text or speech, with security protocols, third-party compatibility, and performance metrics for large-scale operations. First major enterprise platform to brand its AI features as “vibe coding.”
AI MODEL
Opus 4.6 Agent Teams Mature
Anthropic’s Opus 4.6 is now the default for Claude Code Max/Team subscribers. The “agent teams” feature splits work across multiple coordinated agents. 16 Opus 4.6 agents wrote a C compiler in Rust capable of compiling the Linux kernel (~$20K cost).
PRODUCT
Devin 2.2 and SWE-1.6
Cognition shipped Devin 2.2 — the most important update since launch with dramatically fewer bugs. SWE-1.6 training preview began March 1. PR merge rate improved from 34% to 67%. Security fixes average 1.5 min vs 30 min for humans. Cognition raised $500M at ~$10B valuation, with combined Devin+Windsurf ARR more than doubling post-acquisition.
AI MODEL
GPT-5.4 Launches with Computer Use
On March 5, OpenAI released GPT-5.4 in Standard, Thinking, and Pro variants. Native computer-use capabilities, 1M token context, and 33% fewer errors vs GPT-5.2. ChatGPT for Excel/Sheets integration and financial tools (FactSet, MSCI, Moody’s) signal a major enterprise push. First OpenAI model with built-in computer use — directly competing with Anthropic’s computer-use features.
GEOPOLITICS
Pentagon Labels Anthropic a Supply-Chain Risk
The DOD labeled Anthropic a supply-chain risk — the first time an American company has received this designation, typically reserved for foreign adversaries. The dispute centers on Anthropic’s refusal to support autonomous weapons and domestic surveillance use cases. Defense tech firms are actively dropping Claude. CEO Dario Amodei called OpenAI’s messaging about their competing Pentagon deal “straight up lies.” Negotiations reportedly resumed as of March 5.
PRODUCT
Claude Code: Voice Mode and Security Patches
Anthropic rolling out voice mode (/voice push-to-talk) to ~5% of users. STT expanded to 20 languages. New MCP management via /mcp dialog, Claude API skill, and session naming. Two critical CVEs patched: CVE-2025-59536 (RCE via malicious repos) and CVE-2026-21852 (API key exfiltration through project files). Both vulnerabilities allowed malicious repositories to trigger arbitrary shell commands on tool initialization.
NEW TOOL
Kilo Code: Open-Source Multi-Agent Coding
Kilo Code, launched by a GitLab co-founder, has already attracted 1.5M+ users. Orchestrator mode with planner/coder/debugger sub-agents. 500+ model support. Available in VS Code, JetBrains, and CLI. $19/mo or BYO API key. Directly challenges Claude Code, Copilot, and Cursor in the AI coding agent space.
AI MODEL
Qwen 3.5 and Open Weights Push
Alibaba released Qwen 3.5 in four sizes (0.8B, 2B, 4B, 9B) with open weights. Scoring 74.1% on LiveCodeBench v6 — among the strongest results for real-world coding tasks. The open-weights trend continues to pressure proprietary model pricing.
PRODUCT
Claude Code /loop: Autonomous Scheduled Tasks
Claude Code versions 2.1.63–2.1.76 shipped in rapid succession through March 2026, adding the /loop command (cron-like session-scoped task scheduler), Skills.md for persistent agent behaviors, a 1-million-token context window, and increased max output to 64k tokens for Opus 4.6 (128k upper bound for both Opus 4.6 and Sonnet 4.6). MCP servers can now request structured input mid-task via interactive dialogs. /loop turns Claude Code into a background worker for PR reviews, deployment monitoring, and recurring analysis tasks — the closest any tool has come to a fully autonomous development partner.
FUNDING
Replit $400M Series D at $9B Valuation
On March 11, Replit closed a $400M Series D led by Georgian Partners at a $9 billion valuation — triple its $3B valuation from September 2025. Participants include a16z, Coatue, Y Combinator, Accenture Ventures, and Databricks Ventures. Replit is targeting $1B ARR by year-end. 75% of Replit AI users write zero code themselves. The round signals that browser-based full-stack builders remain one of the hottest segments in AI tooling.
STRATEGY
Lovable Goes on the Acquisition Hunt
On March 23, Lovable CEO Anton Osika announced the $6.6B vibe-coding platform is actively hunting acquisitions. The company hit $400M ARR by March 12 (up from $200M at end-2025) with only 146 employees, and is now deploying M&A as a competitive weapon against Cursor, Replit, and Bolt. It previously acquired cloud provider Molnett. Target criteria: “builder-first, high-agency teams” who move fast. This is an unusual posture for a 3-year-old startup — a sign of how rapidly vibe-coding market share is being contested.
PRODUCT
Cognition Ships Devin Review and Windsurf Codemaps
Cognition launched two products in late March. Devin Review is a free code review tool that reads any GitHub PR (public or private) and not only flags issues but spins up a cloud agent to test and propose fixes. Windsurf Codemaps are AI-annotated structured maps of entire codebases, powered by SWE-1.5 and Claude Sonnet 4.5, giving developers navigable context over large repositories before they start making changes. Both tools reflect Cognition's strategy to dominate the full developer workflow — from understanding code to shipping fixes.
PLATFORM
GitHub Copilot JetBrains Agentic Capabilities Go GA
On March 11, GitHub made core agentic capabilities — custom agents, sub-agents, and Plan Agent mode — generally available in GitHub Copilot for JetBrains IDEs, with agent hooks entering preview. On March 12, a new GitHub Copilot Student plan launched, maintaining free access for verified students while restricting self-selection of premium models (GPT-5.4, Claude Opus/Sonnet) in favor of Copilot Auto mode.
SECURITY
OpenClaw Supply Chain Attack: 1,184 Malicious MCP Packages
The largest confirmed supply chain attack targeting AI agent infrastructure: Antiy CERT confirmed 1,184 malicious skills across ClawHub — approximately one in five packages in the open-source MCP ecosystem. Simultaneously, security researchers documented 30+ CVEs targeting MCP servers in just 60 days. Highlights include CVE-2026-23744 (CVSS 9.8, MCPJam Inspector ≤ v1.4.2 — any crafted HTTP request could install an arbitrary MCP server and execute code with no user interaction), a CVSS 9.6 RCE in Microsoft’s Azure MCP server, and BlueRock Security finding 36.7% of 7,000+ analyzed MCP servers potentially vulnerable to SSRF. Treat MCP server packages with the same scrutiny you’d apply to executable binaries.

Numbers Update (April 9, 2026)

93.9%
Claude Mythos on SWE-bench (restricted — Project Glasswing defense partners only)
64+
npm packages infected by CanisterWorm (Trivy cascade, April 2026)
97M
MCP monthly SDK downloads (Mar 25, 2026) — up from 2M at launch
35
CVEs/month attributed to AI-generated code (March 2026 — up from 6 in January)
29%
Developers with "high trust" in AI tool output (down from 70%+ in 2023)
75%
Reduction in PR turnaround time for AI-tool teams (9.6 days → 2.4 days)
73%
Developers using AI tools daily globally (Stack Overflow Q1 2026)
20M+
GitHub Copilot paid users (April 2026)

What to Watch in May 2026

🔗
Stay current: Get daily updates at EndOfCoding.com. Subscribe to the ebook for monthly intelligence briefs with full analysis, data, and actionable insights. Try hands-on courses at Vibe Coding Academy.

Chapter 22: Community Showcase

Updated March 6, 2026

Real projects built by real people using vibe coding. Updated monthly.


Welcome to the Showcase

This chapter is different from the rest of the book. It is not written by us -- it is written by you.

Every project featured here was built using the techniques, tools, and philosophies described in the preceding chapters. Some were built by seasoned developers experimenting with a new workflow. Others were built by people who had never written a line of code before picking up Cursor or Bolt.new. All of them went from idea to deployed software using AI-native development.

The community showcase exists for three reasons:

  1. Proof that it works. Theory is useful. Seeing a non-technical product manager ship an internal dashboard in four hours is more useful.
  2. Shared knowledge. Every submission includes the prompts that worked, the mistakes that cost time, and the metrics that followed. This is a living library of hard-won lessons.
  3. Inspiration. The gap between "I should build something" and "I shipped something" is often just seeing someone in a similar position who already did it.

We review submissions monthly and feature the most instructive projects -- not necessarily the most impressive ones. A weekend prototype that taught the builder three critical lessons about prompt structure is more valuable here than a polished SaaS with no story behind it.


How to Submit Your Project

We welcome submissions from anyone who has built and deployed something using AI-native development tools. Your project does not need to be generating revenue. It does not need to be technically sophisticated. It needs to be real, deployed, and accompanied by an honest account of how it was built.

Submission Template

Copy the template below, fill it in, and submit it to showcase@endofcoding.com or post it in the #showcase channel on our community Discord.

## Project Submission

**Project Name:**
[Your project name]

**Live URL:**
[Link to the deployed project]

**Builder Name:**
[Your name or handle]

**Builder Background:**
[Developer / Designer / Product Manager / Non-technical / Student / Other]
[Brief bio: 1-2 sentences about your experience level and day job]

**Tools Used:**
[List all AI tools: Cursor, Claude Code, Bolt.new, v0, Lovable, Replit Agent, etc.]
[List supporting tools: Vercel, Supabase, Stripe, Tailwind, etc.]

**Timeline:**
[Time from first prompt to deployed: e.g., "6 hours over a weekend"]

**Key Prompts (1-3 of your best prompts that made the biggest difference):**

Prompt 1:
"""
[Paste the actual prompt text you used]
"""
Why it worked: [Brief explanation]

Prompt 2:
"""
[Paste the actual prompt text]
"""
Why it worked: [Brief explanation]

Prompt 3 (optional):
"""
[Paste the actual prompt text]
"""
Why it worked: [Brief explanation]

**What Went Right:**
- [Bullet point]
- [Bullet point]
- [Bullet point]

**What Went Wrong:**
- [Bullet point]
- [Bullet point]
- [Bullet point]

**Metrics (share what you are comfortable sharing):**
- Users: [number or range]
- Revenue: [if applicable]
- Other: [downloads, signups, press mentions, job offers, etc.]

**One Sentence of Advice for Someone Starting Today:**
[Your best tip]

Submission Guidelines


Featured Projects

Project 1: WaitlistWizard -- SaaS Micro-Tool Built in a Weekend

What it is: A standalone waitlist management tool for indie makers launching products. Users create a waitlist page with a custom domain, collect emails with referral tracking, and send launch-day notifications. Includes an analytics dashboard showing signup velocity, referral sources, and geographic distribution.

Builder Profile: Marcus Chen, 29. Full-stack developer at a mid-size fintech company during the week. Side-project builder on weekends. Had used GitHub Copilot for two years but had never tried a full vibe coding workflow until this project.

Tools Stack:

Build Timeline: 14 hours across a Saturday and Sunday. First prompt at 9 AM Saturday. Deployed and shared on X at 11 PM Sunday.

Key Prompts:

Prompt 1 -- The initial spec:

Build a waitlist management SaaS with Next.js 14 App Router and Supabase.

Core features:
1. Landing page builder: user creates a waitlist page with custom title,
   description, and color scheme. Each page gets a unique slug (/w/[slug]).
2. Email collection: visitors enter email, get position number.
   Referral link generated automatically. Each referral moves the referrer
   up 3 positions.
3. Dashboard: real-time count of signups, chart of signups over time,
   top referrers table, geographic breakdown (from IP geolocation).
4. Launch notification: one-click send to all collected emails.

Auth: Supabase Auth with GitHub and Google OAuth.
Database: Supabase PostgreSQL with RLS policies.
Styling: Tailwind with a clean, minimal aesthetic. Dark mode default.

Start with the database schema and RLS policies, then build the
dashboard, then the public-facing waitlist pages.

Why it worked: Front-loading the database schema and RLS policies meant the entire data layer was solid before any UI code was written. This prevented three or four rounds of restructuring that typically happen when you build UI first.

Prompt 2 -- Referral tracking logic:

Add referral tracking to the waitlist system.

When a user signs up for a waitlist:
1. Generate a unique referral code (8 char alphanumeric)
2. Create a shareable URL: [domain]/w/[slug]?ref=[code]
3. When someone signs up via a referral link, record the referral
4. Move the referrer up 3 positions in the queue
5. Send the referrer an email: "Someone joined through your link!
   You moved up to position [X]."

Store referral chains (who referred whom) for the dashboard analytics.
Prevent self-referral. Cap position boost at top 10% of the list.
Handle edge cases: expired waitlists, duplicate signups from same email,
referral codes for non-existent waitlists.

Why it worked: Explicitly listing edge cases in the prompt eliminated two bugs that would have appeared in production. The AI handled all four edge cases correctly on the first generation.

Prompt 3 -- The analytics dashboard:

Build the waitlist analytics dashboard. The user is logged in and
viewing their waitlist's stats.

Show:
- Total signups (big number with daily change indicator, green up/red down)
- Signup velocity chart (line chart, last 30 days, using Recharts)
- Top 10 referrers table (name, referral count, conversion rate)
- Geographic distribution (top 5 countries as horizontal bar chart)
- Recent signups feed (last 20, real-time updates via Supabase Realtime)

All data fetched server-side with React Server Components.
The recent signups feed is a Client Component with real-time subscription.
Loading states: skeleton UI for each card while data loads.
Empty states: friendly message + illustration when no data yet.

Why it worked: Separating server components from client components in the prompt gave the AI clear architectural guidance. The result needed zero restructuring.

Before/After: Marcus had previously attempted to build a similar waitlist tool using traditional development. He spent three weekends on it, got about 60% through the feature set, and abandoned it when the referral position tracking logic became tangled. With vibe coding, the complete feature set was done in one weekend, including features he had not originally planned (geographic analytics, real-time feed).

Lessons Learned:

Outcome: Posted on X and Hacker News the following Monday. 340 upvotes on HN. 2,100 signups in the first week. 180 paying users ($9/month) within 60 days. Currently at $1,620 MRR and growing. Marcus has not yet quit his day job but is now building his second product using the same workflow.


Project 2: FieldSync -- Internal Tool Built by a Non-Technical PM

What it is: An internal field operations dashboard for a 40-person landscaping company. Tracks crew assignments, job status, equipment location, client notes, and daily route optimization. Replaced a mess of shared spreadsheets, WhatsApp groups, and sticky notes on the dispatch office wall.

Builder Profile: Rachel Torres, 34. Operations manager at GreenScape Landscaping in Austin, TX. No programming experience. Had taken one HTML course in college a decade ago. Uses Excel daily and considers herself "tech-comfortable but not technical."

Tools Stack:

Build Timeline: Three evenings after work (roughly 3 hours each) plus most of a Saturday. Total: approximately 16 hours.

Key Prompts:

Prompt 1 -- The initial description:

I manage a landscaping company with 8 crews of 5 people each.
Every morning I assign crews to jobs using a spreadsheet and a
WhatsApp group. I need an app that:

1. Shows today's jobs on a map with crew assignments
2. Lets me drag and drop to reassign crews to different jobs
3. Crews can update job status from their phones (not started /
   in progress / done / issue)
4. Tracks which equipment trailer is with which crew
5. Stores client notes that persist between visits
6. Shows me a daily summary: jobs completed, revenue, crew utilization

Make it simple. My crews are not tech people. The mobile view needs
to be dead simple -- big buttons, minimal text.

I want to log in as admin and see everything. Crews log in with a
simple PIN code and only see their assigned jobs for today.

Why it worked: Writing from the perspective of the actual problem -- not in technical terms -- gave the AI everything it needed. Rachel did not know what a "database" or "REST API" was. She described her day, and the AI built the system to match it.

Prompt 2 -- Fixing the mobile experience:

The crew mobile view is too complicated. They need to see ONLY:
- Their jobs for today, in order
- A big button to change status (green = done, yellow = issue)
- A notes field for each job
- Nothing else

Remove the navigation menu on mobile. Remove the map on mobile.
Remove the equipment section on mobile. Crews do not need any of that.
Just the job list and status buttons. Make the buttons large enough
to tap with work gloves on.

Why it worked: The first version had given crews the same interface as the admin. This prompt stripped it down to exactly what a landscaper standing in a yard with dirty gloves needs. The "work gloves" detail led the AI to generate oversized touch targets (minimum 56px) -- better than many professional mobile apps.

Before/After: Before: Rachel spent 45 minutes every morning in dispatch, managing the spreadsheet, texting crew leaders, and calling clients. Crews often arrived at jobs without knowing the client's gate code or special instructions. Equipment went missing for days because nobody tracked which trailer went where.

After: Morning dispatch takes 10 minutes. Crews see their assignments on their phones before they leave the yard. Client notes (gate codes, dog warnings, irrigation shutoff locations) carry over automatically between visits. Equipment tracking reduced "lost trailer" incidents from two per month to zero in the first quarter.

Lessons Learned:

Outcome: FieldSync has been in daily use at GreenScape for five months. All eight crews use it. Rachel estimates it saves 6 hours of administrative time per week across the company. The owner asked her to "sell it to other landscaping companies," which she is now exploring. Total build cost: $0 (Bolt.new free tier was sufficient for the prototype; Lovable's free tier handled the refinements). Ongoing cost: $25/month (Supabase) + $8/month (Google Maps API).


Project 3: Resonance -- Startup MVP That Got Into Y Combinator

What it is: An AI-powered customer feedback analysis platform. Companies connect their support channels (Zendesk, Intercom, email), and Resonance automatically categorizes feedback by theme, sentiment, and urgency. Surfaces product insights that typically take a research team weeks to compile.

Builder Profile: David Park and Jenna Liu, both 27. David is a former ML engineer at a mid-tier AI startup. Jenna was a product manager at Salesforce. Neither had built a full-stack consumer product before. They quit their jobs in September 2025 with savings to cover six months.

Tools Stack:

Build Timeline: Three weeks from first prompt to a working MVP. One additional week for polish before the YC application. Total: four weeks with two people working full-time.

Key Prompts:

Prompt 1 -- System architecture:

Design the architecture for a customer feedback analysis platform.

Data flow:
1. INGEST: Connect to Zendesk, Intercom, and email (IMAP) to pull
   customer messages. Webhook listeners for real-time ingestion.
   Dedup messages that appear in multiple channels.

2. PROCESS: For each message:
   - Generate embedding (OpenAI text-embedding-3-small)
   - Classify sentiment (positive/neutral/negative/urgent)
   - Extract themes (use clustering on embeddings, auto-generate
     theme labels)
   - Score urgency (1-5 based on sentiment + keywords + customer tier)

3. STORE: PostgreSQL for structured data. Supabase pgvector for
   embeddings. Link every insight back to source messages.

4. SURFACE: Dashboard showing:
   - Theme clusters with message counts and trends
   - Sentiment distribution over time
   - Urgent items requiring immediate attention
   - Weekly auto-generated summary of top themes and shifts

Multi-tenant: each company sees only their own data. RLS enforced
at the database level. API keys scoped per integration per company.

Build the ingestion pipeline first. I want to connect a test Zendesk
instance and see messages flowing into the database within the first
session.

Why it worked: David wrote this prompt like a system design document. The level of specificity on data flow, multi-tenancy, and storage separation meant Claude Code generated a clean, well-separated architecture on the first pass. The instruction to get data flowing in the first session kept the AI focused on the critical path.

Prompt 2 -- The insight generation engine:

Build the weekly insight report generator.

Input: All feedback messages from the past 7 days for a given company.

Process:
1. Cluster messages by theme (using cosine similarity on embeddings,
   threshold 0.82)
2. For each cluster with 5+ messages:
   - Generate a theme label (3-5 words)
   - Count messages and calculate sentiment breakdown
   - Identify the most representative message (closest to centroid)
   - Compare to previous week: is this theme growing, shrinking, or new?
3. Rank themes by: (message_count * urgency_avg * growth_rate)
4. Generate executive summary using Claude:
   - 3 paragraphs maximum
   - Lead with the most important shift
   - Include specific numbers
   - End with a recommended action

Output: Structured JSON with themes array and summary text.
Store in reports table. Send via email to company admin.

Handle edge cases: company with fewer than 10 messages that week
(skip report, send "not enough data" note), themes that appear
for the first time (flag as "emerging"), themes that disappear
(flag as "resolved").

Why it worked: The mathematical specificity (cosine similarity threshold, minimum cluster size, ranking formula) gave the AI enough constraints to produce a working implementation without guessing. Jenna later said the ranking formula in the prompt became the actual production ranking formula -- it was that well-specified.

Before/After: Before: David and Jenna had a pitch deck, three notebooks of customer research, and a Figma prototype. No working software. Their previous attempt at building the MVP with traditional development (David coding the backend, contracting a frontend developer) had consumed six weeks and $12,000 in contractor fees with only the auth system and a basic dashboard to show for it.

After: A fully functional platform that could ingest from Zendesk, classify feedback, cluster themes, and generate weekly reports. Three beta customers were using it with real data. The YC demo showed live feedback flowing in and being categorized in real time.

Lessons Learned:

Outcome: Accepted into Y Combinator W26 batch. Raised a $500K pre-seed round before the batch started. Currently at $8,400 MRR with 14 paying companies. David estimates the vibe coding approach saved them three months and $40,000+ in development costs compared to traditional development, which directly extended their runway.


Project 4: karandev.co -- Developer Portfolio That Landed a Job

What it is: A personal developer portfolio site with interactive project showcases, a working blog with MDX support, an AI chatbot trained on the builder's resume and projects, and a live "what I'm working on" status pulled from GitHub and Spotify APIs.

Builder Profile: Karan Patel, 22. Recent computer science graduate from a state university. Solid fundamentals in Python and Java from coursework, but limited experience with modern web frameworks. Had applied to 47 junior developer positions with a plain HTML resume site. Zero callbacks.

Tools Stack:

Build Timeline: One full week of focused work during winter break. Approximately 40 hours total.

Key Prompts:

Prompt 1 -- Portfolio design direction:

Build a developer portfolio site that will make a hiring manager stop
scrolling. Next.js 14 App Router with Tailwind CSS.

Design: Dark theme. Subtle grain texture background. Smooth scroll.
Minimal but not boring. Accent color: electric blue (#3B82F6).
Typography: Inter for body, JetBrains Mono for code snippets.

Sections:
1. Hero: My name in large type. One-line tagline that rotates between
   3 phrases (typed animation effect). Small "scroll down" indicator.
2. About: 2-paragraph bio. Photo (circular, subtle border glow).
   Tech stack icons grid (React, Python, TypeScript, etc.) with
   hover tooltips.
3. Projects: 3-4 cards in a grid. Each card: screenshot, title,
   one-line description, tech tags, links to live demo + GitHub.
   Cards tilt slightly on hover (3D transform). Click to expand
   into full case study.
4. Blog: Latest 3 posts pulled from MDX files. Title, date, read time,
   excerpt. Link to full post.
5. Contact: Simple email form (Resend API). Social links row.

Page transitions: smooth with Framer Motion. Sections fade-in on scroll.
Performance: 95+ Lighthouse score. No layout shift.

Why it worked: The prompt read like a creative brief, not a feature list. Details like "grain texture background," "cards tilt slightly on hover," and "typed animation effect" gave the AI a visual vision to execute against. The Lighthouse score target acted as a quality gate.

Prompt 2 -- The resume chatbot:

Add an AI chatbot to the portfolio that answers questions about me.

It should be a small floating chat bubble in the bottom right corner.
When opened, it expands into a chat window. Powered by OpenAI GPT-4o-mini
via the Vercel AI SDK.

System prompt for the chatbot:
"You are a helpful assistant on Karan Patel's portfolio website.
You answer questions about Karan's skills, experience, projects,
and education based on the context provided. You are friendly,
concise, and professional. If asked something not covered in the
context, say you don't have that information and suggest emailing
Karan directly. Never make up information about Karan."

Context document (embed this in the system prompt):
[I will paste my resume and project descriptions here]

Features:
- Streaming responses (token by token appearance)
- Suggested starter questions: "What are Karan's top skills?",
  "Tell me about his projects", "What is his education background?"
- Rate limit: max 20 messages per session to control API costs
- Chat history persists in the browser session (sessionStorage)
- Mobile responsive: full-width chat panel on screens under 640px

Why it worked: Providing the exact system prompt within the development prompt eliminated a round of iteration. The rate limit and cost control details showed practical thinking that the AI translated directly into implementation.

Before/After: Before: A single-page HTML resume with a white background, Times New Roman font, and three bullet-pointed project descriptions. Karan described it as "what you'd get if you exported a Google Doc to HTML." Forty-seven applications sent. Zero interviews.

After: A polished portfolio with smooth animations, interactive project showcases, a working blog, and an AI chatbot that could answer recruiter questions about Karan's experience at 2 AM. The chatbot alone generated over 600 conversations in the first month.

Lessons Learned:

Outcome: Karan posted the portfolio on r/webdev, Twitter, and LinkedIn. The Reddit post received 1,200 upvotes. The portfolio has had 14,000 unique visitors in three months. He received 11 interview requests in the first two weeks after launching. Accepted a junior full-stack developer role at a Series B startup in San Francisco. Starting salary: $135,000 -- $30,000 more than the median offer for new grads from his university. His manager later told him: "The portfolio showed us you could ship, not just code."


Project 5: Dungeon of Echoes -- A Game Built by a Teenager

What it is: A browser-based roguelike dungeon crawler with procedurally generated levels, pixel art aesthetics, turn-based combat, and a permadeath mechanic. Players descend through floors, collect loot, fight monsters, and try to reach floor 50. Leaderboard tracks the deepest floor reached.

Builder Profile: Aiden Nakamura, 16. High school junior in Portland, OR. Plays video games constantly. Had completed a Python basics course on Codecademy and built a few simple scripts. No web development or game development experience. Started this project during a snow day when school was cancelled.

Tools Stack:

Build Timeline: Two weeks of after-school sessions (2-3 hours each) plus two full weekend days. Total: approximately 35 hours.

Key Prompts:

Prompt 1 -- The game concept:

Build a roguelike dungeon crawler game in HTML5 Canvas and JavaScript.
No frameworks, just vanilla JS.

The player starts on floor 1 of a dungeon. Each floor is a grid of
rooms generated randomly. The player moves with arrow keys. Each room
can contain: nothing, a monster, a treasure chest, a health potion,
or stairs down to the next floor.

Combat is turn-based. Player and monster take turns attacking. Damage
is based on attack stat minus defense stat plus a random factor.
When a monster dies, it drops gold and maybe an item.

Items: sword (increase attack), shield (increase defense), potion
(restore health). Items have rarity levels: common (white), rare (blue),
epic (purple). Higher rarity = better stats.

Permadeath: when the player dies, the run is over. Show a death screen
with stats: floors cleared, monsters killed, gold collected, time played.

Visual style: 16x16 pixel art aesthetic using simple colored squares
and basic shapes. Dark background. The dungeon should feel gloomy.

Start with movement and room generation. Add combat second.
Add items third. Add the death screen last.

Why it worked: Breaking the build into a clear sequence (movement, then combat, then items, then death screen) matched how game development actually works -- you get the core loop right before adding layers. Aiden said the AI "built each layer perfectly because it always had the previous layer working first."

Prompt 2 -- Making combat feel satisfying:

Combat feels boring. When I attack a monster or it attacks me,
nothing happens visually. Make it feel impactful:

1. Screen shake: brief shake (3 frames) when any attack lands
2. Damage numbers: float upward from the target and fade out, red for
   damage, green for healing
3. Flash effect: the hit target flashes white for 2 frames
4. Death animation: when a monster dies, it fades out and drops
   pixel particles downward
5. Sound: I know we can't do real sound easily, so fake it --
   flash the screen border red briefly on hit to give visual "impact"

Keep the turn-based system. These are just visual effects layered on
top of the existing combat logic. Do not change how damage calculation
works.

Why it worked: The constraint "do not change how damage calculation works" prevented the AI from rewriting the combat system while adding effects. Aiden had learned from an earlier mistake where asking for "better combat" caused the AI to replace his entire combat module.

Before/After: Before: Aiden had tried to build a game three times previously. Attempt one: followed a YouTube tutorial for a platformer in Unity, got stuck on collision detection, gave up after four hours. Attempt two: tried Godot, spent a weekend learning the editor, never got past the main menu. Attempt three: started a text adventure in Python, finished it, but wanted something visual.

After: A fully playable, visually polished (for a browser game) roguelike with 50 floors of content, seven monster types, fifteen items, a working leaderboard, and combat that "actually feels fun to play" according to the comments on his Reddit post.

Lessons Learned:

Outcome: Posted on r/roguelikes and r/IndieGaming. The Reddit post received 480 upvotes. The game has been played over 8,000 times. Aiden's computer science teacher gave him extra credit and invited him to present the project to the class. He is now building a multiplayer version and has started learning React "for real" because he wants to understand what the AI was generating. He says: "Vibe coding got me through the door. Now I actually want to learn what's behind the door."


Project 6: The Copper Pot -- E-Commerce Site for a Small Business

What it is: A full e-commerce storefront for an artisanal cookware shop in Asheville, NC. Features a product catalog with high-resolution image galleries, size/finish variants, a shopping cart with saved-cart recovery, Stripe checkout, order tracking, and an admin panel for inventory management.

Builder Profile: Linda Brennan, 52. Owner of The Copper Pot, a brick-and-mortar cookware shop she has run for 18 years. Zero programming experience. Previously paid a local agency $8,500 to build a Shopify store that she found difficult to update and expensive to maintain ($79/month for Shopify Plus plus agency retainer for changes). Heard about vibe coding from her nephew who is a software developer.

Tools Stack:

Build Timeline: Five days of working on it during slow hours at the shop, plus two evenings. Total: approximately 20 hours.

Key Prompts:

Prompt 1 -- The storefront:

Build an online store for my cookware shop called "The Copper Pot."

I sell high-end copper pots, pans, and kitchen tools. My customers
are home cooks aged 35-65 who appreciate craftsmanship. The feel
should be warm, artisanal, and trustworthy. Think: exposed brick,
natural tones, and beautiful product photography.

Pages:
1. Home: hero image with tagline "Handcrafted Copper Cookware Since
   2008", featured products grid (6 items), testimonial carousel,
   Instagram-style gallery of kitchen photos
2. Shop: filterable product grid. Filters: category (pots, pans,
   tools, sets), price range, material. Sort by price, newest,
   popularity.
3. Product detail: large image gallery (click to zoom), product
   description, size/finish selector, price, add to cart button,
   "You might also like" section with 3 related products.
4. Cart: line items with quantity adjustment, subtotal, shipping
   estimate, proceed to checkout.
5. About: our story, photo of the shop, craftsmanship values.
6. Contact: form + shop address + embedded Google Map.

Colors: warm cream background (#FDF8F0), copper accent (#B87333),
dark text (#2D2926). Font: serif headers (Playfair Display),
sans-serif body (Lato).

Mobile must be perfect. Most of my customers browse on their phones.

Why it worked: Linda described her customers and brand feeling, not technical specifications. The AI translated "warm, artisanal, and trustworthy" and "exposed brick, natural tones" into a design that Linda said "looks exactly like my shop feels." The color hex codes were her nephew's contribution -- he helped her pick colors that matched her physical store's palette.

Prompt 2 -- Admin inventory management:

Add an admin panel that only I can access (password protected).

I need to:
1. Add new products: name, description, price, category, images
   (upload multiple), sizes available, stock count for each size
2. Edit existing products: change any field, reorder images
3. Mark products as "sold out" (shows badge on storefront but
   keeps the page live) or "hidden" (removes from storefront)
4. View orders: list with date, customer name, items, total,
   status (paid / shipped / delivered). Click to see full details.
5. Update order status and add tracking number (customer gets
   an email when I mark it as shipped)
6. Simple dashboard: total revenue this month, number of orders,
   top selling products

Keep it simple. I am not technical. Big buttons, clear labels.
When I upload images, automatically resize them for the web
(I take photos on my phone and they are very large files).

Why it worked: "I am not technical. Big buttons, clear labels." This single line shaped the entire admin interface. The AI generated an admin panel with a significantly simpler layout than a typical CMS, with confirmations on every destructive action and undo options. The automatic image resizing solved a real problem -- Linda's phone photos were 4MB each.

Before/After: Before: A Shopify store that cost $8,500 to build and $79/month to maintain. Linda could not update product descriptions without emailing her agency and waiting 48 hours. Adding new products required a $150/change agency fee. The site looked generic -- it used a standard Shopify theme that looked identical to thousands of other stores.

After: A custom storefront that matches The Copper Pot's physical brand identity. Linda updates products herself through the admin panel. No monthly platform fees beyond Supabase ($25/month) and Vercel ($0 -- free tier). Stripe charges are 2.9% + $0.30 per transaction (same as Shopify).

Lessons Learned:

Outcome: Online sales in the first three months: $23,400. Previous Shopify store's best three-month period: $9,100. The warm, custom design and improved product photography drove a 34% increase in conversion rate compared to the old Shopify store. Linda's monthly tech costs dropped from $79 (Shopify) + agency retainer to $25 (Supabase). She saved approximately $3,000 in the first year on platform and agency fees alone. Three other local shop owners have asked Linda to help them build similar stores.


Community Stats

Aggregated from 247 community submissions received between October 2025 and January 2026.

Submissions Overview

Metric Value
Total submissions received 247
Featured projects (all-time) 38
Countries represented 23
Youngest builder 14 (high school student, built a study flashcard app)
Oldest builder 67 (retired accountant, built a family recipe archive)

Builder Background Distribution

Background Percentage
Professional developer 41%
Student / recent graduate 19%
Non-technical professional 17%
Designer / creative 11%
Founder / entrepreneur 8%
Other (retired, career switcher, hobbyist) 4%

Most Popular Tools

Rank Tool Usage Rate
1 Cursor 62%
2 Claude Code 47%
3 Bolt.new 34%
4 Lovable 28%
5 v0 24%
6 Replit Agent 19%
7 GitHub Copilot 16%
8 Windsurf 11%

Note: Percentages exceed 100% because most projects use multiple tools.

Supporting Technology

Category Most Popular Choice
Framework Next.js (58%)
Styling Tailwind CSS (71%)
Database Supabase (52%)
Hosting Vercel (64%)
Payments Stripe (89% of projects with payments)
Auth Supabase Auth (44%)

Build Time Distribution

Time Range Percentage
Under 4 hours 12%
4-12 hours 27%
12-24 hours (1-2 days) 31%
1-2 weeks 22%
Over 2 weeks 8%

Average time from first prompt to deployed: 18.4 hours Median time from first prompt to deployed: 14 hours

Project Categories

Category Count Percentage
SaaS / web application 72 29%
Internal / business tool 48 19%
Portfolio / personal site 37 15%
E-commerce 29 12%
Game 21 9%
Mobile app 18 7%
Chrome extension 12 5%
CLI tool / developer utility 10 4%

Outcome Metrics

Metric Value
Projects still actively maintained (after 3+ months) 68%
Projects generating revenue 31%
Average MRR for revenue-generating projects $840
Highest reported MRR $12,400
Builders who reported getting hired because of their project 14
Builders who transitioned to full-time on their project 9

Success Patterns

From analyzing all 247 submissions, the projects most likely to succeed shared these characteristics:

  1. Specific problem, specific user. "A tool for landscaping dispatchers" beats "a project management app" every time.
  2. Prompt specificity. Builders who shared detailed, structured prompts (average 150+ words per prompt) had measurably better outcomes than those using short, vague prompts.
  3. Early deployment. Projects deployed within the first 25% of total build time had a 73% continuation rate. Projects that waited until "done" to deploy had a 41% continuation rate.
  4. Real users during build. 82% of revenue-generating projects had at least one real user testing before the builder considered it complete.
  5. Two tools, not five. The most successful builders typically used one primary AI coding tool and one supporting tool. Projects that used four or more AI tools had lower completion rates, likely due to context-switching overhead.

Monthly Spotlight

March 2026 Spotlight: FleetTrack

Category: B2B SaaS / Logistics Builder: Raj Patel, 27, operations analyst at a logistics company Tools: Claude Code (Opus 4.6), Next.js 16, Supabase, Mapbox, Vercel Build time: 18 hours over one weekend

The Story: Raj managed a fleet of 40 delivery vehicles using spreadsheets and phone calls. He had never written production code before but had been following vibe coding tutorials on the EndOfCoding YouTube channel. When his manager complained about the lack of real-time visibility into delivery routes, Raj decided to build a solution himself.

His opening prompt to Claude Code:

Build a real-time fleet tracking dashboard with Next.js 16 and Supabase.

Core features:
1. Map view showing all active vehicles with live GPS positions
   (use Mapbox GL JS). Each vehicle is a colored dot -- green for
   on-schedule, yellow for delayed, red for stopped.
2. Sidebar with vehicle list, sortable by status, driver name, or
   ETA to next stop. Clicking a vehicle centers the map and shows
   route history for today.
3. Driver mobile view: a simple page where drivers tap "Arrived"
   at each stop. Auto-captures GPS coordinates. Works offline and
   syncs when back online.
4. Daily summary: auto-generated at 6 PM showing total deliveries,
   average time per stop, vehicles that went off-route, and fuel
   estimates based on distance traveled.

Auth via Supabase magic link. Role-based: admin sees everything,
drivers see only their own route. Use Supabase real-time subscriptions
for live vehicle position updates.

The dashboard must feel fast. Sub-200ms updates on the map.

Raj had a working prototype by Saturday night. By Sunday evening, he had added route optimization suggestions using a simple nearest-neighbor algorithm. He deployed to Vercel and showed it to his manager on Monday morning. Within two weeks, all 40 vehicles were using FleetTrack. The company cancelled its $800/month fleet management subscription.

Why we selected it: FleetTrack represents the next wave of vibe coding impact: non-developers building real B2B tools that replace expensive SaaS subscriptions. Raj's prompt demonstrates strong domain expertise combined with specific technical requirements -- the sweet spot where vibe coding delivers maximum value. The offline-sync requirement for drivers shows thoughtful product thinking that no AI would have suggested on its own.


Previous: February 2026 Spotlight: QuietPage

Category: Productivity tool Builder: Sana Mirza, 31, UX designer at a remote-first company Tools: Cursor, Next.js, Supabase, Vercel Build time: 11 hours over three evenings

The Story: Sana was frustrated by every writing app she tried. Google Docs felt corporate. Notion was too feature-heavy. iA Writer was beautiful but did not sync across devices. She wanted a writing tool that was quiet, distraction-free, synced to the cloud, and had exactly one feature beyond basic text editing: a daily word count streak tracker.

Sana opened Cursor on a Tuesday evening with this prompt:

Build a minimal writing app. I mean truly minimal.

One page. No sidebar. No toolbar. No menus visible by default.
Just a white page with a blinking cursor. The user types.

Auto-save to Supabase every 30 seconds and on every pause longer
than 2 seconds. Show a subtle "saved" indicator that fades in and
out -- bottom right corner, small gray text, disappears after 1 second.

One feature: daily word count streak. If the user writes at least
200 words today, the streak continues. Show the streak as a small
flame icon with a number in the top right corner. That is the only
UI element visible while writing.

Keyboard shortcuts (show on hover over a small "?" icon, bottom left):
- Cmd+B: bold
- Cmd+I: italic
- Cmd+Shift+H: toggle heading
- Cmd+/: toggle dark mode

No sign-up wall. Auth via magic link only. No password to remember.

If the writing app does not feel calm, it has failed.

The result was a writing app that four of Sana's coworkers started using within a week. She posted it on Hacker News with the title "I built the quietest writing app on the internet." It hit the front page. Within a month, QuietPage had 2,800 registered users and Sana was considering adding a $5/month premium tier for features like version history and export to PDF.

Why we selected it: QuietPage demonstrates that vibe coding is not just for building complex systems. Sometimes the hardest product decision is what to leave out. Sana's prompt is a masterclass in constraint-driven design, and the result is a product people genuinely prefer over established alternatives -- not because it does more, but because it does less, better.


Have a project that should be featured in next month's spotlight? Submit it using the template above.


Explore Further


This chapter is updated monthly with new featured projects and refreshed community stats. Last updated: March 2026.

★ What Level Are You?

Updated March 6, 2026

Answer 6 questions to discover your vibe coding level.

★ Glossary

Updated March 6, 2026
Vibe Coding
AI-assisted development where the developer describes intent in natural language and evaluates output through execution, not code review.
Accept All
The practice of accepting all AI-generated code changes without reviewing diffs.
Coding Agent
An autonomous AI system that can plan, implement, test, and deploy code changes independently.
Composer
A mode in AI IDEs (like Cursor) that generates multi-file code from natural language descriptions.
Error-Driven Development
Debugging by copy-pasting error messages to the AI rather than reading and understanding the code yourself.
MCP (Model Context Protocol)
Anthropic's open protocol allowing AI assistants to connect to external tools and data sources.
Prompt Engineering
The skill of crafting effective natural language instructions to produce desired AI outputs.
Vibe Coding Hangover
The phenomenon of teams struggling to maintain, extend, or debug AI-generated codebases. Documented by Fast Company in Sept 2025.
Zombie App
An application that is functional but unmaintainable because nobody understands the AI-generated code.
Complexity Ceiling
The point at which a vibe-coded application can no longer be extended because the underlying code is too tangled.
Hybrid Workforce
An organization where AI agents work alongside human engineers, as pioneered by Goldman Sachs with Devin.
The 80/20 Rule
Vibe code the 80% (UI, boilerplate, standard patterns). Engineer the 20% (auth, security, business logic).
Agent Teams
A feature in Claude Code (introduced with Opus 4.6) allowing multiple AI agents to work in parallel on different aspects of a project, coordinating autonomously.
Agent Mode
A capability in coding tools (GitHub Copilot, Cursor, etc.) where the AI autonomously identifies subtasks, makes multi-file edits, runs tests, and fixes errors without step-by-step human guidance.
Devin Wiki / Devin Search
Cognition's documentation generation and code search tools built into the Devin platform, enabling AI-generated documentation and natural language querying of codebases.
Multimodal Coding
An emerging trend combining voice, visual, and text-based inputs for AI code generation — including screenshot-to-code and voice-to-code workflows.

★ Resources

Updated March 6, 2026

Tools to Try

What's New

Updated April 1, 2026

Every update to this ebook is tracked here. Subscribers get monthly updates with new content, revised chapters, and fresh prompts.


April 2026

April 9, 2026

April 1, 2026

March 2026

March 25, 2026

March 7, 2026

March 6, 2026

March 1, 2026

February 2026

February 25, 2026