Copy & deploy · 12 prompts

12 Claude prompts for engineering.

These 12 prompts cover the work a software team actually ships: reviewing diffs, debugging errors, writing tests, planning refactors and upgrades, and documenting decisions. Each one is annotated, uses [BRACKET] tokens for your code and context, and is ready to paste into Claude or run in Claude Code. Bring the repo, keep your judgment on the output, and never merge anything you have not read.

REVIEW AND DEBUG

Code review, debugging, and tests

Review a diff or PR for bugs and clarity

Use when: a pull request is open and you want a thorough second read before approving

You are a senior engineer reviewing a pull request. Be rigorous and specific, not polite.

Context:
- What this change is supposed to do: [DESCRIBE INTENT]
- Language / framework: [e.g. TypeScript, React]
- Anything I am unsure about: [OPTIONAL]

Diff:
[PASTE DIFF OR FILES]

Review it:
1. Correctness: find actual bugs, off-by-one errors, null and undefined handling, race conditions, and unhandled error paths. Quote the exact line.
2. Edge cases the change does not cover, with a concrete input that would break it.
3. Clarity and naming: where the intent is hard to follow and how to make it obvious.
4. Anything that contradicts the stated intent above.
5. A short list ranked: must-fix before merge, should-fix, and nits.

Do not invent issues to seem thorough. If a section is clean, say so. Show me the line, the problem, and the fix.

Why it works: stating the intent and asking for line-level quotes turns a vague "looks good" into specific, checkable findings ranked by what actually blocks the merge.

Debug an error with a ranked hypothesis list

Use when: something is failing and you want a structured root-cause path instead of guesswork

Act as a debugging partner. I have an error and I want hypotheses ranked by likelihood, not a single guess.

Error / stack trace:
[PASTE ERROR]

Relevant code:
[PASTE THE FUNCTION OR FILES INVOLVED]

What I have already tried: [OPTIONAL]
Environment: [language, runtime version, OS, framework]

Do this:
1. Restate what the error is actually saying in plain English.
2. List the 5 most likely root causes, ranked, each with the reasoning and a fast way to confirm or rule it out.
3. Tell me the single cheapest check that would eliminate the most hypotheses first.
4. For the top hypothesis, give the specific fix and how to verify it worked.
5. Flag anything in the code that would cause a similar failure elsewhere.

Be honest about uncertainty. If the trace is insufficient, tell me exactly what to log or capture next.

Why it works: forcing a ranked hypothesis list with a cheapest-first check mirrors how strong engineers actually debug, and stops the model from committing to one wrong theory.

Generate unit and integration tests for a function

Use when: a function has thin coverage and you want a real test suite, not three happy-path cases

Write tests for the function below. I want coverage of the cases that actually break things.

Function:
[PASTE FUNCTION]

Test framework: [e.g. Jest, pytest, Go testing]
Conventions to follow: [naming, mocking style, fixtures, OPTIONAL]

Produce:
1. A short list of the behaviors and edge cases worth testing, including empty inputs, boundaries, error paths, and any concurrency or ordering concerns.
2. Unit tests covering each, with clear arrange-act-assert structure and descriptive names.
3. At least one integration-style test that exercises this function with its real collaborators where that is meaningful, mocking only true external boundaries.
4. Call out any behavior that is hard to test because of how the code is structured, and the smallest refactor that would fix it.

Do not test trivial getters for the sake of a coverage number. Each test should be able to fail for a real reason.

Why it works: asking for the behavior list first, plus honesty about untestable structure, produces tests that catch regressions instead of inflating a coverage metric.

REFACTOR AND ARCHITECTURE

Refactors, decision records, and specs

Plan a refactor you can land safely

Use when: a module needs restructuring and you want a staged plan, not a risky big-bang rewrite

Help me plan a safe refactor. The goal is to improve the code without changing behavior, in steps I can ship and verify.

Current code:
[PASTE THE MODULE OR DESCRIBE IT]

What I want to improve: [e.g. testability, coupling, naming, performance]
Constraints: [public API I cannot break, deadlines, test coverage I have today]

Produce:
1. The target shape and why it is better, in a few sentences.
2. A sequence of small, independently shippable steps, each one behavior-preserving, with the safety check after each (tests, a characterization test, a feature flag).
3. The riskiest step and how to de-risk it.
4. What to write a characterization test around before touching anything, if coverage is thin.
5. A rollback plan if a step goes wrong in production.

Prefer many small reversible steps over one large change. If a true rewrite is genuinely safer than incremental work, say so and justify it.

Why it works: demanding independently shippable, behavior-preserving steps plus a rollback plan keeps the refactor reversible and reviewable instead of a long-lived risky branch.

Draft an Architecture Decision Record

Use when: you made or are about to make a real architectural choice and want it documented for the next engineer

Draft an Architecture Decision Record (ADR) for a decision my team is making.

Decision: [WHAT WE ARE DECIDING, e.g. message queue choice, auth approach]
Context and forces: [the problem, constraints, scale, team skills, deadlines]
Options considered: [OPTION A, OPTION B, OPTION C with what you know about each]
Leaning toward: [OPTIONAL]

Write the ADR with these sections:
1. Title and status (proposed / accepted).
2. Context: the problem and the forces at play, neutrally stated.
3. Options: each one with honest pros, cons, and the cost or risk it carries.
4. Decision: what we chose and the specific reasoning that made it win.
5. Consequences: what becomes easier, what becomes harder, and what we are now committed to.
6. What would make us revisit this later.

Be even-handed about the options we did not pick; an ADR that strawmans the alternatives is useless to the next reader. Keep it tight.

Why it works: the fixed ADR sections and the explicit "do not strawman the alternatives" rule capture the reasoning future engineers need, not just the conclusion.

Draft a technical spec or RFC

Use when: a feature is big enough that the team should align on the design before anyone writes code

Help me write a technical spec / RFC for a feature so the team can review the design before implementation.

Feature: [WHAT WE ARE BUILDING AND WHY]
Users and use cases: [WHO, AND THE JOBS THEY NEED DONE]
Known constraints: [systems it must integrate with, scale, latency, compliance]
Open questions I have: [OPTIONAL]

Produce a spec with:
1. Problem statement and goals, plus explicit non-goals.
2. Proposed design: the main components, data flow, and key interfaces or schemas.
3. Alternatives considered and why the proposal wins.
4. Data model and API surface changes, with backward-compatibility notes.
5. Failure modes, rollout plan, and how we measure success.
6. Open questions and the decisions reviewers need to make.

Write it for reviewers who will push back. Surface the weak points yourself rather than hiding them; the goal is a better design, not approval.

Why it works: non-goals, named failure modes, and self-surfaced weak points make the RFC a real design review instead of a pitch, which is what gets useful pushback early.

DOCS AND COMMUNICATION

API docs, PR descriptions, and explanation

Write or improve API documentation

Use when: an endpoint or library exists but the docs are missing, stale, or unusable

You are writing developer-facing documentation for the API below. Write for an engineer who has never seen it and needs to integrate today.

Code / endpoint definition:
[PASTE THE HANDLER, FUNCTION SIGNATURES, OR SCHEMA]

Existing docs, if any: [PASTE OR SAY NONE]

Produce:
1. A one-line summary of what this does and when to use it.
2. Each parameter or field: name, type, required or optional, constraints, and a real example value.
3. A complete request and response example with realistic data, including an error response.
4. Edge cases and gotchas a caller will hit (pagination, rate limits, auth, idempotency).
5. A copy-paste "quickstart" call that works end to end.

Document only what the code actually does; if behavior is ambiguous from the code, mark it as "verify" rather than guessing. Do not describe parameters that do not exist.

Why it works: anchoring to the actual signature and flagging ambiguous behavior as "verify" prevents the confident, wrong API docs that erode trust faster than no docs at all.

Write a clear PR description from a diff

Use when: the code is ready but the description is "wip" and your reviewer deserves better

Write a pull request description from this diff so a reviewer can understand it quickly.

Diff:
[PASTE DIFF]

Linked issue or context: [OPTIONAL]

Produce:
1. A one-sentence summary of what changed and why.
2. A short "what and why" section: the problem, the approach, and any decision a reviewer might question.
3. A bullet list of the notable changes by area, so the reviewer knows where to focus.
4. How it was tested, and anything the reviewer should verify manually.
5. Risk and rollout notes: migrations, feature flags, config, or anything that needs care on deploy.

Be accurate to the diff; do not claim changes that are not there. Keep it skimmable. Flag the parts of the diff that most deserve careful review.

Why it works: pointing the reviewer at the parts that deserve scrutiny, grounded strictly in the diff, gets you faster and better reviews than a wall of generated prose.

Explain unfamiliar code

Use when: you inherited a file or are reading a dependency and need to understand it fast

Explain this code to me like I am a competent engineer who is new to this specific codebase. I want understanding, not a line-by-line paraphrase.

Code:
[PASTE THE FILE OR FUNCTION]

Anything I already know about the context: [OPTIONAL]

Do this:
1. State what this code is responsible for and where it likely sits in the larger system.
2. Walk through the control flow at the level of intent: what happens, in what order, and why.
3. Call out the non-obvious parts: clever bits, implicit assumptions, side effects, and anything that would surprise a reader.
4. Flag likely bugs, smells, or dead code you notice while reading.
5. List the questions I should ask the original author or go verify, and what I would test to confirm my understanding.

Be clear about what you are inferring versus what the code states. If something is genuinely unclear, say so instead of inventing a rationale.

Why it works: asking for intent and explicit assumptions, with inference separated from fact, builds a real mental model and surfaces the questions worth taking back to the author.

OPERATE AND MAINTAIN

Incidents, upgrades, and code comments

Write a blameless incident retrospective

Use when: an incident is resolved and you need a postmortem that drives fixes, not finger-pointing

Help me write a blameless incident retrospective. The goal is to learn and prevent recurrence, not to assign blame.

What happened (paste timeline, alerts, what we did):
[PASTE INCIDENT NOTES AND TIMELINE]

Impact: [who and what was affected, duration, severity]
Root cause as I understand it: [OPTIONAL]

Produce:
1. A clear summary: what happened, the impact, and the timeline in plain language.
2. Root cause and contributing factors, focused on systems and conditions rather than individuals. Use the language of "the system allowed" not "the person failed."
3. What went well in the response, honestly.
4. Action items to prevent recurrence and to detect or mitigate faster next time, each with an owner placeholder and a rough priority.
5. The deeper "why" behind the proximate cause, so we fix the class of problem, not just this instance.

Keep it blameless and specific. Do not name individuals as causes. If the notes are thin on a phase, flag the gap rather than filling it with assumptions.

Why it works: the "system allowed, not person failed" framing plus a push to the deeper why produces action items that fix the class of failure, which is the whole point of a postmortem.

Plan a dependency or framework upgrade

Use when: a major version bump or framework migration is looming and you want a plan before you touch it

Help me plan an upgrade safely. I want the breaking changes mapped and a staged path before I start.

Upgrade: [FROM VERSION] to [TO VERSION] of [DEPENDENCY / FRAMEWORK]
How we use it: [the main features, patterns, and integration points in our code]
Our test coverage and CI: [what safety net I have today]

Do this:
1. Summarize the breaking changes and notable deprecations between these versions that are likely to affect how we use it.
2. Map each breaking change to where it probably touches our usage, and rank by risk and effort.
3. Lay out a staged upgrade path: what to do first, what can be done incrementally, and where a hard cutover is unavoidable.
4. Tell me what to put under test or behind a flag before starting, and how to verify after each stage.
5. Give a rollback plan and the signals that should make me roll back.

Base the breaking-change list on what is well established for these versions; where you are not sure a change applies, say so and tell me to check the changelog. Do not assume APIs that may have moved.

Why it works: mapping breaking changes to your actual usage and staging the cutover turns a scary migration into ranked, verifiable steps, with an honest nudge to confirm specifics against the changelog.

Generate docstrings and meaningful comments

Use when: working code needs documentation that explains the why, not comments that restate the code

Add docstrings and comments to this code. I want documentation that helps a future reader, not noise that restates the obvious.

Code:
[PASTE CODE]

Language and docstring style: [e.g. Python with Google-style, JSDoc, Go doc comments]

Do this:
1. Write a docstring for each public function, class, or module: what it does, parameters, return value, raised errors or exceptions, and any important side effects.
2. Add inline comments only where intent is non-obvious: why this approach, why this edge case is handled, why a tempting simpler version would be wrong.
3. Do not add comments that merely repeat the code (no "increment i by one").
4. Flag any place where the code is confusing enough that a comment is a band-aid and a rename or small refactor would be the real fix.

Keep the code itself unchanged except for comments and docstrings. Match the existing style. Where behavior is ambiguous, write the docstring as "verify" rather than asserting something the code may not do.

Why it works: documenting the why and explicitly banning code-restating comments yields docstrings that pay off on the tenth read, while flagging where a rename beats a comment.

FAQ

Frequently asked questions

How do I handle proprietary or confidential code with Claude?

On Claude Team or Enterprise, your prompts and uploads are not used to train Anthropic's models and stay inside your workspace, which makes them appropriate for most internal source code. Set a clear line anyway. Never paste live secrets, customer data, private keys, or credentials, even in a Team plan, and strip them from any file before it goes in. For code under strict regulatory or contractual confidentiality, get sign-off from security first and consider whether a self-hosted or VPC deployment through a cloud provider is required rather than the public app. A simple tiering rule works: ordinary application code is fine in Claude directly, while anything covered by a specific NDA, export control, or data-residency clause goes through review before it leaves your environment.

How carefully should I review AI-suggested code before merging?

Never ship Claude's output unreviewed. Treat every suggestion the way you would a pull request from a fast but unfamiliar contributor: read it line by line, run it, and run the test suite. Claude is strong at structure, edge cases, and boilerplate, but it can introduce subtle logic errors, call APIs that do not exist, or miss a constraint that lives in your codebase rather than the prompt. The reviewer of record is the human who approves the merge, not the model. Use the prompts here to draft and to widen your thinking, then apply the same standard you would to any other change before it lands on a branch that ships.

Which Claude model should engineering teams use for coding?

For most day-to-day coding, the current Sonnet model is the right default. It handles multi-file context, code review, test generation, and refactor planning at a speed and price that let you keep it in the loop all day. Reach for the larger Opus model on harder problems: a gnarly debugging session across many files, a large architectural change, or a migration where reasoning depth matters more than latency. Keep Haiku for high-volume narrow tasks like generating docstrings or summarizing a diff. The prompts on this page work across all three, but Sonnet usually hits the best balance for engineering work.

How does Claude fit into my IDE and CLI workflow?

You have three main entry points. The web app and desktop app are best for planning, writing specs and ADRs, and reasoning over pasted snippets. Claude Code, the official command-line tool, runs in your terminal, reads your repository, and can edit files and run commands, which makes it the right home for the review, refactor, test, and debugging prompts here. There are also editor integrations and the API for building Claude into your own tooling. A practical setup: use Claude Code for anything that touches the working tree, and keep a Claude Project in the app loaded with your architecture docs and conventions for the writing and planning prompts.

When should I not use Claude for an engineering task?

Skip it when the cost of a subtle wrong answer is high and verification is hard. Cryptographic implementations, anything safety-critical, low-level concurrency, and exact financial or tax calculations are places to lean on established libraries and human experts rather than generated code. Do not use it as a system of record or let it make irreversible changes without a human gate. And do not use it to launder a decision you have not actually made: if you cannot evaluate the output, you are not ready to ship it. Claude is a force multiplier on work you can review, not a replacement for understanding the system.

What security and secret-scanning hygiene should I keep?

Build a habit that runs before any code goes into a prompt and before any generated code gets committed. Keep secrets out of prompts entirely: scrub API keys, tokens, connection strings, and customer data from files first, and never paste a .env. On the way back out, run a secret scanner and your normal SAST and dependency checks on generated code before it merges, since a model can echo a credential it saw or suggest a dependency with a known vulnerability. Treat any third-party package Claude proposes as unverified until you have checked it exists, is maintained, and is the one you meant. The goal is simple: the same guardrails you already trust for human-written code, applied to AI-assisted code with no exceptions.

Setup

Making these prompts compound

Run the review, debug, refactor, and test prompts in Claude Code so the model can read your repository, edit files, and run your test suite in the loop. Keep the spec, ADR, and documentation prompts in a Claude Project loaded with your architecture docs, coding conventions, and past decisions. With that context, the prompts above produce output far better than running them in a blank chat.

Want this wired into how your team actually builds? Start with an AI Audit to find the highest-leverage workflows, then have Treetop stand them up.

More Claude prompt libraries

Wire Claude into how your team ships code

Treetop turns prompts like these into a configured Claude setup with your conventions, architecture docs, and review standards built in. Take the assessment, start with an AI Audit, or book a call to map the highest-leverage workflows for your engineering team.

Take the assessment → Start with an Audit Book a call