The companion to You're Not Leveraging AI. You're Leveraging a Fast Typewriter and The PM Skills Map, Redrawn for AI. Those pieces are about the theory: what AI changes about PM leverage and PM skills. This one is about the practice: which parts of your actual work to hand to AI, and which to protect.

---

"Use AI for your PM work" is like saying "use a calculator for your math work" without specifying whether you're doing arithmetic or proving theorems.

Most AI-for-PMs advice fails because it treats the PM job as one thing. It is not one thing. It is six things arranged in three layers, and the AI decision is different for each.

The governing layer: Orchestration, when to deploy what. The work layer: Generation, Context Management, Evaluation, Influence. The learning layer: Calibration, updating your models based on outcomes.

Orchestration is the real-time governor: what should I do right now? Calibration is the longitudinal governor: how should I update my approach over time? The four work modes sit between them, each with a mechanical component AI can handle and a judgment component it cannot.

The useful question isn't "should I use AI for this?" It's "which part of this should I hand to AI, and which part am I protecting?"

---

The Governing Layer: Orchestration

Hand to AI: The raw signals that inform orchestration decisions. A deadline approaching. A dependency that shifted. A stakeholder who hasn't been updated in two weeks. AI is good at surfacing inputs you'd otherwise miss.

Protect: The decisions themselves.

AI has no model for organizational timing. It cannot sense that this quarter's political dynamics make this the wrong week to escalate, or that the head of engineering is still recovering from a weekend production outage and will read your market analysis as pressure rather than insight. The orchestration inputs are offloadable. The orchestration judgment, when to deploy what, given everything you know about the humans involved, is the thing that separates the PM who generates useful artifacts from the PM whose useful artifacts land badly.

If the first two posts in this series established that judgment is the game, orchestration is where that game is won or lost in real time.

---

The Work Layer

Generation

Hand to AI: First drafts, option enumeration, format translation, blank canvas problems. AI gets you from zero to a starting point fast. This is where most PMs are already operating, and correctly.

Protect: What to generate. How to frame it. The creative leap that changes the problem rather than solving the stated one.

The diagnostic I keep coming back to: if you and your competitor both prompt AI with the same context, does the output look meaningfully different? If not, the generation is commodity. Your job is the non-consensus insight AI doesn't produce, the reframe, the conviction the data doesn't yet support, the creative leap that changes which problem you're solving. AI produces the median. Your job is the outlier.

---

Context Management

This is the mode nobody talks about, and it may be where AI creates the most legitimate leverage.

Context management is assembling, maintaining, and distributing the information substrate your team runs on. Connecting what engineering knows with what sales knows with what the customer said last Thursday. Most frameworks treat this as overhead. It is a skill, and a consequential one.

Hand to AI: Storage and retrieval. Cross-referencing. Summarization. Context assembly, pulling together the relevant information for a specific decision or meeting. AI is genuinely excellent at this, and freeing yourself from the mechanical burden of context assembly is high-leverage time recovery.

Protect: What context matters. The smell test. And above all: sensing what's missing from the information landscape that should be there.

The smell test is the instinct that makes you ask "has anyone actually talked to the customer about this?" when nothing in the data suggests you should. It's the pattern-match against absence, and it's trained by having been wrong about what context mattered enough times that your internal model now includes what's typically missing.

AI can tell you what's in the data. It cannot tell you what's not in the data that should be. The PM who notices that nobody has talked to the enterprise segment about the pricing change, not because the data shows a gap, but because their mental model of the customer landscape includes a segment the data doesn't, is doing something AI cannot do.

There's a second dimension here that's even less offloadable: the relationship-building that makes information flow to you in the first place. The reason the sales lead texts you about a competitor move before it shows up in the CRM. The reason the support engineer flags a pattern they haven't filed a ticket for yet. Context management at its best is a human network effect. AI can process the context that arrives. It cannot build the relationships that determine which context arrives.

---

Evaluation

Hand to AI: Consistency checking. Structured comparison against known criteria. First-pass data interpretation. Red-teaming your own thinking. These are legitimate uses and I'd push PMs to do more of them.

Protect: Taste. Choosing what criteria to evaluate against in the first place. And any evaluation that involves people or organizations, where the relevant signals are political, relational, and largely undocumented.

Here's the risk with evaluation that I haven't seen anyone articulate clearly, and it's the reason I want to spend time on this mode specifically.

When you outsource generation to AI, the cost is visible. You see the output. You can tell when it's mediocre. You edit it. The feedback loop is intact.

When you outsource evaluation to AI, the cost is invisible. You don't know what judgment you failed to develop because you never did the evaluation yourself. AI gives you confident, well-structured evaluation that pattern-matches to what good evaluation looks like. It evaluates against consensus. But the skill that actually matters, identifying quality before consensus confirms it, doesn't develop through reading AI assessments. It develops through the struggle of making judgment calls yourself, getting some wrong, and building the internal model that distinguishes your taste from anyone else's.

The generation outsourcing problem is obvious. The evaluation outsourcing problem is silent. That asymmetry is why this mode deserves more caution than most PMs give it.

---

Influence

Hand to AI: Message drafting and refinement. Argument stress-testing. Tracking influence logistics: who needs alignment, their last known position, what commitments have been made. This is legitimate mechanical support for a fundamentally human activity.

Protect: Reading the room. Choosing who to influence, in what order, with what framing. Modeling how a specific person thinks given their history, incentives, and current political position. The courage to say the hard thing when saying the easy thing would be more comfortable.

AI can help you find the right words. It cannot tell you who needs to hear them, or when. At senior levels, influence is not a component of the job. It is the job. The part AI handles, message construction, was never the bottleneck. The bottleneck was always the judgment about what to say to whom and when. AI makes that bottleneck more visible, not less.

---

The Learning Layer: Calibration

Hand to AI: Prediction logging. Outcome tracking. Pattern surfacing.

This is where AI as calibration infrastructure gets genuinely interesting. Imagine a system that records your predictions ("this feature will move retention by 3 points"), matches them against outcomes ("it moved by 0.5"), and surfaces patterns over time: "You've overestimated retention impact in seven of your last ten launches, and the common factor is you overweight feedback from power users."

That infrastructure doesn't exist for most PMs today. AI can build it. And it's high-leverage because calibration is the mechanism by which all five modes above improve over time.

Protect: The actual model update.

When AI shows you the pattern, you overestimate retention, you overweight power users, the question is why. Is it because you're talking to the wrong customers? Because you're anchoring on what you want to be true? Because your mental model of the product includes assumptions from two years ago that are no longer valid? Diagnosing why your judgment was off, and genuinely changing how you see, is work AI can prompt but cannot perform. The pattern is the input. The update is yours.

---

The Decision That Matters

For any PM task, the question is concrete:

Which mode am I in right now? What's the mechanical component I can hand to AI? What's the judgment component I need to protect?

There's a career dimension to the protection decision that nobody discusses. When every PM can generate polished artifacts in minutes, the PM who visibly does the judgment work, who builds the context map by hand, who makes evaluation calls and tracks them against outcomes, who reads the room rather than drafting the perfect memo, is sending an expensive signal about the quality of their thinking. The hand-built judgment is valuable precisely because a cheap alternative exists. Protecting the judgment layer isn't just good for your capability development. It's how you differentiate yourself when everyone else's mechanical output is indistinguishable.

The meta-lesson across all six modes is this: the mechanical components were never the hard part of the PM job. They were the visible part. AI strips them away and reveals what was always underneath: judgment, taste, timing, courage, and the relationships that make information and influence flow. Those were the job all along. Now there's nowhere to hide.