AI rollout playbook: month-by-month for a 10-50 person team
A four-month plan from "we should do AI" to "AI is running in production." Specific milestones each month. What to commit to. What to kill.
- A four-month plan for a 10-50 person team going from "we should do AI" to "AI is in production."
- Month 1: pick exactly one workflow. Month 2: ship MVP. Month 3: measure + iterate. Month 4: expand or kill.
- The most common failure mode is doing three things at once. Resist it.
- Templates and checklists you can copy.
Every team we talk to has the same instinct: "let's try AI across customer support, sales, and finance at the same time." Three months later they have three half-built AI deployments, none in production. Below is the playbook that works — a focused four-month rhythm starting with exactly one workflow.
Month 1 — Pick one workflow + design the AI
Week 1 — Survey the team
- Ask every team lead: "What repetitive task eats five-plus hours of your week?"
- Compile a list of 15-20 candidate workflows.
- For each, note the team size, the hours-per-week, the rough error cost.
Week 2 — Pick one
Pick based on this scorecard. Score each candidate 1-5:
- Time eaten — high score = more hours per week
- Reversibility — high score = mistakes are easy to fix
- Boundedness — high score = task fits in a clear input/output shape
- Owner motivation — high score = team lead actively wants this
- Data availability — high score = past examples + clear templates exist
Total each. Pick the highest. If two tie, pick the one with the most motivated owner — that person will carry the rollout.
Week 3-4 — Design the AI
- Define the input shape (what does the AI see?)
- Define the output shape (what does it produce?)
- Define the human approval point (who signs off, on what fraction of outputs?)
- Define the success metric (one number — see our AI ROI post)
- Pick build vs buy: vertical product, custom build, or off-the-shelf tool. See our framework.
- Measure the baseline for the success metric — two weeks of pre-AI data.
Month 2 — Ship the MVP
Week 5-6 — Build (or configure)
Build the simplest thing that could work. Resist scope creep. Three rules:
- Single workflow. The MVP solves one task end-to-end. Not five tasks partially.
- Human approval gate. Every AI output goes through human approval in the MVP. Loosen this only after week 8 when quality is measured.
- Observability from day one. Every AI action logged with the 8 minimum fields (see our production AI properties).
Week 7 — Internal pilot
- One team member uses the AI on real work for one week.
- Collect feedback daily — what's wrong, what's surprising, what's slow.
- Fix the worst three issues. Ship a v0.2.
Week 8 — Soft launch to the team
- The full target team starts using the AI.
- Daily 15-minute standup for two weeks specifically about the AI.
- Track adoption rate — what % of the team is using it daily.
Month 3 — Measure + iterate
Week 9-12 — Run the experiment
Continue using the AI. Do not change anything major. The point is to see real production behaviour without confounders.
- Weekly: review the outcome metric vs baseline
- Weekly: review adoption rate
- Weekly: review the human approval queue — what % of AI outputs are rejected?
- Weekly: review failure modes — which inputs does the AI handle badly?
Three patches are allowed during this period — for clear quality problems, not feature requests. Resist scope creep.
Month 4 — Decide
Week 13 — Run the numbers
With 6 weeks of post-launch data, compute the ROI:
- Outcome metric delta (post-launch average minus baseline)
- Convert to value (hours × cost / units × revenue / errors × cost-per-error)
- Subtract AI cost + internal time
- Annualise
Week 14 — The three-option decision
Based on the ROI and adoption data, pick one of three paths:
- Expand — ROI is positive, adoption is high. Roll the AI out to adjacent teams, OR pick a second workflow with the same playbook.
- Tune — ROI is positive but adoption is low, OR adoption is high but ROI is borderline. Spend month 5 on adoption + quality. Re-measure.
- Kill — ROI is negative after honest attempt, OR the team genuinely does not want it. Shut it down. Free the team. Try a different workflow.
Killing is the underrated option. A failed pilot that gets shut down cleanly is better than a half-alive deployment that drags on for a year.
Common failure modes
"Let's start with three workflows at once"
The most common failure mode. Three half-built deployments are worse than one production-quality one. Pick one.
"We will figure out approval later"
Translation: nobody owns the AI's output quality. Define the human approval point in week 4. No exceptions for the MVP.
"We will measure when we have time"
Translation: we will never measure. Set the baseline in week 4. Set up the dashboard in week 5. Both are non-negotiable.
"The AI works; the team is just resistant"
Maybe. More likely: the AI does not fit the actual workflow as well as the demo suggested. Spend a day shadowing the team. Watch where it breaks.
"Let's launch big"
Internal pilot in week 7. Team soft launch in week 8. Don't go company-wide until at least week 12.
What this looks like at a 25-person company
A 25-person services company we have talked with has this exact shape:
- Month 1 — surveyed 4 team leads. Picked client-update messaging (used to take 4 hours/week per project manager × 6 PMs = 24 hours/week).
- Month 2 — shipped an AI drafter integrated with their project tool. Internal pilot in week 7. Team launch in week 8.
- Month 3 — measured: hours dropped from 24 to ~8. Adoption ~85%. Approval rate ~75% (PMs edit before sending).
- Month 4 — decision: expand to a second workflow (proposal drafts). Same playbook, same owner.
This is the rhythm. It is not flashy. It works.
What this means for you
- Block 4 months on the calendar. Resist the urge to do three workflows at once.
- The hardest week is week 2 — picking. Use the 5-criterion scorecard, score honestly, pick the highest.
- Week 4 is when most teams fail to set up measurement. Do it.
- Read the readiness checklist before week 1 to confirm you are ready.
- Read the ROI post before week 4 to pick your metric.
Want a second pair of eyes on the workflow you are picking, or the rollout plan? Book a 30-minute call. We will walk through it with you.
Talk to a real engineer.
A 30-minute call. We will tell you honestly whether AI is the right fix and what it would take.



