wits
    Use Cases · May 25, 2026 · 12 min read

    AI rollout playbook: month-by-month for a 10-50 person team

    A four-month plan from "we should do AI" to "AI is running in production." Specific milestones each month. What to commit to. What to kill.

    AI rollout playbook: month-by-month for a 10-50 person team
    TL;DR
    • A four-month plan for a 10-50 person team going from "we should do AI" to "AI is in production."
    • Month 1: pick exactly one workflow. Month 2: ship MVP. Month 3: measure + iterate. Month 4: expand or kill.
    • The most common failure mode is doing three things at once. Resist it.
    • Templates and checklists you can copy.
    Quick answer
    How do I roll out AI in my company?
    Pick one repeated, expensive workflow. Spend month one designing the AI for that single workflow. Ship an MVP in month two with a clear human-in-the-loop and observability. Measure for six weeks. Decide in month four whether to expand the AI to a second workflow or kill it. The mistake is starting three workflows at once.

    Every team we talk to has the same instinct: "let's try AI across customer support, sales, and finance at the same time." Three months later they have three half-built AI deployments, none in production. Below is the playbook that works — a focused four-month rhythm starting with exactly one workflow.

    Month 1 — Pick one workflow + design the AI

    Week 1 — Survey the team

    • Ask every team lead: "What repetitive task eats five-plus hours of your week?"
    • Compile a list of 15-20 candidate workflows.
    • For each, note the team size, the hours-per-week, the rough error cost.

    Week 2 — Pick one

    Pick based on this scorecard. Score each candidate 1-5:

    • Time eaten — high score = more hours per week
    • Reversibility — high score = mistakes are easy to fix
    • Boundedness — high score = task fits in a clear input/output shape
    • Owner motivation — high score = team lead actively wants this
    • Data availability — high score = past examples + clear templates exist

    Total each. Pick the highest. If two tie, pick the one with the most motivated owner — that person will carry the rollout.

    Week 3-4 — Design the AI

    • Define the input shape (what does the AI see?)
    • Define the output shape (what does it produce?)
    • Define the human approval point (who signs off, on what fraction of outputs?)
    • Define the success metric (one number — see our AI ROI post)
    • Pick build vs buy: vertical product, custom build, or off-the-shelf tool. See our framework.
    • Measure the baseline for the success metric — two weeks of pre-AI data.

    Month 2 — Ship the MVP

    Week 5-6 — Build (or configure)

    Build the simplest thing that could work. Resist scope creep. Three rules:

    • Single workflow. The MVP solves one task end-to-end. Not five tasks partially.
    • Human approval gate. Every AI output goes through human approval in the MVP. Loosen this only after week 8 when quality is measured.
    • Observability from day one. Every AI action logged with the 8 minimum fields (see our production AI properties).

    Week 7 — Internal pilot

    • One team member uses the AI on real work for one week.
    • Collect feedback daily — what's wrong, what's surprising, what's slow.
    • Fix the worst three issues. Ship a v0.2.

    Week 8 — Soft launch to the team

    • The full target team starts using the AI.
    • Daily 15-minute standup for two weeks specifically about the AI.
    • Track adoption rate — what % of the team is using it daily.

    Month 3 — Measure + iterate

    Week 9-12 — Run the experiment

    Continue using the AI. Do not change anything major. The point is to see real production behaviour without confounders.

    • Weekly: review the outcome metric vs baseline
    • Weekly: review adoption rate
    • Weekly: review the human approval queue — what % of AI outputs are rejected?
    • Weekly: review failure modes — which inputs does the AI handle badly?

    Three patches are allowed during this period — for clear quality problems, not feature requests. Resist scope creep.

    Month 4 — Decide

    Week 13 — Run the numbers

    With 6 weeks of post-launch data, compute the ROI:

    1. Outcome metric delta (post-launch average minus baseline)
    2. Convert to value (hours × cost / units × revenue / errors × cost-per-error)
    3. Subtract AI cost + internal time
    4. Annualise

    Week 14 — The three-option decision

    Based on the ROI and adoption data, pick one of three paths:

    • Expand — ROI is positive, adoption is high. Roll the AI out to adjacent teams, OR pick a second workflow with the same playbook.
    • Tune — ROI is positive but adoption is low, OR adoption is high but ROI is borderline. Spend month 5 on adoption + quality. Re-measure.
    • Kill — ROI is negative after honest attempt, OR the team genuinely does not want it. Shut it down. Free the team. Try a different workflow.

    Killing is the underrated option. A failed pilot that gets shut down cleanly is better than a half-alive deployment that drags on for a year.

    Common failure modes

    "Let's start with three workflows at once"

    The most common failure mode. Three half-built deployments are worse than one production-quality one. Pick one.

    "We will figure out approval later"

    Translation: nobody owns the AI's output quality. Define the human approval point in week 4. No exceptions for the MVP.

    "We will measure when we have time"

    Translation: we will never measure. Set the baseline in week 4. Set up the dashboard in week 5. Both are non-negotiable.

    "The AI works; the team is just resistant"

    Maybe. More likely: the AI does not fit the actual workflow as well as the demo suggested. Spend a day shadowing the team. Watch where it breaks.

    "Let's launch big"

    Internal pilot in week 7. Team soft launch in week 8. Don't go company-wide until at least week 12.

    What this looks like at a 25-person company

    A 25-person services company we have talked with has this exact shape:

    • Month 1 — surveyed 4 team leads. Picked client-update messaging (used to take 4 hours/week per project manager × 6 PMs = 24 hours/week).
    • Month 2 — shipped an AI drafter integrated with their project tool. Internal pilot in week 7. Team launch in week 8.
    • Month 3 — measured: hours dropped from 24 to ~8. Adoption ~85%. Approval rate ~75% (PMs edit before sending).
    • Month 4 — decision: expand to a second workflow (proposal drafts). Same playbook, same owner.

    This is the rhythm. It is not flashy. It works.

    What this means for you

    • Block 4 months on the calendar. Resist the urge to do three workflows at once.
    • The hardest week is week 2 — picking. Use the 5-criterion scorecard, score honestly, pick the highest.
    • Week 4 is when most teams fail to set up measurement. Do it.
    • Read the readiness checklist before week 1 to confirm you are ready.
    • Read the ROI post before week 4 to pick your metric.

    Want a second pair of eyes on the workflow you are picking, or the rollout plan? Book a 30-minute call. We will walk through it with you.

    Now over to you

    Talk to a real engineer.

    A 30-minute call. We will tell you honestly whether AI is the right fix and what it would take.