Assessment deliverable
AI Ways of Working
Assessment Report
Confidential — for internal use only
Executive summary
This report presents the findings of a three-week assessment of product and technology ways of working at Acme Digital. The engagement mapped the end-to-end delivery flow, identified bottlenecks and root causes, and produced a maturity baseline plus a pilot plan ready to start in week 4. The assessment was interview- and artefact-based; no audit was performed.
Key takeaway: AI adoption is near zero and delivery practices are inconsistent in ways that block effective use of AI (no shared definitions, no agent-ready workflows). Release is gated by manual ops. Recommendations are prioritised into quick wins, medium-term AI initiatives, and strategic bets, with a crawl—walk—run AI implementation approach and a clear pilot roadmap.
Approach
The assessment ran over three weeks, aligned to the standard assessment plan. Key people across product, design, engineering, and leadership were interviewed — not to audit, but to understand how work actually flows, where handoffs happen, and where context is lost.
- Week 1 — Operating model mapping
Role-by-role interviews across your teams and leaders; mapping of the actual delivery chain (not the org chart); identification of where AI is already in use and where it is not. - Week 2 — Bottleneck analysis & readiness check
Interviews synthesised into highest-leverage intervention points; AI adoption assessed across the full lifecycle; readiness check for embedding agents (CI/CD, quality standards, testing automation). - Week 3 — Pilot design — ready to start
Pilot team selected and workflow redesigned end-to-end; agent integration and quality gates defined; baseline metrics set so the difference is measurable from day one. No gap before implementation.
Deliverables at the end of week 3: an operating model map, an AI adoption profile, a prioritised bottleneck analysis, a readiness assessment for agentic delivery, and a pilot roadmap (which team, which workflow, which agents) ready to implement from week 4.
Current state flow
The end-to-end product delivery flow was mapped from idea intake through to production release and feedback. Bottlenecks are highlighted below.
Findings & root causes
Findings from operating model mapping and bottleneck analysis; root causes target the real constraint so recommendations (including AI) can land.
1. Inconsistent delivery practices (blocks AI)
Teams use different Scrum variants, no shared DoD or backlog hygiene. Refinement and handoffs (Jira vs Confluence vs Slack) vary widely. Without consistent artefacts and definitions, AI tools cannot be applied reliably — prompts, agents, and quality checks need a stable format to be effective.
Root cause: No adopted PDLC playbook; team leads improvised. Ownership for defining “how we deliver” is diffuse; no shared standard for agent-ready workflows.
2. Slow cycle time (11-day avg)
Cycle time “ready to build” to “released” averages 11 days, with work frequently sitting in “In Review” or “Ready for QA” for 3—4 days. Manual QA and inconsistent PR turnaround cap throughput. AI-assisted coding can speed build, but without flow discipline and automation the gain is limited.
Root cause: Manual QA is the main constraint; no WIP limits or flow policies; automation beyond unit tests not prioritised.
3. AI adoption near zero
Only two engineers use Copilot informally; no team AI tooling, policy, or evaluation. Product and design use no AI for discovery or specs. No governance or baseline to measure improvement when a pilot starts.
Root cause: No clear view of how AI fits the lifecycle; no approved tools or guardrails; IP/security concerns unaddressed; agent readiness (quality gates, review) not assessed.
4. Discovery disconnected from delivery
Discovery runs in isolation; no dual-track or clear handoff. Epics land large and loose; feedback from production doesn't feed discovery. Delivery often works on unvalidated scope; no structured input for AI-assisted spec or PRD drafting.
Root cause: No discovery playbook; 1 PM to 3 teams; discovery treated as a phase before delivery, not a parallel track with clear artefacts.
5. Limited metrics & visibility
No flow/cycle-time/throughput dashboards; leadership relies on manual status. Inconsistent “done” definitions block reliable reporting and make it hard to measure AI impact (e.g. cycle time before/after agents).
Root cause: Jira used inconsistently; no single source of truth or dedicated metrics capacity; dashboards not prioritised.
6. Release gating by ops
Single ops engineer runs manual checklists and deploys; at most one release per team per week. Bottleneck limits deployment frequency and prevents the fast feedback loops that make AI-assisted iteration valuable.
Root cause: Low CI/CD maturity; no standardised pipeline, quality gates, or rollback; automation deferred and ownership unclear.
Maturity assessment
Each dimension is scored 1—5 from interview evidence, artefact review, and survey data. The spider chart shows the current state baseline; the dashed polygon is a realistic 6‑month target.
| Dimension | Current | 6‑month target |
|---|---|---|
| Discovery | 2.2 | 3 |
| Delivery | 2.8 | 3.2 |
| AI Adoption | 1.4 | 2.5 |
| Ways of Working | 2.5 | 3 |
| Metrics & Visibility | 1.8 | 2.5 |
| Playbooks & Standards | 1.6 | 2.4 |
Recommendations
Recommendations are prioritised by impact and feasibility, with a strong focus on creating the conditions for effective AI adoption. Quick wins establish shared practices and visibility so that AI tools and agents can be applied consistently; medium-term initiatives embed AI in the delivery lifecycle with governance and quality gates; strategic bets scale agent-assisted workflows and remove structural bottlenecks that limit the value of AI. Each item maps to the numbered findings above.
1. Standardise definition of done & backlog hygiene (AI-ready)
Align all 6 teams on shared DoD, story format, and refinement checklist so that acceptance criteria and artefacts are consistent. This creates a stable format for AI-assisted drafting (PRDs, specs, acceptance criteria) and for agent quality checks. Without it, prompts and tools produce output that doesn’t match team expectations. Estimated: 1—2 sprints.
2. Introduce WIP limits and flow metrics dashboards
Implement Jira WIP limits and a shared dashboard for cycle time, throughput, and work age. Establishes a baseline to measure the impact of AI (e.g. cycle time before and after Copilot or agent-assisted review) and prevents queues that dilute AI productivity gains. Estimated: 1 sprint.
3. Deploy AI coding assistant across all teams
Roll out GitHub Copilot with a clear AI usage policy, security and IP guardrails, and team AI champions. Include prompt patterns and review expectations so agent output is consistently evaluated. This is the foundation for crawl—walk—run: get coding assistance in place, then layer in AI-assisted review and test generation. Estimated: 2—3 sprints.
4. Establish dual-track discovery cadence
Introduce a discovery playbook and reallocate PM capacity (1 PM : 2 teams max) with a weekly discovery sync. Clear discovery outputs (validated opportunities, testable hypotheses) give AI tools a defined input for PRD drafting, spec generation, and acceptance criteria, and close the loop between delivery and discovery. Estimated: 2—4 sprints.
5. Automate CI/CD and remove release gating
Implement pipeline automation, feature flags, and automated smoke tests to enable continuous deployment and remove the single-ops bottleneck. Faster, safer releases create the feedback loops that make AI-assisted iteration valuable — agents can suggest changes and teams can verify them in production without waiting on manual deploy. Estimated: 4—6 sprints.
6. AI-assisted QA and document generation
Introduce AI test generation in CI, PRD/spec drafting agents, and AI-generated acceptance criteria across the PDLC. Build on the standardised DoD and discovery artefacts so generated content fits existing workflows. Measure against the flow and cycle-time baselines established in recommendation 2. Estimated: 4—8 sprints.
AI implementation approach
Agents are embedded directly into the pilot workflow — crawl → walk → run, aligned to delivery readiness and quality gates.
Foundation & governance
AI usage policy and security guardrails; Copilot for pilot team; quality gates for agent output; pilot team training on prompt patterns.
Agents in the workflow
AI-assisted code review, PR summarisation, acceptance criteria in the pilot team; fortnightly upskilling; cycle time and flow velocity vs week 3 baselines.
Advanced use cases
AI test generation in CI; PRD/spec drafting agents; discovery synthesis. Evaluate vs baselines and decide on expansion.
Pilot plan
The pilot plan follows the standard assessment roadmap: three weeks of discovery and pilot design, then implementation from week 4 with no gap. Workstreams align to the deliverables described in the approach.
| Workstream | Week 1—2 Discovery | Week 3 Pilot design | Week 4+ Implement |
|---|---|---|---|
| Operating model & delivery flow | Stakeholder interviews; start mapping how work actually flows | Synthesis; bottleneck analysis; prioritised leverage points; pilot team selected. Operating model for pilot defined — handoffs, who does what, agent touchpoints | Model in play; refine as the pilot runs |
| Pilot design & agent architecture | — | Pilot workflow design; agent integration plan; quality gates defined | Agents go live; iterate on output; expand scope as you learn |
| Measurement & baselines | Readiness check; baseline data collection (runs through week 2) | Baseline metrics agreed — cycle time, flow, AI adoption | Track flow and velocity vs baseline; weekly report |
| Team enablement | — | Coaching prep; team intro. Playbook drafted; coaching plan | Squad coaching; 1-to-1 slots; AI upskilling (from week 3, ongoing) |
| Organisation enablement | — | Leadership briefing; AI governance policy drafted | Fortnightly AI upskilling; community of practice; champions forum (ongoing) |
Want a report like this for your organisation?
Every assessment is tailored to your context. A short conversation is all it takes to get started.