How Small Teams Can Use Codex Like Endava

Endava, a global software contracting firm with engineers across Europe, the Americas, and Asia, published a case study with OpenAI describing how it reorganized software delivery around Codex. The results are specific enough to be useful: smaller teams shipping at a pace that previously required larger ones, junior developers producing senior-quality output, and multi-week analysis tasks compressed into two-hour working sessions.

Most teams reading this aren’t Endava. But the underlying approach — treating Codex as a system tool rather than a personal shortcut — is something any engineering team can adapt at small scale.

What Endava actually did

According to OpenAI’s case study, Endava now calls itself an “agentic organization”: one where senior expertise is encoded into agents that work alongside teams across the full project lifecycle. That lifecycle includes client intake, requirements analysis, design, build, and operations.

Joe Dunleavy, Endava’s CTO for Europe, describes the shift: “We went from producing a lot of the code ourselves to now overseeing the work that Codex can produce.”

Mike Krolnik, Endava’s Global SVP of Agentic Architecture, explains what that means in practice. Senior architects encode their judgment into Codex so that junior developers get senior-level guidance as they execute. A specific example: one of Endava’s legal teams brought engineering a problem involving thousands of pages of contracts to review. Translating that into a workable requirements specification would normally take weeks of back-and-forth. Instead, the team recorded a two-hour meeting with legal stakeholders, fed the transcript to Codex, and generated a usable requirements spec — completed in two one-hour meetings.

The OpenAI source for this case study is at openai.com/index/endava.

What to borrow and what not to copy

Endava operates at enterprise scale with dedicated roles like “Global SVP of Agentic Architecture.” You don’t need any of that infrastructure. What transfers is the mindset: Codex works best as a documented part of the delivery system, not a personal productivity tool used inconsistently.

What to copy:

  • Start with narrow, well-defined use cases: test generation, code explanation, refactoring support, documentation drafts, bug investigation, prototype scaffolding, requirements summarization from meeting transcripts
  • Make Codex part of pull requests rather than a replacement for them — keep human review in the loop
  • Use it for non-coding workflows first: Endava found the fastest value outside of code generation, in requirements analysis and client communication

What not to copy:

  • Enterprise-scale language and organizational structures
  • Any assumption that an “agentic organization” is a goal in itself for a small team
  • The implied claim that Codex automatically improves productivity — results depend entirely on how it’s integrated into review and delivery workflows

A practical small-team pilot plan

Before starting, establish a baseline: pick one repository, two or three specific use cases, and define what “working” looks like before you start.

Week 1: Choose the repo and the use cases. Define what data doesn’t go into prompts (no sensitive credentials, client-confidential material, or proprietary IP without policy review). Agree on a labeling convention for PRs where Codex materially contributed.

Week 2: Define the guardrails in writing. No unreviewed AI-generated code in production. Required tests for Codex-assisted output. Clear prompting conventions that include relevant context.

Weeks 3–4: Track a handful of metrics: cycle time on targeted tasks, PR review load, defects found in review, and developer satisfaction with the workflow. Note any rework caused by Codex output that required significant correction.

End of pilot: Make a decision — expand the use cases, adjust the guardrails, or stop. Don’t expand before you understand the failure modes from the first four weeks.

Risks worth naming before you start

Coding agents can produce plausible but incorrect code. They miss architectural context that isn’t in the prompt. They can introduce security vulnerabilities that look reasonable on first review. Junior developers using them without senior oversight can develop blind spots about what the code is actually doing.

The workflow Endava describes works because senior judgment stays in the loop — Codex doesn’t replace the architect, it extends them. For small teams, that means the senior person on the team needs to understand what Codex is contributing and where its outputs require the most careful review.

Teams handling regulated data, client-confidential code, or proprietary systems need a policy review before using cloud AI coding tools. “We used Codex and it worked” is not a compliance posture.

Before starting your pilot — a checklist

  • Use case: what specific tasks will Codex handle?
  • Repo scope: which repository and which branches?
  • Data policy: what can and can’t go into prompts?
  • Review rule: no unreviewed AI output in production
  • Test rule: what test coverage is required for Codex-assisted PRs?
  • Measurement plan: what will you track over 4 weeks?
  • Owner: who is responsible for running the pilot?
  • Stop condition: what outcome triggers ending or pausing the experiment?

Endava’s core advice for teams just starting out is direct: “Don’t just think about it, really try it.” Pick a non-coding workflow first — requirements analysis, design documentation, or meeting summarization. The fastest way to see how Codex fits into your team is to use it somewhere your team has never used a coding tool before.

Similar Posts