This case study describes a generic mid-size B2B content marketing agency. Specific firm and customer cohort have been anonymised per AgentsBooks's privacy policy.
The starting state
The agency operated as a B2B content-marketing practice with a handful of clients in the technology + financial-services verticals. Production capacity: ~8 long-form essays per month, ~30 short-form posts, ~1 newsletter per client per week. Capacity-bound by senior-strategist time (research + outline) + senior-editor time (review + polish).
The agency was turning away inbound. Adding senior staff was the obvious lever but took 6–9 months per hire to ramp to full productivity.
The 8-primitive shape
The agency deployed a four-agent content fleet:
- Research agent (Identity:
content-research). Runs the topic-research sprint (R1..R8 dimensions per the AgentsBooks research-cadence pattern). Producesresearch-notes.mdwith citation set + counter-narratives. Heart:manualtriggered by senior strategist on topic-pick. - Drafting agent (Identity:
content-drafter). Takes the research notes + strategist's outline and produces a long-form draft. Heart:A2Afrom research agent. - Editor agent (Identity:
content-editor). Reviews drafts against the agency's voice guide + client style guide; produces a redline. Senior editor signs off. Heart:A2Afrom drafter. - Cross-post agent (Identity:
crosspost). On publish, adapts the piece for dev.to / Medium / Hashnode / LinkedIn — each with canonical link pointing home (per the crosspost script in agentsbooks-marketing/generator/crosspost.py). Heart:webhookon publish.
Senior strategist + senior editor stayed in the loop on the high-judgment phases (topic-pick + final sign-off). Junior copyeditor seat was eliminated by attrition.
What changed
After 4 months of HITL operation:
| Metric | Pre | Post | Δ |
|---|---|---|---|
| Long-form essays/month | 8 | 32 | +4× |
| Short-form posts/month | 30 | 120 | +4× |
| Newsletters/client/week | 1 | 1 (unchanged — bottleneck was client capacity) | 0% |
| Cross-post coverage | manual ~20% | automated 100% | dramatic |
| Senior strategist hours/essay | 4 h | 1.5 h | -63% |
| Senior editor hours/essay | 3 h | 1 h | -67% |
Revenue per senior FTE roughly tripled. The agency took on two additional retainer clients without senior hiring.
What didn't work the first time
Three corrections:
- The drafting agent's voice drifted. Without a strong voice guide as Knowledge, drafts came out generic-corporate. Codifying the agency's voice into a Knowledge document — and citing it in the drafter's system prompt — was the fix.
- The research agent over-cited. Initial drafts had 25+ outlinks per essay, which read as link-dumping. Tuning the research-to-draft prompt for citation density (≥1 outlink per substantive claim, ≤8 per 1000 words) recovered readability.
- Cross-post auto-publishing went live before review. A dev.to post went up with a placeholder. Changing the cross-post agent to leave drafts in "unpublished" state (the default in our crosspost.py implementation) for the strategist's morning review caught two more.
The economics
- Token spend: ~$800/month at steady state. Heavy use of prompt caching on the voice guide + per-client style guide.
- Revenue impact: substantial — two additional retainer clients added without proportional cost.
- Payback: ~2 months on the deployment effort.
What this case is not
It's not "AI writes the agency's content end-to-end". The senior strategist + senior editor remained the quality bar; the agent fleet amplified their throughput. Authentic voice + factual accuracy still trace to the human team.
It's not a recipe for fully automated content. The cases where the substrate falls down: original-research essays (need primary interviews), case studies (need real customer stories), and any content where the agency's reputation is the differentiator. Those stayed human-led.
FAQ
Q: Did clients notice a quality difference?
A: Clients noticed faster turn-around. Blind quality reviews showed no measurable difference once the voice-guide Knowledge was tuned. The "AI flavor" most clients fear shows up when the voice guide is thin or absent.
Q: How does this map to the 8 primitives?
A: Identity per agent. Heart triggers each phase. Friends edges between research → draft → edit → cross-post. Knowledge holds voice + style guides. Memory holds per-piece audit trail. Control connects to the agency's CMS + the cross-post platforms.
Q: What about disclosure that content is AI-assisted?
A: Different clients have different policies. The agency disclosed at the brand level (a page on the agency site noting agent-assisted production); individual pieces don't typically need per-piece disclosure under current US/UK guidance.
Want to see the firm-starter for content agencies? Start free →