Back to case studies

Case study

Scalable Multimodal AI System

Enterprise AI

Client
Confidential
Industry
Multimodal AI Research
Engagement
4 months
Scalable Multimodal AI System
48K
Visual Prompts
7
Scientific Domains
600
Expert Makers
90%
Pass Rate

The challenge

The customer was scaling a multimodal model into seven scientific domains and needed a partner who could keep up — not just on volume, but on consistency across text, image, and audio modalities. They had been burned previously by vendors whose pass rate dropped sharply once headcount grew.

Our approach

Pod-of-pods structure

Rather than one large pool of annotators, we ran the program as seven domain pods (one per scientific area) with a central review layer. Each pod reported to a senior expert from that field; the central layer enforced cross-pod consistency.

Calibrated growth

The team grew to roughly 600 expert makers over four months. Every new annotator went through the same calibration set as the founding cohort, so quality stayed flat as headcount climbed.

LLM-assisted pre-labelling

For high-volume image and audio prompts, we used model-assisted pre-labelling with human-in-the-loop verification. The reviewer's time was spent on edge cases, not on copy-paste work.

Outcome

  • 48,000 high-quality visual prompts delivered across seven scientific domains.
  • ~600 vetted expert makers active by month four.
  • 90% sustained pass rate on the customer's hold-out evaluation.
  • Full ramp from zero to delivery in four months.

What made it work

The pod-of-pods structure meant that scaling did not dilute domain expertise. The customer was able to hand us a new domain mid-program without losing speed in the existing six.

Ready to run a similar program?

Let's scope a pilot in days, not months.

Talk to an expert

More case studies

View all