What actually drives data annotation cost
Annotation cost is a function of four interacting variables: data type, task complexity, required accuracy, and throughput. A simple yes/no image classification task costs less per asset than pixel-level semantic segmentation of a surgical scene by two orders of magnitude. A one-pass label with no quality review costs less than a three-pass QA workflow with senior-reviewer adjudication. The headline rate alone tells you almost nothing about which engagement you are buying.
The lever buyers most often under-estimate is QA tier. A vendor quoting at half the market average is almost always running a single-pass workflow with no published inter-annotator agreement metric. The dataset costs less in line-item terms; it costs more in model performance, retraining cycles, and time-to-production.
- Data type: text and image annotation sit at the cheap end; video tracking, 3D point cloud, and clinical imaging command meaningful premiums. The complexity of the annotation task – not the size of the asset – is the cost driver.
- Task complexity: simple classification vs. polygon segmentation vs. multi-attribute labelling, each with their own per-unit time budget. Multi-attribute tasks can easily cost 5–10× a single-attribute baseline.
- Domain expertise: general annotation vs. medical, legal, financial, or engineering domain knowledge. Specialist reviewers add cost but are often the only path to a defensible dataset in regulated work.
- Accuracy requirements: single-pass labelling vs. multi-pass QA with inter-annotator agreement checks. The "third pass" – senior-reviewer adjudication on the decision boundary – is the discipline that most often separates production-grade datasets from noisy ones.
- Volume: high-volume continuous programmes benefit from tiered pricing; one-off batches cost more per unit because the setup overhead is amortised across fewer assets.
- Language: English annotation is the lowest-cost baseline; Southeast Asian, regional Indian, and low-resource APAC languages require specialist teams and price closer to expert work than commodity labelling.
Pricing ranges by data type (market reference)
Public market reference ranges for professional annotation vendors – cited in industry benchmarks tracked by the Stanford HAI AI Index and discussed in FinOps Foundation analyses of AI workload economics. These are market-wide, not vendor-specific, and intentionally framed as bands rather than spot rates.
- Text classification: cents per item for simple two-class labelling; rising into the lower-dollar range when multi-class taxonomy or context is required.
- Named-entity recognition (NER): low-dollar range per thousand tokens for general-domain English; meaningfully higher for medical, legal, or low-resource APAC languages.
- Sentiment analysis: cents per item for binary or ternary sentiment; higher for fine-grained aspect-based sentiment with attribute extraction.
- Image bounding boxes: cents per box for general classification with one or two attributes; the per-box rate rises sharply with class count and per-box attribute complexity.
- Image semantic segmentation: low-dollar per simple image to mid-dollar per dense urban or medical scene; the time-per-image dominates the cost equation.
- Video tracking and frame-level annotation: priced per minute or per frame; multi-object tracking with re-identification through occlusion is materially more expensive than single-object frame labelling.
- Audio transcription and diarisation: per-minute rates that scale with required accuracy (single-speaker vs. multi-speaker meeting with overlapping speech) and language.
- 3D point cloud and LiDAR annotation: per-scene pricing in the dollar-to-multi-dollar range, reflecting the depth of the cuboid plus per-class semantic segmentation work.
- Medical, legal, and financial specialist annotation: per-hour pricing in the expert-rate range, reflecting the licensed-professional requirement on the QA panel.
Per-item vs. per-hour vs. fixed-project pricing
Vendors typically offer three pricing models, and the right one depends on how stable your schema is and how much shared risk you want to carry.
- Per-item pricing works best for well-defined, repeatable tasks where the scope and schema are stable. It aligns vendor incentives toward throughput and is easy to forecast. The risk is that vendors maximise items per hour at the expense of edge-case quality on complex tasks.
- Per-hour pricing suits exploratory or rapidly-evolving tasks where throughput is hard to predict, or work that requires senior reviewer or domain-specialist time. It carries scope risk for the buyer but produces better quality on multi-pass and specialist work.
- Fixed-project pricing works for end-to-end engagements with a defined deliverable – a dataset of size N at accuracy bar M, delivered by date D. It shifts schedule and quality risk to the vendor but requires very clear up-front scoping.
The pricing model trap to avoid
Be cautious of per-item pricing for complex tasks – vendors with tight margins can rationally rush labels to maximise items-per-hour throughput, degrading quality. For anything requiring specialist knowledge or genuine multi-pass QA, per-hour or fixed-project pricing aligns incentives better than per-item.
The other pattern to watch: a per-item rate that is dramatically below the market floor. Annotation, like most expert labour, has a real cost basis. A rate 60–70% below the market median is rarely a clever sourcing win – it is almost always a thinner QA tier, more junior reviewers, or an SLA that does not enforce rework. The dataset you receive will reflect those choices.
Hidden costs to budget for
The line item on the initial quote rarely captures everything you will pay over the life of a real annotation programme. The teams that budget realistically include these line items from day one:
- Rework and corrections – budget 10–20% of the headline volume for re-annotation if initial quality falls short of the SLA. A vendor confident in their QA tier will rework at their cost; a vendor with thin margin will quietly bill it.
- Tooling and setup – some vendors charge onboarding fees for labelling-platform licensing (Labelbox, SuperAnnotate, V7, Encord, Scale Nucleus), pipeline configuration, or schema setup. Whether these are folded into the per-asset rate or invoiced separately is the question to ask up front.
- Data transfer and storage – large video, 3D point cloud, or LiDAR datasets require secure transfer infrastructure. For regulated data, an on-premise or VPC deployment may shift the cost line item from data transfer to infrastructure but typically nets to a lower total.
- Project management overhead – a dedicated PM is the single most reliable predictor of a smooth engagement, and PM time typically adds a fixed percentage on hourly-priced projects. The "no dedicated PM" vendors are not cheaper once you account for the buyer-side coordination cost.
- Quality audits – third-party accuracy audits or external IAA sampling are sometimes worth commissioning, particularly for regulated work or model-risk submissions. They cost extra but produce documentation regulators expect.
- Guideline iteration time – the first two weeks of any engagement are spent iterating the labelling guideline. Reputable vendors include this as part of onboarding; vendors who do not will bill it as out-of-scope when complex edge cases surface in week three.
How to get an accurate quote
The fastest path to a fair price is to share a representative sample dataset (100–500 items) with two or three vendors and request a scoped quote. A good vendor will annotate the sample, return the labelled batch with an inter-annotator agreement report, and quote a per-asset rate plus realistic timeline. Avoid any vendor who quotes without seeing data – annotation complexity varies too much across schemas for blind quotes to be reliable.
When comparing quotes, do not compare headline rates alone. Compare: QA tier described in the proposal, named-reviewer credentials for any specialist work, accuracy SLA in writing (the metric, the floor, and the measurement protocol), rework policy, deployment model (cloud, VPC, on-premise), and the project-management overhead each vendor includes. Two vendors with the same per-asset rate can produce datasets two error-bars apart.
For the pricing post specifically, we recommend asking each shortlisted vendor for a written quote that itemises: per-asset (or per-hour) rate by task type, included QA passes, included project management, on-premise vs. cloud rate variance, and any onboarding or tooling fees. The clean version of this quote can sit on one page and is the right artefact to put in front of finance for sign-off.
Offshore vs. onshore pricing in 2026
Offshore annotation teams (Vietnam, Philippines, India) typically deliver equivalent work at a meaningful discount to onshore teams (US, UK, Australia) – the gap is narrower than it was in 2018, but still material. The quality gap has narrowed substantially as offshore vendors have invested in QA infrastructure, specialist training, and modern annotation tooling. For most standard production tasks, mature offshore teams deliver to the same accuracy bar as onshore vendors.
The right comparison is not "what does offshore save" but "what is the fully-loaded cost of the work" – per-asset rate plus QA tier plus rework allowance plus PM overhead plus security and compliance posture. Vietnam-based pods, in particular, sit in the sweet spot for APAC-facing AI teams because of time-zone alignment and Southeast Asian language coverage; India remains stronger for English-language scale; the Philippines is competitive on conversational and voice work.
DataX Annotation operates from Hanoi, Vietnam, with delivery footprints across APAC. Our engagement structure is per-task pricing with no minimum commitment, multi-pass QA with published inter-annotator agreement, and an on-premise deployment option for regulated work. We share a written quote within 24 hours of receiving a representative sample dataset.
Send us a sample dataset and receive a scoped, itemised quote within 24 hours - no commitment required.
Request a custom quoteFrequently asked questions about annotation pricing
The questions enterprise AI teams ask most often when sizing an annotation budget:
- Is the lowest data annotation quote ever the right answer?
- Rarely. The cheapest quote is correlated with thinner QA, more junior reviewers, and a higher rework rate. The right benchmark is the fully-loaded cost of a production-grade dataset, not the headline rate.
- What share of the model-development budget should data annotation take?
- Across most enterprise AI programmes, annotation takes 20-40% of total model-development cost in year one, dropping to 10-25% in subsequent years as the gold panel matures and active-learning routing reduces full-asset re-labelling.
- How should we structure payment for a long-running annotation programme?
- The pattern that aligns incentives best is monthly invoicing against per-asset volume, with a rework clause that obliges the vendor to rework any batch that misses the SLA at their cost. Avoid up-front fixed-price contracts for ongoing labelling work.
- How quickly does a Vietnam-based annotation vendor turn around a pilot quote?
- Mature vendors return a written quote within 24-48 hours of receiving a sample dataset, with a paid pilot starting within 5 business days of NDA signature.
- Can we negotiate volume discounts on data annotation pricing?
- Yes, almost always. Reputable vendors publish a volume tier or will negotiate one explicitly. The negotiation should be transparent ("at X assets per month the rate drops to Y"), not blended into a single headline number.


