2D image annotation for an autonomous-driving programme
Pixel-precise 2D bounding boxes, semantic segmentation, and lane labels across 1.8M frames for a Tokyo-based mobility company – Japan-context edge cases handled by a locally-fluent annotation pod.
Challenge
A Tokyo-based mobility company training 2D perception models for urban driving needed annotation that respected Japan-specific traffic context – kanji and hiragana signage, narrow back-street lane geometries, and dense cyclist/pedestrian scenes that off-the-shelf overseas vendors had been mislabelling.
Their previous supplier had returned a 6.3% rework rate on Japanese-context edge cases, breaking sprint timelines and forcing in-house engineers to spend evenings re-labelling rather than improving the model.
Approach
We assembled a Japan-fluent annotation pod with reviewers experienced in JIS road-sign conventions and right-hand-drive scene geometry. The schema covered 2D bounding boxes for vehicles, pedestrians, cyclists, and signs; per-pixel semantic segmentation for road surfaces and lane lines; and free-form notes for ambiguous Japan-specific cases.
Every frame ran a two-pass workflow with adjudication, and we co-located a senior reviewer with the client team during sprint kick-offs in Tokyo so guidelines could be iterated face-to-face.
Outcome
Delivered 1.8M annotated frames over nine months at 99.4% field-level accuracy on the validation set – above the client's 98.5% acceptance threshold from sprint one.
Their 2D perception model gained 5.7 mAP on a Tokyo-specific evaluation slice after the first three sprints, and rework on Japanese-context frames dropped from 6.3% to 0.7%, returning roughly 22 engineering-hours per week to the in-house team.
Let's build what's next
Share your challenge – AI, data, or infrastructure. We'll scope your project and put the right team on it.