Facial recognition at large scale, responsibly

Most facial recognition demos work. Most production deployments stall — not on accuracy in isolation, but on the four constraints that production systems have to meet simultaneously: accuracy, throughput, governance and cost. Designing for one at a time is comfortable. Designing for all four is what separates a pilot from a platform.

The four constraints, in tension

Accuracy at the operating point that matters — not the cherry-picked dataset, but the lighting, optics and demographics of your actual sites.
Throughput — the number of faces per second per camera you have to process at peak, with the latency the use case actually tolerates.
Governance — watch-list scoping, retention, consent posture, audit trail, role-based access and the policy that makes the system legitimate to use.
Cost — GPU spend, network spend, ops spend and the cost of keeping models current as faces, optics and policy evolve.

Pull on any one of these and the others move. A higher accuracy threshold drops throughput. Stricter governance limits the data available for model tuning. Lower cost typically means weaker GPUs, which compresses both accuracy and throughput.

Architectural choices that matter

Edge vs. core inference — run detection at the edge to bound bandwidth and latency; reserve recognition for a smaller, fast core fleet. Hybrid pipelines almost always outperform pure edge or pure core at scale.
Watch-list scoping — a 200-person watch-list is a different system from a 200,000-person watch-list. Architect for the smallest list that solves the use case, then plan the upgrade path explicitly.
Embedding pipelines — separate detection, alignment and embedding so each stage can be upgraded independently as models improve.
Decision boundaries — publish the precision/recall trade-off you are operating at, and tune it per use case (a watch-list alert tolerates fewer false positives than a footfall analytic).

Governance is part of the architecture, not an afterthought

Production-grade FR systems treat governance as a first-class component:

Watch-list lifecycle — who can add, remove, review; with audit trail.
Retention windows for each data class (raw frames, embeddings, alert events) tuned to policy.
Role-based access tied to operational roles, not platform roles — so a site manager sees only their site, not the chain.
Decision logs that record the inputs, model version and threshold for any consequential alert.
Documented hand-off to investigation workflow with chain-of-custody preserved.

How to keep accuracy from rotting in production

Periodic accuracy audits against ground truth, per site — not per dataset.
Drift monitoring on the embedding distribution — a real signal that something has changed before users complain.
A clean process for retraining or threshold-tuning, with rollback.
Vendor diversification on the model layer where the architecture allows — reduces the blast radius of any one model regression.

Where to start

Start with the use case, not the technology. Define the alert that matters, define the false-positive cost of that alert, scope the watch-list for that alert, and pick the smallest architecture that delivers the four constraints together. Most failed FR programs failed not because the technology did not work, but because the use case was never disciplined enough.

Talk to us about a responsible facial-recognition program

← Back to Resources Talk to our team

Facial recognition at large scale, responsibly

The four constraints, in tension

Architectural choices that matter

Governance is part of the architecture, not an afterthought

How to keep accuracy from rotting in production

Where to start

Continue reading

How retail operators turn footfall into revenue uplift

Why your camera count data is wrong (and how to fix it)

The dashboard your store manager will actually open

Ready to turn cameras into measurable outcomes?