Unlock the full potential of Wavestore v6.46 -view our launch presentation today and explore the latest innovations in video management.
More than 70% of enterprise video analytics deployments underperform — not because the detection technology fails, but because of operational and architectural gaps between the platform and the security workflows it is meant to support (IFSEC Global, 2025). According to Axis Intelligence (2025), 67% of enterprise deployments fail to meet basic performance expectations.
That gap is not a product problem. It is an architecture and process problem. Most vendor evaluation frameworks focus on AI capability, detection accuracy benchmarks, and feature lists. The failures happen upstream of all of that: in how the event bus is structured, whether analytics decisions survive a network outage, and whether the operational workflow exists to act on what the system detects.
This guide covers what most vendor materials do not: the architectural questions that determine whether a video analytics deployment succeeds, the operational gaps that cause the majority to fall short, and a structured framework for evaluation.
A video analytics platform is an AI-powered software solution that processes raw video feeds to extract structured, actionable data. It enables security teams to automate threat detection, streamline operational workflows, and conduct rapid forensic searches, transforming standard video surveillance into proactive, enterprise-grade intelligence.
A video analytics platform is software that processes video feeds to extract structured, actionable data. Where a CCTV system records what happens, a video analytics platform analyses it — detecting objects, classifying behaviours, generating time-stamped events, and triggering automated responses or human escalation workflows.
The technology operates across three categories of use: security (intrusion detection, access control verification, perimeter monitoring), operational intelligence (occupancy counting, queue management, asset tracking), and compliance (PPE detection, restricted zone enforcement, incident documentation).
The global video analytics market was valued at approximately USD 12.39 billion in 2025 and is projected to reach USD 14.65–15.04 billion in 2026, growing at a compound annual rate of 22–23% through to USD 33–41 billion by 2030–2031 (MarketsandMarkets; Mordor Intelligence). Government remains the largest vertical by market share; enterprise adoption is accelerating across critical infrastructure, healthcare, and transport.
The most persistent misconception among first-time buyers is that AI video analytics determines intent. Buyers expect the system to answer: "Is that person stealing?" or "Is that behaviour threatening?"
It does not work that way. AI video analytics detects patterns and objects: "person in a restricted zone after 22:00," "vehicle present for more than 15 minutes in a no-stopping area," "individual matching a clothing description entering via Door 3." The determination of significance — whether a detected event constitutes a threat, a policy violation, or a false positive — happens in the Rule Engine that operators build on top of the AI's outputs.
This distinction matters for deployment planning. Organisations that configure a platform expecting the AI to make judgement calls, and then build no escalation workflow on top of it, are effectively running an expensive log file. The operational intelligence is in the rules, not the model.
Evaluating a video analytics platform requires looking at more than cameras, servers, and detection algorithms. The full architecture spans four layers, and the failure modes that account for the majority of underperforming deployments are concentrated in the layers that most procurement processes never examine.
The standard vendor evaluation interrogates Layers 1 and 2: what cameras does it support, what does the detection accuracy look like, edge or cloud? Enterprise deployments succeed or fail at Layers 3 and 4.
The hardware layer includes cameras, cabling, NVRs/DVRs, video management software, and storage. One factor is consistently underestimated at this layer: camera placement is an analytics pre-condition, not a retrofit assumption.
Existing surveillance cameras are typically mounted high and wide to maximise general coverage. Precision analytics applications — licence plate recognition, facial recognition, queue measurement — require specific angles, focal lengths, and lighting conditions that general surveillance installations do not provide. The algorithm cannot see what the lens cannot resolve. Retrofit deployments that skip a camera placement audit at the specification stage routinely discover this problem six months into deployment, at significantly higher remediation cost.
Storage is also underestimated. A single 4K camera running H.265 encoding at 6 Mbps generates approximately 65 GB of data per day. A 100-camera fleet with 30-day retention requires roughly 195 TB of storage — a figure that frequently exceeds the cost of the AI detection system itself. Storage planning is a first-layer requirement, not a procurement afterthought.
Processing location determines latency, bandwidth, data sovereignty, and WAN dependency. Edge processing keeps computation local — sub-100ms response times, no internet dependency, video stays on-site. Cloud processing offers scalable compute and multi-site access but requires consistent WAN connectivity and introduces data residency considerations. Hybrid architectures run real-time detection at the edge and route management, reporting, and long-tail forensic analytics to the cloud. The core Layer 2 decision — where computation happens — determines what fails when connectivity is lost. Full architectural treatment in the dedicated section below.
The event bus is the architectural layer that determines whether a video analytics platform genuinely integrates with physical access control, sensor inputs, and third-party systems — or simply sits alongside them.
Adding analytics to an existing VMS server introduces the Metadata Tax: the additional CPU and GPU load required to process analytics metadata competes with the VMS's own processing requirements. At scale, the VMS starts dropping frames or the database locks. This is the most common operational surprise in retrofit deployments — and it is an architecture problem, not a hardware spec problem. Upgrading the existing server addresses the symptom; the correct resolution is offloading analytics processing entirely onto a purpose-built analytics server — dedicated hardware with a GPU designed specifically for deep learning inference, working in conjunction with Wavestore VMS and keeping analytics compute physically separate from VMS recording operations.
The top layer is where the majority of deployment value is either captured or lost. Calibration, escalation protocol design, forensic investigation workflow, and post-incident analytics are Layer 4 concerns — and they are treated as configuration tasks, not architecture decisions.
60% of security teams lack the analytics training to operate effectively at this layer (Agrex AI, 2026). Deploying analytics without a defined Detect → Verify → Dispatch chain does not produce a safety system. It produces an expensive source of logged events that no one acts on.
A video management system manages video: recording, playback, device configuration, live monitoring, and user access. Modern VMS platforms support analytics through plugins and API integrations — but this is Layer 3 middleware, not native event-level integration. The distinction is not semantic. It is structural.
When a third-party analytics bridge fails to synchronise with the VMS, the result is operationally damaging: the analytics platform generates an alert, but the VMS has no corresponding video. Two systems see different realities. The operator receives a notification with no footage to verify it.
At scale, the latency problem compounds. Middleware API bridges between VMS and access control platforms become operationally visible at approximately 50 cameras, or when three or more operators are active concurrently. At that threshold, "Door Forced" alarms arrive 2–3 seconds after the event occurred. For access control in a life-safety environment, a 2–3 second alert lag makes real-time response impossible.
The cross-system investigation cost is equally significant. In middleware-integrated environments, operators toggle between three to five different applications during an incident review. A 10-minute video review becomes a one-hour manual exercise because events are logged separately in each system with no cross-camera event-based linking.
This is also where the integrator questions diverge from the marketing materials. Experienced systems integrators ask: "How does the database handle a schema update during a version jump?" and "What is the actual packet overhead of the VMS-to-controller heartbeat on a shared VLAN?" Marketing talks about integration depth; integrators care about what breaks and when.
Edge analytics processes video and makes access decisions locally — at or near the camera, on a local appliance, or within an edge controller — without dependency on a cloud connection or WAN link.
For life-safety applications, edge independence is not a preference. It is an architectural requirement. Consider the operational reality: a logistics hub experiences a fibre cut that takes the site offline for six hours.
For organisations where WAN outage is a "when" rather than "if" — government facilities, remote sites, transport hubs — the audit trail integrity question is a procurement-grade requirement, not a technical footnote.
At the edge controller level, a sovereign decision architecture holds the full local credential database in memory — up to 2,000,000 cardholder records on the WS-MP4502, with 240,000–600,000 on smaller controller models. It makes access decisions independently of cloud connectivity, buffers up to 500,000 transaction events locally across all controller models, and pushes the complete audit trail to the central server with original timestamps intact when the WAN restores. A super capacitor provides 10 hours of memory and clock backup; an optional lithium battery extends this further. Forensic integrity is never compromised.
Live alerting at the edge targets a p99 latency of under 250ms. Once latency crosses 500ms, incident routing is operationally broken — even if the dashboard appears functional.
Cloud deployment offers genuine advantages: scalable compute without CapEx, multi-site access through a single interface, and processing power for analytics that would be prohibitive to run at the edge (cross-site forensic search, multi-camera re-identification).
The structural constraint is WAN dependency. Cloud-only platforms require consistent, low-latency connectivity for real-time video processing. High-bandwidth consumption — streaming multiple camera feeds to a remote processing environment — saturates internet connections at multi-camera sites. For latency-critical applications (intrusion response, access control at entry points, slip-and-fall detection), cloud-only processing introduces risk that edge architecture eliminates.
The relevant question for procurement is not "is cloud better?" — it is "which system functions require cloud connectivity, and what is the documented behaviour when that connectivity is unavailable?"
Hybrid architecture uses each layer for what it does best. Edge handles the time-critical path: detection, access decisions, local alarms, and audit trail integrity. Cloud handles the management and intelligence layer: metadata synchronisation, cross-site analytics, reporting, and remote access.
This is the dominant enterprise adoption model for 2025–2026. Sub-100ms response requirements for industrial inspection, healthcare monitoring, and life-safety access control are only achievable at the edge; cloud processing alone cannot satisfy this threshold. Long-tail analytics — multi-site forensic search, behavioural trend analysis across locations — benefit from centralised cloud processing where latency is not a constraint.
Over 70% of analytics deployments underperform due to operational gaps rather than technology failures (IFSEC Global, 2025). The detection accuracy may be exactly as specified. The operational value is still not captured.
Factory default sensitivity settings are configured for demonstration conditions: consistent lighting, controlled environments, predictable subject movement. Real deployments have glare, shadows, wind-triggered motion, seasonal lighting variation, and unpredictable traffic patterns.
The consequences are measurable. Operators begin ignoring alerts within two weeks of consistent false alarms (Security Industry Association research). Once alert fatigue sets in, the average response delay climbs to 45 minutes (Agrex AI, 2026). At that point, the system is generating events that no one acts on.
The calibration reality: initial detection can be configured in approximately one hour. Operational accuracy — 98%+ with tuned zones and calibrated sensitivity thresholds — requires two full weeks of environmental cycles to account for lighting changes, weather variation, and traffic pattern differences across hours and days. "Anyone promising 'instant AI' is selling a support nightmare." Deployments that skip this calibration period carry the cost in false positives, alert fatigue, and ultimately in a security operations team that has stopped trusting the system.
Additionally, an average of 10–15% of an enterprise camera fleet is degraded or offline at any given time (industry data, Agrex AI). Without dedicated camera health monitoring, these failures are typically discovered only after a critical incident — when the footage required for an investigation does not exist.
Detection without a defined response chain is not a safety system. It is a log file with a dashboard.
The operational requirement is a Detect → Verify → Dispatch chain with named owners at each stage: who receives the alert, who verifies the event against video, who dispatches a response, and within what time threshold. Organisations that deploy analytics without designing this chain typically find that alerts route to a shared inbox, get acknowledged without verification, and close without action.
Adding analytics processing to an existing VMS server introduces CPU and GPU load competition. At modest camera counts, the competition is manageable. At scale — typically above 50 cameras, or in environments with high-motion scenes requiring continuous processing — the VMS begins dropping frames or the database locks under combined load.
This is an architecture problem. The correct resolution is not a faster general-purpose server — it is acquiring a purpose-built analytics server: dedicated hardware with a GPU designed specifically for deep learning inference, which streams video directly from IP cameras or the VMS via RTSP and processes analytics completely independently of VMS operations. Wavestore's purpose-built analytics servers are available in configurations from 8 channels up to 115 channels, with optional forensic search capability, and are designed to work in conjunction with Wavestore VMS — keeping the analytics processing layer architecturally separate from the VMS recording layer regardless of scale.
Organisations with genuinely integrated security systems resolve incidents 40% faster than those using standalone tools (ASIS International, 2024 State of Security Convergence). The critical distinction is between native event-level integration and dashboard overlay.
Dashboard overlay means an operator can see a camera feed and an access event on the same screen. Native event-level integration means a third-party analytics detection can trigger an access rule, an automated response, and a cross-system audit log entry — without manual operator intervention.
The procurement question that separates the two: "Can the system process a third-party analytics event as a native trigger on the shared event bus, or is it just a bookmark in the video?"
Manual video scrubbing is the default post-incident workflow at most organisations: take the approximate time of an incident, pull the relevant camera or cameras, review footage in real time or fast-forward until the relevant moment is located. At multi-camera sites, with multiple suspects or vehicles, this process is measured in hours.
AI forensic search reduces this to minutes: search by attribute (clothing colour, vehicle type, direction of travel, time window, zone), reconstruct a multi-camera timeline, and surface the relevant frames without manual scrubbing.
The value of this capability depends entirely on the integrity of the underlying storage architecture. A forensic-grade file system writes data atomically — each block is either committed in full or not at all, with no partial writes and no check-disk phase required after a power interruption. Every frame is individually indexed, making playback from any point in a large dataset effectively instantaneous. In the event of a power cut, recording continues to the exact second of power loss and resumes immediately on restoration. The worst-case degradation scenario — a block-level disk fault — results in the loss of a few seconds of footage, not a corrupted archive. This is the storage architecture that makes AI forensic search operationally reliable at scale, not just technically possible.
The FY2026 National Defense Authorisation Act restricts any entity receiving federal funding, loans, or grants from procuring or using covered telecommunications equipment. Restricted manufacturers include Hikvision, Dahua, Huawei, Hytera, ZTE, and their subsidiaries and affiliates. Analogous regulations are being adopted in the UK, Canada, and across the EU.
The procurement error that repeatedly surfaces in government and critical infrastructure evaluations: platform-level NDAA compliance does not guarantee supply chain compliance. A VMS platform that is NDAA-compliant at the software layer may have a certified hardware ecosystem that includes cameras from restricted manufacturers. Compliance documentation must cover every layer — VMS, access control, and hardware — not just the primary software component.
When requesting compliance verification, ask for supply chain documentation, not a compliance statement. A statement asserts compliance; documentation demonstrates it.
Cloud-first platforms frequently cannot guarantee that video data remains within a specific jurisdiction. For government agencies, critical infrastructure operators, and healthcare organisations operating under data residency requirements, this is a disqualifying characteristic — not a configuration preference.
Buyers in regulated environments should require: hardened operating system documentation, local-first storage architecture proof, and direct access to ISO 27001, ISO 27017, ISO 27018, and SOC 2 audit reports. Vendor assurances summarising compliance posture are not equivalent to the underlying audit documentation. Wavestore's cybersecurity architecture documentation covers OS hardening and local-first storage architecture.
The three most widely deployed enterprise VMS platforms run on Windows Server. This is rarely listed as a procurement criterion. It should be.
Windows-based security infrastructure carries the full commodity exploit attack surface: ransomware targeting Windows deployment patterns, zero-day exploits against the Windows Server attack surface, and IoT-to-IT lateral movement vectors — a documented and escalating threat in 2025–2026 threat intelligence (CM3 Building Solutions, 2026; Perkins Coie, 2025). Physical security devices are documented entry points for IT network compromise. The platform protecting the building can itself become a network security liability.
Hardened Linux reduces the attack surface by architectural design — not through patch frequency or endpoint protection, but through OS choice. Specific hardening measures include:
This is a security architecture decision, not a configuration option. It belongs in the RFP. See Wavestore's VMS cybersecurity and hardening overview for full methodology.
Standard evaluation criteria: detection accuracy, camera and protocol compatibility (ONVIF, RTSP), edge and cloud deployment options, integration compatibility list, licensing model, and vendor support SLA.
These are necessary criteria. They are not sufficient. They interrogate Layers 1 and 2 of the Physical Security Analytics Stack. The questions that determine whether a deployment succeeds are at Layers 3 and 4 — and they are almost never included in standard RFPs.
Per-operator seat pricing penalises team growth: every additional security operator or investigator added to the platform represents an incremental licensing cost. This model creates a structural tension between operational best practice (broad access to the forensic and analytics tools) and cost control.
Per-module licensing adds a line item for every capability beyond the base platform: analytics, access control, forensic search, multi-site management. In practice, the total cost of a fully-capable deployment can differ significantly from the initial platform cost.
A per-camera, per-reader model — with all capabilities included and no seat costs — scales predictably. As headcount grows, licensing costs do not. For a 100-door Mercury-based site, the logical migration (database import and controller configuration) can be completed over a weekend. Because the architecture is hardware-independent and supports existing Mercury controllers natively, there is no rip-and-replace of existing hardware boards. The migration effort is approximately 90% software configuration and 10% validation.
What is the difference between a video analytics platform and a VMS?
A VMS (video management system) manages video: recording, playback, live monitoring, and device configuration. A video analytics platform processes video to extract structured intelligence — detecting objects, recognising behaviours, and generating actionable event data. In modern deployments, the distinction is increasingly architectural: a platform with a native shared event bus integrates video and access control as first-class data types, while a VMS with added analytics relies on a middleware layer that introduces latency, sync dependencies, and integration failure risk.
How long does it take to deploy a video analytics platform?
Initial detection can be configured in approximately one hour. Operational accuracy — 98%+ with tuned detection zones and calibrated sensitivity thresholds — requires two weeks of environmental cycles to account for lighting variation, weather, and traffic pattern differences across time of day and season. Deployments that claim instant accuracy typically require significant post-installation tuning or carry higher false positive rates that degrade operational value within weeks.
What happens to a video analytics platform during a network outage?
This depends entirely on the deployment architecture. Edge-deployed analytics with local edge controllers continue to process video, make access control decisions, and maintain a full audit trail with original timestamps — independently of WAN connectivity. Mercury-based controllers buffer up to 500,000 transaction events locally and hold up to 2,000,000 cardholder records on-device, with 10-hour super capacitor memory backup. The complete audit trail is pushed to the central server with original timestamps intact on WAN restoration. Cloud-dependent architectures lose real-time analytics capability during outages with no equivalent local buffer. For life-safety applications, WAN independence is an architectural requirement, not a preference.
What does NDAA compliance mean for a video analytics platform?
NDAA Section 889 restricts use of equipment from covered manufacturers (including Hikvision, Dahua, Huawei, Hytera, ZTE) by entities receiving federal funding. Compliance at the platform software layer does not guarantee compliance at the hardware layer if the vendor's certified camera ecosystem includes restricted manufacturers. Procurement teams should request supply chain documentation — not just a compliance declaration — covering VMS, access control, and hardware layers independently.
How do you reduce false positives in video analytics?
Factory default sensitivity settings generate high false positive rates in real-world environments. Operational accuracy requires: scene-specific calibration of detection zones, time-based sensitivity thresholds (separate profiles for peak hours, off-hours, and weekends), and a defined escalation workflow that prevents alert fatigue from degrading operator response. Research indicates operators typically begin dismissing alerts within two weeks of consistent false alarms (SIA). Expect a 14-day calibration period before production-grade accuracy is achievable.
The architecture of a video analytics platform determines its operational reliability more than any capability listed in a feature comparison. Most procurement processes evaluate what a platform detects. The questions that determine whether a deployment succeeds or fails are architectural: whether analytics decisions survive a WAN outage, whether the event bus is native or middleware-dependent, and whether the operating system carries a commodity exploit attack surface into a life-safety environment.
These questions are almost never asked during procurement. The Physical Security Analytics Stack framework gives security leads, systems integrators, and procurement teams the structure to ask them systematically — and to evaluate platform architecture at Layers 3 and 4, where the deployment outcomes are decided.
For organisations at the evaluation stage, request a technical evaluation to review WAN failover specification and edge buffer capacity, observe native event bus integration versus middleware overlay in a live environment, and review OS hardening and patch validation documentation directly.

Solutions for a world we can't yet see. Discover v6.46 features helping people and businesses.
