Enclustra FPGA Solutions | AI in Motion: Real-Time Smart City Intelligence at the Edge | Edge AI for Real-Time Smart City Intelligence

AI in Motion: Real-Time Smart City Intelligence at the Edge

Cities are becoming increasingly dynamic, connected, and data-driven. Yet many urban monitoring systems still rely on centralised infrastructure, manual video observation, or analytics that react too late. Traffic congestion, crowded public spaces, safety incidents, and environmental changes all unfold in real time — and city administrators need to understand what is happening as it happens.

Smart City AI Monitor hero image showing a connected smart city powered by the Enclustra Pluto XZU20 SoM for edge AI applications.

Together with MakarenaLabs, Enclustra shows how this challenge can be addressed directly with edge AI. Our joint initiative, the Smart City AI Monitor, is a real-time digital twin simulation of an urban street environment, developed as a technology project. It showcases what is possible when advanced AI-based video analytics, embedded AI, and compact FPGA SoC-based edge computing come together in one smart city AI project.

In the Smart City AI Monitor, rendered video streams are processed live on the Enclustra Pluto XZU20 SoM with the Pluto ST11 Base Board. Hailo AI acceleration, enabled via H4Z integration on the Enclustra Pluto platform, powers concurrent multi-model edge inference pipelines, while MuseBox streamlines the deployment and orchestration of AI-powered video analytics for smart city safety and autonomous surveillance scenarios. The system transforms raw video into high-value semantic information: people counting, crowd analytics, traffic level monitoring, known individual recognition, event detection, and automated alerts for critical situations such as altercations, falls, or medical incidents.

The Challenge

Modern cities generate enormous amounts of visual information. Cameras are already present at intersections, public transport hubs, pedestrian areas, campuses, industrial sites, and critical infrastructure locations. However, turning these video streams into timely, reliable, and actionable intelligence remains difficult.

Traditional approaches often depend on sending data to central servers or the cloud. This creates latency, consumes bandwidth, and can increase infrastructure complexity. In safety-critical scenarios, even small delays matter: a fight, a fall, a blocked road, or a suspicious object must be detected quickly enough for operators to act before the situation escalates.

Validation is another key challenge. Urban incidents are not always frequent, predictable, or repeatable. A system may perform well in a controlled project environment but fail when exposed to crowds, occlusions, weather changes, low light, or unusual traffic behaviour. For AI camera vendors, NVR vendors, system integrators, and city technology providers, the system must also remain compact, efficient, and suitable for low-latency edge deployment.

The Solution

The Smart City AI Monitor addresses these challenges with a complete edge AI project built around the Enclustra Pluto XZU20 SoM and Pluto ST11 Base Board. The solution combines high-performance edge computing with a digital twin of a city environment, where virtual roads, intersections, sidewalks, pedestrian areas, traffic lights, vehicles, people, surveillance cameras, and environmental variables such as traffic, air quality, and climate are reproduced in real time.

Using its ALOE framework, MakarenaLabs can generate realistic and repeatable scenarios, from regular traffic flow and pedestrian movement to congestion, assemblies, fights, falls and other anomalous behaviours. Video streams generated by configured virtual cameras are transmitted directly to the Pluto platform, where HAILO-accelerated AI performs real-time inference. An important advantage of the Digital Twin (ALOE) by MakarenaLabs is that it can support the training and validation of neural networks before deployment with real cameras. Although the simulated city may appear visually simplified, it contains structured information about objects, people, vehicles, camera views, and events. This allows the Smart City AI Monitor to generate realistic scenarios for object detection and scene classification without relying on hours of manually collected image streams. Once the neural networks are trained and validated in this controlled environment, the same edge AI pipeline can be applied to real smart city camera feeds, where the Pluto platform processes live video for AI detection, dashboard visualisation, and automated alerts.

Additionally, the system combines CNN-based face recognition with an indexed facial embedding database for known individual identification, and Transformer-based zero-shot scene classification for contextual understanding of activities and anomalies from a single frame. This enables the system to detect identities, interpret scenes, and trigger automated alerts without requiring task-specific retraining for every possible scenario. MuseBox provides the integrated video analytics layer, allowing the project to process multiple AI functions in the same pipeline.

Smart City AI Monitor dashboard showing AI camera video, recognized subjects, scene classification and environmental data. Smart City AI Monitor dashboard showing live AI camera analysis, recognized subjects, scene classification, environmental data, and real-time inference on the Pluto platform.

Why Enclustra Pluto XZU20 and ST11 are the Perfect Fit?

Smart city AI requires compact, low-power, and deterministic edge computing. Built around the AMD Zynq™ UltraScale+™ MPSoC, the Enclustra Pluto XZU20 SoM combines embedded processing with FPGA fabric, enabling real-time image and data processing close to the point of capture. For AI cameras , edge vision systems and mobile sensing systems, size, power, and latency are decisive. Pluto XZU20 provides the embedded foundation for portable AI systems where raw video must be transformed into intelligence locally, without relying on continuous cloud connectivity.

Pluto ST11 Base Board complements the SoM by providing a practical development and integration platform for proof-of-concept projects, prototyping, and future product-oriented evaluations. With interfaces such as Gigabit Ethernet, USB 3.0, M.2, Mini DisplayPort, and multiple MIPI interfaces, the Pluto ST11 is well-suited for video-centric edge AI design, embedded vision applications, and camera-oriented system integration.

Together, Pluto XZU20 and Pluto ST11 give developers a compact and flexible platform to evaluate smart camera workloads, integrate AI acceleration, connect video interfaces, and show real-world use cases quickly. For camera vendors and integrators, this means the project is a reference architecture that can help de-risk new product lines and shorten the path from idea to validated Edge AI architecture.

What are the Target Applications?

  • Smart city safety: Public safety teams can use structured alerts for fights, falls, suspicious objects, crowd formation, and other anomalies instead of watching dozens of live feeds manually.
  • Smart infrastructure and traffic intelligence: The Smart City AI Monitor can detect vehicles, evaluate traffic density, identify congestion indicators, and support better decisions for road management, public transport planning, and incident response.
  • Autonomous surveillance: The same architecture is relevant for autonomous surveillance scenarios in campuses, industrial sites, transport hubs, stadiums, airports, logistics facilities, and critical infrastructure.
  • Mobile and drone-based monitoring: An additional opportunity is temporary event surveillance, emergency response, infrastructure inspection, and multi-angle situational awareness.

Drone-view dashboard for the Smart City AI Monitor showing traffic intelligence, object detection and scene classification. Drone-view perspective of the Smart City AI Monitor, showing traffic intelligence, object detection, and scene classification in a simulated urban environment

Results and Outlook

The Smart City AI Monitor demonstrates how embedded AI can support safer and more responsive urban environments. By combining the Digital Twin (ALOE), HAILO-accelerated inference, H4Z, MuseBox video analytics, and the Enclustra Pluto XZU20 with the Pluto ST11 Base Board, MakarenaLabs and Enclustra show how AI vision workloads can move from simulated testing environments to real-world smart camera deployments.

The project highlights how edge AI can process live video close to the data source, enabling real-time traffic monitoring, event detection, and automated alerts while reducing latency and infrastructure complexity.

Are you planning an advanced AI camera or embedded vision solution? Contact our team to learn how our SoM modules and base boards can help accelerate your next smart vision project.