Architecture | NVIDIA AI Hub

🏗️ Five-Layer Architecture

Read bottom-up: data flows from cameras → edge AI → streaming → fog AI → central AI → dashboards

📡 Layer 0 — Data Sources | Cameras · IoT · GPS · SCATS Traffic Systems

📷

HD/4K Traffic Cameras

18,000+ units. RTSP streaming. H.264/H.265 encoded. 25–30 fps.

🌦️

IoT Weather Sensors

Temperature, rain, flood, road condition sensors. MQTT protocol. Low-bandwidth telemetry.

🛰️

GPS Telemetry

Buses, trains, fleet vehicles. Real-time route tracking and ETA data streams.

🚦

SCATS Traffic Systems

Loop detectors, radar, signal controllers. Vehicle counts per lane per cycle.

🛣️

Road Sensors

LiDAR, radar, speed sensors. Congestion detectors at key intersections.

🔐

BlueField-3 DPU

Hardware security at network edge. MACsec, IPSec, Zero Trust enforcement.

⬇ RTSP · MQTT · H.264/H.265 → Jetson AGX Orin Clusters

🖥️ Layer 1 — Edge AI | NVIDIA Jetson AGX Orin · Sub-3ms · Privacy-First

🎬

NVDEC Hardware Decode

Dedicated H.264/H.265 decode engine. Zero CPU load. 4K @ 60fps per stream.

⚡

CUDA Preprocessing

GPU resize, normalize, colour-space conversion via NPP library before inference.

🎯

TensorRT YOLOv8

Object detection: vehicles, pedestrians, incidents. FP16 engine. <2ms per frame.

🔍

DeepSORT + NvDCF

Multi-object tracking. Maintains vehicle IDs across frames for speed estimation.

🔒

PII Anonymisation

Faces and licence plates blurred on-device by a separate TensorRT model BEFORE metadata leaves.

📊

Metadata Generation

Vehicle count, speed estimate, congestion index, incident flag. Only metadata forwarded — never video.

🖥️ NVIDIA Fleet Command

OTA software updates to all Jetson edge nodes. Health monitoring, model version management, remote debugging. Centralised control of 18,000 edge units.

⬇ Protobuf Metadata · TLS 1.3 · RoCE v2 RDMA — 80–90% bandwidth saving vs raw video

📨 Layer 2 — Pub/Sub Streaming | Apache Kafka · MQTT · DeepStream Message Broker

🟠

Apache Kafka Cluster

High-throughput event backbone. Topics: traffic.events, incidents.alerts, weather.sensor.data, bus.telemetry. Millions of events/sec, fault-tolerant.

🟢

MQTT Broker

Lightweight IoT telemetry for weather sensors and GPS devices. QoS levels 0–2. Low-bandwidth edge device support.

🔵

DeepStream Message Broker

Direct AI event push from Jetson to cloud. AMQP, Redis Streams. Inference results streamed in real-time.

⚡

RoCE v2 RDMA Fabric

HPE Aruba CX + BlueField-3 DPU. Direct GPU memory transfer between nodes. Zero CPU overhead, sub-3ms.

🔐

Stream Security

TLS 1.3, mTLS, OAuth tokens, Kafka ACLs, SASL/SCRAM. AES-256 encryption at rest and in transit.

⬇ HPE FlexFabric 400GbE · WAN / MPLS / SD-WAN

🌫️ Layer 3 — Fog / Regional AI | HPE ProLiant + NVIDIA A100 + Kubernetes

🧠

Triton Inference Server

ST-GCN Graph Neural Network for traffic flow prediction. Anomaly detection engine. Regional congestion forecasting per zone.

☸️

Kubernetes Orchestration

Triton containers, Kafka consumers, analytics microservices. Auto-scaling GPU workloads. Rolling model deployments.

⚡

RAPIDS + CUDA Analytics

Large-scale stream analytics. Feature engineering. Multi-zone aggregation. GPU-accelerated ETL on live data.

🔀

Data Fusion Engine

Combines camera analytics + weather telemetry + SCATS signals + GPS transport streams → unified regional model.

📊

Observability Stack

Prometheus, Grafana, NVIDIA DCGM, OpenTelemetry, HPE InfoSight, Datadog APM. Full GPU and service telemetry.

🔐

NVIDIA Morpheus

AI-powered cybersecurity analytics. Anomalous network traffic detection. SIEM integration. IPSec tunnels.

⬇ InfiniBand 400 Gb/s · NVLink

🧠 Layer 4 — Central AI Cluster | NVIDIA DGX H100 + HPE GreenLake + Kubernetes

🤖

DGX H100 Cluster

8× H100 SXM5 80GB per node. NVLink 900 GB/s. InfiniBand fabric. TensorRT-LLM. Petaflop-scale inference.

🎭

Multimodal Transformer

Vision + Weather + GPS + Events + History. City-wide traffic prediction. ST-GCN at scale. Sensor fusion model.

📚

RAG Pipeline

FAISS / Milvus vector DB. Historical incidents, transport policies, maintenance records, compliance docs. Semantic search.

🗣️

NeMo Fine-tuned LLM

Llama 3 fine-tuned on transport domain data via LoRA. TensorRT-LLM optimised. Incident summarisation, NL reporting.

🔮

Prediction Engine

LSTM + ARIMA time-series forecasting. Congestion, delay, incident probability, demand planning up to 2 hours ahead.

⚙️

NeMo Guardrails

Safety rails on LLM outputs. Ensures only factual, policy-compliant advisories are issued to operators and public.

⬇ REST API · gRPC · WebSocket → Operator Dashboards · Public Apps · Signal Controllers

🌐 Layer 5 — Applications | Dashboards · Public APIs · Signal Control · Alerts

📺

Operator Dashboard

Real-time city traffic map. Incident alerts. Congestion heat maps. AI-generated advisories from NeMo LLM.

📱

Public Journey Planner

Predicted travel times, route recommendations, disruption alerts pushed to commuter apps via REST API.

🚦

Adaptive Signal Control

AI predictions feed back to SCATS controllers. Green-wave optimisation. Emergency vehicle pre-emption.

🚨

Incident Response

Auto-dispatch alerts with AI-generated incident summaries. Severity scoring. Nearest resource recommendation.

📊

Analytics & Reporting

Daily/monthly transport KPI reports. Planning data. Infrastructure investment justification. NL report generation via LLM.

🔌 Protocols & Networking Reference

Protocol / Technology	Layer	Purpose	Speed
RTSP	L0→L1	Real-time video streaming from IP cameras to Jetson	Up to 4K/60fps
MQTT	L0→L2	Lightweight IoT sensor telemetry (weather, GPS)	Low-latency
CUDA / TensorRT	L1	GPU-accelerated inference engine on Jetson & DGX	<3ms inference
Apache Kafka	L2	High-throughput event streaming backbone	Millions events/sec
RoCE v2 RDMA	L1→L2	GPU direct memory transfer, bypassing CPU	<3ms, zero CPU
NVLink	L4	GPU-to-GPU interconnect within DGX node	900 GB/s
InfiniBand 400Gb/s	L3→L4	Inter-node networking in DGX cluster	400 Gb/s
gRPC / HTTP/2	L3→L5	Triton Inference Server client protocol (KServe v2)	Low-latency
OpenAI REST API	L4→L5	NIM LLM endpoint — OpenAI-compatible /v1/chat/completions	<100ms TTFT
TLS 1.3 / mTLS	All	End-to-end encryption and mutual authentication	~0.5ms overhead
Protobuf / MessagePack	L1→L2	Compact binary serialisation of metadata events	~5× smaller than JSON

☁️ AWS Cloud Deployment (nvidia.viswanext.com)

📦 S3 Bucket: nvidiaviswanext

Static website hosting (index.html, learn.html, demos)
Event archive: events/{camera_id}/{timestamp}.json
Model artefacts: models/tensorrt/*.trt
Reports: reports/daily/YYYY-MM-DD.pdf
CloudFront CDN for low-latency global delivery

⚡ AWS Lambda Functions

POST /analyze-traffic → calls NIM LLM, saves to S3
POST /fdiplogin → user authentication JWT
GET /events/{camera} → query S3 event archive
POST /generate-report → NeMo LLM daily report
Trigger: API Gateway + Cognito for auth

🖥️ EC2 GPU Instances

g5.12xlarge (4× A10G) — NIM LLM serving
g5.48xlarge (8× A10G) — Triton Inference Server
p4d.24xlarge (8× A100) — NeMo training jobs
Auto-scaling group with GPU Operator on EKS
Spot instances for batch training workloads

🔐 Security & Networking

API Gateway → Lambda → VPC (NIM endpoints)
WAF rules on API Gateway
Secrets Manager for NGC API keys
IAM roles: least-privilege per Lambda function
CloudWatch logs + X-Ray tracing

Intelligent Transport MonitoringEnd-to-End NVIDIA AI Architecture

18,000+

<3ms

85%

6.4 TB/hr

5 Layers