STANAG 4609
KLV METADATA
STANAG 4609 KLV OVERVIEW
Why It Matters — Stanag Metadata
If Lisa 26 produces video in a proprietary format, no allied system can read it. STANAG 4609 is the NATO standard for embedding metadata in video. Any NATO-compatible C2 system — from a Swedish battalion HQ to a NATO AWACS — can ingest Lisa 26 video and immediately see the metadata: where the drone was, where it was looking, what it detected. This is interoperability. Without it, Lisa 26 is an island.
KLV Packet Structure
Each video frame carries a KLV metadata packet. The packet is a series of tag-length-value triplets. Key tags defined by MISB ST 0601:
| Tag | Name | Lisa 26 Source | Note |
|---|---|---|---|
| 2 | UNIX Timestamp | System clock (NTP via Starlink) | Microsecond precision |
| 13 | Platform Latitude | EKF3 or operator estimate | Degrees, WGS84 |
| 14 | Platform Longitude | EKF3 or operator estimate | Degrees, WGS84 |
| 15 | Platform Altitude | Barometer (BMP390) | Meters MSL |
| 16 | Platform Heading | EKF3 AHRS (gyro) | Degrees true |
| 17 | Platform Pitch | EKF3 AHRS | Degrees |
| 18 | Platform Roll | EKF3 AHRS | Degrees |
| 40 | Target Location Lat | Pixel-to-ground projection | Detection center |
| 41 | Target Location Lon | Pixel-to-ground projection | Detection center |
| 65 | Platform Designation | Drone ID string | e.g. "FPV-ALFA-01" |
Each video frame carries invisible data encoded in the transport stream according to the MISB Standard 0601 specification. The minimum required tags include sensor position, sensor attitude, target position, and timestamp. Optional tags provide additional context: wind speed, platform designation, security classification, and mission identification. Any NATO system capable of reading STANAG 4609 can decode and display these tags without custom software or format conversion.
The encoding process adds approximately 200 bytes per frame to the video transport stream — negligible compared to the video data itself but carrying critical operational context. Without metadata, a recorded video clip is just pixels: an analyst reviewing footage hours later cannot determine where the drone was, what direction the camera pointed, or what altitude it flew at. With embedded metadata, every single frame becomes a self-contained geospatial record that any NATO system can decode and display on a map without manual georeferencing.
Interoperability Through Metadata
Every NATO nation that fields ISR drones uses different video formats, different telemetry protocols, and different ground station software. Without a common metadata standard, sharing video between nations requires manual coordinate entry — the Swedish operator reads coordinates from their screen and types them into the Norwegian system. This process introduces transcription errors (transposing digits in an MGRS grid can place the target 1 km from its actual position) and takes 30-60 seconds per target. KLV metadata eliminates both problems.
With STANAG 4609 KLV embedded in the video stream, a Norwegian TAK operator receiving Fischer 26 video sees the target coordinates automatically overlaid on their map — no manual entry, no transcription errors, zero delay. The metadata travels with the video frames through any distribution system: satellite link, MANET relay, recorded SD card, or IP network. Even if the video is archived and reviewed days later, the coordinates remain embedded and accurate. This interoperability is the foundation of JEF combined drone operations — without KLV, multi-national drone coordination requires voice-relayed coordinates that are slow and error-prone.
Implementation on Jetson Orin Nano
KLV metadata encoding runs as a lightweight process on the Jetson Orin Nano alongside YOLOv8 and ORB-SLAM3. The encoder reads drone position and attitude from the MAVLink telemetry stream (30 Hz), reads target coordinates from the YOLOv8 detection output, and constructs a MISB ST 0601 compliant KLV packet for each video frame. Packet size: approximately 200 bytes per frame. At 30 FPS: 6 kB/s of metadata — negligible bandwidth compared to the compressed video stream. The KLV data is multiplexed into the H.264 video transport stream as a separate PES (Packetized Elementary Stream), following the MPEG-2 TS container specification. Any STANAG 4609 compliant video player decodes both video and metadata without modification.
Related Chapters
Implementation
# STANAG 4609 KLV Metadata Encoder
import struct
def encode_klv_packet(drone_lat, drone_lon, drone_alt,
cam_pitch, cam_roll, cam_yaw,
target_lat, target_lon, timestamp_us):
"""Encode MISB ST 0601 KLV metadata for drone video."""
klv = bytearray()
# Key: UAS Datalink Local Set (06 0E 2B 34)
klv += bytes([0x06, 0x0E, 0x2B, 0x34, 0x02, 0x0B, 0x01, 0x01,
0x0E, 0x01, 0x03, 0x01, 0x01, 0x00, 0x00, 0x00])
# Tag 2: Precision Time Stamp (microseconds since epoch)
klv += encode_tag(2, struct.pack(">Q", timestamp_us))
# Tag 13: Sensor Latitude (scaled int32, ±90°)
lat_scaled = int(drone_lat / 90.0 * 2147483647)
klv += encode_tag(13, struct.pack(">i", lat_scaled))
# Tag 14: Sensor Longitude
lon_scaled = int(drone_lon / 180.0 * 2147483647)
klv += encode_tag(14, struct.pack(">i", lon_scaled))
# Tag 15: Sensor True Altitude (meters, uint16 offset 900)
alt_scaled = int((drone_alt + 900) * 19.2)
klv += encode_tag(15, struct.pack(">H", alt_scaled))
# Tag 23/24: Target Location
klv += encode_tag(23, struct.pack(">i", int(target_lat/90*2147483647)))
klv += encode_tag(24, struct.pack(">i", int(target_lon/180*2147483647)))
return bytes(klv)
Derivation — KLV Bandwidth Budget
A KLV packet's size can be derived from the tags it carries. Each tag adds 2 bytes of framing (Tag + Length bytes) plus its payload. The MISB ST 0601 mandatory tags (for a UAV datalink local set) sum to approximately 150 bytes of payload; adding optional tags brings the typical footprint to 200 bytes. At a standard video frame rate of 30 fps, this yields:
Bandwidth = 200 bytes × 30 fps = 6,000 B/s = 48 kbps overhead per video stream. Over a tactical link capable of 2 Mbps H.264, this is 2.4% overhead — negligible. The derivation is monotonic: halving the frame rate to 15 fps halves the KLV bandwidth to 24 kbps. Compression is not possible because KLV metadata is already essentially incompressible (random-looking scaled integers), which is why it is transported as a sidecar PID in the MPEG-TS multiplex rather than embedded in the video elementary stream.
Worked Example — Decoding a Captured Stream
The following minimal decoder demonstrates how a NATO-certified receiver or ground station would parse an FSG-A Fischer 26 video stream. The example shows that no FSG-A-proprietary decoding is required — any tool that reads MISB ST 0601 can consume the stream identically.
import struct
def decode_klv_packet(klv_bytes):
"""Decode MISB ST 0601 KLV metadata back into readable fields."""
# Skip the 16-byte universal label
body = klv_bytes[16:]
idx = 0
fields = {}
while idx < len(body):
tag = body[idx]
length = body[idx + 1]
value = body[idx + 2: idx + 2 + length]
idx += 2 + length
if tag == 2: # Precision timestamp (microseconds since Unix epoch)
fields["timestamp_us"] = struct.unpack(">Q", value)[0]
elif tag == 13: # Sensor latitude (scaled int32, ±90°)
raw = struct.unpack(">i", value)[0]
fields["sensor_lat"] = raw * 90.0 / 2147483647
elif tag == 14: # Sensor longitude (scaled int32, ±180°)
raw = struct.unpack(">i", value)[0]
fields["sensor_lon"] = raw * 180.0 / 2147483647
elif tag == 15: # Altitude (uint16, offset 900 m)
raw = struct.unpack(">H", value)[0]
fields["sensor_alt_m"] = raw / 19.2 - 900
elif tag == 23: # Target latitude
raw = struct.unpack(">i", value)[0]
fields["target_lat"] = raw * 90.0 / 2147483647
elif tag == 24: # Target longitude
raw = struct.unpack(">i", value)[0]
fields["target_lon"] = raw * 180.0 / 2147483647
return fields
# Example decoding a captured packet
captured = b"..." # 200-byte KLV blob extracted from TS stream
decoded = decode_klv_packet(captured)
print(f"Drone at {decoded['sensor_lat']:.5f}°, {decoded['sensor_lon']:.5f}°")
print(f"Altitude: {decoded['sensor_alt_m']:.0f} m MSL")
print(f"Target at {decoded['target_lat']:.5f}°, {decoded['target_lon']:.5f}°")
Sources
Normative sources. NATO STANAG 4609 Ed.4: NATO Digital Motion Imagery Standard — published by NATO Standardization Office. MISB ST 0601: UAS Datalink Local Set — Motion Imagery Standards Board (motion-imagery.org). MPEG-TS container structure — ISO/IEC 13818-1. KLV tags (2, 13, 14, 15, 16, 17, 18, 40, 41, 65) and their definitions, including scaled-integer encodings, are specified by MISB ST 0601. Formal verification: the numerical claim on this page is verified in provable_claims.py (proof STANAG_4609_METADATA_SIZE — STANAG 4609 KLV metadata size calculation).
Mathematically verifiable estimates. 200 bytes per frame × 30 fps = 6 kB/s — simple arithmetic. Scaled-integer encoding (lat × 2147483647 / 90) — MISB standard.
Operational estimates — not validated in the field. The "less than 1 ms added latency" for KLV integration is a calculated estimate based on the algorithmic cost of struct.pack operations, not measured on Jetson Orin Nano. The "30–60 seconds per target" coordinate transcription time and the "1 km MGRS transposition error" are FSG-A estimates based on published analysis of command procedures. "Any NATO system can decode without modification" is true for NATO-certified systems, but not for experimental or national systems that may have divergent implementations.
External standards and references. NATO STANAG 4609 Ed.4: NATO Digital Motion Imagery Standard. MISB ST 0601: UAS Datalink Local Set (motion-imagery.org). GStreamer KLV muxer documentation. FFmpeg MISB metadata injection guide. FSG-A has not tested the KLV encoder on real video from a combat drone — the hardware is commercially available but the integration is conceptual.