How SVT-AV1 Handles Asynchronous Frame Submission
This article explains how the Scalable Video Technology for AV1 (SVT-AV1) encoder library handles asynchronous input frame submission through its programming API. It details the non-blocking design of the API, the relationship between the input and output queues, the core functions used for thread-safe execution, and how the pipeline is managed from initial frame submission to final stream flushing.
The Asynchronous Pipeline Architecture
SVT-AV1 is designed as a highly parallelized, multi-threaded software encoder. To maximize CPU utilization, its programming API decouples the input of raw video frames from the retrieval of compressed bitstream packets. This decoupling is achieved through an asynchronous, queue-based pipeline.
Instead of waiting for a single frame to be fully encoded before accepting the next one, the encoder allows the host application to continuously feed raw frames into an internal input queue while simultaneously pulling compressed packets from an output queue.
Core API Functions for Frame Handling
The asynchronous flow is controlled primarily by two key function calls:
svt_av1_enc_send_picture(): This function submits a raw input frame (in YUV format, wrapped in anEbBufferHeaderTypestructure) to the encoder’s input queue.svt_av1_enc_get_packet(): This function retrieves a compressed bitstream packet (wrapped in a similar buffer header) from the encoder’s output queue.
1. Non-Blocking Input Submission
When the application calls svt_av1_enc_send_picture(),
the SVT-AV1 library does not perform the actual video encoding on the
calling thread. Instead, it performs basic validation, associates the
frame with its metadata (such as timestamps and picture numbers), and
places the frame into an internal input FIFO (First-In, First-Out)
queue.
Once the frame is successfully queued, the function immediately
returns with a status code (typically EB_ErrorNone). This
allows the host application to continue decoding or generating the next
video frame without waiting for the heavy encoding computations to
finish.
If the internal queue is full—which happens if the encoder cannot
keep up with the rate of input submission—the function will return a
status indicating that the queue is full (such as
EB_NoErrorFifoFull). The application must then wait or call
svt_av1_enc_get_packet() to free up encoder resources
before attempting to send the frame again.
2. Internal Thread Pool Processing
Behind the API, SVT-AV1 manages a pool of worker threads. These threads constantly monitor the input queue. When frames become available, the worker threads pull them into the encoding pipeline.
Because AV1 utilizes complex Group of Pictures (GOP) structures, frames are often encoded out of display order (e.g., to process reference B-frames). The asynchronous architecture allows the encoder to ingest frames in display order, buffer them internally to analyze temporal dependencies, rearrange them for encoding, and finally output them in decoding order.
3. Asynchronous Packet Retrieval
The host application retrieves compressed AV1 data by calling
svt_av1_enc_get_packet(). * If a compressed packet is ready
in the output queue, the function populates the provided buffer and
returns immediately. * If no packet is ready yet (for example, if the
encoder is still buffering frames to build a GOP), the function returns
a status indicating that the queue is empty (such as
EB_NoErrorEmptyQueue).
This non-blocking return allows the application to perform other tasks, sleep briefly, or continue sending more raw frames to the encoder.
Managing the End of Stream (Flushing)
Because of the asynchronous buffering, frames remain inside the encoder’s pipeline when the host application reaches the end of the source video. To resolve this, SVT-AV1 handles the teardown phase through a systematic flush process:
- Sending the EOS Flag: The application submits a
final frame with the
flagsfield in the buffer header set to includeEB_BUFFERFLAG_EOS(End of Stream). - Emptying the Pipeline: Once the EOS flag is received, the encoder stops accepting new input frames. The internal worker threads continue processing all remaining buffered frames until the pipeline is empty.
- Retrieving Final Packets: The application continues
to call
svt_av1_enc_get_packet()in a loop. The final compressed packet returned by the encoder will also contain theEB_BUFFERFLAG_EOSflag, signaling to the application that all input frames have been successfully processed, encoded, and retrieved.