How SVT-AV1 Handles Asynchronous Frame Submission

This article explains how the Scalable Video Technology for AV1 (SVT-AV1) encoder library handles asynchronous input frame submission through its programming API. It details the non-blocking design of the API, the relationship between the input and output queues, the core functions used for thread-safe execution, and how the pipeline is managed from initial frame submission to final stream flushing.

The Asynchronous Pipeline Architecture

SVT-AV1 is designed as a highly parallelized, multi-threaded software encoder. To maximize CPU utilization, its programming API decouples the input of raw video frames from the retrieval of compressed bitstream packets. This decoupling is achieved through an asynchronous, queue-based pipeline.

Instead of waiting for a single frame to be fully encoded before accepting the next one, the encoder allows the host application to continuously feed raw frames into an internal input queue while simultaneously pulling compressed packets from an output queue.

Core API Functions for Frame Handling

The asynchronous flow is controlled primarily by two key function calls:

  1. svt_av1_enc_send_picture(): This function submits a raw input frame (in YUV format, wrapped in an EbBufferHeaderType structure) to the encoder’s input queue.
  2. svt_av1_enc_get_packet(): This function retrieves a compressed bitstream packet (wrapped in a similar buffer header) from the encoder’s output queue.

1. Non-Blocking Input Submission

When the application calls svt_av1_enc_send_picture(), the SVT-AV1 library does not perform the actual video encoding on the calling thread. Instead, it performs basic validation, associates the frame with its metadata (such as timestamps and picture numbers), and places the frame into an internal input FIFO (First-In, First-Out) queue.

Once the frame is successfully queued, the function immediately returns with a status code (typically EB_ErrorNone). This allows the host application to continue decoding or generating the next video frame without waiting for the heavy encoding computations to finish.

If the internal queue is full—which happens if the encoder cannot keep up with the rate of input submission—the function will return a status indicating that the queue is full (such as EB_NoErrorFifoFull). The application must then wait or call svt_av1_enc_get_packet() to free up encoder resources before attempting to send the frame again.

2. Internal Thread Pool Processing

Behind the API, SVT-AV1 manages a pool of worker threads. These threads constantly monitor the input queue. When frames become available, the worker threads pull them into the encoding pipeline.

Because AV1 utilizes complex Group of Pictures (GOP) structures, frames are often encoded out of display order (e.g., to process reference B-frames). The asynchronous architecture allows the encoder to ingest frames in display order, buffer them internally to analyze temporal dependencies, rearrange them for encoding, and finally output them in decoding order.

3. Asynchronous Packet Retrieval

The host application retrieves compressed AV1 data by calling svt_av1_enc_get_packet(). * If a compressed packet is ready in the output queue, the function populates the provided buffer and returns immediately. * If no packet is ready yet (for example, if the encoder is still buffering frames to build a GOP), the function returns a status indicating that the queue is empty (such as EB_NoErrorEmptyQueue).

This non-blocking return allows the application to perform other tasks, sleep briefly, or continue sending more raw frames to the encoder.

Managing the End of Stream (Flushing)

Because of the asynchronous buffering, frames remain inside the encoder’s pipeline when the host application reaches the end of the source video. To resolve this, SVT-AV1 handles the teardown phase through a systematic flush process:

  1. Sending the EOS Flag: The application submits a final frame with the flags field in the buffer header set to include EB_BUFFERFLAG_EOS (End of Stream).
  2. Emptying the Pipeline: Once the EOS flag is received, the encoder stops accepting new input frames. The internal worker threads continue processing all remaining buffered frames until the pipeline is empty.
  3. Retrieving Final Packets: The application continues to call svt_av1_enc_get_packet() in a loop. The final compressed packet returned by the encoder will also contain the EB_BUFFERFLAG_EOS flag, signaling to the application that all input frames have been successfully processed, encoded, and retrieved.