SVT-AV1 1D vs 2D Scalability Differences Explained

This article explains the difference between the 1D (one-dimensional) and 2D (two-dimensional) scalability features offered by the libsvtav1 encoder. We will explore how these multi-threading architectures partition video encoding workloads, their impact on CPU utilization, and how they determine how well the encoder scales across different processor core counts.

What is Scalability in SVT-AV1?

In the context of the Scalable Video Technology for AV1 (SVT-AV1) encoder, “scalability” refers to the software’s ability to split the encoding workload into parallel tasks that can run simultaneously across multiple CPU cores and threads. To achieve high performance, SVT-AV1 uses a highly parallelized architecture. The encoder offers two primary modes for scaling this workload: 1D scalability and 2D scalability.

1D Scalability: Frame-Level Parallelism

1D scalability parallelizes the encoding process along a single dimension, which is primarily temporal (frame-level).

In this mode, the encoder processes multiple video frames at the same time. While Frame 1 is being processed, Frame 2 and Frame 3 are also being encoded on separate threads.

2D Scalability: Joint Frame and Row-Level Parallelism

2D scalability overcomes the limitations of 1D scalability by parallelizing the workload along two dimensions simultaneously: temporally (across frames) and spatially (within a single frame).

Instead of assigning a single frame to a single thread, 2D scalability breaks individual frames down into smaller, independent spatial segments—specifically, rows of Superblocks (LCUs) or tiles.

Summary of Differences

Feature 1D Scalability 2D Scalability
Dimension of Parallelism 1 Dimension (Temporal / Frame-level) 2 Dimensions (Temporal + Spatial / Row-level)
Target Hardware Consumer CPUs (4 to 16 cores) Workstations and Servers (32+ cores)
Efficiency at High Core Counts Poor (threads sit idle waiting for frame references) Excellent (threads work on sub-frame rows)
Memory Overhead Lower Slightly higher due to tracking sub-frame dependencies