SVT-AV1 1D vs 2D Scalability Differences Explained

This article explains the difference between the 1D (one-dimensional) and 2D (two-dimensional) scalability features offered by the libsvtav1 encoder. We will explore how these multi-threading architectures partition video encoding workloads, their impact on CPU utilization, and how they determine how well the encoder scales across different processor core counts.

What is Scalability in SVT-AV1?

In the context of the Scalable Video Technology for AV1 (SVT-AV1) encoder, “scalability” refers to the software’s ability to split the encoding workload into parallel tasks that can run simultaneously across multiple CPU cores and threads. To achieve high performance, SVT-AV1 uses a highly parallelized architecture. The encoder offers two primary modes for scaling this workload: 1D scalability and 2D scalability.

1D Scalability: Frame-Level Parallelism

1D scalability parallelizes the encoding process along a single dimension, which is primarily temporal (frame-level).

In this mode, the encoder processes multiple video frames at the same time. While Frame 1 is being processed, Frame 2 and Frame 3 are also being encoded on separate threads.

How it works: The workload is distributed linearly across the video timeline. Each thread is assigned a full frame to encode.
Limitations: 1D scalability is highly dependent on the look-ahead window and the Group of Pictures (GOP) structure. Because frames have temporal dependencies (e.g., B-frames rely on I-frames and P-frames), threads must frequently wait for reference frames to finish encoding.
Best use case: This mode works well on consumer-grade CPUs with lower core counts (typically 4 to 16 cores). Beyond this point, adding more threads to 1D scalability yields diminishing returns because of dependency bottlenecks.

2D Scalability: Joint Frame and Row-Level Parallelism

2D scalability overcomes the limitations of 1D scalability by parallelizing the workload along two dimensions simultaneously: temporally (across frames) and spatially (within a single frame).

Instead of assigning a single frame to a single thread, 2D scalability breaks individual frames down into smaller, independent spatial segments—specifically, rows of Superblocks (LCUs) or tiles.

How it works: While multiple frames are still processed in parallel (the first dimension), each of those frames is also sliced horizontally and vertically into rows or tiles that are encoded by multiple threads at the same time (the second dimension). Wavefront Parallel Processing (WPP) is often utilized here to allow rows to be encoded with a slight offset.
Advantages: This multi-dimensional approach drastically reduces thread idle time. If a thread is waiting for a new frame to become available, it can instead help encode a row of a frame that is already in progress.
Best use case: 2D scalability is designed for high-end workstation and server processors with massive core counts (32 to 128+ threads). It ensures that heavy multi-core systems achieve near 100% CPU utilization without running into the dependency bottlenecks inherent to pure frame-level parallelization.

Summary of Differences

Feature	1D Scalability	2D Scalability
Dimension of Parallelism	1 Dimension (Temporal / Frame-level)	2 Dimensions (Temporal + Spatial / Row-level)
Target Hardware	Consumer CPUs (4 to 16 cores)	Workstations and Servers (32+ cores)
Efficiency at High Core Counts	Poor (threads sit idle waiting for frame references)	Excellent (threads work on sub-frame rows)
Memory Overhead	Lower	Slightly higher due to tracking sub-frame dependencies