How SVT-AV1 Presets Dynamically Optimize AV1 Encoding
This article explains how the SVT-AV1 encoder
(libsvtav1) dynamically manages its extensive array of AV1
coding tools across different preset levels. We will examine the
mechanics of preset-based configuration, how the encoder uses runtime
content analysis to bypass or enable specific tools, and the trade-offs
made between computational complexity and compression efficiency from
Preset 0 through Preset 13.
The Role of Presets in SVT-AV1
SVT-AV1 uses a numerical preset system ranging from 0 (maximum efficiency, slowest encoding) to 13 (fastest encoding, lowest efficiency). Rather than requiring users to manually toggle dozens of individual AV1 coding tools—such as block partitioning, motion estimation search patterns, and in-loop filters—the encoder groups these tools into predefined configurations.
As you move from Preset 0 to Preset 13, libsvtav1
dynamically scales back the search depth, disables complex prediction
modes, and swaps exhaustive mathematical evaluations for faster
heuristic approximations.
Runtime Content Analysis and Dynamic Bypassing
The defining characteristic of SVT-AV1 is its ability to adjust coding tool usage on the fly, even within a single preset. It achieves this through a multi-stage pipeline that utilizes Resource Coordination and Picture Decision modules:
- Spatial and Temporal Variance Analysis: Before actual encoding begins, the encoder analyzes the input frames to determine spatial complexity (flat vs. textured areas) and temporal complexity (low vs. high motion).
- Dynamic Thresholding: If a frame or a block of pixels has low temporal variance (minimal motion), the encoder dynamically disables advanced inter-prediction tools for that block, regardless of the preset.
- RDO (Rate-Distortion Optimization) Simplification: Rate-Distortion Optimization is the most CPU-intensive part of encoding. SVT-AV1 uses early-termination algorithms. If a fast mathematical approximation—such as SATD (Sum of Absolute Transformed Differences)—indicates that a certain block partition or prediction mode will not yield a significant compression benefit, the encoder dynamically skips the full RDO calculation for that tool.
How Specific Coding Tools Scale Across Presets
To understand how libsvtav1 balances speed and quality,
we can look at how key AV1 coding tools are dynamically throttled across
different preset levels:
1. Block Partitioning (NSQ Search)
AV1 allows blocks to be partitioned from \(128\times128\) down to \(4\times4\) pixels, including non-square partitions (NSQ) like \(1:2, 2:1, 1:4,\) and \(4:1\). * Low Presets (0–3): SVT-AV1 performs an exhaustive search of all square and non-square partition sizes to find the absolute mathematically optimal structure. * Medium Presets (4–8): The encoder dynamically restricts NSQ searches based on parent-block depth and spatial variance. If a parent block is relatively homogeneous, the encoder skips smaller non-square partitions. * High Presets (9–13): NSQ partitioning is completely disabled. The encoder only searches square partitions (\(1:1\)) and aggressively limits the maximum search depth to speed up block allocation.
2. Motion Estimation (ME) and Search Patterns
Determining how blocks move between frames is highly
resource-intensive. * Search Range Scaling: Lower
presets use wide search ranges and complex search patterns (like
Exhaustive or Diamond search) to find the best motion vectors. Higher
presets reduce the pixel search radius and switch to highly localized
projection search patterns. * Compound Prediction: AV1
supports compound inter-prediction (using two reference frames
simultaneously). libsvtav1 dynamically limits the selection
of compound reference pairs at higher presets, choosing to evaluate only
the most likely candidates based on temporal distance.
3. In-Loop Filtering
AV1 features three in-loop filters: the Deblocking Filter (DBF), the Constrained Directional Enhancement Filter (CDEF), and the Loop Restoration (LR) filter. * CDEF and Loop Restoration: In high-quality presets, the encoder tests all available CDEF filter directions and Loop Restoration modes (Wiener and Self-Guided) to find the optimal settings. At faster presets, SVT-AV1 dynamically reduces the search space by choosing filter parameters based on frame-level statistics, or it may disable Loop Restoration entirely while keeping a simplified CDEF pass active.
4. Quantization and Transform Decisions
- TX Search (Transform Size and Type): AV1 supports multiple transform types (DCT, ADST, Flip-ADST, Identity). Low presets perform a full search across all transform types and sizes. Medium to high presets dynamically restrict the transform sizes to match the partition size directly, skipping alternative transform type evaluations entirely.