How SVT-AV1 Implements AV1 In-Loop Filters
This article provides an overview of how the SVT-AV1 (libsvtav1) encoder implements the three critical in-loop restorative filters defined in the AV1 specification: the Deblocking Filter (DBF), the Constrained Directional Enhancement Filter (CDEF), and the Loop Restoration (LR) filter. We will examine the sequential execution of these filters, the architectural design SVT-AV1 uses to parallelize their computation, and the algorithmic heuristics employed to balance encoding speed with visual quality.
The In-Loop Filtering Pipeline
In the AV1 specification, in-loop filtering is applied to reconstructed frames before they are placed into the Reference Frame Buffer to be used for predicting future frames. This process consists of three distinct filtering stages applied in a strict sequential order:
- Deblocking Filter (DBF): Attenuates blocking artifacts at transform unit and prediction unit boundaries.
- Constrained Directional Enhancement Filter (CDEF): Eliminates ringing and mosquito artifacts around sharp edges without blurring the edges themselves.
- Loop Restoration (LR) Filter: Uses Wiener filtering or Self-Guided restoration to restore fine details and high-frequency information lost during lossy compression.
SVT-AV1 implements this pipeline using a highly threaded, tile-based, and row-based parallel processing framework to maximize CPU utilization across modern multi-core processors.
1. Deblocking Filter (DBF) Implementation
The deblocking filter in SVT-AV1 is designed to smooth out artificial discontinuities created by block-based coding. It operates on both horizontal and vertical block boundaries, adjusting pixel values based on the transform size, quantization parameter (QP), and boundary strength.
Optimization and Parallelism
- SIMD Acceleration: SVT-AV1 relies heavily on hand-crafted AVX2, AVX-512, and ARM Neon SIMD assembly instructions to accelerate boundary filtering, which is highly repetitive and computationally expensive at the pixel level.
- Row-based Deblocking: To avoid waiting for an entire frame to finish reconstructing, SVT-AV1 processes deblocking on a row-by-row basis (often aligned with Superblock boundaries) as soon as the reconstruction dependencies for those rows are resolved.
2. Constrained Directional Enhancement Filter (CDEF) Implementation
CDEF is a highly effective tool in AV1 that identifies the primary direction of texture in 8x8 blocks and applies a directional low-pass filter to smooth out ringing artifacts along that direction.
Search and Parameter Selection
Evaluating every possible combination of CDEF damping and strength parameters across an entire frame is computationally prohibitive. SVT-AV1 optimizes this search through: * Fast Heuristics: The encoder estimates the optimal CDEF parameters based on frame-level and block-level variance, skip-mode flags, and quantization parameters. * Early Termination: If a block has low spatial variance or is determined to be flat, SVT-AV1 skips the directional search entirely. * SIMD Direction Search: The 8 directional search algorithms are highly vectorized, allowing the encoder to quickly calculate the sum of squared differences (SSD) for each direction to find the best match.
3. Loop Restoration (LR) Filter Implementation
The Loop Restoration filter is the final and most complex stage of the AV1 in-loop filtering pipeline. It can choose between two main filter types on a Restorational Unit (RU) basis (typically sized at 64x64, 128x128, or 256x256 pixels): * Wiener Filter: A 2D symmetric separable filter. * Self-Guided Restoration (SGR-Proj): A filter based on guided filtering using two different low-pass filtered images.
SVT-AV1 Heuristics and Speed Presets
Because Loop Restoration is highly intensive, libsvtav1 applies aggressive scaling based on the user-selected speed preset: * High-Quality Presets (Low Speed): SVT-AV1 performs an exhaustive RDO (Rate-Distortion Optimization) search across both Wiener and SGR-Proj filters for every Restoration Unit to find the mathematically optimal configuration. * Medium to Fast Presets (High Speed): The encoder uses pre-analysis data, such as block-level distortion and temporal activity, to bypass the evaluation of one or both filters. It may restrict the search to only Wiener filtering or disable Loop Restoration entirely for specific frames (e.g., non-reference frames) or spatial regions with low detail.
Threading and Pipelining in SVT-AV1
SVT-AV1 utilizes a multi-stage, multithreaded architecture where the in-loop filters are decoupled from the main encoding loop where possible.
Once a row of Superblocks completes reconstruction, it is immediately handed off to the DBF, followed by CDEF, and then LR. This “pipelined” design ensures that different threads can process different parts of the frame concurrently without waiting for the entire frame to be reconstructed, drastically reducing memory footprint and maximizing CPU cache locality.