Articles | 2021 Video Processing Pipeline

In this article, we'll be discussing the video pipeline used for the site's major LP projects beginning around quarter 4 of 2020. This article therefore deprecrates all previous articles, but they will be preserved for archival and educational purposes.

Editor's Note (2025) - I recall this article being nearly finished but it seems to be missing some context and possibly some updated notations from that era, as well as screenshots of the content I was using for comparisons and demonstrations. Since the source of some of the information and the testing material was originally the Drunk Swede who's long offed himself, and I still use this pipeline to present day otherwise, I am opting to publish it. I'll possibly make a modern pipeline one to standardize the publication in the future, for now this is being placed under the incomplete section as-is.

Premise

Although the years of 2019 and 2020 were tumultuous and tragic, efforts were still placed into the production of video content and the betterment of that production. Drunk Swedish people pursued research in the background, perhaps for tism reasons more than anything, and the results of their research were brought to the forefront and put under scrutiny several times throughout the years until finally, in late 2020, I decided to test the current rendition of an experimental x265 pipeline in the Shadow Warrior run. The results were significant when compared to prototype x264 proposals for the run at the time, and so I decided to, for now, adopt the x265 pipeline and see how it fares under more extensive workloads. That said, as of this writing, I have not been doing any existensive workloads, and my days of video casting are behind me, at least for a very long time. I'll be referencing mostly third party material for this article.

The following article entails my current pipeline, configurations, and details observed advantages/disadvantages to the setup.

x265

x265 is a relatively new codec that hasn't seen much use for the simple fact that, out of the box, it offers no advantages over x264, is heavier on encode times, and isn't supported by many trendy devices or services. When tested throughout its early life we never found any particular reason to use it over x264, as though it offered some benefits when compared to the ancient ancestor it often had significant warping and quality loss elsewhere. To make matters worse, newer versions of ffmpeg use dumber and dumber default settings, requiring more and more configuration to get acceptable results.

However, Sweedtism began to finally make headway on producing more reliable results with the codec. Currently, we're using x265 for LP productions, changing settings on a per-project basis, and observing the results.

Caveats

Before we get into actually discussing my current pipeline and showing comparison screenshots, there's a few upfront caveats we need to bring up right away.

Speed

Make no mistake, x265's encode speed is significantly worse than x264 by a factor of usually two fold at the minimum. This isn't an issue on my modern system, but the system I was using two years ago, where encode speeds averaged 2-3fps, x265 would have been ill advised to use for big projects.

Size

x265 can make use of lower quality CRF settings than x264 but the bitrate will often be larger, especially when we start making use of aq to push more bitrate into darker colors for data preservation.

Accuracy

x264 is superior for accuracy in very high quality CRF environments (such as those used in sprite based games at ~CRF12). x265 is not very scalable.

Introduction

When thinking in the context of producing large-scale video projects like those on GP, it's best to think of the comparison of x264 and x265 as a comparison of compression algorithms. Since videos are composed of two layers of data - Chroma and Luma - the compression of both is something we need to pay attention to. We're preserving the full resolution of our colors by disabling the extremely destructive, asinine behavior of Chroma Subsampling, but the Chroma still gets compressed by the actual codec and that behavior depends greatly on many factors, not just bitrate.

The Tradeoff of Compression

As discussed in earlier articles, one of the goals of our work in setting up our recording environment was to provide the encoder with as unpolluted a signal as possible. Using i444 recordings, easily obtained by changing a few lines in the OBS source code, using full-range, taking advantage of HDCP-stripping splitters for consoles, and encoding to an extremely low CRF of 11 or lower for virtually 1:1 quality with source. All of this is done to ensure that the encoder is working with the correct information.

More accurate data resulted in massive quality jumps with our existing quality settings. By changing some settings like psy-rd, we can adjust how bitrate budget is distributed afterwards. But, compression is still a tradeoff, and by compressing motion picture we're always going to be losing data - that's how compression works.

Now that our source is accurate, we can focus more on that actual compression itself. This is where the x264 and x265 comparisons can be best made, and the comparisons can sometimes be tricky.

We're balancing the load of both Chroma and Luma, and when we increase the bandwidth demands for one, either by improving accuracy with 10bit or i444, or using a noisier scene in general (like titles with hurrdef post processing), the codec will be strained when bandwidth demands increase. Either our bitrate will sky rocket, and henceforth the filesize, or we'll start seeing significant viewer-facing distortion. More often than not, both things happen with shitty services that utilize defaults across the board except for turning off features that cost processing time - like youtube. For the projects on this site, I've been steadily making more and more compromises, offering up bitrate in exchange of preserving more greivous offenders. In an age where most household PCs are burning untold terabytes of bandwidth on running twitch-based camwhore streams 24/7, adding a few gb onto a DDL release has become less and less of a concern for me.

Even so, to preserve the intense noise of some titles like Shadow Warrior in the existing setup was becoming rather difficult. The nauseating camera shaking, tone mapping and noise-based texturing across intensely contrasted colors made for worst-case compression scenarios and I was starting to look at upwards of 3-4gb per hour of video with the resulting quality not being that great. I contemplated using a Spline resize, but I didn't want to compromise this time around. x265 ultimately rescues this project; though the file sizes remained very lofty, the quality between the two encodes was substantially different.

Thanks to years of work on behalf of Canada's better half, Olaf Bearblower composited some javascript for us to copypaste to help better illustrate comparisons between images. Some worst-case scenario frames have been abducted from various sources and strung up in a fiesta for perusal.

Comparison A - DDDA

> Dragon's Dongma Comparison

Both codecs perform poorly in this scene filled with sharp contrasts at 27 CRF. Comparing either x264 or x265 to the source yields the extreme prices of compression.

Highlights on armor and the character's staff disappear.
The rightmost NPC's face blurs out to near indistinguishability.
Most of the texture detail on ground bricks and the table is lost.
Both codecs can't handle the wall on the left, warping bricks; Luma and Chroma suffer.

However, comparing x264 to x265 yields some very queer results.

The circular headpiece for the staff is more muddied in x264.
The wall is much darker and more desaturated in x264.
Dark noise introduced by x264 is much less apparent in x265, and heavily compressed sections overall are smoothed out more gracefully.
x264 preserves the table more faithfully than x265, but x265 preserves the box on the table more faithfully.
x264 preserves the face of the rightmost NPC more accurately and sharply, but at the cost of introducing significantly more artifacting.
While the background objects behind him get washed out badly by both codecs, x265 preserves the dark grunge details on the bathtub more accurately.
Teal and the dark staircase in front of him have more sharp highlights preserved in x264 but the noise is enormous compared to x265.

Comparisons yield that in this single scenario, x265 seems to be trading some noise and some smaller details for more color accuracy. Parts of the frames get smoothed out that we wished didn't, but the parts that get effected by huge shifts in color or brightness in x264 aren't nearly as fucked in x265. Both codecs still exhibit an extremely destructive warping behavior on dark lines and sharp contrasts, either annihilating highlights or causing masonry to turn into sticky spaghetti caught between two Ram Ranch cowboys in a barn.

It's worth noting that, in motion, I've noticed more active distortion caused by x265 in very fringe cases - like character weapons moving against bright backgrounds filled with high contrast darks and bright spots. A difficult scenario for both codecs, and x265 is far from a "solution" for many edge cases only visible with actual video.

Comparison B - DDDA 00_00_50

> Dragon's Dongma Comparison 2

A simple scene that features a prominent foreground Luma gradient from the character's lantern. Compared to the Original, we see both codecs perform the same things poorly.

Preservation of detail on the floorboards above the stairs is ineffective.
Both x264 and x265 struggle with the lighting shift from bright to dark, both on the left wall and floor.
Both codecs drop highlights on the character's weapon and the wooden beam to the right.

Again, comparing x264 to x265 yields more curry to the corn.

While both codecs introduce artifacting around the minimap, x264's is much more severe.
Both codecs introduce light pollution to the area surrounding the minimap, but again x264's is much more severe.
x264 introduces warping on the Star and Crescent minimap icon. x265 preserves it. Inshallah!
All of the compression in the forefront of the scene is significantly noisier in x264, not to speak of the Chroma shifting being worse as well. x265, at worst, introduces compression-related blurring that softens the wood, though x265 isn't able to retain the colors on the top step any better than x264, likely due to a lot of it being from higher contrast areas.
While both codecs soften the already low-resolution steps, x265 preserves texture detail better.
x264 loses details on many wood panels, especially on the vertical beams, including the illuminated sharp angled surface on the right one, either due to artifacting or due to its algorithm being unable to handle the motion. x265 blurs them at worst, resulting in visibly sharper and more accurate details compared to the original.
The wooden bench warps due to artifacting from x264 but only softens in x265.
The books on the bench make for an interesting comparison. x264 loses the open book's text and shadows for the bend in the upwards-facing pages, and the two stacked books warp considerably. x265 handles these more gracefully.
The crates next to the bench become grainy with noise in x264 but only soften in x265.

A comparison that favors the advantages of x265 heavily, overall the image irreversibly changes but x265 handles the change more smoothly, literally and figuratively. The things that neither codec can handle, like the top step, don't really make much sense in the context of the rest of the image. As complex a scene as it is, though, x265 seems to be able to handle the challenge favorably.

It's important to add that the video sizes for these comparisons are equivalent if not x265 actually being smaller. What's really making the difference is how x265 is allocating and distributing that bandwidth; x265 is showing a superior algorithm to noise management, especially with color and brightness ranges that traditionally give encoding a very rough time in the rear. That said, there's still plenty of cases both codecs can't handle which ultimately demand for more bandwidth/size or a better codec to be developed.

2021 Audio Editing Pipeline

To take advantage of the new Rode M2 XLR setup, I began recording in 32-bit and 24-bit formats and spent some time experimenting between different programs and configurations looking for the ideal setup. Ultimately, I ended up ditching Audition entirely for processing audio, over 20 years after I started using its first versions (cooledit).

I record on a low volume to discourage pops, but not so low volume I need to seriously muck with settings on Ventrillo. Once the audio is extracted out of videos, or dumped from videos for editing, they enter a Vegas project where I run three (3) sets of compressors on them.

The compressors, in order, perform the following tasks;

Vegas Compressor - A simple stock compressor that levels out the volumes a little bit. Since the Vegas stock compressor is not very good, we don't use it for anything more than sort of a foundation for the heavy lifters. The slow attack and release settings are to discourage bouncing.

Ozone Compressor 1 - This handles the volume rebalancing, using a Loudness Maximizer coupled with a single band compressor. This one uses very aggressive attack settings to discourage clipping, and the Loudness Maximizer actually reduces the final volume a bit.

Ozone Compressor 2 - De-Esser. It's modulated specifically to target my voice.

The results are sort of ho hum. While the audio is a lot smoother and the dynamic range is more consistent, it's still not perfect. Depending where I sit the audio easily gets overdriven too much for my liking, but even small setting changes can result in breaking the entire setup.