One of the big problems with video, especially when watching it online, is the amount of bandwidth it often takes up. This problem of bitrates is particularly a problem when it comes to things like live streaming and video conferencing. Researchers at Nvidia think they’ve found a way around the limitations of existing video streaming codecs, though, with the development of a new neural network engine.
The new engine works by completely ignoring every traditional video codec out there in order to bring the amount of bandwidth required for video streaming down to a fraction of what it might normally use with something like h.264.
The new technology is possible due to Nvidia Maxine, their cloud-AI video streaming platform for developers. Nvidia posted a video explaining the new technology to YouTube, which you can see above.
It essentially works by sending along the usual keyframes like you’d expect with h.264, but instead of generating whole in-between frames and sending those down the pipe, as would happen with h.264, it creates a sort of mask of the subject, focusing on key parts of their face. The movements of these key locations is then sent instead of image data for each frame, and the recipient sees an AI-reconstructed image.
This means that the bandwidth is drastically reduced as only a relatively tiny data set is sent to the recipient instead of entire images or blocky chunks of changed pixel data.
Nvidia says that this technology also offers some other advantages for video conferencing, too, such as being able to reposition the subject’s head to face the camera. For most of us, looking directly at the camera isn’t possible. Our cameras are usually either on top of the monitor that’s displaying the other side of the conversation or they’re off to the side – so we’re not looking directly at the camera. The new Nvidia technology solves that problem, by helping to make sure subjects are looking at each other while they talk.
And, naturally, because it’s able to understand the shape and structure of your face, you’re able to swap yourself out for things like virtual characters – although that isn’t really anything new. My phone’s been able to do that for a couple of years already now.
The technology is still in its early days, and you can see that there are some issues with this technique in the video up top. When you compare the quality vs the amount of bandwidth used, though, there’s really no comparison when it comes to something like video conferencing.
As time progresses and the technology gets more advanced and more reliable and cameras and screens keep increasing in resolution, this or a similar technology could potentially become the key to streaming super high resolution and high frame rate data of all types into our homes without having to go through major hardware upgrades to increase the available bandwidth.