Faculty Recruiting Support CICS

Deep Learning for Immersive Media Delivery

08 May
Add to Calendar
Wednesday, 05/08/2024 12:00pm to 2:00pm
CS 203 & Zoom
PhD Dissertation Proposal Defense
Speaker: Lingdong Wang

Immersive media refers to digital content that provides immersive and interactive experiences under Extended Reality (XR) scenarios. Although immersive media has the potential to revolutionize future communication and entertainment, the delivery of such content remains an unsolved challenge. Compared with 2D regular videos, 3D volumetric videos can incur magnitudes of higher bandwidth consumption and processing latency. As a result, traditional streaming systems that rely solely on networking resources are unable to distribute immersive media over existing network infrastructure. In this thesis, we aim to overcome this communication bottleneck with extra computational resources, specifically through the application of deep-learning (DL) techniques.
 
Recent advances in DL have made it possible to enhance and represent media content with superior visual quality. For example, neural enhancement methods like super-resolution can improve the quality of a degraded video, while neural compression can achieve a better rate-distortion trade-off than traditional codecs. Given these techniques, we propose a practical immersive media delivery system that transmits videos from the server to the client at a low quality to improve efficiency, then recovers the video quality by DL at the client.
 
We organize the thesis as follows. Firstly, we investigate what DL methods to apply for immersive media. We propose an efficient video super-resolution model that leverages eye tracking, a functionality commonly available in XR devices. Our method performs DL operation in adaptation to human vision and allocates computational resources optimally via convex optimization. Next, we study how to apply DL in immersive media delivery. We propose a control algorithm that jointly utilizes networking and computational resources to maximize the user's Quality of Experience (QoE). Our algorithm achieves a theoretical performance guarantee based on Lyapunov optimization. And its performance is verified in large-scale simulations and real-world systems. In the last piece of the proposal, we present our future plan of implementing a volumetric video streaming system. We will begin with creating volumetric representations and compression techniques based on DL. Then we will address the resource allocation problems associated with these representations, such as adaptive streaming and foveated rendering.
 

Advisor: Ramesh Sitaraman

Join via Zoom