Video Processing on Quantum Computers - $15
Date: April 3, 2024Topics: 2024 BEITC Proceedings, Striving for Efficiency in Video TechnologyQuantum computing is a multidisciplinary field comprising aspects of computer science, physics, and mathematics that utilizes quantum mechanics to solve complex problems faster than on classical computers. Quantum computers are available today on the cloud, although they are considered “Noisy Intermediate-Scale Quantum” (NISQ) computers with a small number of quantum bits (qubits) with limited performance due to short coherence time and noisy gates. However, quantum computers are improving all the time, and it is possible that in the future they could provide acceleration to video processing workflows. This presentation will give a short overview of quantum computing basics, some methods for representing images in qubits, and describe some of the research on potential video applications of quantum computing.
Thomas Edwards | Amazon Web Services | Seattle, Wash., United States
Vision and Language Models for Enhanced Archive Video Management - $15
Date: March 21, 2025Topics: 2025 BEITC Proceedings, AI Applications: Sports, Newsrooms and ArchivesArchival video collections contain a wealth of historical and cultural information. Managing and analyzing this data can be challenging due to the lack of metadata and inconsistent formatting across different sources. In particular, identifying and separating individual stories within a single archived tape is critical for efficient indexing, analysis and retrieval. However, manual segmentation is time-consuming and prone to human error. To address this challenge, we propose a novel approach that combines vision and language models to automatically detect transition frames and segment archive videos into distinct stories. A vision model is used to cluster frames of the video. Using recent robust automatic speech recognition and large language models, a transcript, a summary and a title for the story are generated. By leveraging computed features from the previous transition frames detection, we also propose a fine-grained chaptering of the segmented stories. We conducted experiments on a dataset consisting of 50 hours of archival video footage. The results demonstrated a high level of accuracy in detecting and segmenting videos into distinct stories. Specifically, we achieved a precision of 93% for an Intersection over Union threshold set at 90%. Furthermore, our approach has shown to have significant sustainability benefits as it is able to filter and remove approximately 20% of the content from the 50 hours of videos tested. This reduction in the amount of data that needs to be managed, analyzed and stored can lead to substantial cost savings and environmental benefits by reducing energy consumption and carbon emissions associated with data processing and storage.
Khalil Guetari, Yannis Tevissen, Frederic Petitpont | Moments Lab Research | Boulogne-Billancourt, France
VVC Broadcast Deployment Update - $15
Date: March 21, 2025Topics: 2025 BEITC Proceedings, Making ATSC 3.0 Better than EverThe VVC video coding standard was finalized in July 2020, and is thus almost five years old. In light of this approaching anniversary, VVC’s deployment status is reviewed. Major milestones in VVC deployment are detailed, and a comparison with the preceding HEVC codec are provided at an equivalent point in time.
Justin Ridge | Nokia | Dallas, Tx., United States
Lukasz Litwic | Ericsson | Warsaw, Poland
Watson Captioning Live: Leveraging AI for Smarter, More Accessible Closed Captioning - $15
Date: April 26, 2020Topics: 2020 BEITC Proceedings, Using Artificial Intelligence for Closed CaptioningThe requirements for closed captioning were established more than two decades ago , but many broadcasters still struggle to deliver accurate, timely, and contextually-relevant captions. Breaking news, weather, and entertainment programming often feature delayed or incorrect captions, further
demonstrating that there is great room for improvement. These shortcomings lead to a confusing viewing experience for the nearly 48 million Americans with hearing loss and any other viewers who need captioning to fully digest content.Committed to transforming broadcasters? ability to provide all audiences with more impactful viewing experiences, IBM Watson Media launched Watson Captioning Live , a trainable, cloud-based solution producing accurate captions in real-time to ensure audiences have equal access to timely and vital
information. Combining breakthrough AI technology like machine learning models and speech recognition, Watson Captioning Live redefines industry captioning standards.The solution uses IBM Watson Speech to Text API to automatically ingest and transcribe spoken words and audio within a video. Watson Captioning Live is trained to automatically recognize and learn from data updates to ensure timely delivery of factually accurate captions. The product is designed to learn over time to increase its long-term value proposition for broadcast producers.
This paper will explore how IBM Watson Captioning Live leverages AI and machine learning technology to deliver accurate closed captions at scale, in real-time? to make programming more accessible for all.
Brandon Sullivan | The Weather Company Solutions | Austin, TX, USA
What the Future Holds for Content Protection with CDN Edge - $15
Date: April 23, 2022Topics: 2022 BEITC Proceedings, OTT 2: Open Caching and the Network EdgeOTT/D2C services seek to enable fast and effective watermarking at a lower cost to better prevent piracy. This session will outline how to tackle this need by encrypting content at the CDN’s Edge.
Gwendal Simon | Synamedia | Rennes, France
Lionel Carminati | Synamedia | Rennes, France
Gwenaël Doërr | Synamedia | Rennes, France
Alain Durand | Synamedia | Rennes, France