SET INTERNATIONAL JOURNAL OF BROADCAST ENGINEERING https://revistas.set.org.br/ijbe <div class="edgtf-st-title-holder"> <h2 class="edgtf-st-title"><strong>SET INTERNATIONAL JOURNAL OF BROADCAST ENGINEERING</strong></h2> </div> <div class="edgtf-separator-holder clearfix edgtf-separator-center"> <div class="edgtf-separator"><span style="font-size: 0.875rem;">The </span><strong style="font-size: 0.875rem;">SET-IJBE, the SET International Journal of Broadcast Engineering</strong><span style="font-size: 0.875rem;"> is an open access, peer-reviewed article-at-a-time publication international scientific journal whose objective is to cover knowledge about communications engineering in the field of broadcasting. The SET IJBE seeks the latest and most compelling research articles and state-of-the-art technologies.</span></div> </div> <p><strong>ISSN (Print): 2446-9246<br /></strong><strong>ISSN (On-line): 2446-9432</strong></p> <p>Impact factor: under attribution</p> <p><strong>On-line version</strong> - Once an article is accepted and its final version approved by the Editorial Board, it will be published immediately on-line on a one article-at-a-time basis</p> <p><strong>Printed version</strong> - Once a year, all articles accepted and published on-line over the previous twelve months will be compiled for publication in a printed version.</p> SET - Brazilian Society of Television Engineering, or, in Portuguese, SET - Sociedade Brasileira de Engenharia de Televisão. en-US SET INTERNATIONAL JOURNAL OF BROADCAST ENGINEERING 2446-9246 <p>Copyright Transfer Agreement – Cover Letter<br>The Copyright Transfer Agreement – Cover Letter must be submitted together with the article.<br>The Corresponding Author must, on behalf of all co-authors, complete all the required information, check the boxes, print=, SIGN and scan the (signed) document.<br>The Copyright Transfer Agreement – cover Letter must also be forwarded in PDF format. Template available at:</p> <p><a href="http://www.set.org.br/wp-content/uploads/2016/03/Copyright-Transfer-Agreement.docx">download template</a></p> Audience and Complexity Aware Orchestration https://revistas.set.org.br/ijbe/article/view/239 <p>Video encoding services are known to be computationally intensive ‎[2]. In a software environment, it is desirable to be able to adapt to the available computing resources. Therefore, modern live video encoders have the “elasticity” feature. That is, their complexity adapts automatically to the number and capabilities of available CPU cores. In other words, the more CPU are allocated to a live video encoder, the higher the encoding performance. Until recently, the elasticity feature was used as an ad-hoc adaptation to uncontrollably varying conditions. In ‎[1], authors are going a step further by developing a mechanism which allows seamless CPU reallocation in a Kubernetes cluster pod. Therefore, it becomes possible to control the trade-off between coding efficiency and CPU power consumption, by deciding which amount of CPU to allocate to a given live channel. This process is called dynamic orchestration.</p> <p>Another noteworthy feature has been demonstrated in ‎[1]: not every content needs the same amount of CPU. There is a relationship between content characteristics and encoder algorithmics convergence time. Some contents need more CPU than others to achieve equivalent coding performance. Given a finite amount of CPU cores and several video channels, the content adapted dynamic orchestration process allocates a variable number of CPU cores to each channel, depending on each channel content characteristics. The content characteristics are computed in real-time in the lookahead part of each channel encoder and transmitted to the orchestrator. CPU allocation is updated continuously, thus keeping the CPU usage optimal. An experiment is provided where a set of live channels with the same encoder configuration are running on a server. The total number of CPU cores to be allocated to the channels is fixed. The content complexity of every channel is varying, and channels are generally different from each other’s. Up to 13% of bitrate reduction, while having the same video quality, was demonstrated in the dynamic allocation mode comparing to a static mode, where every channel gets the same CPU allocation. Another setup is implemented with the same channels but with the target to reduce the CPU usage. Dynamic allocation mode used 30% less CPU cores to reach the same video quality and channel bitrate than static mode.</p> <p>Although having optimized the video headend computing resources consumption is a great success, it is only a part of the picture. In an OTT live streaming scenario, the live channels are conveyed from the content provider origin server to the end users often through Content Delivery Networks (CDN) using a unicast HTTP streaming protocol (Apple HLS, MPEG DASH, …). Thus, depending on the network topology and the number of viewers for each channel, an overall network load is generated. This network load has several consequences. In addition to network occupancy and energy consumption, the streaming cost increases with the generated load to be transmitted to the viewers, the source of the cost can be the CDN provider or the cloud bandwidth cost, for public cloud based applications.</p> <p>It is proposed in this paper to complement the content complexity aware orchestration by also considering the audience measurements. Thus, a channel allocation update does not only depend on the content complexity, but also on number of viewers for each channel.</p> <p>First, the audience aware model interest is demonstrated theoretically. Considering two channels having about the same content complexity, but with one having more viewers than the other. The orchestration algorithm will allocate more resources to the popular channel in order increase the encoding performance, hence reducing the stream bitrate while keeping the same video quality, so viewers can fetch lighter video profiles and experience better quality of experience (QoE). The second channel will be allocated less CPU cores, thus the bitrate will increase slightly, but since it’s watched by less viewers than the first, the overall network load will be decreased.</p> <p>As already mentioned, the different channels generally exhibit different characteristics or complexities. Moreover, the channel complexity may change significantly over time. It is advocated in this paper that optimal CPU allocation is a function of both content characteristics and network load, the network load being a byproduct of the number of viewers per channel and the network topology. A cost function is proposed, as well as an algorithm to perform cost minimization.</p> <p>A second set of experiments considers a realistic use case. A set of live channels with different contents are running in a server, each channel having different numbers of viewers. The orchestrator gets the measurements in real time, both from the network, through a real-time analytics service ‎[3], and the encoders, then computes and applies the optimal allocation. A reduction of the overall bitrate compared to a uniform static allocation is demonstrated, while preserving the video quality. Finally, a discussion concludes the paper.</p> abdelmajid moussaoui Thomas Guionnet Mickael Raulet Copyright (c) 2023 SET INTERNATIONAL JOURNAL OF BROADCAST ENGINEERING http://creativecommons.org/licenses/by-nc-nd/4.0 2022-08-17 2022-08-17 8 8 8 Advances in video compression: a glimpse of the long-awaited disruption https://revistas.set.org.br/ijbe/article/view/238 <p>The consumption of video content on the internet is increasing at a constant pace, along with an increase of video quality. Cisco [1] estimates that by 2023, two-thirds (66 percent) of the installed flat-panel TV sets will be UHD, up from 33 percent in 2018. The bitrate for 4K video is more than double the HD video bitrate, and about nine times more than SD bitrate. As an answer to the ever-growing demand for high quality video, compression technology improves steadily. Video compression is a highly competitive and successful field of research and industrial applications. Billions of people are impacted, from TV viewers and streaming addicts to professionals, from gamers to families. Video compression is used for contribution, broadcasting, streaming, cinema, gaming, video-surveillance, social networks, videoconferencing, military, you name it.</p> <p>The video compression field stems from the early 80’s. Since then, it has grown continuous improvements, and strong attention from the business side - the video encoder market size is planned to exceed USD 2.2 Billion by 2025 [2]. Among the many well-known milestones of the video compression, MPEG-2 has been a tremendous success in the 90’s, and the enabler of digital TV. MPEG-2 has been present on cable TV, satellite TV, DVD. In the early 2000’s, AVC/H.264 has been a key component of HD TV, on traditional networks as well as on internet and mobile networks. AVC/H.264 is also used in HD Blu-Ray discs. Ten years later, in the 2010’s, HEVC (H.265) has been the enabler of 4k/UHD, HDR and WCG. Finally, VVC (H.266) has been issued in 2020. Although it is a young codec, not yet widely deployed, it is perceived as an enabler for 8k [3] and as a strong support for the ever-growing demand for high quality video over the internet.</p> <p>Each codec generation allows decreasing the bitrate approximately by a factor two. This comes however at the cost of increased complexity. The reference VVC encoder is about 10 times more complex than the reference HEVC encoder. Interestingly, the technology does not change radically between codec generations. Instead, the same principles and ideas are re-used and pushed further. Of course, there are new coding tools, but the overall structure remains the same. Let us consider a simple example: Intra prediction mode, which consists in encoding a block of a frame independently from previous frames. In MPEG-2, intra block coding is performed without prediction from neighboring blocks. In AVC/H.264, intra blocks are predicted from neighboring blocks, with 9 possible modes. In HEVC, the prediction principle is reconducted, with 35 possible modes, while VVC is pushing further to 67 possible prediction modes. Having more prediction modes allows better predictions, hence better compression (even though mode signaling cost increases), at the cost of more complexity for the encoder which must decide among a larger set of possibilities.</p> <p>The encoding structure we are dealing with is the block-based hybrid video coding scheme. One natural question which arises is how far we can push this model. In other words, can we improve steadily the compression performance of this model decades after decades, by pushing the parameters and adding more local coding tools, or are we converging to a limit? At each codec generation, the question has been raised, and answered by the next generation. People have tried to propose new competing models. For example, in the early 2000’s, 3D motion compensated wavelet filtering was studied as a mean of compacting efficiently video energy [4]. The technology was promising, but never surpassed the emerging AVC/H.264 at that time.</p> <p>Nowadays, the recognized industry benchmark in terms of video compression performance is VVC. Can we go beyond the VVC performance? Well, the answer is already known, and it is yes. Indeed, the JVET standardization group, which is responsible for VVC, is currently conducting explorations. The Ad-Hoc Group 12 (AHG12) is dedicated to the enhancement of VVC. Around 15% coding efficiency gains are observed, only two years after VVC finalization [5]. So, we may continue the process for at least another decade.</p> <p>However, there is a new contender arising: artificial intelligence; or more precisely, machine learning, or deep learning. In another Ad-Hoc Group, AHG11, JVET is exploring how machine learning can be the basis of new coding tools. This also brings coding efficiency gains of about 12% [6]. Hence the question: will the future of video compression include machine learning? At this stage, we would like to point-out two new facts.</p> <p>First, considering the “traditional” methods explored by AHG12, there is a coding tool which seems to stop bringing gains: frame partitioning. The partitioning is a fundamental tool for video compression. It defines how precise can be the adaptation of the encoder to local content characteristics. The more flexible it is, the better the coding efficiency. All the subsequent coding tools depend on the ability to partition the frame efficiently. AVC has 16x16 pixels blocks, with some limited sub-partitioning. HEVC implements a much more flexible quadtree based partitioning from 64x64 pixels blocks. VVC combines quadtree partitioning with binary and ternary tree partitioning, from 128x128 pixels blocks for even more flexibility. During the exploration following HEVC standardization, the single fact of enhancing the partitioning brought up to 15% coding efficiency gains. Similarly, in the AHG12 context, people came with new extended partitioning strategies. However, only marginal gains were reported [7]. Does that mean we are finally approaching a limit?</p> <p>The second fact is the development of end-to-end deep learning video compression. This strategy is highly disruptive. In short, the whole block-based hybrid coding scheme is replaced by a set of deep learning networks, such as auto-encoders. These types of schemes are competing with state-of-the-art fixed image coders [8]. For video applications, they are matching HEVC performance [9][10]. This level of performance has been reached in only five years. That’s an unprecedently fast progression. One may easily extrapolate, even if the progression slows down, that the state-of-the-art video compression performance will soon be the end-to-end strategy prerogative. Therefore, we may very well be at a turning point of the video codec history.</p> <p>The goal of this paper is to analyze the benefits and limitations of deep learning-based video compression methods, and to investigate practical aspects such as rate control, delay, memory consumption and power consumption. In a first part, the deep-learning strategies are analyzed, with a focus on tool-based, end-to-end, and super-resolution-based strategies. In a second part, the practical limitations for industrial applications are studied. In a third part, a technology is proposed, namely overlapping patch-based end-to-end video compression, to overcome memory consumption limitations. Finally, experimental results are provided and discussed.</p> Thomas Guionnet Marwa Tarchouli Sebastien Pelurson Mickael Raulet Copyright (c) 2023 SET INTERNATIONAL JOURNAL OF BROADCAST ENGINEERING http://creativecommons.org/licenses/by-nc-nd/4.0 2022-07-08 2022-07-08 8 11 11 Neural Network-Like LDPC Decoder for Mobile Applications https://revistas.set.org.br/ijbe/article/view/246 <p>This paper presents a low complexity iterative decoder for Low-Density Parity-Check (LDPC) codes for mobile applications using a Neural Network-like (NNL) structure and a modified Single-Layer Perceptron (SLP) training algorithm. The proposed approach allows for midrange decoding performance with a minimum gap to Shannon-limit of 3.19 dB at a frame error rate of 10^-4 for the short frame and the code rate 13/15 of the next-generation Digital Terrestrial Television Broadcasting (DTTB) standard of the Advanced Television Systems Committee (ATSC), the "ATSC 3.0". The NNL decoder has a low decoding time, thus, it would be suitable for low power embedded systems, software-defined radio implementation tools, and software-based DTTB receptors.</p> Fadi Jerji Leandro Silva Cristiano Akamine Copyright (c) 2023 SET INTERNATIONAL JOURNAL OF BROADCAST ENGINEERING http://creativecommons.org/licenses/by-nc-nd/4.0 2022-12-21 2022-12-21 8 11 11 Comparison of the Physical-layer Performance between ATSC 3.0 and 5G Broadcast https://revistas.set.org.br/ijbe/article/view/237 <p>This paper compares the physical-layer performances of ATSC 3.0 and 5G broadcast in the scenario to provide mobile broadcasting services. The differences of physical-layer between ATSC 3.0 and 5G broadcast are discussed in terms of transmission efficiency, overheads, and BICM performance over mobile environments. Through the computer simulations, it is shown that ATSC 3.0 can provide more robust and enhanced physical-layer performance than 5G broadcast over mobile environments.</p> Seok-Ki Ahn Sung-Ik Park Copyright (c) 2023 SET INTERNATIONAL JOURNAL OF BROADCAST ENGINEERING http://creativecommons.org/licenses/by-nc-nd/4.0 2023-03-30 2023-03-30 8 Spectrum Availability for the Deployment of TV 3.0 https://revistas.set.org.br/ijbe/article/view/244 <p>In this paper, we study the current and future spectrum availability of the VHF and UHF bands in Brazil for the deployment Next-Generation Digital Terrestrial Television Systems, which are being studied under the “TV 3.0 Project” initiative, coordinated by The Brazilian Digital Terrestrial Television System Forum (SBTVD Forum). Coverage simulations of all expected operating television stations were computed in different scenarios to estimate the spectrum usage over the Brazilian territory. Results indicate that even after the analog TV switch-off there will be no spectrum availability in the main metropolitan regions for simulcast transmissions between the current ISDB-Tb System and the future TV 3.0. Hence, hybrid approaches should be implemented to smoothly introduce a new digital television system in Brazil.</p> Thiago Soares Paulo Eduardo dos Reis Cardoso Ugo Silva Dias Copyright (c) 2023 SET INTERNATIONAL JOURNAL OF BROADCAST ENGINEERING http://creativecommons.org/licenses/by-nc-nd/4.0 2022-07-25 2022-07-25 8 11 11 Contributions to TV 3.0 using 5G-MAG Reference Tools https://revistas.set.org.br/ijbe/article/view/241 <p>The evolution of the evolved Multimedia Broadcast Multicast Service (eMBMS) to the Further evolved Multimedia Broadcast Multicast Service (FeMBMS) in Release 14 of 3GPP enabled broadcast transmission in a format 100% dedicated to user devices. As a result, the 5G standard for cellular networks expanded and began to be quoted in the broadcasting sector as 5G Broadcast. This standard is one of the quoted ones to integrate the TV 3.0 architecture in Brazil, being responsible for the physical layer. To be possible, this technology must meet some requirements, such as negative noise carrier ratio, MIMO antennas and channel bonding. In order to complement the tests carried out by the SBTVD Forum, this paper aims to evaluate and discuss the SNR and minimum signal level tests using an Open-Source receiver called 5G-MAG that is managed by the group of the same name. The tests were carried out using a Universal Software Radio Peripheral (USRP) Software Defined Radio (SDR) reproducing I/Q file with 5G Broadcast data with a bandwidth of 6MHz, the width of interest of the Forum, to transmit the signal via GNU Radio software and another USRP SDR as a receiver, having as interface the 5G-MAG.</p> Wesley Souza Cristiano Akamine Copyright (c) 2023 SET INTERNATIONAL JOURNAL OF BROADCAST ENGINEERING http://creativecommons.org/licenses/by-nc-nd/4.0 2022-08-29 2022-08-29 8 6 6 CONVOLUTIONAL NEURAL NETWORK IMPLEMENTATION IN FPGA FOR IMAGE RECOGNITION https://revistas.set.org.br/ijbe/article/view/243 <p>Redes neurais, do inglês Neural Networks (NNs) vem sendo estudadas e aprimoradas de modo<br>que cada vez mais maquinas simulam a capacidade de realizar tarefas complexas feitas somente por<br>seres vivos inteligentes, a visão para reconhecimento e interpretação do ambiente é uma das tarefas<br>que vem sendo estudada a fim de ser implementada de modo eficiente em novas tecnologias para<br>utilização em veículos autônomos para a conveniência de motoristas, redução de acidentes e até<br>mesmo entregas de produtos de forma autônoma, para a tarefa complexa de reconhecimento de<br>interpretação de imagens tem sido desenvolvidas e utilizadas redes neurais convolucionais, inspiradas<br>na maneira como os seres vivos enxergam. Os arranjos de portas logicas programáveis em campo, do<br>inglês Field Programable Gate-Arrays (FPGAs) vem evoluindo e adquirindo cada vez mais poder e<br>velocidade de processamento em paralelo o que os torna perfeitos candidatos a implementação de<br>NN de modo eficiente, com capacidade de processamento superior e tempo de resposta baixo<br>comparado com as alternativas. O objetivo deste trabalho é avaliar o desempenho e a viabilidade da<br>implementação em FPGA de uma NN classificadora de imagens.</p> Fadi Jerji Victor Mendonça Aguirre Copyright (c) 2023 SET INTERNATIONAL JOURNAL OF BROADCAST ENGINEERING http://creativecommons.org/licenses/by-nc-nd/4.0 2022-08-20 2022-08-20 8 9 9