After years of trial and error designed to reduce operating cost and (more recently) keep crews safely distanced, IP-based remote production has found its niche in live production and will remain the de facto method for producing events over a distributed network infrastructure. However, the big hurdle left to overcome for successful deployment of such networked workflows is latency.
In live production, video latency refers to the amount of time it takes for a single frame of video to transfer from the camera to a processing location (on premise or in the cloud) and back to the display – wherever that display might be.
Latency is inherent in the broadcast chain due to the processing of video for distribution both internally and (eventually) to the public, but there are lots of other factors which add latency: encoding and decoding, compression; distance; and switch hops. In short, latency is everywhere across all infrastructures whether that is audio, video, IP or copper…the issue is more about being aware of it and how to manage it.
The longer the signals have to travel, the greater the chance of latency creeping into the distribution path because they often must pass over several “hops” of video processing to reach their destinations in real time. Thus, when frame delay (latency) is introduced, it makes the job of the announcers, video switcher and replay operators, camera shaders and other technical crew – that are relying on a program monitor – much harder. Typically, audio is always ahead of the video, but they must be synchronized at the receive end or lip sync errors and other problems occur.
There is inherent latency in the speed of light and, while light moves very quickly, when signals are traveling over long distances, especially across continents, speed of light of light latency becomes a problem. Also, depending upon the complexity of the Telco routing in place, you may have different points along that path that add latency. Therefore, it’s an issue that system designers have to be aware of. So, when aggregated together, distance, the speed of light and those required hops create latency. Depending upon the operator position and the work that they are doing, anywhere between 200 milliseconds (ms) to 500 ms (approximately half of a second), has been deemed acceptable.
Due to the unique nature of each production and where it is located, latency is not an easy problem to solve but equipment manufacturers are developing tools to deal with it. For example, Blackmagic Design offers its line of ATEM live production switchers with built-in features to address latency, using adjustable audio delay on the analog inputs.
Another way to solve latency is to locate physical hardware on either end of the workflow. So, hardware at the main production facility and the physical hardware at the venue, used to mix the content on sight so the talent can see and react to the action in real time. This way, whatever is being done locally, the talent and crew can see it immediately. This does create some duplication of compute power, which can result in extra cost, but it solves the latency problem.
From an audio perspective, the biggest challenge is in-ear monitoring and the requirement for people to hear themselves on set in real time – it’s essential for IFB monitoring to have a low-latency signal path. There are many ways to minimize latency in the signal path, and each one has a cumulative effect. Balancing related paths to have same latency is a good start. Many audio consoles have delay built in across input delay (256 legs of 2.73 ms), path delay (all paths have 2.73 ms) and output delay (256 legs of 2.73 ms). For remote production and remote working, broadcast mixing systems like Calrec’s RP1 minimize IFB and monitoring delay by processing all the audio for IFB monitoring on-site. On in IP network, a lower packet time such as 125us will increase bandwidth (this is due to bigger packet overhead based on having to create more packets) but it will lower latency as it takes less time to packetize. Switch hops increase latency and create more data buffering, so keeping these to a minimum helps. Distance obviously makes a big difference, so the backhaul links will also have a big effect: there will be a big difference in speed (and cost) between dark fiber and copper links. When sending camera feeds, a large amount of bandwidth is required for the video content and there are multiple video streams between the camera and the base station. If industry standard wrappers such as SMPTE ST 2022-6 are employed, the video content can be easily extracted and different kinds of compression can be applied.
When JPEG 2000 or XS is used, typically the bandwidth requirements for visibly lossless signal transmission is only 10 % of an uncompressed transmission, according to product engineers at Net Insight. An even higher compression rate can be applied if the IP bandwidth available requires it, but signal delay and timing might be increased.
That’s because compression introduces latency to the signal transmission. First, the audio signals embedded into the camera transmission protocol need to be delayed the same amount as the video signals. Also, all of the other audio sources from the production site need to be synchronized with the camera audio and video signals.
One general rule of thumb: If you can produce your high-quality content in one location on premise, and the go up to the cloud only once for distribution, latency will be minimized significantly.