Thoughts about 3D Technology

22/12/12 13:55

Thoughts about 3D Technology
I have been talking about the various merits and faults of 3D with my colleagues going back to a demonstration I saw in April of 2011 at the NAB convention in Las Vegas. I thought I would try and elucidate some of these thoughts and see if anyone had interest in replying.

I would like to start off with describing a simple experiment that illustrates the basic idea behind modern 3D video:
With both eyes open, point your finger at the place where the ceiling meets 2 walls. Now close your right eye. If your right eye is dominant, your finger will not appear to move, or will only move a small bit. Open both eyes and point again. Now close your left eye. Your finger will have appeared to move quite a bit. Now you will know if you are left eye dominant or right eye dominant. Note that this does not necessarily correlate to your dominant hand.
What is important for this discussion is the difference in focus between your two eyes.

<br />What is important for this discussion is the difference in focus between your two eyes.

This is the mechanism that enables one’s depth perception. This is what a modern 3D camera seeks to emulate. These cameras use two lenses which allow a three-dimensional signal to be recorded by determining the differences between the two images which is analogous to how your brain decodes your own three-dimensional vision. But these images are displayed on two-dimensional screens. In order to give depth to the image, glasses are used.

What is interesting for me is that while watching a live broadcast of a sporting event, I found the image presented to be very disconcerting. I felt a little nauseous in fact! A possible explanation for my discomfiture is that when I am moving through space, my brain is being fed information from all of my senses, not just my eyes. So my orientation is affected by my inner ear, for balance, by my hearing and also by the interrelation with my movements, gravity and so forth. When viewing a live 3D broadcast, I am affected only by the visual content, which is akin to looking through someone else’s eyes without being connected to the rest of their sensory input. This is particularly apparent when the images from handheld cameras are on screen. A realistic surround sound image would be helpful, but of course the soundfield is different from every visual perspective, and a live surround mix cannot possibly take all of the various perspectives in account. What is generally done, is that one perspective is chosen, and the surround matrix is built in coherence with it.

It is possible that stabilizing the cameras with gyroscopically controlled panheads will alleviate some of the discomfiture issues

It is possible that stabilizing the cameras with gyroscopically controlled panheads will alleviate some of the discomfiture issues, but that still leaves the rest of the sensory inputs to be concerned with. To my knowledge, there have been no definitive studies published on the effect of surround audio in combination with 3D imagery in terms of image stability. There have certainly been papers presented regarding localization of sound in a surround matrix, especially in respect to video game production. Studies indicate that audio information presented in 3D improves reaction time, as noted in certain high stress environments such as airplane cockpits.[1] Will accurate 3D audio reproduction have a positive effect on viewer reaction to 3D disorientation? It has yet to be seen.
3D technology is still relatively new, and is being studied and improved upon constantly. Perhaps the answer may be to place the viewer in the center of a holographic matrix where one could possibly have a better connection between motor control and perception.
Include Footnote Link (PDF) - Psyko Audio Labs , Surround Technologies for headphones.
I more or less cited this article when talking about the correlation between 3D audio and reaction time. I have noticed that when mixing for TV - by having the various "instructions" (director/producer, etc) come from different speakers in different locations, I can react faster and understand better.
Daniel Littwin , director New York Digital.
contact: daniel.nydigital@gmail.com
São Paulo, Brazil

Mentioned:
NAB convention in Las Vegas

Tags: 3D

Comments

Mixing and Live Audio Acquisition For Television Sports

10/10/12 01:45

Live audio coverage
for television sport presents several changes

Maintaining a consistent sound level while presenting a dynamic fast paced event.
Keeping the audio coherent with the on-screen action.
Signal routing for transmission, replay, resale and audio.
Maintaining signal integrity, continuity and lip synch whether in mono stereo or surround.
Creating and monitoring multiple mixes simultaneously: Surround, Stereo, Mono, router feeds, alternate mixes for international clients, etc.

Depending on the complexity of the production, an engineer has between 5 hours and 5 weeks to set up the show.

For a typical NBA game one generally has 4 hours. This type of show will: be broadcast on 1 network, involve 1 field of play, have 1 announce position and possibly an “effects feed” for an associated second language or radio broadcast.

For the US Open Tennis tournament, the total equipment set up time is more than 3 weeks. This show involves over 125 networks, 5 fields of play, separate EFX
feeds from each tennis court in surround, stereo, mono as well as ambience only, over 100 different announce cabins, multiple interview positions, internet, internal stadium webcasts, and an all-encompassing intercom system, as well as many editing facilities, etc.

Live Mix techniques, in general:
Stems – audio subgroups of the broadcast mix.

Stems facilitate ease of operation for live acquisition. Establishing international sound sub-mixes, creating IFB
mixes and router feeds are all made easier through the use of stems. For editing and rebroadcast they are essential to recreate the sound of the show quickly and easily while allowing the editor to change segment lengths. Operationally, the subgroups are sent to DAs.
The signals are then returned to the desk as inputs, to allow access to auxiliary sends and to allow level manipulation without affecting the group master gain. This is very important for enabling isolated recordings for editing purposes, signal distribution to downstream clients and off line recording for replays with audio during the live broadcast.

1 EFX - sound effects or action. US television refers to action mics as EFX mics
2 IFB - interrupted fold back
3 DA – distribution amplifier
Grouping and processing

The Announce Group:
Here, I generally use 2 levels of dynamics.

At the group level, I insert a limiter, generally at a ratio of 10:1 a threshold of -1db, a medium slow attack time and a very fast release. This limiter acts as a safety, to prevent the overall submix from over-modulating. The insert point should be prefader.

Individual channels have “soft knee” compressors at ratios of 1.8:1 with medium attack and release times. The insert point should be prefader, post filter, pre-eq, if the console allows. These compressors have the effect of enhancing speech intelligibility and of keeping the announcers more “present” in the mix. Additionally, they provide an extra gain stage, if necessary.

Equalization normally entails a high pass filter set at 75Hz, a small, narrow “bump” in the low midrange somewhere between 450Hz and 800Hz to enhance the individual voice and a somewhat wider presence peak somewhere between 2200Hz to 3200Hz for clarity. A low pass filter is used in the event of high frequency noise problems that arise during broadcast or that cannot be solved in the allotted set up time.

Most of the processing is done to enhance intelligibility. Care must be taken not to allow harshness in the announcer voices, however.
The announce group (or groups) is then sent to a DA and returned to the desk as well as routed to various listening positions throughout the UM as well as to the video router or recorders as needed.
To fix the physical location of the announcers in the stadium, I place the main stereo ambient pair in the announce cabin (or just outside) so that the stadium noise arrives at the ambient mics at the same time as it arrives in the announce mics. This gives the home viewer the illusion that they are sitting in the announce cabin, watching the game with the announcers. (See section Ambience)

Ambience:
I normally build a separate ambience or crowd group. This allows me to limit or compress the crowd mics separately for routing purposes as well as allowing the compressors for the action mics group to be affected by just the action and not be affected by crowd response.

TV sports generally, are shown from 1 perspective point, with varying views added to enhance coverage. (Golf, track and field and gymnastics are notable exceptions to this generality) The perspective is generally the announcer’s viewpoint. To place the announcers in the stadium for the home viewer, one should use a stereo coincident or near-coincident pair. My preference is a matched pair of cardioid condenser mics in an ORTF configuration. (2 matched cardioid capsules set 17cm apart, at an angle of 110 degrees.) After experimenting with x/y and m/s pairs, I have found that the ORTF seems to best mimic human hearing. I place the mics so that they do not “hear” the announcers, but the arrival time of the ambient noise at the crowd mics is the same, (or almost the same), as at the announce mics. This is of course, limited to the announce cabin.

Depending on the event I use a hard knee compressor or limiter with a ratio of 4:1, a slow attack time and a fast release. The threshold is usually 2 or 3 dB before 0.
It is important to keep a consistent sound field relative to both level and phase (position). If this perspective is changed repeatedly the sonic image presented to the home viewer will be confused.
The ambient mics also serve to mask any sudden changes made to the mix.

Action Group
All sports have specific areas of concentrated action, where points are scored, where plays transition from offense to defence and back, where coaches shout instruction and where players communicate amongst themselves. It is relatively easy to aim microphones at these areas of interest. The audio mixer must choose the correct microphone for each specific area of play appropriate to the event and the setting.

I normally compress this group at a ratio of 4:1, a slight soft knee curve, slow attack, fast release times and a threshold of 3dB before 0.
I tend to classify the specific desirable sounds heard in most sports into 2 categories:

1) Thumps – low midrange (somewhere between 400Hz and 750Hz) ground contact, ball sounds, and physical contact.

and

2) Presence – midrange (somewhere between 1,25KHz and 4.5KHz) definition, voices, squeaks, pops and clicks.

By choosing the correct microphone, one can minimize the amount of equalization needed. In practice, however the perfect microphone is often not available and eq needs to be added to the channel signal. Too much emphasis or de-emphasis of a particular frequency can indicate problems: the monitoring in the audio mix room, a recurrent frequency in the arena, PA system equalization or other issues.

Following the action:
Most changes in the mix balance need to be fairly abrupt as the play moves around the field. To minimize the effect of rapidly opening and closing microphones, keeping a consistent tonal and level balance from channel to channel is essential. Select the primary or most important single source; optimize the sound and then balance the other sources to match. When mixing, I transition by leaving a mic open until the next source is also open, then the previous source can be faded out. This must be done almost instantaneously. When combined with the sound field established by the main ambient pair, the transitions are no longer apparent. The home viewers are unaware that anything is being altered in the mix balance; they simply hear the sounds that match the pictures.

Certain mics can be compressed for extra emphasis. Coach mics for example or any other sources that will isolate vocal responses. Other primary source mics may be suitably enhanced by compression. Compression may also be a necessity for microphones that will be isolated to the router and/or to tape machines. (see section The Routing Switcher)

Caution and restraint should be exercised in equalization and processing, however. Otherwise, the mix could become strident and unpleasant and/or the announcers will be masked by the game sounds. Overemphasis of any particular frequency band can also lead to transmission issues such as overlimiting and distortion. Overprocessing will lead to listener fatigue, and make the presentation unpleasant to listen to.

The Router:
The routing switcher is one of the most important tools in the mobile unit. Many different elements can be added to the show to enhance production choices and capabilities. By isolating reporter and interview positions, camera mics, and other possible replay sources it is possible to give the home viewer a unique perspective using replays with audio.

Depending on the routing switcher and control panel a variety of signals are available to the video tape operator. These are usually organized by video sources. In the case of a stereo show utilizing VTR’s with 4 audio channels for example: All the camera source buttons are programmed with an action/ambient stereo mix on channels 1 - 2 and a full program mix on channels 3 - 4. Another row of buttons presents video for the handheld cameras matched with an iso of the associated microphone on both channels 1 & 2. Additionally there are several other mixes generated from the console’s aux busses for prefade reporter mics, various interview positions and at least 1 extra stereo aux mix just in case it is needed. Router audio signals can be generated from groups, auxes, console direct outputs, satellite return feeds, and perhaps a betacam to access ENG or EFP footage gathered during the event. (see image below)

Router Panel
In the example shown above, assuming a 4 channel audio configuration, the top level of (blue) buttons are programmed so that channels 1-2 are a mix of the action and the ambience mics and channels 3-4 are a full mix of stereo program. The lower level of (green) buttons are programmed so that channels 1-2 receive the individual camera’s shotgun mic and channels 3-4 are a full mix of stereo program. The (yellow) button for camera 6 is an audio-only button programmed so that channels 1-2 are the pre-fade reporter’s mic.

For surround shows router configurations are of course, more complicated. In practice, audio elements are configured as stereo pairs and recombined in the mix desk prior to transmission. A surround synthesizer can be inserted across a stereo channel or mix bus to process stereo sources for surround transmission.

The routing switcher is fed audio signals from various outputs on the desk. Group sends, aux sends, direct outputs and multitrack outputs are fed to various inputs of the router for distribution e attribution for the video equipment, transmission, etc.

Many VTR’s are limited to 4 channels of audio. Hard disc video recorders like the EVS are often configured for only 4 audio channels as well.

When working in surround reproduction and transmission it is possible to use a surround decoder/encoder set to route a surround imaging matrix to a stereo pair. This signal is then routed through a decoder and played back as a surround source through the audio desk. This would enable a machine with a four channel configuration to be used to reproduce surround material. Of course only one playback machine would be available at any given time. (see image below)
<span style=

VTR + EVS">
More sophisticated VTR’s and EVS recorders allow 8 channels of discrete audio recording.
It is therefore much simpler to maintain surround signal integrity in a live broadcast situation.
(see image below)

Microphones Positions for Specific Sports:

Basketball
Basketball is most often played and telecast from an indoor arena with a hardwood floor, concrete walls and (hopefully) thousands of animated fans. Indoor sports arenas are generally very reverberant and often any acoustic problems are exacerbated by using PA systems that are much louder than they need to be.

Lapel mics (Sony ECM 77’s or similar) are mounted behind the baskets, on the backboards, in the rubber frame that surround the edge. A short shotgun mic (Sennheiser 416 or similar) is mounted on the stanchion pointed at the free-throw line and a long shotgun (Sennheiser 816) is mounted on the handle of the handheld camera behind the end zone. If there is a handheld camera at midcourt it also would have a long shotgun. An ORTF pair is mounted at midcourt.

Tennis
Tennis is unusual in that although the action is oriented side to side like most other sports, the perspective presented by television is from one end. Therefore the stereo perspective is perpendicular to the net rather than parallel.

A stereo pair is set in the announce booth and another pair is mounted on the umpire’s chair to capture crowd and ambience sounds. A lapel mic is laced into the net. 4 short shotguns are mounted on short mic stands behind each baseline. Shotguns are also mounted on courtside cameras and beneath the umpire’s chair.

US Open Tennis Championship, 2007 Louis Armstrong Stadium, USA

Soccer
To cover soccer (football) a combination of operated and stationary mics are used. I place an ORTF pair in the announce cabin and use long shotguns mounted on tall poles for the response of the cheering sections.

Opposite the team benches are two mobile operated mics. All cameras on the field have shotgun mics. The difficulty lies in trying to capture sounds from a great distance. Compression can help accentuate individual mics. It is extremely important for the mic operators to wear headphones to monitor their equipment, to be very active and to anticipate the direction of play. Over exaggeration of sounds that are close to mics, such as corner kicks, will make the other, more distant sounds seem inaudible by comparison. A careful balance must be maintained.

Golf
One of the most challenging sports to cover is golf. The arena for the sport is huge and there are only 2 areas where sounds are easy to capture: the tees and greens. Every microphone (or pair of mics) must be available prefader with processing as a router source. For a typical golf course, this would be 18 stereo tee mics, 18 stereo green mics, 10 wireless handheld cameras with mics, 8 wireless operated shotgun mics as well as ambience and crowd mics, about 62 stereo and surround microphone sources in all. Additionally there are between 1 and 6 announce cabins, 2 to 4 wireless reporter units, a trophy presentation area, 18 VTR’s, 4 EVS and edit suites as well.

Verification of the installations with the Autor on the PGA championship, Medina, IL, USA, august 2006

This level of complexity is also common in track and field, gymnastics and Formula 1 racing. Each of these sports has specific peculiarities and challenges including:

Widely varying sound pressure levels,
Difficulties with microphone placement
Difficulties with signal transport and cable paths
Difficulties with weatherproofing
Multiple events occurring simultaneously
Many channels of wired and wireless mics and electric points
Many channels of wired and wireless intercommunications
Interconnections between analog, digital, copper and fiber optics
Facilities and crews that are shared between different broadcasters and production teams.

The process is never simple. Audio production for television requires significant planning in advance, flexibility on site, troubleshooting and rapid decision making while confronted with many variables. However, with careful organization each broadcast can be presented in its entirety with the accuracy and creativity that will create a sense of realism and excitement for the home viewer.

Video to just hear the Sennheiser MKH 816 at work

Daniel Littwin , director New York Digital.
contact: daniel.nydigital@gmail.com
São Paulo, Brazil

Mentioned:
ORTF configuration
EVS
Sony ECM 77
Sennheiser MKH 416
Sennheiser MKH 816

Tags: Router

Comments

Using Bidirectional or Figure 8 Microphones

09/06/12 14:37