3D Microphone Array Comparison (3D-MARCo)

In June 2019, I conducted a rather ambitious 3D recording project with my postdoc researcher Dale Johnson and students at the APL. We used a total of 71 microphones to configure eight different 3D main microphone arrays as well as a dummy head and additional supporting microphones (51 of which were DPA D:Dictate mics). Obviously, this project required a lot of equipment to achieve high-quality recordings, which were realised with kind support from DPA, Merging Technologies and Genelec (warm thanks to everyone involved). The purpose of the project was to create an open-access database of musical recordings and room impulse responses to research about what kind of perceptual differences that different 3D mic techniques can produce for recordings made in an acoustic environment. The database has been named “3D-MARCo” (3D Microphone Array Comparison), and is available for free download. The main aim was really not about finding out which technique is the winner as all of these different techniques have pros and cons depending on the context. But it was to elicit salient attributes and develop scales for future subjective investigations on 3D audio quality.

To make sure this kind of research is ecologically valid, it is important to record various types of real musical performances in a good concert hall. So we hired professional musicians to form a string quartet and a piano trio, and also had kind performance contributions from colleague pipe organist and a student A Cappella ensemble. We also managed to capture room impulse responses (RIRs) for 13 different source positions using all of the 71 mics. The recording venue was St.Paul’s concert hall based at the University of Huddersfield, which has been a crucial venue for my 3D audio research over the last 10 years. It used to be a Victorian-style church until it was converted into a concert hall several decades ago, and is a main venue for Huddersfield Contemporary Music Festival. St.Paul’s has a reverb time of 2.1 seconds, with a lot of it developed by reverberation from a high ceiling, which is perfect for capturing diffuse ambience for the height channels in 3D reproduction.

The microphone arrays included in this project are as follows:

  • OCT-3D
  • PCMA-3D
  • 2L-Cube
  • Decca Cuboid (Decca Tree with surround and height channels)
  • Hamasaki Square with two types of height configurations
  • mhAcoustics Eigenmike EM32 (Higher-Order Ambisonics up to 4th order)
  • Sennheiser Ambeo VR
  • Neumann KU100 Dummy head
  • Additional microphones for side/height channels, floor channels, overhead channel and spot microphones for individual instruments.

The details of the recording setup can be found in a related AES paper and an article I wrote for Resolution Magazine. There is also YouTube video that DPA filmed on site during the recording session.

The database is now being used by a number of academics, researchers and developers for spatial audio research, critical listening, recording education, etc. At the time of this writing, the 3D-MARCo database has been downloaded 3975 times from the Zenodo repository.

The original plan was to conduct a subjective listening test to elicit salient perceptual differences between all of these microphone arrays earlier this year, but due to Covid this has been postponed to an unknown future unfortunately (hopefully sometime next year). However, during the lockdown period, I’ve worked on the “objective” analyses of the arrays using the RIRs instead, including the parameters of interchannel crosstalk, interchannel cross-correlation, interaural cross-correlation, fluctuations of interaural level and time differences, D/R ratio and spectral distortion caused by the height microphones. The results are going to be published as a journal paper hopefully soon. For now, I can share some interesting insights obtained from the analyses.

  • There were substantial differences among the arrays in the amount of both horizontal and vertical interchannel crosstalk, and this was found to be related to the considerable differences in the amount of spectral distortion in the ear signal as well as in the magnitude of ILD and ITD fluctuation over time. From this, it is expected that the arrays would have audible differences in perceived timbral characteristics as well as the localisation stability and spread of phantom image.
  • The arrays would have a considerable difference in the perceived magnitudes of horizontal spatial impression (e.g., ASW and LEV) and the size of listening area due to the different degree of interchannel decorrelation. Considerable differences in vertical decorrelation were also observed, but based on previous research, this is hypothesised to have a minimal effect on perceived vertical image spread, based on the literature.
  • The analysis of interaural cross-correlation suggests that the addition of the height layer to the base layer would have a minor effect on ASW and LEV regardless of the array, even though the base and height layers might have audible differences independently.
  • The differences in the D/R ratios of ear-input signals resulting from the 9-channel playback were around or below the just noticeable difference of perceived auditory distance, even though individual microphones had larger differences especially in the rear channels. This raises an interesting question as to whether it would be the channel-dependent balance of D/R ratio or the D/R ratio of the final ear signal that affects perceived auditory distance.

More details with graphs will be available in a paper soon. In the mean time, checked out 3D-MARCo if you haven’t yet. The download link has a Reaper session template including a binaural playback configuration for easily comparing all the recordings back to back.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s