The audio video alignment nightmare in audio compressed formats

Introduction

Doing some research about automatic editing systems I have found strange timing behaviors in audio compressed streams and I wanted to take some measures, and here they are:

Audio tests – Audacity

  • Using Audacity  2.0.5 (free audio editor) I generated a test time line, see figure 1
Audacity audio test time line

Figure 1: Audacity audio test time line

  • After that I have exported the test timeline to 3 different formats:
    • PCM
    • MP3 (using integrated coder)
    • AAC (using integrated ffmpeg)
  • I have imported the previously exported files into Audacity (see figure 2 and 3) and you can see the results in the following table:
    Audacity – Test nun Export coder Delay respect original [ms] Obs
    1 PCM 0
    2 MP3 +51
    3 AAC +24

 

Original and exported->imported tracks in Audition

Figure 3:  Original and exported->imported tracks in Audacity

Zoom of the original and exported->imported tracks in Audition

Figure 2: Zoom of the original and exported->imported tracks in Audition

Audio tests – Adobe audition CC

  • I have repeated the same test that I have done with Audacity with Adobe Audition CC version: July2014
  • The result is the following (see Figure 3 and figure 4):
    Audition – Test nun Export coder Delay respect original [ms] Obs
    1 PCM 0
    2 MP3 0
    3 AAC -21  It has lost the first 21ms

     

Original and exported->imported tracks in Audition

Figure 3: Original and exported->imported tracks in Audition

Zoom of original and exported->imported tracks in Audition

Figure 4: Zoom of original and exported->imported tracks in Audition

 

Audio-video test – Adobe Premiere

Audio-Video alignment test timeline

Figure 5: Audio-Video alignment test timeline

  • After that I have exported the test timeline to 3 different formats:
    • MXF OP1a (Video: AVCI50 720p50, Audio: PCM 48KHz 16b)
    • MP4-AAC (Video: H264 main 4.1 VBR 10Mbps 720p50 ,Audio: AAC 320Kbps 48KHz)
    • MP4-MPEG (Video: H264 main 4.1 VBR 10Mbps 720p50 ,Audio: MPEG1 L2 320Kbps 48KHz
  • Finally I have imported the previously exported media files into premiere, and you can see the results in the following table:
Premiere- Test nun Export coder Audio delay respect video (or original) [ms] Obs
1 MXF OP1a 0  Non-compressed audio format
2 MP4-AAC +10
3 MP4-MPEG +10

 

Adobe Premiere CC timelines comparison (original and exported -> Imported media: MP4-AAC and AAC-MPEG)

Figure 6: Adobe Premiere CC timelines comparison (original and exported -> Imported media: MP4-AAC and AAC-MPEG)

 

Audio-video test – Final cut 7

  • I have exported the test timeline to:
    • MOV (Video: XDCAM EX 720p50 35Mbps VBR, Audio: PCM 48KHz 16b)
    • MOV (Video: MPEG-4 6.4Mbps 720p50 ,Audio: AAC-LC 320Kbps 48KHz)
  • The results are:
FinalCut7- Test nun Export coder Audio delay respect video (or original) [ms] Obs
1 MOV-XDCAM, PCM 0 Non-compressed audio format
2 MOV-MPEG4, AAC-LC 0

 

Final Cut 7 timelines comparison (original and exported -> Imported media: MOV XDCAM-PCM, and MOV MPEG4-AAC-LC)

Figure 7: Final Cut 7 timelines comparison (original and exported -> Imported media: MOV XDCAM-PCM, and MOV MPEG4-AAC-LC)

Future work

  • Extend the experiment to other audio / video editors
  • In order to simplify the experiment I have measured only into the same software (ie: Export premiere -> import premiere). A good idea could be create an import export matrix between applications.
  • Play the exported files with a broadcast player and measure the audio video delay in SDI
  • Establish a reliable software method to measure the audio video delay in media files

Pleliminary conclusions

  • When we use NON compressed formats all works as expected, but when we use an audio compressed format (AAC, MP3) it seems that in some cases the audio and video editors do NOT compute properly the audio compressing time.
  • If you only work with audio perhaps you can handle a small audio delay, but if you work with audio and video tens of ms of audio advance or delay will be noticeable for the audience according to EBU R27
  • Recommendation: In the video processing chain if you can work with audio UNcompressed formats do it, you will increase the quality and you will avoid a lot of headaches. And use audio compressed formats at last step, just for distribution

Loudness meters [CAT]

These slides are part of a conference that we gave about loudness (link). Different views of loudness were explained in that conference:

  • Introduction by Llorenç Gómez (TVC)
  • Standards by Josep Ramon Casas (UPC)
  • Loudness meters by Jordi Cenzano (8TV)
  • Loudness in production chain by Emili Planas (Mediapro)

The conference was organized by Consell del Audiovisual de Catalunya.

Download / view PDF

Loudness meters slides PDF

 

 

 

 

 

Here you can see a video about the conference:

Design and implementation of a loudness monitoring system – MERIT master thesis

This paper was created in 07/2013 as MERIT master thesis (from UPC university).

The main objective is to explore the normalization of audio levels in the media industry, which is a relevant issue nowadays. Many broadcast organizations around the world are concerned about this problem and they have published many papers and standards that are analyzed in this project, some of those organizations are: European Broadcasting Union (EBU), International Telecommunication Union (ITU), Advanced Television Standards Committee (ATSC), etc.

Download / view document

MERIT master thesis PDF

IDMAC presentation [CAT]

Here you can found the presentation about Loudness that I did in the MAC (Mercat Audiovisual de Catalunya).

Slides IDMAC presentation

Slides IDMAC presentation PDF

RTP Protocol

It describes the RTP protocol, and it proposes a C code to implement it and test it.

Download / View paper

Paper PDF