The audio video alignment nightmare in audio compressed formats

Introduction

Doing some research about automatic editing systems I have found strange timing behaviors in audio compressed streams and I wanted to take some measures, and here they are:

Audio tests – Audacity

  • Using Audacity  2.0.5 (free audio editor) I generated a test time line, see figure 1
Audacity audio test time line

Figure 1: Audacity audio test time line

  • After that I have exported the test timeline to 3 different formats:
    • PCM
    • MP3 (using integrated coder)
    • AAC (using integrated ffmpeg)
  • I have imported the previously exported files into Audacity (see figure 2 and 3) and you can see the results in the following table:
    Audacity – Test nun Export coder Delay respect original [ms] Obs
    1 PCM 0
    2 MP3 +51
    3 AAC +24

 

Original and exported->imported tracks in Audition

Figure 3:  Original and exported->imported tracks in Audacity

Zoom of the original and exported->imported tracks in Audition

Figure 2: Zoom of the original and exported->imported tracks in Audition

Audio tests – Adobe audition CC

  • I have repeated the same test that I have done with Audacity with Adobe Audition CC version: July2014
  • The result is the following (see Figure 3 and figure 4):
    Audition – Test nun Export coder Delay respect original [ms] Obs
    1 PCM 0
    2 MP3 0
    3 AAC -21  It has lost the first 21ms

     

Original and exported->imported tracks in Audition

Figure 3: Original and exported->imported tracks in Audition

Zoom of original and exported->imported tracks in Audition

Figure 4: Zoom of original and exported->imported tracks in Audition

 

Audio-video test – Adobe Premiere

Audio-Video alignment test timeline

Figure 5: Audio-Video alignment test timeline

  • After that I have exported the test timeline to 3 different formats:
    • MXF OP1a (Video: AVCI50 720p50, Audio: PCM 48KHz 16b)
    • MP4-AAC (Video: H264 main 4.1 VBR 10Mbps 720p50 ,Audio: AAC 320Kbps 48KHz)
    • MP4-MPEG (Video: H264 main 4.1 VBR 10Mbps 720p50 ,Audio: MPEG1 L2 320Kbps 48KHz
  • Finally I have imported the previously exported media files into premiere, and you can see the results in the following table:
Premiere- Test nun Export coder Audio delay respect video (or original) [ms] Obs
1 MXF OP1a 0  Non-compressed audio format
2 MP4-AAC +10
3 MP4-MPEG +10

 

Adobe Premiere CC timelines comparison (original and exported -> Imported media: MP4-AAC and AAC-MPEG)

Figure 6: Adobe Premiere CC timelines comparison (original and exported -> Imported media: MP4-AAC and AAC-MPEG)

 

Audio-video test – Final cut 7

  • I have exported the test timeline to:
    • MOV (Video: XDCAM EX 720p50 35Mbps VBR, Audio: PCM 48KHz 16b)
    • MOV (Video: MPEG-4 6.4Mbps 720p50 ,Audio: AAC-LC 320Kbps 48KHz)
  • The results are:
FinalCut7- Test nun Export coder Audio delay respect video (or original) [ms] Obs
1 MOV-XDCAM, PCM 0 Non-compressed audio format
2 MOV-MPEG4, AAC-LC 0

 

Final Cut 7 timelines comparison (original and exported -> Imported media: MOV XDCAM-PCM, and MOV MPEG4-AAC-LC)

Figure 7: Final Cut 7 timelines comparison (original and exported -> Imported media: MOV XDCAM-PCM, and MOV MPEG4-AAC-LC)

Future work

  • Extend the experiment to other audio / video editors
  • In order to simplify the experiment I have measured only into the same software (ie: Export premiere -> import premiere). A good idea could be create an import export matrix between applications.
  • Play the exported files with a broadcast player and measure the audio video delay in SDI
  • Establish a reliable software method to measure the audio video delay in media files

Pleliminary conclusions

  • When we use NON compressed formats all works as expected, but when we use an audio compressed format (AAC, MP3) it seems that in some cases the audio and video editors do NOT compute properly the audio compressing time.
  • If you only work with audio perhaps you can handle a small audio delay, but if you work with audio and video tens of ms of audio advance or delay will be noticeable for the audience according to EBU R27
  • Recommendation: In the video processing chain if you can work with audio UNcompressed formats do it, you will increase the quality and you will avoid a lot of headaches. And use audio compressed formats at last step, just for distribution

One Response to The audio video alignment nightmare in audio compressed formats

  1. Pingback: Trim HLS stream with frame accuracy using ffmpeg and Bash script | JORDI CENZANO

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: