MPEG DASH multiple video implementation over one single manifest

by Sergi Màrquez Andreu


The aim of this article is to explain how we created a MPD (Media Presentation Description) with multiple videos in a single asset. To realize this, we have used the tools provided by TVC in order to correctly produce and deliver of multiple videos MPEG DASH steams. This topic has been developed of TVC participation on TV-RING European project. The relation with HBB4All European project is close because we have done a similar task in that project by incorporating multiple audios in a single asset. More information about these projects and the tools that we use, are available at the links at the end of this article.

Introduction MPEG DASH

DASH (Dynamic Adaptive Streaming over HTTP) it’s an international standard. It is an adaptive bitrate system which can be used for both live streaming and on-demand content.

The client player requests the stream that suits better with the available bitrate depending on the network status. The responsible to cut each flow into segments is the server, but it can be previously segmented into chunks by some processing tools. We use segments of 960 milliseconds because with the previous tests we found that with this size of chunks avoids problems synchronism between video and audio.  MPEG DASH uses a MPD referred to as the manifest. This file is an xml document to describe time periods, languages, codecs and other metadata for each video or audio component present in the stream.

Preparing the content

In order to make the different segmentation tools and servers tested work properly, the contents we want to deliver as MPEG DASH streams have to be previously encoded. Depending on some specific parameters of the content encoding such as video/audio codec, pixels resolution, sampling frequency and so on, we can obtain malfunction on these DASH generation software solutions.

For that reason each one of the video and audio components of the MPEG DASH stream we want to generate are encoded separately and joined at the end of the encoding process. The final encoding of the streams tested which worked correctly is the following:

  • Video: constant GOP size of 24 frames (960 ms), we can freely choose the bitrate (in our case is 3 Mbps), 720×576 (or 1920×1080), 16:9, 25 fps, H264.
  • Each one of the audios: language code (cat, eng, qad – for audio description, qca – for clean audio), 128 Kbps, 48 KHz, 2 channels, AAC (LC)


GPAC is an Open Source multimedia framework. The project covers different aspects of multimedia, with a focus on presentation technologies and on multimedia packaging formats such as MP4.

GPAC provides three sets of tools based on a core library called libgpac:

  • A multimedia Player: Osmo4 / MP4Client
  • A multimedia packager: MP4Box
  • And some server tools included in MP4Box and MP42TS applications.

For our interest MP4Box allows the preparation of HTTP Adaptive Streaming content. Specifically performs segmentation of the content in MPEG DASH chunks, and generates initialization segments and the MPD, ready to be delivered.


Using the different MPEG DASH generation tools we took advantage of the ‘language’ attribute of the videos. We use this tag to be able to differentiate each video component from the others. From the encoding point of view all the videos look the same, except for the ‘language’ tag.

To set the ‘language’ of the videos we use the MP4Box tool. To identify each component we use its corresponding ISO 639-2 three-letter language code for the audio and for videos we have invented the name tags (names must contain three letters). Here is an example of the MP4Box command we execute to add a ‘language’ to video and audio components.

  • Adding tag language to the first video:
“I:\GPAC\mp4box” -add I:\OHD\25\OHD_25_3M.mp4#video -lang 1=via I:\OHD\25\OHD_25_3M_v01.mp4
  • Adding tag language to the second video:
“I:\GPAC\mp4box” -add I:\OHD\25\OHD_25_ALTERNATIU_3M.mp4#video -lang 1=vib I:\OHD\25\OHD_25_ALTERNATIU_3M_v02.mp4
  • Adding tag language to the third video:
“I:\GPAC\mp4box” -add I:\OHD\25\OHD_25_MOSAIC_3M.mp4#video -lang 1=msc I:\OHD\25\OHD_25_MOSAIC_3M_msc.mp4
  • Adding tag language to the audio:
“I:\GPAC\mp4box” -add I:\OHD\25\OHD_25_3M.mp4#audio -lang 1=cat I:\OHD\25\OHD_25_3M_aud.mp4
  • Also, we use MP4Box to multiplex the definitive encoded videos and audio into a single MP4 file:
“I:\GPAC\mp4box” -add I:\OHD\25\OHD_25_3M_v01.mp4 -add I:\OHD\25\OHD_25_ALTERNATIU_3M_v02.MP4 -add I:\OHD\25\OHD_25_MOSAIC_3M_msc.mp4 -add I:\OHD\25\OHD_25_3M_aud.mp4 I:\OHD\25\OHD_25_MULTIPLEX.mp4

Using MP4Box for VOD

To generate a multi video MPEG DASH stream for a Video On Demand content we need to segment each video/audio (VA) component of the file in chunks of a specific duration (in our case 960 ms) and generate its initialization segments and the corresponding MPD. In order to deliver a truly multi video DASH stream and make the client HbbTV device able to select between the different videos offered, the obtained MPD must have each VA component in separated ‘Adaptation Sets’.

The command that we use by means of MP4Box is like one showed below:

“I:\GPAC\mp4box” -dash 960 -profile dashavc264:live -bs-switching no -segment-name vod_seg_3video_rid$RepresentationID$_cn$Number$ -url-template -mpd-title multivideo_vod_mpd “I:\OHD\25\OHD_25_MULTIPLEX.mp4#trackID=1:id=v0:role=v0” “I:\OHD\25\OHD_25_MULTIPLEX.mp4#trackID=2:id=v1:role=v1” “I:\OHD\25\OHD_25_MULTIPLEX.mp4#trackID=3:id=v2:role=v2” “I:\OHD\25\OHD_25_MULTIPLEX.mp4#trackID=4:id=a0” -out “I:\OHD\25\OHD25_multivideo\OHD_25_multivideo.mpd”

Description of the used MP4Box options:

  • dash 960: Determines the desired duration for the output chunks, in milliseconds. In this case, chunks of 960 ms long.
  • profile: dashavc264:live Set the MPEG DASH profile to be used for the MPD. To serve multiple videos and switch between them we use the “dashavc264: live” profile.
  • bs-switching no: With this option we disable the automatic bitrate-based switching between the different representations of each VA component. This is not needed for the testing because we used only one representation and quality for each component.
  • segment-name vod_seg_rid$RepresentationID$_cn$Number$: Sets the name for the output chunks. In this case identifying to which VA component corresponds the chunk with the ‘RepresentationID’ and using ‘Number’ to point to which part of the content they correspond.
  • url-template: To use the number-based segmentTemplate to create the chunks.
  • mpd-title multiaudio_vod_mpd: Set the title of the MPD.
  • “I:\OHD\25\OHD_25_MULTIPLEX.mp4#trackID=1:id=v0:role=v0” “I:\OHD\25\OHD_25_MULTIPLEX.mp4#trackID=2:id=v1:role=v1” “I:\OHD\25\OHD_25_MULTIPLEX.mp4#trackID=3:id=v2:role=v2” “I:\OHD\25\OHD_25_MULTIPLEX.mp4#trackID=4:id=a0: Indicate the four inputs files to segment (3 videos + 1 audio), taking them from the multiplexed MP4 by its ‘trackID’. The most important is the “: role =” of each input, which we assign a different value for each VA component in order to obtain that the Via video, Vib video, Msc video and Catalan audio belong to separate ‘Adaptation Sets’.
  • out “I:\OHD\25\OHD25_multivideo\OHD_25_multivideo.mpd”: Set the path of the generated MPD.

To make it more clear, this command will generate the following files at the output directory:

  • MPD: OHD_25_multivideo.mpd.
  • 4 initialization segments, one for each VA component segmented, identified by its ‘RepresentationID’: vod_seg_ridvid0_cn.mp4 (video via), vod_seg_ridvid1_cn.mp4 (video vib), vod_seg_ridvid2_cn.mp4 (video msc) and vod_seg_ridaud0_cn.mp4 (Catalan audio).
  • As many chunks (960 ms long) as necessary until reach the total duration of the content, identified by its ‘RepresentationID’ too: vod_seg_ridvid0_cn1.m4s… (Video via), vod_seg_ridvid1_cn1.m4s… (Video vib), vod_seg_ridvid0_cn1.m4s… (Video mosaic), vod_seg_ridaud0_cn1.m4s… (Catalan audio).

Once having all this MPEG DASH files generated to deliver its corresponding VOD stream, you only have to copy them on a Web Server contents folder, configure the mime type for the MPD files as ‘application/dash+xml’, and request the generated MPD with the HTTP URL.


Unified Streaming Platform – USP

Unified Streaming Platform is a streaming technologies provider with a complete set of products designed for use depending on the specific required functionality. Unified Streaming delivers media content from one unified source to multiple clients and devices, avoiding the need for separate infrastructure. It wraps the solutions available from Apple (HTTP Live Streaming), Adobe (HTTP Dynamic Streaming), Microsoft (Smooth Streaming) and MPEG-DASH into one.

USP is cross-platform and can run on Windows, Linux and Unix systems, fitting at each corresponding web server (Apache, IIS, Nginx, Lighttpd). It works for MPEG DASH streaming of VOD and Live contents.

Using USP for Live

The tests of Unified Streaming software were performed using their evaluation virtual machine (VM). Here is explained how to use this virtual machine, running Ubuntu and containing the necessary USP components, to generate and serve a multi video MPEG DASH Live stream.

On this VM we have to execute two scripts (bash) for each different DASH multi video Live stream we want so serve. One script creates the publishing point and the other one ingests and encodes the original Live stream we want to deliver in compliance with the MPEG DASH standard. These scripts have to be executed with ‘root’ permissions using the terminal.

Simulating a multi video Live stream

For the initial tests, we simulate a Live multi video stream using VLC. For a real Live stream we would need the equipment to generate an IP streaming with the required encoding. From a different computer in the same network we execute a VLC command to deliver a multi video content in loop as a unicast stream to the server containing the VM. The VA components of the content have been previously encoded as described in the section ‘Preparing the contents’. This simplifies the tests reducing the processing power needed at the USP server as the ‘original’ Live stream ingested is already properly encoded.

The simulated stream consists of a MP4 file encapsulated into a Transport Stream delivered via UDP. The VLC command executed to generate that unicast stream is the following:

“START vlc -vvv c:\hbb4all\OHD_25_MULTIPLEX.mp4–sout-all –sout “#std{dst=, mux=ts,access=udp}” –ttl 12 –loop”

Check Apache server and USP License Key

Before you launch the MPEG DASH Live stream executing the discussed scripts, we should verify that the Apache HTTP Server of the USP VM is running and has correctly recognized the Unified Streaming license key. Otherwise we will not be able to use Unified Streaming properly.

The Apache server is started when the virtual machine boots. In case there is any error or the license is not correct we can reboot the Apache service with the command: ‘sudo service apache2 restart’.

If we take look at the log messages at /var/log/apache2/error.log and we found “License key found” text followed by the key means that it is a valid one, has been detected and will make USP work. We could check that the license key is correct at the /etc/apache2/conf-enabled/usp-license.conf file, where we find the keyword ‘UspLicenseKey’ followed by a space and the key itself.

Publishing point

The first script we should run to offer a MPEG DASH live stream is for creating a publishing point on a directory under the Apache server content folder. For the example: ‘/var/www/usp-evaluation/live/channel1/channel1.isml’. This script is executed one time (finishes) and uses the Mp4split tool to create the manifest USP server manifest live file (ISML). Mp4split needs the USP license key to work properly too. The desired parameters for that command will be discussed below:


–archiving=1: stores on disc the last fragments of the encoded Live stream of set size, up to the maximum ‘archive_length’. 1 enabled. 0 disabled, so just keeps the last two segments stored.

–archive_segment_length=2: the ingested live stream is archived in frames of the length indicated by this parameter, in seconds (2 seconds fragments).

–archive_length=24: the total amount of live stream fragments stored on disk, in seconds (24 seconds, 12 segments).

–dvr_window_length=2: DVR mobile window length, in seconds (2 seconds DVR window).

–time_shift=3: time shift offset in seconds.

–restart_on_encoder_reconnect: when encoder stops for any reason, allows it to start running again pointing to the same publishing point.

–mpd.minimum_update_period=1: sets the minimum update period for the MPD of the MPEG DASH stream, in seconds (minimumUpdatePeriod = 1 second).

–mpd.min_buffer_time=0: sets the minimum time that a DASH client should have a chunk stored on buffer before playing it, in seconds (0 seconds of buffer).

Ingesting the ‘original’ Live stream

Once the publishing point has been created the second script can be executed. We use FFMPEG as the encoder to catch the ‘original’ Live stream (simulated with VLC) over the time, re-encode it if necessary and generate the ISMV fragments at the publishing point directory. When started, this script will run continuously if there is no error processing the stream. To finish this script ‘Ctrl+C’ can be used, this will stop the Live stream.


Note that the stream sent to udp://@ already has the video and audio encoding desired (bitrates, aspect ratio, fps, GOP size, sampling frequency…) and the only process that has to be performed by FFMPEG is copy that encoded content.

The query ‘fifo_size=100000’ added after UDP URL should be included to avoid input “circular input buffer overrun” problems.

Using multiples videos with different ‘language’ tags, Unified Streaming can separate each video component of the stream and generate a MPD with independent ‘Adaptations Sets’. The ‘language’ tag is the only difference between the different videos streamed.

Once the ISMV fragments and the ISML manifest are present in the server content directory, the MPEG DASH Live stream can be requested by an HbbTV client using a URl like this one: ‘’

With this URL you point to the publishing point (channel1.isml) and request a file with the MPD extension. As you can notice, the DASH chunks and MPD are not stored on the server. When receiving a request for a MPEG DASH Live stream, USP generates dynamically, on the fly, the MPD and the DASH chunks requested from the information stored at the ISML and the ISMV fragments.


Setting up the HbbTV Application for testing

In order to test the generated MPEG-DASH content, we have changed (from the application of HBB4ALL) the function getComponents where we expect an integer. To refer to the audio component it is an integer with value ‘1’, and in our case we use the value ‘0’ to refer to video components. We can see in the followings tables the equivalence of the function (extracted OIPF-DAE document):


The visual interface of the application is the following one and is divided in two lists. One is Live content and the other is VoD content:

If you play on the content we will see the first video with tag ‘Via’:

If we want to switch the video stream we have to go to selector of the ‘video tags’ (Via, Vib and Msc) and choose one of them:

Test of the application:

We have proved the application in Live and in VoD. The conclusions are the following ones:

  • Vo: We have tested with five manufacturers and it has worked in one. With this manufacturer, everything works correctly and the commutation between video streams is below one second with black screen.
  • Live:  we could observe that the television do not changes the played video stream when you switch them at the test application. Our logs show that the application receives the order and calls the correct switching function, and in fact we are still investigating which is the problem to solve it.

Links of interest

TVRING project:
Unified Streaming Platform:
OIPF-DAE document: OIPF-T1-R1-Specification-Volume-5-Declarative-Application-Environment
HBB4ALL project:
HBB4ALL multiples audios:

Posted in Developement, Interoperability, Media streaming, Testing Tagged with: , , , , , ,