MPEG-DASH Content Generation with MP4Box and x264

The Situation: Your pre-MP4Box DASH file

A video is given in some container format, with a certain codec, probably including one or more audio tracks. Let’s call this file inputvideo.mkv. This video should be prepared for MPEG-DASH playout.

H.264/AVC for video will be used within segmented mp4 containers.

The tools: X264 and MP4Box

Two tools will be used. x264 to prepare the video content, and MP4Box to segment the file and create a Media Presentation Description (MPD). Alternatively it is possible to generate MPEG-DASH & HLS content out of this mkv with our Bitmovin Encoding Service, which perfectly integrates with our Bitmovin Player.

Preparing the video file

If the source video is already in the correct format, this step can be skipped. However, the odds are long for this being the case.

The following command (re-) encodes the video in H.264/AVC with the properties we will need. All the command line parameters are explained after the code.

1
x264 --output intermediate_2400k.264 --fps 24 --preset slow --bitrate 2400 --vbv-maxrate 4800 --vbv-bufsize 9600 --min-keyint 48 --keyint 48 --scenecut 0 --no-scenecut --pass 1 --video-filter "resize:width=1280,height=720" inputvideo.mkv
Parameter Explanation
--output intermediate_2400k.264 Specifies the output filename. File extension is .264 as it is a raw H.264/AVC stream.
--fps 24 Specifies the framerate which shall be used, here 24 frames per second.
--preset slow Presets can be used to easily tell x264 if it should try to be fast to enhance compression/quality. Slow is a good default.
--bitrate 2400 The bitrate this representation should achieve in kbps.
--vbv-maxrate 4800 Rule of thumb: set this value to the double of --bitrate.
--vbv-bufsize 9600 Rule of thumb: set this value to the double of --vbv-maxrate.
--keyint 96 Sets the maximum interval between keyframes. This setting is important as we will later split the video into segments and at the beginning of each segment should be a keyframe. Therefore, --keyint should match the desired segment length in seconds mulitplied with the frame rate. Here: 4 seconds * 24 frames/seconds = 96 frames.
--min-keyint 96 Sets the minimum interval between keyframes. See --keyint for more information.We achieve a constant segment length by setting minimum and maximum keyframe interval to the same value and furthermore by disabling scenecut detection with the --no-scenecut parameter.
--no-scenecut Completely disables adaptive keyframe decision.
--pass 1 Only one pass encoding is used. Can be set to 2 to further improve quality, but takes a long time.
--video-filter "resize:width=1280,height=720" Is used to change the resolution. Can be omitted if the resolution should stay the same as in the source video.
inputvideo.mkv The source video

Note that these are only example values. Depending on the use case, you might need to use totally different options. For more details and options consult x264’s documentation.

Segmenting

Now we add the previously created h264 raw video to an mp4 container as this is our container format of choice.

1
MP4Box -add intermediate.264 -fps 24 output_2400k.mp4
Parameter Explanation
intermediate_2400k.264 The H.264/AVC raw video we want to put in a mp4.
-fps 24 Specifies the framerate. H.264 doesn’t provide meta information about the framerate so it’s recommended to specify it. The number (in this example 24 frames per second) must match the framerate used in the x264 command.
output_2400k.mp4 The output file name.

What follows is the step to actual create the segments and the corresponding MPD.

1
MP4Box -dash 4000 -frag 4000 -rap -segment-name segment_ output_2400k.mp4
Parameter Explanation
-dash 4000 Segments the given file into 4000ms chunks.
-frag 4000 Creates subsegments within segments and the duration therefore must be longer than the duration given to -dash. By setting it to the same value, there will only one subsegment per segment. Please see refer to this GPAC post for more information on fragmentation, segmentation, splitting and interleaving.
-rap Forces segments to start random access points, i.e. keyframes. Segment duration may vary due to where keyframes are in the video – that’s why we (re-) encoded the video before with the appropriate settings!
-segment-name segment_ The name of the segments. An increasing number and the file extension is added automatically. So in this case, the segments will be named like this: segment_1.m4s, segment_2.m4s, …
output_2400k.mp4 The video we have created just before which should be segmented.
mp4box-v11

Fore more details please refer to the MP4Box documentation.

The output is one video representation, in form of segments. Additionally, there is one initialization segment, called output_2400k_dash.mp4. Finally, there is a MPD.

And that’s it.

What’s next?

Just put the segments, the initialization segment, and the MPD onto a web server. Then point the Bitmovin Player config to the MPD on the web server and enjoy your content.

In case of problems with the player, please refer to the FAQ..

What about more representations?

The steps explained in this post can be repeated over and over again, just pass another bitrate to x264. And make sure previously created files are not overwritten.

For each representation another MPD will be created. As it is just a XML file, it is possible to open it with a text editor and copy & paste the representation into the video AdaptationSet of another MPD file until all needed representations are in the same MPD.

What about audio?

The same MP4Box command as for video can be used for audio. If you have audio already in a separate file, just use this as input.

MP4Box -dash 4000 -frag 4000 -rap -segment-name segment_ audiofile.mp4

If audio is in the video file, you can use the #audio selector:

MP4Box -dash 4000 -frag 4000 -rap -segment-name segment_ video.mp4#audio

To have the MPD content in the same MPD as video, copy & paste the whole Adaptation Set.

Encoding MPEG-DASH & HLS?

Learn more about MPEG-DASH & HLS encoding!

All the best,
Daniel & the Bitmovin Team
Follow us on Twitter: @bitmovin

How to encode multi-bitrate videos in MPEG-DASH for MSE based media players

INTRODUCTION

Online video streaming best-practices have evolved significantly since the introduction of the html5 <video> tag in 2008. To overcome the limitations of progressive download streaming, online video industry leaders created proprietary Adaptive Bitrate Streaming formats like Microsoft Smooth Streaming, Apple HLS and Adobe HDS. In recent years, a new standard has emerged to replace these legacy formats and unify the video delivery workflow: MPEG-DASH.

In this article, we’d like to talk about why Adaptive Bitrate Streaming technology is a must-have for any VOD or Live online publisher, and how to encode Multi-bitrate videos mp4 files with ffmpeg to be compatible with MPEG-DASH streaming. In a subsequent post, we will show you how to package videos and generate the MPEG-DASH .mpd manifests that will allow you to deliver high quality adaptive streaming to your users.

ADVANTAGES OF ADAPTIVE STREAMING

Adaptive streaming technologies have become so popular because they greatly enhance the end-user experience: the video quality dynamically adjusts to the viewer’s network conditions, to deliver the best possible quality the user can receive at any given moment. It reduces buffering and optimizes delivery across a wide range of devices.

adaptive-streaming-schema

Adaptive streaming schema

But Adaptive Bitrate Streaming is trickier on the back-end side: distributers need to re-encode videos into multiple qualities and create a Manifest file containing the information on the location of the different quality segments that the user’s player will then use to obtain the segments he or she needs. Good news, this guide will show you how to do it the right way for MPEG-DASH!

TOOLS

As we saw before, there are two steps to providing adaptive streaming. First, encode your file into different qualities, and second, divide those files into segments and create the manifest that will link to the files.

Today we’ll take a look at how to  properly re-encode your files into different qualities. To do so, we’ll use the very well-known video tool FFmpeg. This library is the Swiss army knife of video encoding and packaging. Download FFmpeg

For the second step, you can use GPAC’s MP4Box, or do it on-the-fly with a Media Streaming Server such as Wowza, USP or Mist Server. We’ll provide an overview of the different solutions in the second post on this subject.

VIDEO ENCODING: THE IMPORTANCE OF I-FRAMES

Even if the MPEG-DASH standard is codec agnostic, we will encode our videos with h264/AAC codecs with fMP4 packaging, which is the most commonly supported format in today’s browsers.

The biggest trick in encoding is to align the I-frames between all the qualities. In the encoding language, I-frames are the frames that can be reconstructed without having any reference to other frames of the video. As they incorporate all of the information about the pixels in each image, I-Frames take up a lot more space and are much less frequent than the other types of encoding frames. Plus, they are not necessarily at the same place across different qualities if we try to encode with the default parameters. The issue is that after the encoding, we will need to divide the videos into short segments, and each segment has to start with an I-frame. If the I-frames are not aligned between different qualities, the lengths of the segments will not match, rendering quality switching impossible.

To ensure that users will be able to switch between the different qualities without issues, we need to force regular I-Frames in our file while we encode the video into different qualities.

FFMPEG COMMAND LINE

The cleanest way to force I-frame positions using FFmpeg is to use the  x264opts ‘keyint=24:min-keyint=24:no-scenecut’  argument.

  • -x264opts allow you to use additional options for the x264 encoding lib.
  • keyint sets the maximum GOP (Group of Pictures) size, which is the group of frames contained between two I-Frames.More info on GOPs.
  • min-keyint sets the minimum GOP size.
  • no-scenecut removes key-frames on scenecuts.

Let’s look at an example of encoding a 720p file with FFmpeg, while forcing I-Frames.

As you can see, we use a framerate of 24 images per second and then a GOP size of 24 images, which means that our I-Frames are located every second. Using the same command with a different bitrate, we can create files of three different qualities with the same I-Frames position:

To choose the value of the GOP size, you’ll need to take in account the length of the segments you want to generate: segment length * framerate has to be a multiple of the GOP size.

Example: if the framerate is 24 and you want 2-seconds segments, the GOP size needs to be either 48 or 24). Know that if your GOP size is too big, the seek might not work properly in some players: as for quality switching, the player has to seek out an I-frame to resume the streaming.

to learn more about h264 encoding with ffmpeg, check out their guide.

CONCLUSION

Congratulations! You’ve successfully encoded our video into different qualities with aligned I-frames. Now, simply fragment them into video segments and generate the MPEG-DASH Manifest file. We’ll show you how to do this in our next blog post, so stay tuned!


PART 2


In our previous article How to encode Multi-bitrate videos in MPEG-DASH for MSE based media players (1/2), we examined how to encode a video file in different qualities with FFmpeg encoder. If all has gone well, you now have different files with different bitrates for your video. Because we have chosen an adequate framerate and GOP size, we can now create a functional Multibitrate Stream with several video qualities for the player of your choice, according to different end-user parameters (network connection, CPU power, etc.).

As we saw before, there are several Adaptive Bitrate Streaming technologies out there. For this tutorial, we chose to focus on MPEG-DASH, which we strongly believe will become a ubiquitous format in upcoming years. We are not alone in this belief. The DASH working group has the support of a range of companies such as Apple, Adobe, Microsoft, Netflix, Qualcomm, and many others. Unlike other RTMP-based flash streaming technologies, MPEG-DASH uses the usual HTTP protocol.

The advantage of HTTP is that it doesn’t need an expensive, near-continuous connection with a streaming server (as is the case for RMTP) and that it is firewall friendly and can take advantage of HTTP caching mechanisms on the web. It is also interesting to compare the DASH format with existing streaming formats like Apple HTTP Live Streaming or Microsoft Smooth Streaming. All of these implementations use different manifest and segment format; therefore, to receive content from each server, a device must support its corresponding proprietary client protocol. DASH has been created to become am open standard (supposedly royalty-free) format to stream content from any standard-based server to any type of client. In a nutshell, there are a host of benefits to using MPEG-DASH, and we hope you are now convinced that if you need to deliver Adaptive Bitrate Streaming on the web, MPEG-DASH is the way to go.

In this article, we’ll see :

  • first what is MPEG-DASH format and how it mainly works
  • then we’ll point out different tools you can use to generate MPEG-DASH manifests
  • and eventually we’ll explain how to play the MPEG-DASH files you will have just created.

At the end of your reading, you will be in possession of re-encoded files and an MPEG-DASH .mpd manifest generated from the different video files you packaged. This manifest is the most important file for delivering high quality adaptive streams to users, so stay focused!

 

1. LET’S DASHIFY!

MPEG-DASH, like other Adaptive Bitrate Streaming technologies, has two main components: the encoded streams that will be played for the user and the manifest file. This file contains the metadata necessary to stream the correct video file to the user. More precisely, it splits the video into several time periods, which are in turn split into adaptation sets. An adaptation set contains media content. Generally there are two adaptation sets: one contains the video and the other the audio. However, there may be more adaptation sets (if there are several languages, for instance) or only one adaptation set. In this case, the single set contains both the video and the audio; the content is said to be muxed.

11808-ozer-dash-fig1-org-1

Each adaptation set contains one or several representations, each a single stream in the adaptive streaming experience. In the figure, Representation 1 is 640×480@500Kbps and Representation 2 is 640×480@250Kbps. Each representation is divided into media segments called chunks. These data chunks can be represented by a list of their urls, or in time-based or index-based url templates. They can also be represented by byte-range in a single media file. In this case, no need to split the media file into thousands of little files for each data chunk (as is required for HLS, for example).

The DASH manifest, a .mpd file (Media Presentation Description), is an XML providing the identification and location of the above items, particularly the urls where the media files are hosted. To play the stream, the DASH player simply needs the manifest, as it fetches each part of the video needed from the information contained in the manifest. According to the network and CPU status, the player will choose the segment from the most suitable representation to deliver a stream with no buffering.

Let’s look at what tools we can use to create this manifest file from our mp4 video files, and how to use these tools!

 

2. WHAT TOOLS SHOULD I USE TO PACKAGE IN MPEG-DASH?

  • DO IT YOURSELF WITH MP4BOX:

MP4Box is a very useful multimedia packager distributed by GPAC. You can get the binary files here. Once you have installed it, you can execute the following command to Dashify your file, with the correct options and arguments. Watch out though, it is important that when you encoded your files in mp4, you entered the right parameters to get one IFrame per segment. You can refer to our previous article about encoding with ffmpeg to learn how to do it properly with ffmpeg.

  • -dash [DURATION]: enables MPEG-DASH segmentation, creating segments of the given duration (in milliseconds). We advise you to set the duration to 2 seconds for Live and short VOD files, and 5 seconds for long VOD videos.
  • -rap -frag-rap: forces segments to begin with Random Access Points. Mandatory to have a working playback.
  • profile [PROFILE]: MPEG-DASH profile. Set it to onDemand for VOD videos, and live for live streams.
  • -out [path/to/outpout.file]: output file location. This parameter is optional: by default, MP4box will create an output.mpd file and the corresponding output.mp4 files in the current directory.
  • [path/to/input1.file]…: indicates where your input mp4 files are. They can be video or audio files.
    (More options are available here.)

Let’s take a look at an example. For a set with 2 video qualities (vid1.mp4 and vid2.mp4) and 2 audio qualities (aud1.mp4 and aud2.mp4), you would need to run the following command:

Be careful: for this to work, you need vid1.mp4 and vid2.mp4 to have only video. If not, you can isolate the video track by adding #video after the file name: vid1.mp4#video and vid2.mp4#video. It might be better to generate a manifest with at least one video and one audio track, as muxed content isn’t always supported (Wowza isn’t able to manage muxed content, for instance).

After doing that, you should get one manifest file and one file by track (in the previous example, you should get four mp4 files: two audio files and two video files). That’s it –  you have just Dashified your video files! You can then put them on a server and provide the manifest url to your player. The urls of the mp4 files are contained in the manifest; just open it with a text editor if you ever wish to change them.

  • THE EASY WAY: WOWZA

Wowza is a Media Server providing easy solutions to stream your Videos On Demand or your Live streams in the format of your choice. Wowza has paid services to stream videos, but also offers a Free Trial if you’d like to try the Wowza Streaming Engine. This service has a lot of benefits if you need to get your videos online easily and quickly, and one of them is that Wowza provides DASH for your videos. Simply load your mp4 files (encoded using the methods from the first part of our article) to your server and Wowza will take care of creating the DASH manifest from the files. As simple as that. To get a url to your manifest, read this part of the Wowza documentation explaining the Wowza url syntax. Another advantage is that it is very quick to install Streamroot solutions if your streams are created using the Wowza Streaming Engine, so go for it!

You can find on their website the Wowza quick start guide and Wowza pricing. Moreover, Wowza can also be used to transcode your files (meaning to re-encode them) with a paid add-on.

  • OTHER FREE TOOLS:

Of course, other tools exist to re-package your videos in DASH format.

  • Nimble is a free media server that is extremely easy to use and totally compatible with Streamroot. Linux-based, it is a very reliant and efficient server that works on all operating systems. For easy integration with Streamroot, check out our tutorial on configuring Nimble and Streamroot.
  • The Bento4 Packager is a software tool for content packaging and parsing that works with several DRM platforms like CENC, PlayReady or Marlin. Like MP4box, it allows you to package your mp4 in DASH, generating a .mpd manifest. You can download it here and find some useful documentation here.
  • Nginx is an open-source HTTP Server that can be turned into a Media Server thanks to the nginx-rtmp-module. It takes a Live RTMP stream in input and on the other side provides a Live stream in HLS or Dash format. Nginx is free but has some constraints: it is only for live streams, your input stream has to be a RTMP stream, and the setup can be quite painful.
  • Unified Streaming Platform is a very efficient platform to encode and stream your media. With a host of output formats, it can be used to deliver MPEG-DASH and HLS for VOD or Live streaming. Among its benefits are smooth integration into your CDN and DRM management. Users must pay a license to use USP, but a Free Trial installation is available here to help you make up your mind.
  • If you’d rather use a Free Media Server solution, take a look at MistServer. This solution is said to be highly customizable and potentially very effective for many output formats (including HLS & DASH). The one trade-off: you might have to go through a complex configuration process. Be aware, however, that DASH and DRM are only available in the non-free MistServer Pro version.
  • There are also other (paid) solutions to obtain DASH streams from your videos: Zencoder and encoding.com are two cloud-encoding services which allow you to create DASH streams from any video files.

To sum it up, here is a comparative table to help you make your choice:

dash-tools-comparative-sheet-feuille-1-2-1024x286

* available with a paying add-on
** a Pro version exists including DASH and DRM management.

 

3. AND NOW THAT I HAVE MY DASH, HOW DO I PLAY IT?

If you have your manifest and mp4 files online and want to test them quickly, why not try the Streamroot demo player? Just enter the url of your manifest, click on load, and your VOD or Live stream will be played by our player. If you do this, open the demo page with your stream in several tabs: you will see the Streamroot p2p module start working with your stream and a graph will show you how much bandwidth you could save with Streamroot’s technology. ( PS: our player also supports HLS and Smooth Streaming streams !)

Be careful though, you will have to allow CORS requests, Range requests and OPTIONS requests on your webserver (quick and easy with Wowza!) if you want the p2p module to work. This is necesary as all the new HTML5 based players are asking the video segment with XHR requests ! For more information on how to configure the CORS parameters on your webserver, you can check out this website.

If you have your DASH files hosted on a server, you can also use the dash.js player to play them on a web page. Check out the documentation on github to install it. If you just want to test your stream, go to this page and enter the url of your manifest to play your media.

Lastly, if your file is stored locally, you can use a local player like the one offered by GPAC (the same guys who did MP4Box). This player can be configured in many ways and is perfect for testing the DASH files you just created. You can download it here and find some interesting configuration information here.

 

CONCLUSION

Congrats! Now that you’ve read the both parts of our article, you perfectly know how to encode your files in different bitrates, how to package them in MPEG-DASH files, and event how to test it with the Streamroot Demo Player. At this time, you’re totally ready to integrate Streamroot p2p module into your streaming delivery system !

Otherwise you are still ready to deliver high quality Multibitrate Streaming in a format that is expected to become the next international Streaming Standard.

Stay tuned for other news about MPEG-DASH and video streaming – we still have a lot to say about it. Finally, please feel free to send us feedback about this article via comments or Twitter. We’re also very curious about what hot topic you would like to read about in our next month blog post ( Dash ? HLS ?  HTML5 streaming and Media Source Extensions ? or more WebRTC tutorials ?), so don’t hesitate to send us your suggestions !