Using open source tools to generate more previews in PresSTORE

A while ago, I wrote an article about using qt_tools to generate proxy previews for PresSTORE Archive.  While this solution is relatively simple, because of the toolkit it was relegated solely to Quicktime enabled Mac systems.  It made codec support easy, but also came with its own set of problems.

As Linux/Windows becomes more common for PresSTORE installs, I thought it might be nice to find a cross platform solution.  Enter ffmpeg, imagemagick, and ffmbc.  All of these open source toolkits allow us to encode preview movies from a variety of sources to a variety of platforms.

I will not go into installing these tools onto your system as there are already a multitude of sites on the internet that go through the process, but if you would like to learn more about each of these tools, here are their websites:

ffmpeg – http://ffmpeg.org/

ffmbc – http://code.google.com/p/ffmbc/

ImageMagick – http://www.imagemagick.org/

ffmpeg and ffmbc are great tools for video encoding, and ImageMagick is a tremendous tool for still image manipulation.  But enough about the tools, you can read more about them in the links above.  Lets write some PresSTORE Preview Generation scripts. I won’t show an ffmbc example here, but once you understand how ffmpeg works, you should be able to relatively simply intermingle the two to figure out which is the best fit for you in your environment.

One thing you need to understand first and foremost when writing PresSTORE Preview Generator scripts is how the entire engine works.  It is a very simple tool that has a great deal of power, but before we dive in we probably want to go through how scripts are called and what information is used where.

At its core, the script engine only takes one input.  This input is the full path of the file that is being archived.  Your script can carry whatever arguments you want to build into it, just know that last argument passed in the command line is the full path of the file being archived.

Example:

$>/path/to/my/script -script_argument_1 -script_argument_2 -script_argument_n /path/to/file/being/archived

In addition to the inputs, there is only one output that PresSTORE itself cares about and that is the path of the location of the proxy preview that you will be passing to PresSTORE.  That means the only output to stdout your script should EVER send is the path of a correctly generated proxy preview.  If you would like to build logging into the script (which I would recommend for troubleshooting and testing), you want all of that output to be piped to an external file.  You also want to make sure any output that would normally go to stdout for any commands in your script instead echo to a log location or are silenced.  If we run our script, we should get nothing put the path to the proxy preview.

$>/path/to/my/script -script_argument_1 -script_argument_2 -script_argument_n /path/to/file/being/archived
/path/to/proxy/preview
$>

One other thing to keep in mind is that PresSTORE actually moves this proxy preview file to your target location, it does not copy it. I only mention this because if you are looking up an existing preview proxy that has already been generated from another system in your script and not generating a new file, you may inadvertently unlink that file from whatever system generated it in the first place if you don’t copy it first to the same filesystem that your proxies live on for PresSTORE.

Alright, enough of the boring stuff.  Lets build an extremely simple script that uses perl and ffmpeg to generate an H.264 preview of our original media.  Below is the entire script for you to read, then we will break it down line by line. These scripts should run Mac/Linux without much modification, but you will have change some of the paths and format if you want to use it in Windows.

#!/usr/bin/perl
 
use File::Basename;
 
$infile = $ARGV[0];
$inname = basename($infile);
$outfile = "$inname.mp4";
 
system("/usr/local/bin/ffmpeg -y -i \"$infile\" -pass 1 -vcodec libx264 -pix_fmt yuv420p -vf scale=240:160:-1 -b:v 128k -passlogfile /tmp/passlog \"/usr/local/proxietemp/$outfile\" 2>/var/log/awproxy.log");
 
print "/usr/local/proxietemp/$outfile\n";

Alright, so what does all of this mean? First line is our script interpreter, it tells the system what we are using to interpret all this junk.

1
#!/usr/bin/perl

After our interpreter, I use a built in class in perl called File::Basename. This class allows me to quickly determine a file’s name when it is passed a full path rather than doing my own regexp mucking about.

3
use File::Basename;

The next few lines define some internal variables I want to use to make the process easier on me. The first line takes the full path that PresSTORE passed my script and names it $infile. $ARGV[0] is the first argument passed on a command line in perl. Alternately, if you built more runtime options for your script, you would want to make sure $ARGV[x] would be whatever the last argument on the CLI is here as that is what PresSTORE passes (example, if you had a size=50 argument in your script in PresSTORE, that would take over ARGV[0], so your infile would move up one to $ARGV[1]).

The next line, I am taking that full path to our file to be archived and chopping out just the filename and calling that $inname. The following line simply says that I am going to call my output file the same name as my input file, but add .mp4 to end of it. It is important to note that PresSTORE will rename this file after the move, so the filename really does not matter other than just making sure it is something we know.

5
6
7
$infile = $ARGV[0];
$inname = basename($infile);
$outfile = "$inname.mp4";

The next line is where we actually send our input file to ffmpeg for output. This is meant as an example and this command is by no means a catch all for every video format out there, but it catches a large amount of input formats and builds a standard H.264 mp4 file.

The system() command is used in perl to call a standard CLI command and return only the exit status of this command. This script has no error trapping, but could easily include an if/then statement that makes sure that the ffmpeg job ran and the file at output is valid. I won’t go into that here, but something to think about.

The “-y” flag tells ffmpeg to overwrite our output file if it already exists
The “-i” flag is followed by the location of our file we are reading in as source. Up above, we called this $infile, so we pass that here. The \” is needed to insure that files with spaces in their names are properly passed. Inside the system() call, we can’t just use ” as that would end our argument, so we escape the ” with a \.
The “-pass” argument tells us how many passes we want to do on the encode. Since these are low quality proxies, 1 is sufficient.
The “-vcodec” argument asks us which encoder we want to use on the file. In this case, I’m using libx264 to generate MP4 files.
The “-pix_fmt” argument asks us what kind of colorspace/format we are using, in this case 4:2:0 progressive
The “-vf” argument allows us to pass options to ffmpeg, in this case we are scaling down our source video to 240×160 and trying to preserve its source aspect ratio if possible.
The “-b:v 128k” argument tells us the audio encode rate we want to use.
The “-passlogfile” argument asks us where we want to put our pass information which is used by ffmpeg if more than 1 pass is needed. It is still needed for single passes, but not referenced.
Next, we pass the path that we want to output our proxy to, in this case it is going to “/usr/local/proxietemp/$outfile”
Lastly, we pass “2>/var/log/awproxy.log”. This simply say that any output that would normally run to stdout instead gets written into our log file at “/var/log/awproxy.log”

9
system("/usr/local/bin/ffmpeg -y -i \"$infile\" -pass 1 -vcodec libx264 -pix_fmt yuv420p -vf scale=240:160:-1 -b:v 128k -passlogfile /tmp/passlog \"/usr/local/proxietemp/$outfile\" 2>/var/log/awproxy.log");

Now that we have our preview generated and we know where it is, we need to tell PresSTORE where it is. This is simple as just printing out the only output to stdout the script needs, the full path to the proxy preview.

11
print "/usr/local/proxietemp/$outfile\n";

If all went well, we can test this script outside of PresSTORE to make sure it worked properly by just running it from a command line:

$>/path/to/my/script/proxy.pl /path/to/a/media/file.ext
/path/to/preview.mp4

Simply open the “/path/to/preview.mp4” file in the media player of your choice and verify that it worked. If so, congrats, your first script is in the bag for basic moving pictures.

But what if you have film sequences? What then? Luckily, ImageMagick is actually capable of decoding 10-bit LOGRGB dpx and cineon film scans, so using a very similar method we can create small jpg frames for preview in PresSTORE to determine if our sequence is the right one before restoring. The script is even simpler here.

#!/usr/bin/perl
 
use File::Basename;
 
$infile = $ARGV[0];
$outfile = basename($infile);
$tempdir = "/usr/local/proxietemp";
 
system("/usr/bin/convert \"$infile\" -resize 256 \"$tempdir/$outfile.jpg\"");
 
print "$tempdir/$outfile.jpg";

As you can see, the first few lines are exactly the same and the last line is similar as well, the only change is that instead of using the ffmpeg program, we are now using the convert program installed with ImageMagick. This is a very simple tool that takes in our high res file and resizes it to match a 256 pixel horizontal size with default jpeg quality. If you want to get more in depth with ImageMagick, just visit their site and read the documentation.

The last step is now setting up our PresSTORE environment to be able to actually run these scripts and load our previews. This is relatively simple as well. Just add them as preview generators in your archive plan like so:

One more important thing about timeouts – if the timeout is reached before the script finishes because it is still encoding, PresSTORE will skip that file and move on to the next. If you are doing long form encodes or have a slower system you are encoding on, make sure you set the timeout high enough to allow for the preview generator to complete cleanly. This isn’t an exact science, but you can guess.

Hope this helps anyone looking to do this in the future.