07.13.10

Sysadmin: Automatically Compress Media Files Placed in a Directory

Posted in OS X Server, Sysadmin at 3:16 pm by cygnil

We do a lot of media conversion at my workplace, generally as part of a service for our clients. We’ve seen most formats come in, and we put them in the outgoing queue on most types of media (except for 8-tracks, so far as I know). It’s usually inconvenient to compress these media with our own workstations because most of us like to actually work on our workstations while this is going on.

To solve this, I wrote a Perl script to turn our underused OS X server into a media conversion workhorse. It’s a little underpowered by today’s standards (dual G5 processors), and more powerful computers are always better when it comes to compressing video, but the convenience of a networked drag’n'drop media compressor outweighs the speed issues.

This script requires FFMPEG for it to do its compressing, and also works best on a *NIX or OS X system (although it was developed and initially tested on a Windows machine, so everything except for email notices should work on that). It also requires the File::Find::Rule module for Perl, which can be downloaded from CPAN or the ActivePerl repository. One word of caution: the FFMPEG project isn’t composed of the user-friendliest of blokes and doesn’t do official releases, and while binaries are available for Windows the Linux and OS X users are going to have to download and compile their own versions. Novices be warned.

Before I show the code, here’s a brief feature list:

  • It can read and write a variety of codecs and formats. Some of them have to be compiled into FFMPEG separately, but almost everything is supported.
  • It can email a designated list of users when a job is completed, as well as the same list of users plus a separate list of admins if a job fails.
  • Options can be given on a per-file basis by placing a special text file alongside the media file to be converted.
  • Yes, this will work with DVDs; no, it won’t take things like subtitles or alternate language tracks into account. The default language and subtitles are what get put in the resulting media file.

Now, here’s the code:


#! /usr/bin/perl

# Media Converter 1.0 (C) 2009 Yale University
# Released under the Creative Commons-LGPL 2.1 License
# http://creativecommons.org/licenses/LGPL/2.1/

# Converts media files in a certain folder. Best when set to run automatically.
# It can email users when their media have been converted as well as sending
# an alert to those same users plus system administrators when a job fails for
# some reason. Encoding options can be given to each job by placing a text file
# with the name of the target media file plus ".txt" in the input directory
# (e.g. the file "in.mpg" would have a corresponding options file "in.mpg.txt").
# This script will continue to encode movies above a certain size threshold in
# the input directory from oldest to newest until there are no more files to
# encode. More than one copy of this script is prevented from running
# simultaneously by use of a lock file.

use strict;
use File::Copy;
use File::Find::Rule;

# You'll definitely want to change some of these to suit your own setup
#
use constant
{
  # If you have the nice command available, set an integer
  # here to make use of it (for the sake of other tasks
  # running on the server)
  NICE          => "",
  
  # Path your your FFMPEG binary
  FFMPEG_BINARY       => "/usr/local/bin/ffmpeg.exe",
  
  # Path to store the lockfile in
  LOCKFILE_PATH       => "/tmp/",
  
  # Name of the lockfile
  LOCKFILE         => "media_convert.lock",
  
  # Set to non-zero to delete source files after encoding
  DELETE_AFTER_DONE     => 0,
  
  # Set to non-zero to email admins on error
  EMAIL_ON_ERROR       => 1,
  
  # Addresses to email when errors occur
  EMAIL_ERROR_ADDRESSES   => 'your_email@example.com',
  
  # Directory to scan for source files
  INPUT_DIR         => "/share/media/in/",
  
  # Directory to move source files to during work
  WORKING_DIR       => "/share/media/working/",
  
  # Directory to move source files to after encoding
  # (assuming they don't get deleted)
  DONE_DIR         => "/share/media/done/",
  
  # Directory to place output in
  OUTPUT_DIR         => "/share/media/output/",
  
  # Minimum size of a file to be eligible for encoding
  SIZE_THRESHOLD      => "50Ki",
  
  # Network path for users to access the output files;
  # gets sent out in notification email
  OUT_SHARE        => "",
};

my %encopts;

# Check to make sure the environment is sane before these become issues
#
if ( ! -x FFMPEG_BINARY )
{
  error( FFMPEG_BINARY . " does not exist or is not executable." );
}
if ( ! -d INPUT_DIR || ! -r INPUT_DIR )
{
  error( "Input folder " . INPUT_DIR . " can't be read or does not exist." );
}
if ( ! -d WORKING_DIR || ! -r WORKING_DIR )
{
  error( "Working folder " . WORKING_DIR . " can't be read or does not exist." );
}
if ( ! -d WORKING_DIR || ! -w WORKING_DIR )
{
  error( "Working folder " . WORKING_DIR . " can't be written to." );
}
if ( ( ! -d DONE_DIR || ! -w DONE_DIR ) && !DELETE_AFTER_DONE )
{
  error( "Done folder " . DONE_DIR . " can't be written to or does not exist." );
}
if ( ! -d OUTPUT_DIR || ! -w OUTPUT_DIR )
{
  error( "Output folder " . OUTPUT_DIR . " can't be written to or does not exist." );
}

# Lock file; if this exists, another copy of the program is running (probably)
if ( -d LOCKFILE_PATH )
{
  if ( -e LOCKFILE_PATH . LOCKFILE )
  {
    print "Lock file detected at " . LOCKFILE_PATH . LOCKFILE . ". To get this to run, please remove the lock file.n";
    exit;
  }
  else
  {
    open( H_LOCK, ">", LOCKFILE_PATH . LOCKFILE ) or error( "Couldn't create lock at " . LOCKFILE_PATH . LOCKFILE );
    close( H_LOCK );
  }
}
else
{
  error( "Lockfile path " . LOCKFILE_PATH . " does not exist!" );
}

# Now start searching for files and encoding them
#
# "in" rule has to come last!
my @candidate_files = File::Find::Rule->file()
  ->size( ">=" . SIZE_THRESHOLD )
  ->maxdepth( 1 )
  ->in( INPUT_DIR );

while ( @candidate_files > 0 )
{
  # Make sure customized options from one file don't persist to others
  #
  reset_encoding_defaults();
  
  # Use a FIFO method for determining which item to encode
  #
  # Start determining which file is the oldest
  my @older_files = ( $candidate_files[ 0 ], $candidate_files[ 0 ] );
  while ( @older_files > 1 )
  {
    my $first_file = $older_files[ 0 ];
    my @stat = stat( $first_file );
    @older_files = File::Find::Rule  ->file()
      ->size( ">=" . SIZE_THRESHOLD )
      ->mtime( $stat[ 9 ] )
      ->maxdepth( 1 )
      ->in( INPUT_DIR );
    if ( @older_files == 0 ) { @older_files = ( $first_file ); }
  }
  my $work_file = $older_files[ 0 ];
  $work_file =~ s|${INPUT_DIR}|&WORKING_DIR|ei;
  print "Moving '" . $older_files[ 0 ] . "' to '" . $work_file . "'...";
  move( $older_files[ 0 ], $work_file ) or error( "Coudn't move " . $older_files[ 0 ] . " to working folder." );
  print " done.n";

  # Check to see if there's an attached file of encoding options;
  # if so, read it in. Code from Recipe 8.16 of the Perl Cookbook
  #
  if ( -f $older_files[ 0 ] . ".txt" )
  {
    print "Found config file, reading it in...";
    open( H_CONF, $older_files[ 0 ] . ".txt" );
    while( <H_CONF> )
    {
      chomp;                    # no newline
      s/#.*//;                  # no comments
      s/^s+//;                 # no leading white
      s/s+$//;              # no trailing white
      s/r//;      # Remove Windows line endings
      next unless length;  # anything left?
      my ($var, $value) = split(/s*=s*/, $_, 2);
      $encopts{$var} = $value;
    }
    close( H_CONF );
    print " done.n";
  }
  
  # Do the conversion
  #
  my $out_file = $work_file;
  $out_file =~ s|${WORKING_DIR}|&OUTPUT_DIR|ei;
  $out_file =~ s/..*$/.$encopts{ 'out_extension' }/;
  my $ffmpeg_string = '"' . FFMPEG_BINARY . "" -i "$work_file" -y -r " . $encopts{ 'framerate' }
    . " -acodec " . $encopts{ 'acodec' } . " -vcodec " . $encopts{ 'vcodec' }
    . " -qscale " . $encopts{ 'qscale' } . " -b " . $encopts{ 'b' }
    . ( $encopts{ "size" } ? " -s $encopts{ 'size' }" : "" )
    . ( $encopts{ "time" } ? " -t $encopts{ 'time' }" : "" )
    . " -bt " . $encopts{ 'bt' } . " "$out_file"";
  if ( NICE ) { $ffmpeg_string = NICE . " " . $ffmpeg_string; }
  `$ffmpeg_string`;
  
  # Check that the conversion went all right and dispose of the file appropriately
  #
  if ( ! -f $out_file ) { error( "Output file wasn't created! FFmpeg string was:n" . $ffmpeg_string ); }
  if ( DELETE_AFTER_DONE )
  {
    unlink $work_file or error( "Couldn't delete '$work_file' after encoding." );
    unlink $older_files[ 0 ] . ".txt";
  }
  else
  {
    my $done_file = $work_file;
    $done_file =~ s|${WORKING_DIR}|&DONE_DIR|ei;
    move( $work_file, $done_file ) or error( "Couldn't move '$work_file' to '$done_file'." );
  }
  
  # If there's anybody who signed up to be alerted when the job is done,
  # email them now
  #
  if ( $encopts{ 'done_addresses' } )
  {
    my $filename = $out_file;
    $filename =~ s|.*/||;
    my $message = "Encoding of your file '$filename' has finished, and it can now be found at '" . OUT_SHARE . "$filename'.";
    `echo $message | mail -s "Encoding of $filename completed!" $encopts{ 'done_addresses' }`;
  }
  
  @candidate_files = File::Find::Rule->file()
    ->size( ">=" . SIZE_THRESHOLD )
    ->maxdepth( 1 )
    ->in( INPUT_DIR );
}

# Now clean up
#
unlink LOCKFILE_PATH . LOCKFILE;

#
# END OF PROGRAM
#

# Subroutines
#

# Report an error to the console and optionally through email
#
sub error
{
  my $error = shift();
  
  # Notify the people who should be alerted on error as well as the users
  #
  if ( EMAIL_ON_ERROR )
  {
    my $message = "Encoding of an item failed. The following message was given:nn" . $error;
    $encopts{ 'done_addresses' } .= EMAIL_ERROR_ADDRESSES;
    `echo $message | mail -s "Encoding failure!" $encopts{ 'done_addresses' }`;
  }
  if ( -e LOCKFILE_PATH . LOCKFILE )
  {
    unlink LOCKFILE_PATH . LOCKFILE;
  }
  die( "An error occurred: '" . $error . "'" );
}

# This is where you change the defaults. In particular, if
# you have any extra libraries compiled into FFMPEG then
# you'll probably want to set those here (e.g. libx264, libfaac)
#
sub reset_encoding_defaults
{
  $encopts{ 'acodec' }       = "vorbis";
  $encopts{ 'vcodec' }       = "h263";
  $encopts{ 'framerate' }     = "30";
  $encopts{ 'qscale' }       = "8";
  $encopts{ 'b' }         = "96k";
  $encopts{ 'bt' }         = "256k";
  $encopts{ 'size' }        = "";
  $encopts{ 'time' }      = "";
  $encopts{ 'other_opts' }     = "";
  $encopts{ 'out_extension' }     = "mov";
  $encopts{ 'done_addresses' }   = "";
}

At runtime, the program searches a directory for media files and encodes any it finds in FIFO fashion according to file modification date, i.e. it encodes the oldest file first, then the next-oldest, and so on. If it finds a text file bearing the same name as the media file (e.g. “MyMovie.mov” would have a text file “MyMovie.mov.txt”) then it reads in options from that text file and replaces the defaults with them. This allows for granular control over formats and encoding options without having to invoke a command line. Here’s a sample text file:


 # This is a file for defining options for a particular file.
 # For example, if you wanted to use mp3 compression instead
 # of the default, you could put that here. If you're familiar
 # with the FFMPEG binary, you can also give it special
 # parameters (such as setting the size or aspect ratio)
 # from within this file.
 #
 # To use this file, simply copy it and rename it using the
 # name of the media file it's attached to, plus the extension
 # ".txt". For example, if you have a media file with the name
 # "test.mpg" then the options file would be "test.mpg.txt".
 # You can then set the below options according to your needs.

 # Audio codec to use. Valid options are libfaac, libmp3lame,
 # ac3, mp2, adpcm_ms, pcm_s8, flac, vorbis, and many others.
 # See "Valid Encoders.txt" for a full list.
 # The default is libfaac.
acodec = libfaac

 # Video codec to use. Valid options are libx264, flv, h263,
 # mjpeg, mpeg1video, mpeg2video, mpeg4, rv10, rv20, wmv2,
 # and many others. See "Valid Encoders.txt" for a full list.
 # The default is libx264.
vcodec = libx264

 # Extension to give the output file, e.g. MOV, WMV, AVI.
 # The default is mov.
out_extension = mov

 # Addresses to email when the video is finished encoding.
 # Separate them with a comma, and it's up to the user to
 # make sure the addresses are valid. No default
done_addresses =

 # Framerate of the output video. Default is 30
framerate = 30

 # Qscale factor. This has nothing to do with the physical
 # dimensions; rather, it has to do with PSNR quality.
 # The default is 8.
qscale = 8

 # Video bitrate in bits/second. Use "k" for thousands.
 # The default is 96k.
b = 96k

 # Video bitrate tolerance. Use "k" for thousands. Too
 # low and quality will suffer dramatically. The default
 # is 256k.
bt = 256k

 # Other options to pass to FFMPEG. These should be given
 # just as they would be on the command line. The full list
 # is at http://www.ffmpeg.org/ffmpeg-doc.html
 # Some useful ones:
 # -s WxH       Defines size in width by height, e.g. 640x480
 # -t seconds   Encodes only the first part of a video,
 #                e.g. 30 for the first thirty seconds
 # -aspect x:y  Sets the aspect ratio, e.g. 16:9
 # -ac X        Sets the number of audio channels
 # Others can be found at http://www.ffmpeg.org/ffmpeg-doc.html
 #
 # As an example, to encode the first minute of a video at
 # 320x240 resolution, the options would be set to
 # "-s 320x240 -t 60"
 #
 # There is no default
other_opts =

Now for the final part: making the script run repeatedly. It will scan the target directory once when it’s invoked, but if you want to have it run on anything dropped in a directory you’ll need to run it periodically, sort of like OS X folder actions except capable of working on remotely-added files (folder actions will only work when something is dropped in from the server’s console). To do that in OS X, create a .plist file in a convenient system directory (I used /Library/LaunchDaemons):


<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/
PropertyList-1.0.dtd">
<plist version="1.0">
  <dict>
    <key>Label</key>
    <string>Media_Convert</string>
    <key>Program</key>
    <string>/usr/local/bin/media_convert.pl</string>
    <key>StartInterval</key>
    <integer>300</integer>
  </dict>
</plist>

(This sets the script to run every five minutes, thereby constantly checking the target directory for media.)

Finally, set the script to launch when the OS X machine is started by issuing the command:

launchctl -w load /Library/LauchDaemons/your_script.plist

Linux users have it easier this time around: don’t muck about with the plist file, and instead simply set the Perl script to be executed every five minutes from cron.

Windows users: sorry, you’ll either have to deal with the awkward Windows task scheduler or download something like CRONw. You’ll also be devoid of email capability unless you set up sendmail on your Windows machine (as well as cat) or alter the lines in the script which send email out.

I’ve found this script to be very useful so far, and I hope that it can help someone else, too.

Leave a Comment

You must be logged in to post a comment.