CatDV and PresSTORE Archive Integration

UPDATES:

CatDV 9.0 Updates and changes to aw-queue.pl – (August 2011)

Original Post

CatDV has become a very popular video asset management system and with the advent of their Enterprise Server product, can provide entire video workgroups with an asset management and automation solution at a relatively aggressive price point. I am frequently asked if Archiware has a plug- in for CatDV. I had written some basic scripts in the past that gave very rudimentary support, but I always wanted previews and metadata in the Archiware archive index just in case CatDV went away and we still needed to search our archive.

While setup of this integration is not difficult, it should not be taken on unless you are sure you know what you are doing. I would recommend setting up in a test environment first if this is your first time as to not impact production databases. This is by no means the authoritative method of accomplishing this integration and I welcome feedback and advances to what has already been done.

First things first, let’s look at the methods being used for the two operations at play here, archive and restore. You can grab all of these scripts from the following URL:

http://provideotech.org/files/catdv-presstore-current.zip

 

The archive process has more steps than the restore process as we are copying our previews as well as extracting metadata from CatDV.  The script package is made up of 3 primary scripts.

 

1) aw-queue.pl
This script was written to let any text file with full media paths being on each line be used to trigger and queue an archive or restore job.  We will schedule this script to run using launched.  I would recommend running this once a day after hours so all media chosen for archival can be done in one batch, but more frequent periods are not problematic.  This script can be run against any text file, so CatDV is not necessary for it to function.

2) catdv-preview.pl
This script is run by Archiware while an archive job is going.  It must be added to the “preview generators” tab in the plan setup.  It locates and copies the CatDV preview movie into the Archiware index.  If this movie is unavailable, it can generate its own preview movie.

3) catdv-xml.pl
This script is run by Archiware while an archive job is going.  It must be added to the “metadata import” tab in the plan setup.  It parses the xml file written out by CatDV Worker and attaches that metadata to the asset in the Archiware index.

4) catdv2aw.pl
This is a very basic script that just appends the filename to the appropriate text file.  Arguably could be don without the script at all with longer commands in Worker Node.

The restore process is simpler as it only calls the aw-queue.pl file with a restore method.  As we are simply grabbing our hi-res file from the archive to return to our primary online storage, it does not need the preview or xml scripts.

Putting the pieces in place.

Some of these scripts do use Perl’s Simple::XML and File::Basename.  If you do not have these modules installed, use cpan to install them.

shell> perl -MCPAN -e shell
cpan> install XML::Simple
cpan> install File::Basename

Tying this all together isn’t all that difficult, but since there are several moving pieces it is worth pointing out where everything should go.  You don’t have to follow these paths, but I would recommend it.  First, make the following directories.  We are going to assume you already have PresSTORE installed in its default location (/usr/loca/aw/)

shell> mkdir /usr/local/aw-scripts/
shell> mkdir -p /usr/local/catdv-presstore/tmp

Make sure the tmp directory is world writable, it is where we will store temporary xml files.

shell> chmod 777 /usr/local/catdv-presstore/tmp

Now we need to make our text queues and make sure they are world writable as well

shell> touch /usr/local/catdv-presstore/archive-queue.txt & touch /usr/local/catdv-presstore/restore-queue.txt

Now lets move our scripts into place and make sure they are executable.

shell> cd /path/to/downloaded/scripts
shell> mv ./aw-queue.pl /usr/local/aw-scripts/
shell> mv ./catdv-xml.pl ./catdv-preview.pl ./catdv2aw.pl /usr/local/catdv-presstore/
shell> chmod +x /usr/local/catdv-presstore/catdv*
shell> chmod +x /usr/local/aw-scripts/aw-queue.pl

Script configuration

The CatDV scripts need to be set up to match appropriate metadata from CatDV to its target in the PresSTORE archive index.  This has to be set up in the catdv-xml.pl file.  If you set up your paths the same way as recommended, you will have to build your key:value pairs in the last two sections of the script.  I have put these sections below as a reference.  Simply modify/add/subtract key/value lines until all your metadata is mapped.

# this next section will have to be set up to match your PresSTORE/CatDV settings.
# first, place each of your PresSTORE metadata fields as array members below
# I have two metadata fields in my PresSTORE index, League and Event.  I start
# with md1key, incrementing for each metadata field I want to add.  Make sure you
# use the internal name for the field.

$md1key = "league"; #Maps to USER13 in CatDV
$md2key = "event";  #Maps to USER15 in CatDV

# in this section, I pull the USER field from CatDV that I want to map to each metadata key.
# for this example, the name of Metadata field USER13 in CatDV is "League", USER15 is "Event".
# repeat until all your md keys line up with values read from the xml file

$md1value = $data->{CLIP}->{USER13}; #extracts value of "league"
$md2value = $data->{CLIP}->{USER15}; #extracts value of "event"

Next, we need to modify the catdv-preview.pl file to meet the needs of our environment.  Specifically, we need to tell the script where to look for path based previews, what our preview extension is, and where we are going to temporarily copy those previews to for PresSTORE to move them into its index (I would recommend putting this path on the same filesystem as your PresSTORE index).

# where are our previews?
$previewdir = "/Users/Shared/CatDV Docs/Proxies";

# what kind of previews are we making?  (mov, mp4, m4v, etc)
$previewextension = "mp4";

# where do we want to temporarily copy the previews to before adding to AW index?
$awproxypath = "/tmp";

You may notice that there is logic built into the script to build a preview from scratch if the CatDV preview is not found.  I leave this choice up to you as to how you want to generate proxies.  I’ve written a basic primer on Xsanity on using qt_tools for preview generation in PresSTORE (http://www.xsanity.com/article.php/20110209085948148), but there are many other ways of performing this task.  If your facility has a dedicated encoding solution such as Episode Engine, FlipFactory, Rhozet, or similar I would recommend using it here.

The last script to setup is our aw-queue.pl script.  These are pretty self explanatory and are in the first section of the script file.  You are not going to want to change the first two variables as they grab the first and second directives from the CLI when the script is run.

##################
# User variables #
##################

#this is the path for the original media queue to be archived/restored
$full_queue_path = $ARGV[0];
#this is the method being used (archive or restore)
$method = $ARGV[1];

#this is the path to nsdchat.  You need to change this if you do not have a standard
#PresSTORE install
$nsdchat = "/usr/local/aw/bin/nsdchat -c";

#if the filesystem for the archived content lives on a different host than localhost,
#change that here
$client = "localhost";

#index ID - if using default index, change this to Default-Archive
$index = "CatDV";

#log file location.  Change this if you want this going someplace different.  Must
#change on non-MacOS X systems
$logfile = "/Library/Logs/aw-queue.log";

Once you have made the necessary changes, lets move on to the next section.

Setting up the PresSTORE environment

Now that our scripting environment is set up, we can set up our Archive index in PresSTORE.  Log in to your PresSTORE server as an administrator, and click on Archive at the top.  Now flip down the triangle to the lower left under “Advanced Options” and select “Manage Indexes”.

First we need to create a new index that will hold our CatDV specific assets.  One of the great powers of PresSTORE is its ability to have multiple indexes, so each workgroup in a facility can have multiple systems creating their own archives with their own discreet metadata schemes and setups.

At the bottom of the window, click “New”.

At this point, we are going to create a new archive index and enable it.  Create a name for your new index (in this example we use “CatDV”) as well as a description for the index.  This is what users will see if browsing the index through the PresSTORE interface.  For this example I have called it “CatDV archive index”.  The last field is where to story this index information.  For the sake of this example, we will leave this as its default.  If you want to learn more about relocating indexes, read the PresSTORE manual or contact me.

The next step is to add metadata fields that line up to your CatDV metadata fields.   These are the keys you added as $mdkey1-n values in the catdv-xml.pl script.

Right click on your index you just created and select “Fields…”

Here we have our metadata field setup for this particular index in PresSTORE.  There are 6 fields here to fill out.

Internal Name: This is the field you will use to input your $mdkey1-n values.  This is used for internal communication inside of the PresSTORE archive index and for API interaction.  It cannot have spaces or special characters and is meant for non-human interaction.

Type: This defines whether the metadata field is a Text field or a number.  For the sake of this integration, always select Text.

Application name: This field is what a user sees as the metadata field in the PresSTORE index when browsing through the PresSTORE interface.  I would recommend making this something human readable.

Comment: This is simply a comment field to further describe the field.

# rows:  This tells PresSTORE how many rows to display in the web UI for input and reading of metadata.  Adjust appropriately for the type of metadata being put into each field.  If it is a description field, for example, I would select 3-4 rows.  For one word or simple metadata (filetype or time of day for example), 1 row would be sufficient.

Order: This is the order that the fields are displayed to the user.  Your choice.

Close this window once you have your metadata fields set up.  We now need to allow user access to our new index so it can be selected when we set up our Archive plan.  To do this, under “Advanced Options” on the right hand, select “Access to Indexes”.

Double click your newly created index.  It will open a new login area configuration.  Select your Archive Index that you just created and add appropriate groups.  If you want to do more limited permission sets for the index, please consult the PresSTORE manual (section 7.1.4 in the P4 user guide).

Hit apply once you have allowed a user access to the new index and lets move on to the next step.  The final PresSTORE configuration is creating our Archive plan that we want to use to work with CatDV.  Do this by selecting “Archive” from the top menu and selecting “Archive Plan” from the left pane.  Click “New” to begin setup on your new Archive plan.

First thing you will have to do is give this plan a name.  I have called mine simply “CatDV” for this example.  Make sure after selecting a name, you click “Apply”.  Set the archive plan to “Enabled” and Auto start to “Disabled”.   Select your disk or tape pool (for more info on setting up pools, look at the Archiware user guide in section 7.4) and your index you just set up.

Hit “Apply”.  Now pull down the “General Setup” menu and select “Archive Options”.  Here we set Archiware to Delete files only.  This will delete the archived files after they have been written to archive and verified.  We can also setup a script here to clean up our temp files (.xml) if we like.  This is optional but recommended.  The script I wrote simply removes all files from /usr/local/catdv-presstore/tmp after a successful archive run.

The next two panes are where we point to our metadata and preview generation scripts (catdv-xml.pl and catdv-preview.pl).  Go into both the “Preview Generation” and “Metadata Import” tabs and point them to your scripts.

Now our PresSTORE environment is setup, time to move on to CatDV.

Setting up CatDV

If you are reading this, I’m going to assume you have used CatDV Enterprise server and are comfortable with setting up user metadata fields.  The important thing to remember is that we will be referencing the USERxx reference for each filed in our metadata map above.

Instead, I will walk through setting up necessary fields in CatDV for PresSTORE and configuration of Worker Node.  First things first, open up CatDV, connect to your server, and open the preferences.

We are going to create one new User Defined Field.  Select the “Field Definitions” menu and click “User Defined Fields”.  Add a new field called “PresSTORE” with a type of “Grouping”.  It should not be Locked or Mandatory.

Hit close and create a new Pick List for whatever column you just setup PresSTORE in.  Make 3 values, one blank, one “Archive”, one “Restore”.  Make sure you select “Fixed value” and “Keep Sorted”.  Do not allow Extensible or Remember values.

Close this dialog and save you new fields to the server.  Restart CatDV and verify that the new fields are available in your Log Notes tab.  Default should be blank.

Setting up Worker Node

We are almost there!  Time to set up CatDV Worker Node to react to our new metadata field with two automations.  The first automation  will be set up to trigger archival queue file writing and xml export.  The second will be to trigger restore queue file writing.  Go ahead and fire up Worker Node and click “Edit Config”.

First things first, lets make sure we have enough User-defined fields visible in worker node to subscribe to.  We need to be able to see our PresSTORE User Defined Field we just created.  In this example, that was USER16, so I need to have this box set to at least 16 fields.

Next, lets make sure we have our server information correct to connect to our CatDV enterprise server.  Remember, the Worker Node here does not have to be on the CatDV server itself, just on a machine that has physical access to media at the paths being provided by CatDV.

Now to set up our “archive” watch action.  We need to set up a Server Query action that monitors our “PresSTORE” User Defined Field.  We want to monitor for when the selection is set to “Archive” in the Server Query tab.  For all of these examples, my PresSTORE field is USER16.  Yours will vary depending on metadata set.

In the Parameters tab, make sure you Enable query, set the check interval to something fairly quick (10-30 seconds) and make sure Allow polling is disabled.  Also make sure Process online files only is checked.  We don’t want to pass information accidentally about a file that isn’t online.

In the Conversions tab, we actually aren’t doing any conversions, but we are going to execute one of our scripts from this location.  The script we call is catdv2aw.pl which takes two arguments.  The first argument is the location of our archive queue file (this would be at /usr/local/catdv-presstore/archive-queue.txt if you aren’t modifying locations).  The second argument is $i, which returns the full path of the file to be archived.  I put both of these arguments in quotes to eliminate escaping problems for files with spaces or certain unicode characters in their filenames.

 

The second part of this puzzle is exporting an XML file with the asset information for metadata parsing.  I place this in the /usr/local/catdv-presstore/tmp/ directory and pass the $f variable as the filename and append .xml.  The $f variable is just the basename of our file to be archived.  We use this for lookup in the catdv-xml.pl script.

/usr/local/catdv-presstore/catdv2aw.pl /usr/local/catdv-presstore/archive-queue.txt "$i"

Last but not least, we go into the Post-Processing tab to reset some metadata fields in the database.  First, we set Status to “Archived” so we have a searchable field for all archived content in CatDV.  If you are already using Status for other workflows in your environment, you could create another user defined field to accomplish the same task.  We also set our PresSTORE field back to blank so the field can be used for restore later.  Hit OK and your archive action is complete.

The process for setting up restore is very similar and actually a bit less involved.  We now want to monitor our PresSTORE field for “Restore” and want to Enable query, but we want to make sure that we do not have the Process online files only checkbox ticked as none of the content that will be restored will actually be online, so every task would fail.  For conversions, we are using our catdv2aw.pl script again, but this time we change from our archive queue to our restore queue.  We do not need to export xml here.

/usr/local/catdv-presstore/catdv2aw.pl /usr/local/catdv-presstore/restore-queue.txt "$i"

In Post-Processing, we again want to set our PresSTORE field to blank, but change our status to “In Archive (Online)”.

We are now done setting up Worker Node.  We can do some verification that things are working before putting the final piece in place now if we want to.  Have our Worker Node running and go select a file in your CatDV database.  Choose the PresSTORE field and set it to Archive or Restore and update the server.

Worker node should pick up the changes and write them into your /usr/local/catdv-presstore/archive-queue.txt and /usr/local/catdv-presstore/restore-queue.txt files.  If these files aren’t getting updated, double check that they are writable and that worker node is not throwing any errors in its log.

You can also now run the catdv-xml.pl and catdv-preview.pl scripts by hand against any line in your archive-queue.txt file.  The should return key/value pars and the path to the copied preview respectively (don’t forget to clean up that preview file by hand if you do this a lot.  PresSTORE does this for you, but the script does not).

Configuring launchd to start archive/restore jobs (Mac OS X).

The final configuration step in this process is to setup schedules to send our archive-queue.txt and restore-queue.txt file to the aw-queue.pl script.  The aw-queue.pl script takes multiple arguments and is used for both archival and restore.

shell> /usr/local/aw-scripts/aw-queue.pl
usage: aw-queue.pl queue_file archive|restore archive_plan

queue_file: full path to queue file
archive|restore: select whether to archive or restore from archive the source_file
archive_plan: which archive plan you want PresSTORE to use

We are going to formulate launchd scripts and enable them to run this script against all the information we just collected on a periodic basis.  We will have one script that runs to trigger archives and another that triggers restores.  My recommendation would be to set the periods for both low (1-5 minutes) during testing, but in production run archive only once or twice per day.  Restore should remain relatively quick.

You will place these launchd scripts in /Library/LaunchDaemons so they will be run at the system level by an administrative user.  If you don’t want to write these scripts by hand, I’d recommend picking up a copy of Lingon from the App store.  It is a GUI front end for loading/configuring launchd on Mac OS X.

My archive script looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
	<dict>
	<key>Disabled</key>
	<false/>
	<key>Label</key>
		<string>org.provideotech.aw-queue-archive</string>
	<key>ProgramArguments</key>
		<array>
			<string>/usr/local/aw-scripts/aw-queue.pl</string>
			<string>/usr/local/catdv-presstore/archive-queue.txt</string>
			<string>archive</string>
			<string>10002</string>
		</array>
	<key>StartCalendarInterval</key>
	<dict>
		<key>Hour</key>
		<integer>18</integer>
	</dict>
</dict>
</plist>

My first key simply says that the script is active

Next key is the name of the script.  This should look like net/org/com.<yourcompany>.aw-queue-archive

The program arguments are individual command line parameters in order.  If we were to just run this script by hand based on the above info, it would look like this:

shell> /usr/local/aw-scripts/aw-queue.pl /usr/local/catdv-presstore/archive-queue.txt archive 10002

The last directive is StartCalendarInterval.  As I have this configured, at hour 18 (6:00PM), my script would run.  You can have this run as quickly as seconds or as long as months.  For more info I would recommend reading up on launchd.

shell> man launchd

One final step and we are all done.  We just need to load up our launchd plist files.  There are two ways of doing this.  The first is a simple restart and if everything is correct the plist will load on boot.  The second is using launchctl.

shell> sudo launchctl load -w /Library/LaunchDaemons/org.provideotech*

This should load your launchd jobs and installation is complete.