Automatically backing up P5 indexes

Much of the truly important index data in Archiware P5 is indeed stored on tape with the data itself.  That said, ancillary information like system licensing, custom metadata and proxies, and configuration settings are often not saved.  While all of this is recoverable in the case of a true disaster, sometimes it is easier to backup the backup software itself to a different location for fast stand up in the case of a true catastrophe.

Using python (2.6 for this script), here is a basic script that automates this process for Mac and Linux P5 servers.

Let’s break it down:

This first section simply imports the libraries we are going to use in python and allows us to set a few simple variables, namely where P5 is installed and where we want our backups to go

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#!/usr/bin/env /usr/bin/python
 
import sys
import os
import commands
import subprocess
import sys
import shutil
import time
 
#
# where does your archiware install live (no trailing slash)
aw_path = "/usr/local/aw"
# where do you want to backup index and log files to?
backup_path = "/Users/szumlins/Desktop/backup"

Next, we want to see if nsd (the P5 server process is running). We do this using the *nix pgrep command and looking for output. If no output is returned, we know it isn’t running and we tell variable “nr” that the software isn’t running.

16
17
18
19
20
21
# check to see if nsd is already running, if it is lets throw a flag
nr = 0 
# our flag
if(commands.getoutput('pgrep nsd') == ""):
	print "P5 is not running"
	nr = 1

Now that we know if P5 is running, we want to make sure that no jobs are active. It would be a bad idea to shut down the service in the middle of a large backup or archive!

The first line checks if we should even do this step. If P5 isn’t running, then no jobs are running.
After this, we use the subprocess class to run a CLI call to nsdchat, P5’s CLI API asking for the name of all the jobs.
Following lines just clean up spaces and put all the jobs into an array so we can iterate against them in our next block.

22
23
24
25
26
27
# if P5 is running, lets check and make sure no jobs are running
if(nr != 1):
# get all the running jobs
	jobs_str = subprocess.check_output([aw_path + "/bin/nsdchat","-c","Job","names"])
	jobs_str.rstrip()
	jobs = jobs_str.split()

Now that we know all the jobs, we only care about jobs that are actually running. Scheduled jobs in the future are registered as jobs even if they aren’t running.

First, we create an iterator so we can get some metrics when we are done with our loop (i).
Next we enter a loop for all of the job numbers in the array we created above.
Using the Job class in the P5 API, we run the status method against our job number. If that job returns “running”, we increment our iterator (i) by one.

29
30
31
32
33
34
# figure out which are running and which are stopped. We only care about running jobs
i = 0
for job in jobs:
	status = subprocess.check_output([aw_path + "/bin/nsdchat","-c","Job",job,"status"]).rstrip()
	if status == "running":
		i = i +1

Now that we’ve looked at all the jobs, if all is working our (i) should be zero if it is safe to restart and should be larger if there are running jobs. We simply say that if (i) is greater than 1, send some info to the console and don’t run the shutdown process or backup process.

36
37
38
39
# if there is a running job, don't stop the service
if i < 0:
	print "There are " + str(i) + " job(s) running, can't shut down service!"
	exit()

If P5 is running AND there are no jobs running, we can safely stop the service. We run the stop-server command and print what it is doing to the console.

40
41
42
43
#if we can shut down P5, lets do it
print "Shutting down P5"
output = subprocess.check_output([aw_path + "/stop-server"])
print output

If we have gotten this far, we now know that our P5 service isn’t running and it is safe to make copies of the online indexes.

First things first, we timestamp the start process so we know how long all of this is going to take.

44
45
# Lets make a backup of the files now, first we make a stamp so we know how long it took
time_start = time.time()

Safety says always make a backup of your backup, so first we check to see if there is a safe previous one. If that backup directory is found, move it to a backup backup. Backup backup backup backup backup.

46
47
48
49
50
51
52
# Next we check to see if there is already a valid previous backup. If there is, move it to keep it safe
print "Checking if there is existing backup"
if(os.path.isdir(backup_path + "/aw")):
	print "Existing backup found, moving old backup"
if(os.path.isdir(backup_path + "/aw-old")):
	shutil.rmtree(backup_path + "/aw-old")
	shutil.move(backup_path + "/aw",backup_path + "/aw-old")

Now that we are certain that we have someplace safe to make a copy to, lets get to copying. We are copying three directories that are useful in case of a full recovery: indexes, customerconfig (server configuration), and logs.

53
54
55
56
57
58
# After we make sure we can make a new copy, we copy the proper files
print "Making backup copy"
shutil.copytree(aw_path + "/config/index",backup_path + "/aw/config/index",symlinks=True)
shutil.copytree(aw_path + "/config/customerconfig",backup_path + "/aw/config/customerconfig",symlinks=True)
shutil.copytree(aw_path + "/log",backup_path + "/aw/log",symlinks=True)
time_stop = time.time()

Last but not least, we calculate how long it took and start P5 back up so it can continue on its merry way.

59
60
61
62
63
64
# Then we let you know how long it took
print "Backup took " + str(time_stop - time_start) + " seconds"</code>
 
# Last, we start up the server
output = subprocess.check_output([aw_path + "/start-server"])
print output

This script does need to be run as a root user as it starts/stops the P5 service. Permissions on the folders it creates are default to root ownership, so you may want to noodle with the script to change permissions on your backups to make them more accessible.

Best bet would be to schedule this script to run in off hours with cron or launchd as the root user.

Update 11/6/2017

I’ve updated the script to support Python 2.6 and 2.7 as well as to provide logging. A lot of the same methods are still in place, but the updated script is now available at https://github.com/szumlins/p5_self_backup

4 thoughts on “Automatically backing up P5 indexes

  1. Great work Mike.

    How you modify this to add multiple paths, i.e. the index is another volume than the install and you want to backup both somewhere else.

  2. The shiutil commands are copying the existing indexes, so if you have indexes that don’t live in the default locations you could add lines that accomplish what you want there. That said, the commands are set to follow symlinks, so if you have a symlinked index on other storage into your default Archiware install directory, it actually will get copied.

  3. Guys,

    I found it helpful to change the exit() to a sys.exit(1) so that we can catch the exit status for in line scripts. And I also added a sys.exit(0) to the bottom of the code for a clean/successful exit.

    So change #1 –

            # if there is a running job, don't stop the service
            if i < 0:
                    print "There are " + str(i) + " job(s) running, can't shut down service!"
                    #exit()
                    sys.exit(1)

    And change #2 –

    # Last, we start up the server
    output = subprocess.check_output([aw_path + "/start-server"])
    print output
    sys.exit(0)
  4. Hi there,

    So, thank you for the detailed python script to backup P5 data. However, the machine I’m running P5 on doesn’t have (a proper version of?) Python installed, so trying to run this just gave me an error in my shell.

    I took the liberty to re-write this script to bash (should be compatible with other shell derivates, but the test expressions are bash-specific double brackets)

    I tried to keep everything as close to your original python code as possible, and kept the comments the same for clarity.
    This should run on every linux-like machine that P5 can be installed on.

    [code]#! /usr/bin/env bash

    # where does your archiware install live
    aw_path=”/usr/local/aw”
    # where do you want to backup index and log files to?
    backup_path=”/home/admin/awbackup”

    # check to see if nsd is already running, if it is lets throw a flag
    nr=”0″
    # our flag
    pgnsd=$(pgrep nsd)
    if [[ -z $pgnsd ]]; then
    echo “P5 is not running”
    nr=”1″
    fi

    # if P5 is running, lets check and make sure no jobs are running
    if [[ $nr != 1 ]]; then
    # get all the running jobs and list them on separate lines
    jobs_str=$($aw_path/bin/nsdchat -c Job names | tr ‘ ‘ ‘\n’)
    fi

    i=”0″

    # figure out which are running and which are stopped. We only care about running jobs
    printf ‘%s\n’ “$jobs_str” | while IFS=” read -r line || [[ -n “$line” ]]; do
    job=”$line”
    status=$($aw_path/bin/nsdchat -c Job $job status | grep “running”)
    if [[ $status != “” ]]; then
    i=$(echo “$i+1” | bc)
    fi
    done

    # if there is a running job, don’t stop the service
    if [[ $i -ge 1 ]]; then
    echo “There are $i jobs running, can’t shut down service.”
    #if we can shut down P5, lets do it
    elif [[ $i -eq 0 ]]; then
    echo “Shutting down p5 service”
    $aw_path/stop-server
    fi

    # We check to see if there is already a valid previous backup. If there is, move it to keep it safe
    echo “Checking if there is an existing backup”
    if [[ -d “$backup_path”/aw/config ]]; then
    echo “Existing backup found, moving old backup”
    mkdir -p “$backup_path”/aw_old
    mv “$backup_path”/aw “$backup_path”/aw_old/
    fi

    # Lets make a backup of the files now, first we make a stamp so we know how long it took
    time_start=$(date +%s)
    # Then we start the actual backup copy
    # [R]ecursive to copy the full tree, [f]orced override, and symbolic [L]inks are followed
    echo “Making backup copy”
    cp -RfL “$aw_path”/config/index “$backup_path”/aw/
    cp -RfL “$aw_path”/config/customerconfig “$backup_path”/aw/
    cp -RfL “$aw_path”/log “$backup_path”/aw/
    time_stop=$(date +%s)
    # Then we let you know how long it took
    backup_time=$(echo “$time_stop-$time_start” | bc)
    echo “Backup complete. Copy took” $backup_time “seconds”

    # Last, we start up the server
    $aw_path/start-server[/code]

    the test -z expression sets $nr to 1 if pgrep nsd has no output, i.e. if it is empty, similar to how Python checks if the output is exactly “”

    jobs_str is rolled up into a single command, but actually does two things. It checks if any jobs are running, then breaks that output up into multiple lines.
    The output of nsdchat -c Job names is always a one-liner on stdout, but for the purposes of checking each job separately (in the next step) we want to be able to test them on by one.
    That’s why we break up the one-liner by translating ‘ ‘ (space) into ‘\n’ (new line) and fill the variable with a bunch of numbers on separate lines which we can then print one at a time, and throw into a loop.
    (reading line by line)

    The way to run these are similar, too. For this script, simply add the code to a new file with the .sh extention and make it excecutable (chmod -x scriptname.sh)
    Then execute it in a Bash prompt, via ssh, or set it in crontab under the root or admin user.

    I hope this helps those who don’t have a working Python on their machine, for whom Python fails to run, or whomever are better at reading and manipulating Bash scripts. If there’s anything incorrect or anyone has improvements, I’d be very interested in hearing them.

    Kind regards,

    Jay

Leave a Reply