Backing up Hudson conf: what to archive?

6 messages Options
Embed this post
Permalink
Emmanuele Sordini

Backing up Hudson conf: what to archive?

Reply Threaded More More options
Print post
Permalink
Dear all,
I plan to schedule regular backups of my Hudson work folder. Since it has grown quite large, I would like to back up only the vital information, i.e. what would be needed for a 100% recovery in case anything went wrong with the HD of the box it's currently running on.

I was even thinking of keeping track of the changes by archiving configuration files in a version management repository of some kind. Not quite for build work products, because I think this would soon cause storage issues on the repository.

So here's my question: what files have to be backed up/archived on a regular basis to meet the above requirement?

Thanks in advance
Emmanuele
David Weintraub

Re: Backing up Hudson conf: what to archive?

Reply Threaded More More options
Print post
Permalink
Depending on the plugins, you could simply backup the $HUDSON_HOME/jobs/*/config.xml files which define all of your project configurations. This is probably the smallest set to backup. You won't have any history, but all of your build jobs will be defined.

Backing up everything else but that jobs directory will allow you to backup your entire Hudson configuration including plugins. On our system, backing that up takes up 170mb which is considered quite small in today's world.

In Kornshell, it would look something like this:

$ backup $HUDSON_HOME/jobs/*/config.xml
$ backup $HUDSON_HOME/!(jobs)

Of course, you can back up the enter $HUDSON_HOME directory, and that way you'll get everything you need. This can be thrown onto another server and get Hudson up and running in a few minutes. All of your build history will be preserved that way.

On Wed, Nov 4, 2009 at 8:15 AM, Emmanuele Sordini <[hidden email]> wrote:

Dear all,
I plan to schedule regular backups of my Hudson work folder. Since it has
grown quite large, I would like to back up only the vital information, i.e.
what would be needed for a 100% recovery in case anything went wrong with
the HD of the box it's currently running on.

I was even thinking of keeping track of the changes by archiving
configuration files in a version management repository of some kind. Not
quite for build work products, because I think this would soon cause storage
issues on the repository.

So here's my question: what files have to be backed up/archived on a regular
basis to meet the above requirement?

Thanks in advance
Emmanuele
--
View this message in context: http://n4.nabble.com/Backing-up-Hudson-conf-what-to-archive-tp413462p413462.html
Sent from the Hudson users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]




--
David Weintraub
[hidden email]
Emmanuele Sordini

Re: Backing up Hudson conf: what to archive?

Reply Threaded More More options
Print post
Permalink
David Weintraub wrote:
Depending on the plugins, you could simply backup the
$HUDSON_HOME/jobs/*/config.xml files which define all of your project
configurations. This is probably the smallest set to backup. You won't have
any history, but all of your build jobs will be defined.

Backing up everything else but that jobs directory will allow you to backup
your entire Hudson configuration including plugins.
Dear David,
thanks for your reply. I would say I need to back up the jobs AND plugin configuration, that's it. The build history is useful but not vital if one is rather short on disk space (or does not want to clog the SVN repository).

David Weintraub wrote:
On our system, backing
that up takes up 170mb which is considered quite small in today's world.
You're definitely right. And plus, if an incremental backup facility is being used, one could keep track of all the change history with a minimum storage overhead.

Regards
Emmanuele
David Weintraub

Re: Backing up Hudson conf: what to archive?

Reply Threaded More More options
Print post
Permalink
On Thu, Nov 5, 2009 at 5:38 AM, Emmanuele Sordini
<[hidden email]> wrote:
> Dear David,
> thanks for your reply. I would say I need to back up the jobs AND plugin
> configuration, that's it. The build history is useful but not vital if one
> is rather short on disk space (or does not want to clog the SVN repository).

You're backing up to Subversion? I use Subversion, but I wouldn't use
it for backups. This is especially true for non-text files. Subversion
doesn't (easily) allow you to remove obsolete versions, so every time
a JAR file changes, it takes up that much more room in the repository.
In your case, I'd just backup the configuration files, and not the
plugins. Otherwise, you'll quickly run out of diskspace.

--
David Weintraub
[hidden email]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Emmanuele Sordini

Re: Backing up Hudson conf: what to archive?

Reply Threaded More More options
Print post
Permalink
> You're backing up to Subversion? I use Subversion, but I wouldn't use
> it for backups.

Me neither.

We've managed since to schedule regular backups on our server, and the HUDSON_HOME is one of the directory trees that is being backed up on a regular basis, both incremental and total. As cumbersome as it may be as far as disk space consumption goes, I believe this is the best solution if one has enough storage to afford it.

> This is especially true for non-text files. Subversion
> doesn't (easily) allow you to remove obsolete versions, so every time
> a JAR file changes, it takes up that much more room in the repository.
> In your case, I'd just backup the configuration files, and not the
> plugins. Otherwise, you'll quickly run out of diskspace.

I know that binary file archival can be a problem in case of big-sized items. But I think this issue applies to pretty much every configuration management system, not only SVN.

I was thinking about using SVN too for text (i.e. configuration) files. In fact regular backups ensure a good level of safety, but do not allow to keep track of changes. The only problem is that I should rig up something like a script to pick the configuration files from the Hudson job tree, copy them into a separate directory and archive them into SVN.

Regards
Emmanuele

Tobias Neckel

Re: Backing up Hudson conf: what to archive?

Reply Threaded More More options
Print post
Permalink
Hi,

> I was thinking about using SVN too for text (i.e. configuration) files. In
> fact regular backups ensure a good level of safety, but do not allow to keep
> track of changes. The only problem is that I should rig up something like a
> script to pick the configuration files from the Hudson job tree, copy them
> into a separate directory and archive them into SVN.

I recently finished such a (python) script that does this for all basic
xml-files of hudson as well as for all jobs and users. It tracks changes
in the jobs (creation, deletion, and also renaming).

We use a hudson-cron-job that performs the backup script calls once a
day (being part of the backuped jobs itself :-)).

This may not be the most elegant way to do so, but it works for us. I
include the script as well as a README.txt file, perhaps it helps.

Best regards
Tobias

--
Dr. rer. nat. Tobias Neckel

Institut für Informatik V, TU München
Boltzmannstr. 3, 85748 Garching

Tel.:   089/289-18632
Email:  [hidden email]
URL:    http://www5.in.tum.de/wiki/index.php/Tobias_Neckel

#!/usr/bin/python
# Script for handling comparison of hudson directory structure
# to support backup (via svn).
#
# The script assumes to be called in a separate job (in order
# to know the path local to that workspace) and searches the
# full hudson directory structure. The following directories
# and files are important:
# - ./*.xml
# - jobs/.../*.xml
# - users/.../*.xml
#
# DATE:   26.10.2009
# AUTHOR: Tobias Neckel
#
# INPUT:  -
#
# OUTPUT: -



#
# Method to compare and handle the originial and the backup structure.
# This method is called for jobs and users separately in the main script.
#
# INPUT:  backupDirectory Name of the backup directory inside the hudson
#                         base path (including final slash).
#         targetDirectory Name of target directory to compare inside the
#                         hudson base directory (users or jobs).
#
# OUTPUT: -
#
def compareAndHandleOriginalAndBackupStructure(backupDirectory, targetDirectory):

    if (targetDirectory!="users") & (targetDirectory!="jobs"):
        print "ERROR: only \"users\" and \"jobs\" allowed for targetDirectory!"
        sys.exit(1)

    fileNameStructure        = "tmp_structure_" + targetDirectory + ".txt"
    fileNameBackupStructure  = "tmp_backup_structure_" + targetDirectory + ".txt"
    #store all subdirectories for the jobs
    os.system("find ./" + targetDirectory + "/ -maxdepth 1 -type d | grep -v \".svn\" > " + fileNameStructure)
    infile = open(fileNameStructure,"r")
    # and of the backup structure of the jobs
    os.system("cd " + backupDirectory + "; find ./" + targetDirectory + "/ -type d | grep -v \".svn\" > " + fileNameBackupStructure)
    infileBackup = open(backupDirectory + fileNameBackupStructure,"r")

    # prepare lists + debug output jobs (skip first entry (which is ./jobs/)
    print "entries for target " + targetDirectory + ":"
    entries = []
    for i in infile.readlines()[1:]:
        print "current entry is: " + i,
        entries.append(i.strip())
    print "backup entries for target " + targetDirectory + ":"
    backupEntries = []
    for i in infileBackup.readlines()[1:]:
        print "current backup entry is: " + i,
        backupEntries.append(i.strip())    

    print ".... start comparison ...."
    for i in entries:
        print "current entry is: " + i
        isEntryAlreadyInBackupStructure = 0
        for k in backupEntries:
            if i==k:
                print "entry " + i  + " already in backup structure: copy xml-files, commit, delete row from list"    
                isEntryAlreadyInBackupStructure = 1
                os.system("cp " + i + "/*.xml " + backupDirectory + i + "/")
                os.system("svn commit " + backupDirectory + i + " -m \"Committing modified config file(s) (automated hudson maintenance)\"")
                backupEntries.remove(i)
        if isEntryAlreadyInBackupStructure!=1:
            print "entry " + i  + " not in backup structure; create it therein, copy xml-files, add, commit"
            os.system("mkdir " + backupDirectory  + i)
            os.system("cp " + i + "/*.xml " + backupDirectory + i + "/")
            os.system("svn add " + backupDirectory + i )
            os.system("svn commit " + backupDirectory + i + " -m \"Committing new config file(s) (automated hudson maintenance)\"")

    for k in backupEntries:
        print "for this old entry (" + k  + ") in the backup structure: call svn delete"
        os.system("svn delete " + backupDirectory + k + " -m \"deleting deprecated directory and config file(s) (automated hudson maintenance)\"")



######################
# Start of main script
######################
import sys
import os

if len(sys.argv)>1:
    #print "usage: " + sys.argv[0] + " <factor>"
    print "error: no arguments supported for this script"
    sys.exit(1)

#backupDirectory = "./hudsonBackupStructure/"
backupDirectory = "./hudson_backup/"

print "............... committing hudson backup docu .............."
os.system("svn commit " + backupDirectory + "/README.txt -m \"Committing hudson backup docu (automated hudson maintenance)\"")
os.system("svn commit " + backupDirectory + "/backupScript_copy.py -m \"Committing hudson backup docu (automated hudson maintenance)\"")

print "............... committing modified base xml configs ......."
os.system("cp ./*.xml " + backupDirectory + "/")
#ATTENTION: for initial usage: perform manual "add"!
os.system("svn commit " + backupDirectory + "/*.xml -m \"Committing modified base config file (automated hudson maintenance)\"")


print "............... comparing structure for jobs ..............."
compareAndHandleOriginalAndBackupStructure(backupDirectory, "jobs")
print "............... comparing structure for users..............."
compareAndHandleOriginalAndBackupStructure(backupDirectory, "users")




















author: Tobias Neckel
date  : 10.11.2009  

This file contains a summary of relevant infos concerning the basic hudson backup system.


We backup only config files (*.xml) of hudson (located in a directory that shall be called HUDSON_ROOT in the following) which will allow a complete recover of all jobs and users (and global config data). All builds and artefacts and history etc. is considered to be not really relevant.

Three categories are currently backuped:
- HUDSON_ROOT/*.xml (containing the file config.xml with the basic hudson data (such as relevant nodes etc.))
- HUDSON_ROOT/jobs/*/*.xml
- HUDSON_ROOT/users/*/*.xml

The backup is realised via a python script (backupScript.py) lying directly in HUDSON_ROOT. The backup is referring to a svn repository (file:///home_local/repositories/svn/hudson_backup/). The script is put in copy (because of the different location) into the repository (backupScript_copy.py).

Note that only one level of recursion below the jobs and the users, resp., is supported in order to avoid LARGE data to be committed!

The backup is triggered automatically by a job called backup_hudson (starting with "backup" in order to push it to the top position of the alphabetical list of jobs). This job is executed once a day to track changes in jobs or users (currently at 23:00).

------------------------------------
Tips and Tricks

*) There seems to be a problem when a job name contains brackets etc. Using standard letters and underscores avoids this.

------------------------------------
Setup of a New Hudson Instance

If, for some reason, you need to resetup your hudson, you need to perform the following steps:

*) create the corresponding HUDSON_ROOT (if not existent):
mkdir HUDSON_ROOT

*) checkout the backup repository instance
cd HUDSON_ROOT
svn checkout file:///home_local/repositories/svn/hudson_backup/

*) copy the xml-files to the HUDSON_ROOT (exporting the svn to a temporary folder TMP (which will be newly created!) to avoid .svn folders!):
svn export file:///home_local/repositories/svn/hudson_backup/ TMP
cp -r TMP/* HUDSON_ROOT
rm -rf TMP




------------------------------------
Initial Setup of Backup Repository

If, for some reason, the repository fails/is deleted/disappears, here is a list of all necessary steps that need to be taken to reset it up again. You need to make sure to have backupScript.py from somewhere (either from your old HUDSON_ROOT, or from peano/howto/, e.g.).

*) create a repository instance:
svnadmin create hudson_backup

*) create somewhere locally a directory (called TMP in the following) with the basic hudson backup structure:
cd ~
mkdir TMP
cd TMP
mkdir jobs
mkdir users

*) import the basic hudson backup structure into the svn:
cd ~/TMP
svn import -m "Initial import" file:///home_local/repositories/svn/hudson_backup/

*) checkout the basic hudson backup structure into HUDSON_ROOT:
cd HUDSON_ROOT
svn checkout file:///home_local/repositories/svn/hudson_backup/

*) prepare the first run of the backup script by copying all base configs and adding them manually to the svn:
cd HUDSON_ROOT
cp *.xml ./hudson_backup/
cd hudson_backup/
svn add *.xml

*) copy the backup script (located somewhere (BASE_DIR) at your side; peano/howto/, e.g.) into the hudson root directory:
cp BASE_DIR/backupScript.py HUDSON_ROOT

*) put this README.txt and the backupScript.py into the repository:
cp BASE_DIR/README.txt HUDSON_ROOT/hudson_backup/
cp BASE_DIR/backupScript.py HUDSON_ROOT/hudson_backup/backupScript_copy.py
cd HUDSON_ROOT/hudson_backup/
svn add README.txt
svn add backupScript_copy.py
svn commit README.txt  -m "Initial setup" file:///home_local/repositories/svn/hudson_backup/
svn commit backupScript_copy.py  -m "Initial setup" file:///home_local/repositories/svn/hudson_backup/

*) (optional) call a backup run manually (otherwise just wait for the next automatic hudson backup realised by the backup_hudson cron-job (if still specified)):
HUDSON_ROOT/backupScript.py > logInitialBackupScript.txt 2>&1



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]