Installation of SystemImager

SystemImager Install


Background

The Physics Department has Computer Labs in Weniger 412 and 497.  These machines have Linux installed on them.  To make administration easier, the support staff looked for a solution, similar to Symantec Ghost for Windows, to use with Linux machines.  After some searching, the SystemImager Package was decided upon.

It is possible to clone machines in any operating system environment by using a Linux Rescue disk and the command dd. 
# dd if=/dev/sda of=/dev/sdb       (Assumes you want the first drive copied to the second drive.  If drives are IDE, use hda and hdb instead)
But for this to work, the drives must be the same size, although copying from smaller drive to a larger should work, but the extra space is wasted.

This approach was used initially for the original lab setup, but is time consuming, and requires opening up each machine and placing hard drives in different machines. 

I was looking for something that would do this over the network, and also would allow for incremental updates.

This document describes the installation and configuration of SystemImager in the labs.  There is also a section on problems that occurred during installation.  Note that this document was was edited from the previous systemimager install instructions created from memory, some time after the actual installation.

Installation:

The files were downloaded from sourceforgeVersion 4.1.4 was used.  This is actually an unstable version, but due to problems encountered with the boot process with an older version, this actually seemed to work better. 

The following packages were installed on the server via the standard rpm method:

client
common
flamethrower (not used in first installation and was not downloaded for last installation)
server
x86_64boot-standard
x86_64initrd-template

The same packages were installed on a sample client.  This client had been installed with all software that was requested for a lab machine.  Basically, this machine was ready to be cloned to all other machines.

Configuration:

For our new installation, we had to edit the /etc/fstab and the /boot/grub/menu.lst files before we pushed the image over the network. The problem was that the fstab and menu.lst files knew the exact name of the hard drive (ie. they were using /dev/disk-by-id, instead of the more useful /dev/sdaX notation). So when we pushed the image over to the next computer, the hard drive names weren't the same, and the computer we pushed the image to would not boot. With the df command, we were able to see that the hard drive it boots from was sda. Below is the old fstab and menu.lst files and the new ones that we edited them to.

fstab.old  
fstab  
menu.lst.old  
menu.lst  

The rest I just followed their guide for doing the setup, although this manual was for version 3.4.1, most of the steps are the same.

There really wasn't much special configuration needed to do to do a basic setup. The main step on the client is to do
# si_prepareclient --server physics-server.physics.oregonstate.edu

This starts an rsync process on the client so that the server can download all of the files from the client.

Then start the si_getimage program on the server to pull the image to the server.
# si_getimage --golden-client wngr497-pc07.physics.oregonstate.edu --image 03_12_2008 --exclude /home/*
replacing wngr497-pc07.physics.oregonstate.edu with the name or IP of the client, and 03_12_2008 with whatever name you want to call the image. We had a problem a couple of times where the home directories that were mounted were copied along with all the other files.  Follow the instructions in the script. Here is the most recent script located at /etc/systemimager/cluster.xml on the server

cluster.xml  
In this script, you can see what we named the image as and separate the computers in 412 and 497 into two groups. This will start downloading all of the files on the client to /var/lib/systemimager/images on the server.  Note that there wasn't much space for these images in that partition, so I made the images directory a symlink to /space/systemimager_images, which is a 235 GB partition, plenty of room for a few ~15 GB  images.
Note that it is a good idea to keep images a couple of versions back in case something is wrong with an image.

Also, you can change any files directly to the image by changing those files.  The root directory is the image name under the images folder.

Now, the next part was getting this image on the other machines with a bootable usb drive. 
The basic thing to do is to issue the command:
# si_mkautoinstalldisk --device /dev/sdb --kernel /usr/share/systemimager/boot/x86_64/standard/kernel --initrd /usr/share/systemimager/boot/x86_64/standard/initrd.img --append "IMAGESERVER=128.193.97.2 IMAGENAME=03_12_2008 SKIP_LOCAL_CFG=y BITTORRENT=y"
The usb drive must have a file named local.cfg with at least the following lines:
IMAGESERVER=physics-server.physics.oregonstate.edu (replace with your image server, we had to use IMAGESERVER=128.193.97.2 the last time)
IMAGENAME=03_12_2008 (again, replace with your image name)


Make sure that on the server that /etc/init.d/system-imager-bittorrent is turned on.
We used bittorrent in the lasted install of the images to the other computers. We tried installing many versions of bittorrent, but some had dependencies on certain python files that we were not able to find. We used an older version and downloaded the following files
BitTorrent-4.0.3-1.noarch.rpm which made a directory with the following two rpm files
BitTorrent-4.0.3-17.x86_64.rpm BitTorrent-gtk-4.0.3-17.x86_64.rpm
We are not very sure how we got the BitTorrent to work, but it saved a lot of time.
Once the usb drive is made, simply boot the client machine with the boot usb drive you made.  Boot with the usb drive and after it runs the boot..... and init..... the rest is done from the server and you can remove the usb stick.  Since we used BitTorrent, the Wngr412 Lab took approximately 2 hours to complete. The old way took about an hour to do 3 machines. The machine will show a message saying:
"I have been done for xx minutes now!  Reboot me already!!!

Once rebooted, the machine should be identical to the golden-client, except whatever info is given to it by the DHCP server. 

Do all the machines you wish to do.

Updating a machine:

One of the nicest features of the systemimager program is that you can do incremental changes to your clients without having to start the above process over.  Follow the above steps for getting the image on to the server.  Now, all you need to do is to "push" the update to the clients.  This is done with the following command:
#si_pushupdate --image 03_12_2008 --reboot --no-bootloader --hosts 412 --server physics-server.physics.oregonstate.edu --ssh-user root
An explanation of each option:
si_pushupdate (the program that does the updating)
--image (pretty self-explanatory)
--reboot (if the updates applied need you to reboot each machine)
--hosts (group names from cluster.xml) --server (image server)
--ssh-user (do the updates over ssh with the following user, user must have full access to files on client)
(it is hard to see but there are two dashes before image, reboot, no-bootloader, hosts, server, and ssh)


For --ssh-user to work without having to enter passwords, I have enabled ssh-keys so that no passwords are needed.  See here for details.

There is a file on the clients that lets you exclude certain files from being updated.  This file is image/etc/systemimager/updateclient.local.exclude.  Here is what I added to the end of this file for our machines:
-------------------
# Added by Justin Elser 10/2/2006
#  files that should not be included in image

# kconf_updaterc is needed or else kde takes forever
/local/kconf_updaterc
# otherwise NIC comes up as eth1 which breaks maple
/etc/udev/rules.d/30-net_persistent_names.rules
# default printer is already set
/etc/cups/printers.conf
/etc/cups/printcap
# don't need to change var/tmp
/var/tmp
-------------------

This machine must be on the client to be effective, meaning that if you put it in the image, it won't take affect until after you update the image, not before.

Problems encountered (First Installation):

The first and major problem encountered was trying to make the bootable CD work.  Since I didn't have access to the DHCP server, I had to either modify the CD or create a boot floppy.  For some reason, whenever I tried to use a floppy with 3.6.3, the kernel would complain about the floppy drive not working, when I knew the drive was working fine.  It turns out that 3.7.3 has this system called UYOK, or Use Your Own Kernel.  The kernel that was being used did not work with the floppy drive in the Dell Optiplex GX620, complaining that it was a SCSI drive or something.  When I switched to a version with UYOK, the floppy worked just fine. 

I did spend quite some time trying to get the proper information on the CD instead of the floppy, but had no luck with this.  I forgot most of the stuff I tried, but know that it involved uncompressing the kernel image on the CD, modifying it, and recompressing the kernel.  Never got it to work.

I also attempted to use Flamethrower to do multicast installs, thinking that this would save on time.  However, I could never get the clients to connect to the server properly.  In the end, it would have taken less time if I had never tried this.  There is however, an option in new versions to use bittorrent to do the file transfer.  This would probably be faster, but I was worried that our campus watches for bittorrent traffic.