Tuesday, 8 June 2010

Recovering your Linux server with a Knoppix rescue disk

Among the many positive aspects of working with Linux, one is the excellent recovery methods. If your server doesn't boot properly, you can still access everything on it using a recovery disk. Here you will learn how to do this using Knoppix. This article doesn't focus on a particular version of Knoppix, and will work on almost all Linux distributions.

Booting your server using a Knoppix rescue CD is easy. Just put the disk in your server's optical drive and restart the server, next the Knoppix operating system starts loading automatically. But it doesn't immediately give you access to the files on your hard drive. You have to mount all file systems on your server yourself -- assuming you can still mount them. The procedure that is described in this article helps you in fixing boot problems that are not caused by file system errors. If your server's file systems have errors that prevent them from being mounted, the procedure described in this article will help you find a solution, but there may be additional steps required.

Mounting the Linux file systems

To access the root file systems on your server using a Knoppix rescue CD, you'll have to mount it. This is also true for other file systems on your server. When using a rescue system, you'll have to mount the root directory on a temporary directory. Most distributions have a directory /mnt which exists for this purpose, so it's a good idea to use it and mount your file system on it. But, there is a potential problem: most utilities assume that your configuration files are in a very specific directory; if your distribution is looking for /boot/grub/menu.lst for instance, the tools may be incapable of understanding that it is in /mnt/boot/grub/menu.lst instead. Therefore, you need to make sure that everything that is mounted on /mnt, is presented to the operating system as mounted directly in the / directory. The following procedure shows you how to do that.


  1. Boot your computer, using the Knoppix CD. You'll see the Knoppix welcome screen next. From here, press Enter to start loading Knoppix.

  2. While loading, Knoppix will wait a while to show you all available languages. If you don't select anything, English is started automatically. Once completely started, you'll get access to the Knoppix desktop.

  3. To restore access to your server, you'll need to open a terminal window from Knoppix. By default, after opening a terminal window you'll get the access permissions of an ordinary user. To be able to repair your server, you need root permissions. You'll get them using the sudo su command.

  4. Now use the mount command. This command shows you that currently no file systems are loaded at all, but everything you see is in a RAM drive.



    Click on image for larger version

    By default, Knoppix loads RAM disks only.


  5. In case you don't know exactly how storage in your server is organized, you'll need to check what partitions and disks are used. The fdisk -l command gives a good start for that. This command shows you all disks that are available on your server (also if they are LUN's offered by a SAN), and it will show you which partitions exist on these disks. The disk names typically start with /dev/sd (although other names may be used), and are followed by a letter. The first disk is /dev/sda, the second disk is /dev/sdb and so on. On the disks, you'll find partitions that are numbered as well. For instance, /dev/sda1 is the first partition on the first disk on your server. Here is an example of what a typical disk layout may look like:

    Use fdisk -l to show the current disk layout of your server.

     
    ilulissat:/ # fdisk -l


    Disk /dev/sda: 8589 MB, 8589934592 bytes
    255 heads, 63 sectors/track, 1044 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes


       Device Boot      Start         End      Blocks   Id  System
    /dev/sda1 * 1 13 104391 83 Linux
    /dev/sda2 14 30 136552+ 82 Linux swap / Solaris
    /dev/sda3 31 553 4200997+ 83 Linux


  6. Now it's time to find out what exactly you are seeing. If it looks like the example above, it's not too hard to find out which is the root file system. You can see that there are two partitions using partition type 83 (which means they contain a Linux file system). One of them however is only 12 cylinders, and as each cylinder is about 8 MB only, it's too small to contain a root file system. The second partition is using partition type 82, so it contains a swap file system. Therefore, the only partition that can possibly contain the root file system, is /dev/sda3.

  7. Now that you know which partition contains the root file system, it's time to mount it. As mentioned before, it's a good idea to do that on the /mnt directory, Knoppix doesn't use it for anything useful anyway. So in this case, the command to use would be mount /dev/sda3 /mnt

  8. A quick check should show you at this point that you have correctly mounted the root directory. Before you activate the chroot environment, you'll need access to some system directories as well. Most important of them are /proc and /dev. These directories normally are created automatically when booting. That means they do exist in your Knoppix root directory, but once you've changed /mnt to become your new root directory, you'll find them empty. As you really need /proc and /dev to fix your problems, mount them before doing anything else. The next two commands should help you mounting them.

    mount -o bind /dev /mnt/dev


    mount -t proc proc /mnt/proc


  9. Once you are at this point, your entire operating system is accessible from /mnt. You can verify this now, by activating the directory (use cd /mnt). At this point your prompt looks like root@Knoppix:/mnt#. Now use the command chroot . to make the current directory (.) your new root directory. This brings you to the real root of everything that is installed on your server's hard drive.

  10. As Linux servers tend to use more than one partition, you may have to mount other partitions as well, before you can really fix all problems. If for instance the directory /usr is on another partition, you won't be able to do anything before you have made that accessible as well. The only task to perform at this moment, is to find out which file system is mounted where exactly. There is however an easy answer to that question: /etc/fstab. In this file you'll see exactly what is mounted when your server normally boots. So check the contents of /etc/fstab and perform all mounts defined in there manually. Or make yourself easy and use mount -a. This command will mount all file systems automatically which haven't been mounted yet.



Now, you'll have full access to all utilities on your server's hard drive, and more important, to all files -- time to analyze what went wrong and restore access. But make sure that you start by using a backup at this point!

To fix any problems on your computer, you have to make sure to restore full access to your system. You can do this by mounting all file systems on your computer, and after that by making them accessible by using the chroot command. This way you are ensured that all tools see the server's file system as it really is, and that will make it a lot easier for you to restore access.

ABOUT THE AUTHOR: Sander van Vugt is an author and independent technical trainer, specializing in Linux since 1994. Vugt is also a technical consultant for high-availability (HA) clustering and performance optimization, as well as an expert on SLED 10 administration.

Reference link: http://searchenterpriselinux.techtarget.com/tip/0,289483,sid39_gci1358366_mem1,00.html

No comments:

Post a Comment