Galiano
Galiano
Galiano
Overview
This final thesis presents how the OS deployoment system has improved to
overcome PC labs management.
Chapter 1 Introduction
This project started two years ago when I was on an internship at the EPSC’s
Computer Architecture department.
Later Arnau Güell help us to develop the first version of the project, this first
version was his final degree thesis.
At earlier phases of development, the first idea was to modify the GRUB [Ref 2]
code. GRUB is an Open Source boot loader with power to boot several
Operating Systems, including Linux images from the network.
GRUB has two important problems, the first one is its complexity and the
second one is its necessity to be installed on the local drive. One of the most
important characteristics of REMBO is it lack of local installation.
To have an entire OS to restore the final operating system was more powerful
than a software modification. Studying the exisisting technologies and
developing an OS restoring dedicated operative system will introduce new and
more flexible ways to achieve the objective.
From that moment until now, the every day work of this project has been to
search new ways to improve the original idea, to test them and to decide what
to use on the final version.
2 Advanced OS Deployment System
Along this thesis the objectives are the same, but introducing the improvements
where the last version had flaws. The particular objectives in contrast with the
previous degree thesis are detailed at Section 6
As software evolves new solutions for old problems are being released. The
case of the deployment systems is not different. In this section we will review
REMBO but also some other Open Source tool available today.
1.4.1 REMBO
REMBO is privative software, now owned by IBM as part of its Tivoli Suite.
Since IBM acquired it, REMBO has been renamed to Tivoli Provisioning
Manager for OS Deployment [Ref 1].
This software is based on Bpbatch [Ref 4], a little software from the late 90’s
that ran as a boot loader of Terminals and could restore OS. It works using the
PXE specification. REMBO has two main functionalities: to select an OS to boot
and to restore them also REMBO has two ways for restoring an OS image: full
restoring or fast restoring. This last work using differential images and just
copying the difference between the installed image and the source image.
Among other features it has an easy to use graphic interface and supports
multicast data transferring.
Diskless Remote Boot Linux (DRBL) provides diskless clients with a running
operating system. A diskless node is nothing more than a PC, which its
operating system in not installed on the disk.
This software is installed on a server and provides the configuration and
scripting to boot diskless nodes on a network. This diskless client works using
the same technologies used in this project.
Once the client is turned on, it uses the suite Clonezilla to clone and restore the
OS. DRBL with Clonezilla is very similar to the presented project with 3 big
differences:
• It does not support fast restoration.
• It cannot boot the restored OS once is downloaded on the client local
disk.
• Its interface is ncurses based. Ncurses is a library to develop interfaces
for command line [Ref 5]. An example of the interface can be seen in the
next image. The image shows the Clonezilla options: save disk, restore
disk, save partitions and restore partitions
In the previous version of the project the boot of the Microsoft Windows was
achieved but the OS executed with no BIOS support and was very slow.
In order to not loose performance on Microsoft Windows while boot without
rebooting is necessary for this OS to access low level hardware instructions.
In order to compete with REMBO it’s mandatory to support this OS, which
represent about the 90% of the actual OS market. Our objective is to find new
tools to improve Microsoft Windows efficiency after its hot boot.
The previous project had only the option for full image restoring which takes
long time.
Next chapter will explain how does the base system work. Starting from the top
layer of the system going down until understand all the internals of the booting
process until the restored OS is booted.
Chapter four details the other two new important features: the fast syncing and
the new complete restoration method. How a file system restoration does
improve the byte-by-byte deployment.
The test done using the last version of the OS deploy system, had as
consequence a couple of hard disks broken down. It was using a byte-by-byte
writing system that could lead to break the device.
Then another way to restore the systems started to be searched. That’s how
PartImage software was found. This software not only takes care of the image
creation and restoration, it includes a dedicate server and enough options to
improve this process. In this version of the project we use PartImage as an
improvement for the full image restoration but also as a tool for their creation
Finally conclusions show the new restoration times and how the project
implementation could affect economically to a school like EPSC.
Also project environmental effects and future improvements have been written
down to complete the thesis.
6 Advanced OS Deployment System
2.1 Introduction
If the user selects the first option, an image restoring software will download
and install the desired OS on the computer. Once the image is deployed the
booting without reboot the computer process start.
For the user, the Linux loading and the image deployment is transparent, this
improves the user experience that doesn’t need to worry about the system
configuration.
At figure 2.1 the basically functionality of the system is being illustrated. First
when the client PC boot, it ask for the OS deployment operative system to be
loaded via network. Then the user has the possibility to choose between the
next options:
• Boot Linux
• Boot Microsoft Windows
• Restore Linux
• Restore Windows
If the user chooses one of the first two options, the OS installed on the local
disk will be booted. If the user chooses to restore an OS, the OS deployment
client will connect to the OS deployment server and ask for the selected
operating system image to be downloaded.
Now we have a complete vision of the system. Next sections will explain step-
by-step and technology-by-technology how does the OS deployment system
works.
BIOS are the acronym for Basic Input/Output System [Ref 6]. This system is the
first piece of software that a PC loads when it boot. BIOS are usually stored
inside the flash memory of the motherboard, called EPROM.
First BIOS tries to identify the PC hardware and initiates it using the software
embedded in each hardware device, the firmware. Once the hardware is
initialized, BIOS loads the boot loader, which permit to boot an OS. Then the
system memory is initialized, the BIOS typically copies/decompresses itself into
that memory and keeps executing from it.
All these initial steps are made into the Real Mode but the rest will run in
Protected Mode. Real Mode is based in 16 bits and this mode cannot map all
memory. That is why it has to go into Protected Mode, which is based in 32 bits,
8 Advanced OS Deployment System
allowing accessing all the memory. At this point, CPU is up and running fine so
it stands for 16 to 32 conversions.
Old operating systems relay on BIOS for many Input / Output operations,
nowadays OS attack directly to the hardware being almost independent of the
BIOS CODE.
From the Linux perspective, after booting, it gets data from the BIOS. In order to
handle the devices directly without using the BIOS, improving the access to
these devices. Linux introduces its own hardware addresses to manage the
hardware. So once Linux is loaded, the BIOS calls are unreliable.
This data is loaded in the system memory so the BIOS loaded in memory can
be overwritten.
As Linux, Microsoft Windows XP tries not to rely on the BIOS to improve its
performance, but in fact this is not totally achieved. Microsoft Windows relies on
seven BIOS interruptions. A study done by Adam Sulmicki of the Maryland
Information System Security Lab shows how Microsoft Windows XP relay on
the next BIOS services [Ref 7]:
• Video Interface
• Equipment Check
• Fixed Disk/Diskette Interface
• System Functions Interface
• Keyboard Interface
• System Timer Interface
• User System Timer Interrupt
The MBR is quite small, and boot loaders have grow, so once the first stage
has been loaded into memory the software jump to its second stage allocated
on the disk. This second stage has more complex code to find bootable devices
or boot different operating systems. When the second stage is loaded the
software is able to load the operating system and transfer the execution to it.
GRUB: is a boot loader from the GNU Project. GRUB allows a user to have
different operating systems on his computer at once, and to choose which one
to run when the computer starts. GRUB can be used to select from different
kernels
LILO: Linux Loader [Ref 8] is a boot loader for Linux. As GRUB, LILO does not
rely on any file system, and can boot any operating system from different
sources except the network.
NTLDR: NTLDR, NT Loader, is the boot loader for the entire NT kernel based
Windows (Windows NT, Windows 2000, Windows 2003 or Windows XP).
PXELINUX is software for booting Linux off a network server, using a network
ROM conforming to the Intel PXE (Pre-Execution Environment) specification.
The boot loader is a file called PXELINUX.0.
2.4.1 PXE
configured also DHCP is used to send the TFTP’S server IP address. TFTP is
the acronym for Trivial File Transfer Protocol, and is designed to send files over
the network.
The TFTP server contains the boot loader, which is sent to he client and loaded
inside the PC memory. PXELINUX will be the boot loader downloaded by the
client at this project. This boot loader demands to have the kernel at its same
folder in consequence it downloads the kernel via TFTP too.
2.4.2 DHCP
Basically DHCP starts as link layer protocol, sending a DHCP discover packet
broadcast over a local area network. The available DHCP servers answer to
this discover packet. Then the client accepts one of the answers and waits for
the configuration to be sent.
The DHCP service has to be only installed on the server. The PXE network
interface card has stored the DHCP client at the ROM.
2.4.3 TFTP
Trivial File Transfer Protocol (TFTP) is a very simple file transfer protocol. It
uses UDP, cannot list directories and it has neither authentication and nor
encryption.
TFTP uses UDP; this means it has not error control, not retransmission of lost
packets during the transmission.
The TFTP service has to be only installed on the server. The PXE NIC has
stored the client at the ROM.
After sending the kernel, with TFTP from server to the client, it needs all the
applications to restore the system and libraries. There are two alternatives for
the root file system: a RAM file system or a network file system.
A RAMFS is a special kind of file system, which compresses a whole file system
into a file, so this file can be transmitted over the network and decompressed on
the client, on top of RAM memory. Once is decompressed, it runs a whole file
Chapter2. OS deployment system technical details 11
The second solution is to access over the network a remote file system, so the
client only access to it when required, loading into memory the selected
executables.
Inside the OS deployment server a NFS service has been installed and it has
been configured to export an entire root file system: the OS deployment system
client file system. The client must have installed and running the NFS client.
Client and server must have configured their kernels to support NFS so the
kernel will create a device called NFS.
The boot loader has to be configured to pass the NFS device as a parameter to
the kernel. This way the kernel knows that it has to look for the root file system
at the server not at his local disk.
At the previous version the way for restoring Image was the conjunction of two
open source software: Device Imager [Ref 6] and netcat [Ref 7].
DeviceImager was the software selected to make and restore images on the
previous version of the project. This software was based on 2 utilities zsplit and
unzsplit.
Zsplit creates exact images of the disk without taking care of the file system
architecture. Unzsplit recovers the images of the disk created by zsplit.
12 Advanced OS Deployment System
To send the image through the network netcat was used, netcat is the
application that let a user create UDP/TCP connections using command line.
Then netcat was used to create TCP connections in order to send the OS
image created by Zsplit. These images were processed by netcat on the client
and sent to Unzsplit via a pipe. Unzsplit restore the received image on the disk.
Device Imager software is not optimized for this function, it is not a fully qualified
image server, and so to each image transferring the OS deployment system
created a netcat connection.
In summary the conjunction of DeviceImager plus netcat does not have enough
features due to its simplicity and it was also quite aggressive for the disk. It’s not
the best solution for an OS deployment system.
Kexec only works for ELF executables formats or its compressed forms.
Windows kernel is not an ELF executable but we found grub.exe ELF
executable that can hot boot a Microsoft Windows OS.
When kexec passes control to grub.exe and this to MS Windows, the BIOS are
unreliable, so it produces a problem in the overall performance of the system.
The solution was to add a BIOS emulation layer inside grub.exe, before it boots
Windows.
The new grub.exe version had this improvement into :Integrating BIOS
emulation. This improves the Microsoft Windows performance, making the hot
boot almost perfect.
Since Kexec was designed for Linux kernels. Including the right parameters it
works flawless.
Figure 2.2 shows the final system detailed operation of the deployment system.
In order to have an OS deployment system working the client BIOS is
configured to boot via the network card using PXE. At boot time, the client
search a DHCP server, then the server sends an IP address, a netmask, a
default gateway, the TFTP server IP address and the name of the boot loader
file (PXELINUX).
The client looks for the TFTP server and asks for the PXELINUX boot loader
file. Once it is downloaded into memory this asks to the TFTP server for the
kernel image. The tftp server sends the kernel to the client and PXELINUX
boots the downloaded kernel.
This kernel is configured to find the root file system, on a NFS server. It
connects to the NFS server and Linux loads in a normal way. At this point a
diskless system is up and running.
After the system is booted the user can select to recover or to boot a resident
OS. If the user decides to recover then, the OS deployment system connects
the deployment services and downloads the image.
If the user chooses to boot the system, or if the image has been just recovered
the next step is to boot the system. The booting without reboot system is
accomplished again using the special call of the Linux kernel called kexec.
14 Advanced OS Deployment System
3.1 Introduction
In the last version of the OS deployment system, there was no Graphic User
Interface (GUI) at all; it was a Command Line Interface (CLI). It was working
fine enough but it was not user friendly.
The solution for the GUI of the OS deployment system is to create graphics on
a virtual console, and to use a basic web browser for text with support to display
graphics on console.
The technology used to provide graphics on a console is known as Framebuffer
[Ref 10], is widely used to improve visual style of the command line interface as
it is shown on the image below.
Also Framebuffer console can be used on conjunction with boot splash software
to show beautiful images on the Linux booting time instead of the old fashioned
black screen that users were used to. The boot splash software selected is
gensplash; this software is the Gentoo Linux boot splash software. The client is
based on Gentoo Linux distribution so it is logical to install this specific software.
The image 3.2 shows a boot screen without the selected technologies. As it
can be observed is not a friendly interface.
For the web browser application Links2 has been the chosen because it
provides us support for framebuffer devices. Links2 has also a wide diversity of
image formats and JavaScript support.
To give service to our web GUI, the selection was clear: Apache2 [Ref 11]. This
web ensures the reliability and easy configuration, for the OS deployment
system.
The following sections describe the chosen technologies with more detail.
3.3 Framebuffer
memory buffer. This memory buffer contains a complete frame of stored data.
The buffer contains information of the colour values for every pixel showed on
the screen.
The first framebuffer implementation was created to allow the Linux kernel to
show graphics on a text console on systems that do not have one; like the
Apple Macintosh.
Later it was used on IBM PC compatible architecture; To see if it is activated
just verify if the Tux logo is shown at the boot time (one logo per CPU), as
shown in fig 3.3
There are lots of software packages, which can use the framebuffer device like
Links2 web browser, used in this project, or the multimedia player Mplayer.
Framebuffer device must be supported by the application to use it.
The development of framebuffer drivers for Linux started at 1999 and nowadays
it continues giving support for the new hardware.
18 Advanced OS Deployment System
The framebuffer driver is located at the Device Drivers section and Graphics
Support subsection.
One should select the option "support for frame buffer devices" and activate
support for the correct driver. In most cases “vesa-tng” driver should work fine.
It's important to select “console display driver support”, too.
Figure 3.4 shows how the kernel menu configuration looks; particularly it is a
snapshot from the “Graphics support” kernel properties. At this section it is
possible to configure every kernel feature related with graphics on console.
We load the “video mode selection” and “frame buffer console support”. Also to
make the Tux logo appear at boot, enter to “Logo configuration” and select the
logo with more colours.
Chapter 3.Graphic User Interface 19
Since the system that must be used in heterogeneous hardware, the kernel
must support all the framebuffer drivers; this way independently of the hardware
is running on framebuffer will work.
Once the drivers are loaded it is time to configure and compile the web browser
to work with framebuffer as shown in the next section.
It is very important to compile PS/2 mouse and keyboard drivers as well as USB
to let the user interact with the GUI.
3.4 Links2
Links is an open source text and graphics web browser developed by “Mikulas
Patoxla” and his group Twitbritghtlight labs. Links2 was developed later and the
most important features were added graphics support via framebuffer and X.org
server, and Java script enabled browsing.
The image below shows an example on how does Google home page looks’
using Links2 with framebuffer activated.
3.5 Gensplash
Splashutils is the name of the package, which contains the tools for the boot
splash creation. This package must be installed.
Once the kernel is compiled and the tools are installed, it is time to compile the
boot image inside the kernel using a ramfs image. When the Linux kernel is
compiled it creates an empty initramfs. Splashutils can fill the ramfs with the
image and the configuration of the boot splash image using the next command:
Linux kernel path to the ramfs is the first parameter, we add a verbose with –v,
and -r to select the resolution and the last parameter indicates the boot splash
theme.
Once the ramfs is filled, the kernel must be recompiled to include the ramfs
inside.
To end the boot loader, in our case, PXELinux has been configured to use
gensplash.
Chapter 3.Graphic User Interface 21
The design of the graphic user interface is based on the features of the Links2
browser. As a premise, there was the decision to make it as simple as possible
to make the user experience an easy thing.
A table and simple colours should be enough in a similar way as Google shows
on its home web page. Figure 3.6 shows the final graphic user interface’s
screenshot of the project GUI.
Using a web browser, as a GUI, in our architecture, has as trade off that there is
no direct way to make local executions. OS Deployment needs to execute, on
the client computer, some scripts to restore & boot the different OS.
To accomplish this the OS deployment system starts in the background a netcat
process that runs in parallel with the links2 web browser. The netcat software
starts listening on the port 80 at localhost at the client’s boot time. When a user
click over a link, this link point to an URL similar to this:
http://localhost/usr/bin/ls
As said before netcat is listening so it receive this HTTP petition and pass it
using a UNIX pipe to a script which functionality is to parse the petition and
execute /usr/bin/ls on the localhost.
Illustration 3.7 shows how the Apache Server is serving html petitions related
with GUI. When it is needed to execute a local script on the client, the http
petition is not directed to the Apache server, is directed locally instead.
22 Advanced OS Deployment System
Six scripts have been created one for each disposable option of the system, in
addition to the wwwexec.sh script mentioned before:
• Boot Windows (windowsboot)
• Boot Linux (linuxboot)
• Linux fast restoration (syncL)
• Windows fast restoration (syncW)
• Linux complete restoration (linrestore)
• Windows complete restoration (winrestore)
4.1 Introduction
In the last version of the OS deployment system the restore system created
exact copies of the disk’s partitions, which were created by using low-level byte-
by-byte copy operation.
This process allows the end user to make a complete restoration without taking
care of the system structure or the information stored inside the files. This
method becomes ineffective when a little change is made because it has to
restore the whole disk instead of the updated file, which will be faster.
Fast image restoration can be achieved using file system synchronisation
software. Rsync is software which lets synchronise a whole OS, testing the file
integrity. We call to sync a file system to the action of synchronise it by means
of rsync.
By the time the OS deployment system was developed last year, the file system
syncing was only available over open file systems, so Microsoft's NTFS was
excluded from it. This is the reason why it was not implemented. Nowadays,
thanks to new technologies, which will be explained along this chapter, it's
possible to synchronise with NTFS, Microsoft's privative file system.
Thanks to file system support now it’s possible to use new software to make the
system restoration. This software is called PartImage and it improves in many
ways the capabilities of the OS deployment system. It demonstrates once more
the power of the system; just installing the software it’s possible to improve the
whole deployment system.
First in this chapter file systems will be introduced. Then type of file systems
and nowadays file systems on the two most important operating systems will be
explained. Later the Microsoft Windows NTFS Linux driver problem will be
treated.
Once the file system paradigm has been understood the new technologies used
in this project for OS restoring will finish the chapter.
File systems are a logical organisation method for the computer's physical data.
This organisation allows the computer to search for information about files, or
directly the files, gave permission to files, copy and move files and large list of
possibilities.
In a more formal way to explain it, a file system is a set of abstract data types
24 Advanced OS Deployment System
File system also stores data such as file names, length of the file, last date of
modification, creation date and file owner among other properties.
Looking at its functionality there are 5 types of file systems [Ref 14]:
- Disk file systems: are designed for storage devices such as hard drive,
USB pen drives and similar block devices.
- Database file systems: are oriented to work with databases, files are
identified by their attributes, like type, author, …
- Transactional file systems: log events or changes. This feature let a
system to recover data from failure.
- Network file systems: are dedicated to work over the network, providing
access to files on a server.
- Special purpose file systems: A special purpose file system is any file
system that is not mentioned above, like virtual file systems.
In this work the focus will be the disk file systems, the network file systems and
the special purpose file systems.
Each OS has it's own file system for the storage devices (disk file system).
Linux supports a big variety of file systems and some of they are proprietary file
systems as MAC OS HFS.
Linux & Windows are the target image of the OS deployment system; Linux
uses, in most cases, ext2, ext3 and reiserfs as file systems. Windows uses
FAT32 and NTFS.
At the beginning of the Linux development the Minix file system [Ref 15] was
the default and the only file system supported. This file system had a maximum
size limit of 64 megabytes and a file name length limit to 14 characters. The
extended (ext) file systems was created to solve these limits pushing the
maximum size to 2 gigabytes, and the file name length to 255 characters.
Chapter 4. System restoring improvements 25
The ext file system was based on inodes as all the UNIX file system were. An
inode is a data structure, which stores basic information about file, directory or
any file systems object.
The second extended file system (ext2) was introduced on early 1993, to solve
3 important issues of the extended file system: there was no support for
separate access, inode modification and data modification timestamps [Ref 16].
The separate access introduced the security domains of Unix in the Linux files;
this is the read/write/execute permission for the user/group/other domains.
The inode modification permits the file system to change basic information
about a regular file, directory or any other file system object. In order to know
when data has been changed it is important for a file system to add data
modification timestamps.
The next illustration shows extended file system’s inode internal structure, and
how to find the data blocks of a single file.
The third version of the extended file system was released on November 2001
[Ref 17].
The ext3 file system adds, over its predecessor:
to a disk file system. A file system with journaling reduces its possibilities
to become corrupted in a power failure or a system crash.
• Tree-based directory indexes for directories spanning multiple
blocks: In large directories this tree structure allows faster access to the
files.
• Online file system growth: This functionality let the file system to
change its size once it is created.
FAT is a file system supported by almost all the existing operating systems due
to it’s simplicity, well known structure and compatibility. Bill Gates and Marc
McDonald created the FAT file system in 1977 for managing disks in Microsoft
Disk BASIC [Ref 18].
FAT is the acronym of File Allocation Table. This table centralizes the
information about which areas belong to files, which areas are free or corrupted,
and it includes where is the file located in the disk, too.
Disk space is allocated to files in contiguous groups of hardware sectors called
clusters.
The first version of FAT limits the cluster address to 12 bytes. This limitation lets
the file system to 4096 clusters. In case of clusters of 4 blocks of 1Kbyte a total
of 32 Megas could be stored on a disk.
Due to the growth of the hard disks FAT16 was released increasing the cluster
address to 16, supporting to 2 Gigabytes of disk space. Years later Microsoft
created a new version with 32 bytes of cluster address improving its capabilities
of space until 2 Terabytes. Microsoft limitated the space allowed on FAT to 32
gigabytes in their systems reporting performance problems. FAT32 cannot
contain a file larger than 4 Gigabytes.
The FAT file system does not contain mechanisms, to prevent newly written
files from becoming scattered across the disk.
FAT does not have an internal structure so files are given the first available
location on the volume. The first free cluster number is the address of the first
cluster used by the file. Each cluster contains a pointer to the next cluster in the
file, or an indication (0xFFFF) that this cluster is the end of the file.
Chapter 4. System restoring improvements 27
This illustration shows three files. The file File1.txt is a file that is large enough
to use three clusters. The second file, File2.txt, is a fragmented file that also
requires three clusters. A small file, File3.txt, fits completely in one cluster. In
each case, the folder structure points to the first cluster of the file.
The solution to the disk fragmentation is the linkage of all free clusters into one
or more lists as is done in Unix file systems. The file allocation table has to be
scanned in order to find free clusters, which affects directly to the performance
using large hard disks, where that larger allocation table are being used.
This image shows how the disk is organised using the FAT file system. First of
all the boot sector, the two file allocation tables: the original and the backup
one. Then the data region until the end of the partition.
NTFS
Since Windows XP, Microsoft’s operating systems use NTFS as standard disk
file system. Microsoft Windows is the most used operating system so to give
support for its file system was mandatory in our project.
The information about how does NTFS work is not very clear because of the
closed source politics used by Microsoft.
NTFS is quite similar to FAT in the way that it has a table too, but instead of a
normal table is more similar to a data base, creating a more complex and fast
file system [Ref 19].
28 Advanced OS Deployment System
Figure 4.4 shows how NTFS is structured. The NTFS boot sector contains the
sufficient information for the system to boot. The Master File Table is the new
File Allocation Table designed by Microsoft in order to prevent fragmentations
problems.
The File System Data is described at the Microsoft Tech Net Web as a place
that stores data that is not contained on the Master File Table.
To improve the security of the file system, a copy of the Master File Table is
placed on the NTFS structure.
We found any information available about the Master File Table structure.
Microsoft private code politic is the biggest problem on the development of the
NTFS driver. Large test and analysis research was the only possibility to
understand the internal structure and to create the driver.
Opposite to FAT, which worked on a simple table, NTFS works with a very
complex table called Master File Table. This table controls everything within the
file system using a relational database, which makes the development of the
NTFS driver a hard task.
The new structure on NTFS solves the fragmentation problems FAT had.
Nowadays there are two NTFS drivers, one driver on kernel space, which can
only read NTFS file system, and a user space driver, which can read and write
NTFS without problems.
The development of a driver for Linux that handles this database was complex,
as most developers know, to develop software for the kernel is quite difficult.
Due to the importance of these drivers to the Linux community the NTFS driver
developers decide to make their work easier creating a driver on user space
with a new technology called FUSE. In any case a complete NTFS kernel driver
is just a question of time.
Chapter 4. System restoring improvements 29
Kernel space is reserved for running the kernel, device drivers and kernel
extensions.
User space is that place in the memory where the end-user runs their
applications.
An application is divided in process, as each process requires some of the user
space memory. Applying memory space separation, the operating system
ensures that the kernel will always have memory enough to run and will not be
corrupted by user applications.
At the above image it is possible to see how the user space and the kernel
space are divided. On the user space only user applications and libraries are
allocated while at the kernel space device drivers and file systems are being
handled.
4.4.3 VFS
A virtual file system is an abstraction layer over a real file system. This means
that it is possible to build a file system related to a concept, which can be
abstract to the kernel.
Using VFS is possible to create a file system, which reads the information of
your mp3, and sorts it for each folder. This means that the file system will auto
create both folders, artist and album, and depending which folder is opened it
will show the mp3 sorted one way or the other, and all the information will be
contained at one place: the original folder.
The user can access the original folder and browse all the mp3 unsorted.
VFS is being used as a bridge between file systems and open source operating
systems providing access to these file systems independently over which file
system are writing on.
Virtual file system creates a contract between the kernel and the real file
system. This contract simplifies the support for new file systems to the kernel.
Creating and filling the contract will be enough to make the new file system
compatible. The contract is a list of rules to make the kernel understand the
relation between it, the real file system and how to represent it to the end user.
Once what a virtual file system is understood it is much easier approach to the
file system in user space concept, also know as FUSE.
4.4.4 FUSE
FUSE is a UNIX\Linux kernel module that allows any user to create their own
file system drivers in user space; in this case the FUSE kernel module gives the
contract. [Ref 21]
As said above virtual file systems in user space are very useful because they
act as bridge between kernel and file systems.
Chapter 4. System restoring improvements 31
At the image we can see how does FUSE works with the NTFS driver. The disk
device /dev/hda1, is an NTFS disk, it is mounted on /mnt/ntfs. When a user
wants to list the folder the NTFS driver contains all the NTFS logic, how to
handle the MFT, and how to write on the disk, how to read a file... It uses the
support library libfuse to implement basic calls as open, read or write. These
orders pass through the kernel fuse module which bridges with the VFS driver
that allows the disk reading at this example.
User space drivers are very useful for NTFS drivers developers who using
FUSE do not need to deal with kernel internals and can code directly the NTFS
in any programming language they want implementing libfuse.
FUSE improves the development velocity of the driver; in the other hand using
FUSE increases the CPU consumption, because of the addition of new layers
on the file system structure.
FUSE driver module must be compiled in the kernel in order to use it.
This driver has two parts the one that reside in the kernel, FUSE module and
the FUSE file system driver, which is stored at the user space. FUSE driver
module must be compiled in the kernel in order to use it.
32 Advanced OS Deployment System
NTFS-3G is the FUSE file system driver, which makes possible to read and
write over an NTFS partition or disk [Ref 22]. It still has some limitations, the
driver cannot copy access control list not permissions. Windows desktops are
not access control list aware so this issue will only be a problem on servers. Our
project is designed to work on normal PC desktops so the access control list
flaw would not be affected for it.
The conjunction of these three technologies, VFS, FUSE & NTFS-3G is used in
the OS deployment system to synchronise NTFS partitions at the file system
layer.
Once the NTFS-3G driver is installed, some utilities are required in order to
administrate NTFS partitions. These are found at ntfs-utils package, which
contains the following utilities:
As we will see later in this chapter these utilities are later used by PartImage to
clone NTFS partitions and to resize NTFS partitions, so it is very important to
install this package.
4.5 RSYNC
Time is a precious value; therefore saving time for the users on the OS
deployment system is a key feature. In order reduce the time spent in a data
transaction, one may increase the bandwidth or may reduce the amount of sent
data.
An OS deployment system only handles the data transaction, and then just the
second method could be implemented. A personal computer usually brakes
down because of a change on its operating system. In the previous version of
the OS deployment system a change at the OS meant to download the whole
image taking about 20 minutes of the user‘s time.
Rsync is an open source software, which synchronises files and directories from
one computer to another while minimising data transfer using encoding (delta
encoding) when appropriate.
Chapter 4. System restoring improvements 33
The client splits its copy of the data into fixed size, it create chunks of the data,
it computes and MD4 hash and a weaker 'rolling checksum'. If the checksums
are different only the different chunks are send. The rolling checksum is sent to
the server.
The server calculate the rolling checksum for every chunk of the files to be
synchronised (this checksum is based on zlib), then it compares it with the one
received to check if it match. If the checksum match then the server calculates
the chunk MD4 checksum and verifies if this new checksum match by
comparing it with the MD4 sent by the client.
Once the two checksums are checked the server, sends only the chunks that
have failed one of the two comparisons, along with the instructions on how to
unify the new blocks into the wrong version of the compared file, creating an
identical file in both sides.
In most cases only small image differences are found and if there are little
differences the OS deployment software does not need to send a lot of data to
restore the initial image, saving much time to the end user.
Rsync must be executed on the client and on the server as a daemon. Big files
take more time to synchronize due to more checksum and comparison are
34 Advanced OS Deployment System
needed. Big changes increment the amount of time because more chunks have
to be sent.
This function let the OS Deployment system to have the capability to restore
only changes: a fast restoration. Rsync daemon is running on the server, while
at the client the next parameters have been used at the fast restoration script:
• r: recursive synchronising.
• --delete: deletes the difference with the original.
• server ip: indicates the rsync daemon IP address
• synchronising name: is an alias of which folder of the server disk wants
the administrator to restore on the client.
• Destination folder: is the client’s folder where data is going to be
restored.
Although full image creation and restoration was working fine in the previous
version of the OS deployment system we found a tool that have many
interesting features to improve them.
Even this was not an initial objective of the project we have included it here.
The new software to provide complete image restoring is called Partimage.
Searching for a better solution to handle the image restoring and the image
server, PartImage was found [Ref 15]. PartImage is an utility, which gives:
• File system layer image creation
• File system layer image restoration
• Image compression
• Partition images
• Graphic user interface to manage the server & the client
• Command line interface
• Secured connection to transfer the images via SSL
• User authentication
• NTFS support
Create a partition with Partimage is quite easy thanks to it’s easy user interface.
Two things must be specified.
• The partition to save: giving the appropriate file of the /dev directory.
• Filename where to save the selected partition.
Once a partition is selected, the application lets the user to select the
compression level. Gzip compression will make a smaller imager and bzip2 will
make even a smaller image.
36 Advanced OS Deployment System
The compression level affect directly to the time of creation of the image and
the time of restoration. More compression reduces the image file size, thus it
reduce the time of sending the image but increments the time to copy it to the
disk because there is a time to decompress it.
With no compression the image file will be as big as the selected partition, the
creation and the restoration of the image will be very fast, but the transfer time
will be slow.
The tradeoffs between space and velocity affects to the efficiency of the
application. During the test phase of the application with a direct-wired
connection, no compression was the fastest method, and the one, that better
can competes with the commercial applications.
The next image shows the graphic user interface provided by PartImage at the
client site to ease the user experience. One may choose between the partitions
created on the disk, the file name and the action the user wants to realise.
Notice that the images can be created and stored directly in the server.
PartImage gives two options in order to restore an image, using the graphic
user interface, or via command line.
The graphic user interface is the same as the shown above for creating, the
user just needs to check on the restore partition box, fill the image name and
the server IP. In the OS deployment system the end user does not need to
know neither the name of the image no the server IP. So the user interface is
not necessary for the project purpose, instead the command line lets to
autotimatize the connection using a script.
The PartImage server contains the image files. In order to run the PartImage
server software the administrator must create a ‘partimag’ user; the server will
run under this uid.
The server gives the possibility to use a username and a password; this option
is compiled by default. As said before user does not have to interact directly
with PartImage, so in our case this software must be compiled without the
authentication option.
As you can see in the next image a graphic user interface can used to control
connections to the server. This user interface shows:
• The number of clients connected to the server
• The state of each client: saving, restoring or waiting
• The IP address of each client
• The location of the operating system image
38 Advanced OS Deployment System
Chapter 5 Conclusions
The project initial idea was to create a system to compete against REMBO
using open source technologies. Part of this objective was accomplished at a
previous version of the project. In order to continue improving the initial goals
new objectives were added and have been solved within this degree thesis.
Next sections list the three objectives that the project has solved and some
review of the technologies used:
Now using the web interface, it is easy for the user to choose the desired
option. Beside this, it provides an easy development platform in order to create
new web designs.
The inclusion of the NTFS drivers has provided new ways for restoring
improving greatly the restoration process. Now rsync can be used between all
the disposable operating system.
More over, the project has achieved an important objective that was not
previously planned:
Beside the three initial objectives, this new feature has been added, too. No
more disks have broken down during the tests, thanks to PartImage and its
smooth file system complete restoration.
In summary the project has achieved all its objectives, and has add new
improvement to the initial design.
Using the OS deployment system the number of CDs and DVDs wasted on this
operation will be significantly reduced. Less CDs and DVD means less
environmental impact.
References 41
References
[Ref 2] Grub:
http://www.gnu.org/software/grub/
[Ref 3] PXE:
http://download.intel.com/design/archives/wfm/downloads/pxespec.pdf
[Ref 5] Ncurses:
http://en.wikipedia.org/wiki/Ncurses
[Ref 6] Bios:
http://en.wikipedia.org/wiki/BIOS
[Ref 8] Lilo:
http://www.acm.uiuc.edu/workshops/linux_install/lilo.html
[Ref 9] Kexec:
http://www.xmission.com/~ebiederm/files/kexec/README
Annex
Script SyncW
Script SyncL
Script winrestore
Script linrestore
Script windowsboot
Script linuxboot
WWWEXEC.SH
EOF
( $torun )
exit