Comp F
Comp F
Comp F
# What is Slack Space ? Slack Space is a form of internal fragmentation, wasted space of a hard drive that are not fully used by the current allocated file and which may contain data from a previously deleted file.
Support for numerous file system types (many of which not being recogniz ed by windows) Ability to mount a file Ability to analyze a live system in a safe and minimally invasive manner (No hardware or software write blocker needed) Ability to redirect standard output to input (Multiple commands on one line) Ability to review source code for most utilities Ability to create bootable media Linux is free as well as the source code
Tools are mostly free or cost efficient
Linux Kernel and the Forensic Acquisition of Hard Discs with an Odd Number of Sectors
You will never probably find a drive with an odd number of sectors for Linux is unable to see the last sector of the hard drive with an odd number of sectors. However this issue has been resolved for Linux using kernel 2.6. Systems that are using version 2.6 of the Linux kernel can completely forensically acquire disks or partitions with an odd number of sectors.
# DD DD is a powerful tool with the purpose of making exact copies of entire drives or individual partitions on those drives. Command for DD cloning : # dd if=/dev/sda of=/tmp/forensic if : input file of: output file
# FDISK Well FDISK is commonly use to partition hard drive, in digital forensic, it can be use to crave information on the device. # fdisk -ul /path/of/clonning/device
# MD5SUM MD5SUM is usually used for calculating and generating hashes with the purposes of verifiying the integrity of a file or device. The command for md5sum : # md5sum file_or_device
# XDD XDD used for getting the byte offset. Using XDD is easily to get the offset of file. The command is : # xxd clone_images
The Linux file system components can be broadly divided into 2 categories: user space, and kernal space.
One of the main user space components includes the applications which can be accessed through the user interface. For file system calls, such as writing, reading, opening and closing, Linux has another user space component, which is the GNU C Library (glibc). It provides the user interface for such actions The system call interface acts as the middle-man that matches system calls from user space to the right receivers in kernal space. As the individual file systems in kernal space may have different behavioural rules from each other, there is an overlying primary interface that helps facilitate this difference. This component is called the Virtual File System (VFS), and its main functions are to export a set of interfaces which are then abstracted to the individual file systems. The Inode Cache and the Directory Cache, which provides a collection of recently used file system objects are involved in this process. Down one level are the various individual file systems. They pass requests to the device drivers through a buffer cache that stores a set of recently used lists (LRU), allowing for faster access.
High-level Architecture
In Linux, four components are made up to define the file system, They are the boot block, superblock, inode block, and data block. The smallest disk allocation unit in the file system is known as a block, which varies from 512 bytes and up.
# BOOTBLOCK The bloot block contains the bootstrap code which is the instructions for startup. In a Linux based computer, there is only one boot block located on the main hard disk. # SUPERBLOCK The superblock is a structure that represents a file system, containing vital information about the system and is considered part of the metadata, a very "high-level" metadata structure for the file system. It displays disk geometry, available space and location of the first inode and keeps track of all inodes. # INODE An inode (index node) is a data structure found in many Unix file systems in which each inode stores all the information about a file system object (file, device node, socket, pipe, etc.), except data content, file name, and location in the filesystem. # DATABLOCK The data block is where files and directores are stored on a disk drive. This location is linked precisely to indoes. A data block is similar to a cluster of disk secotrs on a FAT or NTFS volume. Block ranges between 1024 to 4096 bytes each on a Linux volume.
File System Foundations
The analysis of a linux file systems shows a number of important concepts, which will now be discussed in the context of digital forensics.
A. Actual Data vs. Meta-Data The general purpose of a file system is to effortlessly store and retrieve data on secondary storage containers like hard disks or usb devices such as memory "sticks". To achieve this objective, file systems have to store not just the actual data, but also management data like defect lists, allocation tables as well as file and directory names. This data is often being referred to as meta-data.
The software that implements a file system is usually part of the operating system. A set of algorithms is followed to create, update or delete meta-data in correspondence to creation, modification or deletion of actual data.
B. Forensic Software The high-level process of digital forensics are done through the following steps: Data acquisition from the source, data analysis and the extraction of evidence, preservation and finally the presentation of the evidence. Depending on the type of source, digital forensics might comprise of media analysis, network pack analysis, code analysis etc.
In this example, we will primarily focus on the data analysis, extraction and presentation of evidence performed by forensic media analysis software. Given the perception of actual data versus meta-data media analysis tasks can be structured into the follow categories as listed below.
- Examine actual data Examine allocated data for specific content such as keywords Examine partly used allocated data blocks for hidden data in the "slack" space Examine unallocated data blocks for the content of deleted files and folders. - Examine meta-data Listing of allocated meta-data in formats particular to forensic analysis such as MAC time-stamps of each file in the file system. Examine for unallocated meta-data for the purpose of reconstructing past events or recovery of deleted files.
C. Linux file system and disk layout
Meta-data resides at the beginning of the diskstorage space. Super blocks at the very beginning contains general file system state information like its size, location of free unallocated space and other information. It is then followed by the inode list that possess per-file meta-data such as serial number, timestamps, access control information and disk block allocation lists.
There are some limitations in terms of performance in this approach such as but not limited to long disk head movements required from the meta-data to the actual data zone on disk. However from a forensic point of view, this disk layout appears misleading at first sight because it seems to offer a straight forward seperation between meta=data and actual data.
An indepth look at the inode structure disclose that this seperation might not be as entirely straightforward given the concept of indirect, double and even triple indirect disk block address.
The inode data structure illustrates the concept of indirect block addresses, where data blocks are used to store a list of block addresses of which depends on the level of indirection - lead to blocks of actual data or again to blocks of disk block addresses. This approach is required to keep the inode size small as well as the disk block size small, but still allowing creation of files larger than the number of block pointers in the inode times the disk block size,
Data Recovery in Linux
When a file gets deleted, it goes to the Trash/Recycle Bin, provided that the file is not larger than the bin's capacity. Files are be restored easily despite having the Recycle Bin emptied. Permanently deleting the file using Shift+Delete or an rm command at the command line interface, you may think the file is irretrievable, it isn't. The fact is that only a pointer to the file is marked as deleted, no thte data which still remains on the disk. The file deletion and tracking mechanism of available storage space works as such: a "table of contents" marks out the location in which each file is stored on the drive. Upon deletion of any particular file, its individual entry will be removed from the "table of contents", freeing up "available" space to store other data. When a new file is being written to the disk, it uses this "available" space, which replaces the previous data with the new data. Recovery of the old file becomes very difficult under such circumstances. Thus when you notice that you might have accidentally messed up the lost data that you want, you should avoid writing any new data to the medium. It is advisable that you shouldn't continue running the OS that is using that martition or medium in read-write mode. Even a browser caching files could cause the space of the deleted file to be overwritten.
File Deletion Process: Ext2 vs. Ext3
We will review how Ext2 deletes a file and compare it to Ext3 procedure. On Ext2 file system, deleting of a file by the OS can be continued as marking the directory entry, inode and data blocks that make up a file as unallocated. This marking appears in the block and inode bitmaps of each block group. Taking an example of a file called "ST2602.pdf" with a size of 562378 bytes, its structure is similar to the diagram below.
When the file "ST2602.pdf" is deleted on the Ext2, the inode and content blocks are marked as unallocated so that the OS is able to reuse them when needed while keeping all the information still there as shown in the next diagram.
Lab Setup
For the following demostration, we will be using multiple file systems created on a temporary server running on Ubuntu 12.04
/dev/sda1 will be an Ext2 FS called "storage" which will be used to store images of other file systems /dev/sda7 will be an Ext3 FS that has its own journal mounted with default options and named "ext3default"
We will be copying serveral pdf files, opening them and then deleting some after mounting them with their default options and finally unmounting the FS. The images were then created with the DD, one of the common tools we introduced earlier on. The images were stored on /media/storage with the following notation </dev>img.dd so in this case, sda6.img will be the forensic image for /dev/sda6.
We will be using TSK (The Sleuth Kit) tools as part of this forensic investigations. As we are based on a Ubuntu distro of linux, we will therefore install TSK with the following command:
"apt-get install sleuthkit"
Let's continue by seeing what really happens on and Ex3 file system. After having deleted some files and imaging the entire filesystem on sda6, the process to find and restore a file is repeated:
[root@st2602]# ils -r sda6img.dd class|host|device|start_time ils|st2602||1194252024 st_ino|st_alloc|st_uid|st_gid|st_mtime|st_atime|st_ctime| st_mode|st_nlink|st_size|st_block0|st_block1 144014|f|0|0|1194231994|1194231994|1194231994|4075 5|0|0|0|0 144015|f|0|0|1194231994|1194231982|1194231994|1006 44|0|0|0|0 [REMOVED]
[root@st2602]# istat sda6img.dd 144015 inode: 144015 Not Allocated Group: 9 Generation Id: 3670456940 uid / gid: 0 / 0 mode: -rw-r--r-size: 0 num of links: 0 Extended Attributes (Block: 295595) security.selinux=root:object_r:file_t:s0 Inode Times: Accessed: File Modified: Deleted: Direct Blocks: [root@st2602]# Sun Nov 16 21:06:22 2012 Sun Nov 16 21:06:34 2012 Sun Nov 16 21:06:34 2012
I have delibrately highlighted the above to show that the links to the data blocks and the size of the file has been zeroed out. This is the reason why recovery a deleted file from Ext3 could be considered almost an impossible task, but we will showing you a method that a forensic investigator could use to recover and restore a file under Ext3.
[root@st2602]# ils -r sda6img.dd class|host|device|start_time ils|st2602||1194252024 st_ino|st_alloc|st_uid|st_gid|st_mtime|st_atime|st_ctime|st_mode|st_nlink|st_si ze|st_block0|st_block1 144014|f|0|0|1194231994|1194231994|1194231994|40755|0|0|0|0 144015|f|0|0|1194231994|1194231982|1194231994|100644|0|0|0|0 144016|f|0|0|1194231994|1182970801|1194231994|100644|0|0|0|0 144017|f|0|0|1194231994|1182970801|1194231994|100644|0|0|0|0 144018|f|0|0|1194231994|1182970801|1194231994|100644|0|0|0|0 144019|f|0|0|1194231994|1182970801|1194231994|100644|0|0|0|0 144020|f|0|0|1194231994|1194231982|1194231994|100644|0|0|0|0
[root@st2602]# istat sda6img.dd 144014 inode: 144014 Not Allocated Group: 9 Generation Id: 3670456939 uid / gid: 0 / 0 mode: drwxr-xr-x size: 0 num of links: 0 Extended Attributes (Block: 295595) security.selinux=root:object_r:file_t:s0 Inode Times: Accessed: File Modified: Deleted: Direct Blocks: [root@st2602]# istat sda6img.dd 144015 inode: 144015 Not Allocated Group: 9 Generation Id: 3670456940 uid / gid: 0 / 0 mode: -rw-r--r-size: 0 num of links: 0 Extended Attributes (Block: 295595) security.selinux=root:object_r:file_t:s0 Inode Times: Accessed: File Modified: Deleted: Sun Nov 16 21:06:22 2012 Sun Nov 16 21:06:34 2012 Sun Nov 16 21:06:34 2012 Sun Nov 16 21:06:34 2012 Sun Nov 16 21:06:34 2012 Sun Nov 16 21:06:34 2012
We can see from here that inode 144014 was linked to a directory while 144015 contained a file but their block pointers and stats were lost. Inode 144015 also belongs to block group 9 and now being marked as unallocated as a result of the deletion process.
[root@st2602# fsstat sda6img.dd | grep i group: 9 Group: 9: Inode Range: 144001 - 160000 Block Range: 294912 - 327679 Layout: Super Block: 294912 - 294912 Group Descriptor Table: 294913 - 294913 Data bitmap: 295093 - 295093 Inode bitmap: 295094 - 295094 Inode Table: 295095 - 295594 Data Blocks: 295595 - 327679
Looking closer at the information in group 9 reveals that there are 16000 inodes with an inode table as large as 500 blocks. The inode range is 144001 - 160000, and the inode table is 295095-295594. Dividing 16000 by 500, we get 32, which is the number of inodes in each block. 144015 is therefore the 15th entry in the table, with content found in the inode table's first block. Recall that the journal works only at the block level so we are looking out for 295095, the first block of the inode table.The output from jls shows that there are 295095 has been referenced too multiple times, so we shall use the first one. When there are multiple cases of a particular block being referenced to, we might have to analyze eacch of them one by one. To determine the chronological order of multiple references, we need to check the sequence number of the transaction. The higher that number, the newer the inode copy will be.
[root@st2602]# jls sda6img.dd JBlk 0: 1: 2: 3: 4: 5: 6: 7: 8: 9: Descriptrion Superblock (seq: 0) Allocated Descriptor Block (seq: 2) Allocated FS Block 183 Allocated Commit Block (seq: 2) Allocated Descriptor Block (seq: 3) Allocated FS Block 295094 Allocated FS Block 1 Allocated FS Block 295095 Allocated FS Block 295093 Allocated FS Block 295595
There is information of an operation belonging to the inode table of group 9 in block 7 of the journal as shown in the above output. We can search for a copy of inode 144015 in the journal because it records copies of modified metadata.As mentioned earlier, we are looking for an entry that is the 15th one of the inode table within block group 9, and using jcat together with dd and xxd will help us extract inode 144015 from the journal. However, we need the inode size first in order to do that. On Ext2/3 file systems, inode size is often 128 bytes. For verification sake, we can try running fsstat or dumpe2fs on the file system's image. Here is an example of dumpe2fs being used.
[root@Akula1 workbench]# dumpe2fs sda6img.dd|grep -i "inode size" dumpe2fs 1.42.4 (12-June-2012) Inode size: 128
Now we execute the below code: jcat sda6img.dd 8 7 | dd bs=128 skip=14 count=1 | xxd Jcat takes three parameters, first being the image file, second being the inode where the journal begins and third would be the entry within the journal which we want(entry 7). We used dd to carve one inode from block 7, and since we need to get the 15th inode from the inode range, 14 blocks are skipped (skip=14). The size of the inode is 128 bytes. Sending the output from jcat and dd to xxd, we get the hex dump format, which is shown below:
[root@st2602# jcat sda6img.dd 8 7 | dd bs=128 skip=14 count=1 | xxd 1+0 records in 1+0 records out 128 bytes (128 B) copied, 0.00402034 seconds, 31.8 kB/s 0000000: a481 0000 ca94 0800 b1b3 8246 6c13 2e47 ...........Fl..G 0000010: dd96 8246 0000 0000 0000 0100 6004 0000 ...F........`... 0000020: 0000 0000 0000 0000 9abb 0400 9bbb 0400 ................ 0000030: 9cbb 0400 9dbb 0400 9ebb 0400 9fbb 0400 ................ 0000040: a0bb 0400 a1bb 0400 a2bb 0400 a3bb 0400 ................ 0000050: a4bb 0400 a5bb 0400 a6bb 0400 0000 0000 ................ 0000060: 0000 0000 6cba c6da ab82 0400 0000 0000 ....l........... 0000070: 0000 0000 0000 0000 0000 0000 0000 0000 ................
The above is a copy of what 144015 used to be, so interpretation is required. We aim to find block pointers in order to get the structure of the inode, as seen in the table below:
Looking at the above, we know that the size of the file linked to inode 144015 was 562387 bytes (0x0894ca):
0000000: a481 0000 ca94 0800 b1b3 8246 6c13 2e47 ...........Fl..G
0000 0000 0000 0000 9abb 0400 9bbb 0400 9cbb 0400 9dbb 0400 9ebb 0400 9fbb 0400 a0bb 0400 a1bb 0400 a2bb 0400 a3bb 0400 a4bb 0400 a5bb 0400 a6bb 0400 0000 0000
0000050: a4bb 0400 a5bb 0400 a6bb 0400 0000 0000 ................
No double of triple indirect block pointers can be found: Right now, we only have 12 data blocks, which is 49152 bytes out of 562378 which is the total file size. Block 310182 (0x4bba6) probably contains the opinters to the rest of the file. We will use dcat to check the content of block 310182:
[root@st2602]# dcat -h sda6img.dd 310182 0 a7bb0400 a8bb0400 a9bb0400 aabb0400 16 abbb0400 acbb0400 adbb0400 aebb0400 32 afbb0400 b0bb0400 b1bb0400 b2bb0400 48 b3bb0400 b4bb0400 b5bb0400 b6bb0400 64 b7bb0400 b8bb0400 b9bb0400 babb0400 80 bbbb0400 bcbb0400 bdbb0400 bebb0400 96 bfbb0400 c0bb0400 c1bb0400 c2bb0400 112 c3bb0400 c4bb0400 c5bb0400 c6bb0400 128 c7bb0400 c8bb0400 c9bb0400 cabb0400 144 cbbb0400 ccbb0400 cdbb0400 cebb0400 160 cfbb0400 d0bb0400 d1bb0400 d2bb0400 176 d3bb0400 d4bb0400 d5bb0400 d6bb0400 192 d7bb0400 d8bb0400 d9bb0400 dabb0400 208 dbbb0400 dcbb0400 ddbb0400 debb0400 224 dfbb0400 e0bb0400 e1bb0400 e2bb0400 240 e3bb0400 e4bb0400 e5bb0400 e6bb0400 256 e7bb0400 e8bb0400 e9bb0400 eabb0400 272 ebbb0400 ecbb0400 edbb0400 eebb0400 288 efbb0400 f0bb0400 f1bb0400 f2bb0400 304 f3bb0400 f4bb0400 f5bb0400 f6bb0400 320 f7bb0400 f8bb0400 f9bb0400 fabb0400 336 fbbb0400 fcbb0400 fdbb0400 febb0400 352 ffbb0400 00bc0400 01bc0400 02bc0400 368 03bc0400 04bc0400 05bc0400 06bc0400 384 07bc0400 08bc0400 09bc0400 0abc0400 400 0bbc0400 0cbc0400 0dbc0400 0ebc0400 416 0fbc0400 10bc0400 11bc0400 12bc0400 432 13bc0400 14bc0400 15bc0400 16bc0400 448 17bc0400 18bc0400 19bc0400 1abc0400 464 1bbc0400 1cbc0400 1dbc0400 1ebc0400 480 1fbc0400 20bc0400 21bc0400 22bc0400 496 23bc0400 24bc0400 00000000 00000000 512 00000000 00000000 00000000 00000000
.... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... ... !... "... #... $... .... .... .... .... .... ....
Lastly, to recover the what we presume to be the file, we can use dd to carve out data from the inode copy in the journal, and finish the job either manually or using foremost. If the files are fragmented, one solution can be to to carve out the blocks to a single file and execute foremost to find out what file type it is. In this case, the file is not fragmented:
[root@st2602]# dd bs=4096 skip=310168 count=141 if=sda6img.dd of=recover.dd 141+0 records in 141+0 records out 577536 bytes (578 kB) copied, 0.00310458 seconds, 186 MB/s [root@Akula1 workbench]# foremost -b 4096 -o recovery -t pdf recover.dd Processing: recover.dd |*| [root@st2602]# cd recovery [root@st2602]# more *.txt Foremost version 1.5.7 by Jesse Kornblum, Kris Kendall, and Nick Mikus Audit File Foremost started at Fri Nov 16 19:59:13 2012 Invocation: foremost -b 4096 -o recovery -t pdf recover.dd Output directory: /media/workbench/ext3default/recovery Configuration file: /usr/local/etc/foremost.conf -----------------------------------------------------------------File: recover.dd Start: Fri Nov 16 19:59:13 2012 Length: 564 KB (577536 bytes) Num Name (bs=4096) Size File Offset Comment 0: 00000002.pdf 553 KB 8192 Finish: Fri Nov 16 19:59:13 2012 1 FILES EXTRACTED pdf:= 1 -----------------------------------------------------------------Foremost finished at Fri Nov 16 19:59:13 2012
We carved out some blocks before and after where the info revealed the file could be located in, and foremost finished the job. Below is a graphical resume of this tehcnique.
Conclusion
"Knowledge of a file system's internals is essential for analysis without the use of specialized tools. Even in cases where such tools do not work, we can develop our own based on knowledge alone. The external journal is an example of such a scenario. "For the purpose of recovering forensic evidence, research has demonstrated that the Ext 3 file system trumps Ext2 due to the presence of more available options such as:" Allowing for file recovery through the journal's metadata copies Viewing of file activity across time via debugfs External journal analysis (assuming there is access to a journal's internal structures) However, there is a caveat; the journal has a tendency to rewrite itself from the very beginning if it runs out of space but additional recording of changes is still required. Due to its cyclic nature,the journal possesses a very short lifespan, with a commonly fixed size of 128MB or a theoretically maximum size of 400MB (assuming the block size is 4096 bytes, with 102400 file system blocks). That being said, the journal is still considered relevant if scanning for recent activity."
References
Carter, B. (2005). File system forensic analysis. Ext2 and Ext3 Concepts and Analysis Bill Nelson, Amelia Phillips & Christopher Steuart (2009). Computer Forensics and Investigations Anatomy of the Linux Kernel (IBM Linux Technical Library) ECCouncil Computer Hacking Forensic Investigator V8 TheSleuthKit (www.sleuthkit.org) I have taken efforts in this report. However, it would not have been possible without references from the above sources. I would like to extend my sincere thanks to all of them.
Reflections
The purpose of computer forensics is to recover and investigate digital evidence which can then be used to prosecute criminals in court. With the rise of computer crime and computer related crime, the importance of computer forensics cannot be understated. Data recovery is among one of the many digital forensics techniques used to acquire the incriminating evidence needed to put away dangerous law breakers. Through this project, I have learnt that data recovery is a very fundamental and important technique in computer forensics. One reason is due to the widespread use of personal computers, laptops, and digital data. With the proliferation of data being stored and transferred on digital mediums, be it pictures, videos, or documents, there is an increasing chance that relevant evidence can be discovered and used for litigating purposes. However, it is very unheard of for the retrieving of important data to be a walk in the park. Usually, criminals would have the common sense to delete the incriminating files. This is where data recovery comes in. Most criminals are unable to anticipate the usage of sophisticated data recovery techniques to extract deleted file fragments from slack space and rebuild them. Even if they are able to conceive that files could be reconstructed, they may not have the capability to prevent law enforcers from retrieving and rebuilding files through data recovery techniques. The usual behaviour of criminals is to simply delete file data, and most common operating systems and file systems tend to leave the deleted data somewhere in the physical disk sectors. That is why file carving, or reconstructing deleted data by searching for known file headers within the disk image, is an extremely effective technique for most computer-related crime cases.