0% found this document useful (0 votes)
88 views69 pages

My Note 3-5

Uploaded by

Abhi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
88 views69 pages

My Note 3-5

Uploaded by

Abhi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 69

Abhi

Module-3
Understanding Hard Disks and File Systems

Different Types of Disk Drives and their Characteristics,


Disk drives are essential components for data storage in modern computing systems. Over the years, various
types of disk drives have evolved, each with unique characteristics in terms of technology, performance,
capacity, and reliability. The most common types of disk drives are Hard Disk Drives (HDDs) and Solid-
State Drives (SSDs). Below is a detailed comparison of these types, along with their internal workings,
benefits, and limitations.

1. Hard Disk Drives (HDDs)


A Hard Disk Drive (HDD) is a mechanical, non-volatile data storage device that uses magnetic storage to
store and retrieve digital information. The data is stored on rapidly rotating platters coated with a magnetic
material, and data is read written using moving mechanical arms with read write heads. This technology has
been around for decades and is still commonly used for large-scale storage due to its cost-effectiveness.

Characteristics:
- Mechanical Components: HDDs contain moving parts such as spinning platters and read write heads,
making them susceptible to wear and physical damage.
- Magnetic Storage: Data is recorded magnetically on the platters. The drive heads magnetize small areas on
the disk to represent data.
- Rotational Speed: Measured in Revolutions Per Minute (RPM), this factor significantly impacts the drive’s
performance. Common speeds include 5400 RPM and 7200 RPM for consumer drives, while high-end
enterprise models can reach up to 15,000 RPM.
- Capacity: HDDs generally offer higher storage capacities (up to 20TB and beyond) at a lower cost
compared to SSDs.
- Cost: HDDs are more affordable than SSDs on a per-gigabyte basis, making them ideal for users seeking
large storage at a lower cost.
- Durability: Due to their mechanical components, HDDs are more prone to failures and physical damage,
particularly from shocks or drops.

Performance Metrics:
1. Seek Time: Time taken for the readwrite heads to position themselves over the correct track. Typical
values range between 8 to 12 milliseconds.
2. Rotational Latency: Time taken for the platter to spin the correct sector under the readwrite head. Average
latency is half the time of one full revolution.
3. Data Transfer Rate: Measured in megabytes per second (MBs), this rate depends on platter RPM, platter
density, and the data's location on the disk.
4. Access Time: Sum of the seek time and rotational latency, representing the total time to begin reading or
writing data.

Data Arrangement:
- Tracks, Cylinders, and Sectors:
- Tracks: Concentric circles on each platter where data is stored.
- Cylinders: The collection of tracks on all platters at a given head position.
Abhi

- Sectors: Subdivisions of tracks, each typically storing 512 bytes or, in newer HDDs, 4096 bytes (4KB
sectors).
- Zoned Bit Recording (ZBR): Outer tracks of the platter store more sectors than inner tracks, increasing
overall storage capacity.

Interfaces:
HDDs use various interfaces to connect to a computer:
- IDEPATA: Older interface, slower data rates, and bulkier cables.
- SATA (Serial ATA): Most common interface, offering speeds up to 6 Gbps (SATA III).
- SCSI and SAS (Serial Attached SCSI): High-performance interfaces used in enterprise storage
environments. SAS supports more devices and faster speeds compared to traditional SCSI.

2. Solid-State Drives (SSDs)


A Solid-State Drive (SSD) is an advanced data storage device that uses NAND flash memory instead of
mechanical parts to store data. SSDs are non-volatile, meaning they retain data even when power is lost.
They offer significantly faster performance than HDDs and are more durable due to the lack of moving
parts.

Characteristics:
- No Moving Parts: SSDs consist entirely of electronic components, making them less prone to mechanical
failure.
- NAND Flash Memory: Most SSDs use NAND-based flash memory, which is non-volatile and stores data
in memory cells.
- Performance: SSDs are much faster than HDDs. Their speed advantage is particularly noticeable in tasks
like booting up the operating system, opening applications, and file transfers.
- Form Factors: SSDs come in various sizes, including 2.5-inch drives (common in laptops), M.2 form
factors (which attach directly to the motherboard), and PCIe-based SSDs.
- Durability: Since SSDs do not have moving parts, they are more durable and less susceptible to physical
damage caused by vibrations or shocks.
- Cost: SSDs are more expensive per gigabyte than HDDs, though the price gap has been narrowing over
time.

Performance Metrics:
1. Data Transfer Rate: SSDs are significantly faster than HDDs. Typical SSDs using the SATA interface
offer readwrite speeds of 500–600 MBs, while PCIe NVMe SSDs can achieve up to 3500 MBs or more.
2. Access Time: With no moving parts, SSDs offer near-instantaneous access times, often in the range of
microseconds.
3. InputOutput Operations per Second (IOPS): SSDs can handle far more IOPS than HDDs, making them
ideal for environments where quick access to data is critical.

Types of SSDs:
1. NAND-based SSDs:
- Most common, used in consumer devices.
- MLC (Multi-Level Cell), TLC (Triple-Level Cell), and QLC (Quad-Level Cell) refer to the number of
bits stored per cell. More bits per cell result in higher density but reduced endurance and slower write
speeds.
Abhi

2. Volatile RAM-based SSDs:


- Uses DRAM for data storage, providing very fast access. However, data is lost when power is cut, so
these SSDs usually come with backup batteries or secondary storage to retain data.

SSD Interfaces:
- SATA SSD: Older standard, slower than newer interfaces (limited to 6 Gbps).
- PCIe SSD: Uses PCIe lanes for significantly faster speeds (up to 32 Gbps in PCIe 4.0 and 64 Gbps in PCIe
5.0).
- NVMe (Non-Volatile Memory Express): A storage protocol designed for PCIe SSDs that leverages the
parallelism of flash storage. NVMe offers low latency and high performance, making it the preferred
interface for high-speed SSDs.

3. Hybrid Drives (SSHDs)


Solid-State Hybrid Drives (SSHDs) combine the large storage capacity of an HDD with the speed of an
SSD. They incorporate a small amount of SSD storage (typically used as a cache) to store frequently
accessed files, while the bulk of the data is stored on the slower but higher-capacity HDD portion.

Characteristics:
- Speed: SSHDs offer better performance than traditional HDDs, though they are slower than full SSDs.
- Cost-Effective: SSHDs offer a middle ground between the cost of HDDs and SSDs, providing a balance
between capacity and speed.
- Capacity: Since most of the data is stored on the HDD portion, SSHDs can offer high storage capacities
(up to several terabytes).

4. PCIe and NVMe SSDs


PCIe SSDs connect directly to the motherboard using a high-speed PCI Express (PCIe) interface. These
drives are typically faster than SATA SSDs because they use more data lanes for higher throughput. NVMe
(Non-Volatile Memory Express) is a protocol used by modern PCIe SSDs, offering lower latency and higher
parallelism than SATA-based drives.

Characteristics:
- Speed: PCIe SSDs with NVMe can reach speeds up to 3500 MBs for read operations, far surpassing SATA
SSDs.
- Low Latency: NVMe SSDs are designed for minimal delay, making them ideal for performance-intensive
applications like gaming, video editing, and large-scale data processing.
- Form Factors: NVMe SSDs are available in various form factors, including M.2, U.2, and add-in cards
(AIC).

Usage:
- Gaming and High-Performance Computing: PCIe and NVMe SSDs are favored in systems where
performance is paramount, such as in gaming PCs, servers, and workstations.

Logical Structure of a Disk,


The logical structure of a disk determines how data is organized, accessed, and managed by the operating
system (OS). It helps the OS understand how to interact with the physical disk, and it directly impacts
Abhi

performance, storage efficiency, and the ability to recover from errors. Below, we’ll dive deeper into each
key component that makes up the logical structure of a hard disk.

1. Master Boot Record (MBR)


- Purpose:
The MBR is the first sector of the hard drive, responsible for storing crucial information such as the boot
loader and partition table. It tells the system where the OS is located and initiates the boot process.

- Components of MBR:
- Partition Table:
The MBR contains a 64-byte partition table that stores details about the partitions on the disk, including
size and type. This table can describe up to four primary partitions, with the possibility of one being an
extended partition to create logical drives.

- Master Boot Code:


This is a small program (446 bytes) that loads and executes the OS. It identifies the active partition (the
partition from which the OS is booted) and loads the first sector of this partition into memory to continue the
boot process.

- Disk Signature:
A unique 2-byte identifier at the end of the MBR used by the system to distinguish between different
disks.

- Limitations:
- Can only manage disks up to 2 TiB in size.
- Limited to four primary partitions. To bypass this limit, extended partitions are used to create additional
logical drives.

2. Clusters
A cluster is the smallest unit of storage that the file system can manage. Physically, a cluster is made up of
a set of continuous sectors on the disk. When files are stored, the file system assigns the necessary number
of clusters to hold the file’s data.

- Importance of Cluster Size:


- Small Files and Slack Space:
A file smaller than the size of a cluster will still occupy the entire cluster, resulting in wasted space called
slack space.

- Impact on Performance:
Larger clusters reduce fragmentation (when files are broken into non-contiguous blocks), improving
readwrite performance. However, they also increase slack space, especially if many small files are stored on
the disk.

- Cluster Sizing:
Abhi

Cluster size can range from 512 bytes to 4096 bytes or more, depending on the disk volume and
formatting scheme. Larger disks usually have larger clusters, but this trade-off must be managed to minimize
wasted space.

- Cluster Chaining:
Files do not need to be stored in contiguous clusters. The file system can store parts of a file in different
locations on the disk and link these clusters through a process known as cluster chaining.

3. Slack Space
Slack space is the unused portion of a cluster that remains after a file has been written. For example, if the
cluster size is 4096 bytes and the file only uses 3000 bytes, the remaining 1096 bytes become slack space.

- Types of Slack Space:


- RAM Slack:
The space from the end of the file to the end of the sector. RAM slack contains leftover data from the
system’s memory, which might be unrelated to the file itself.

- Drive Slack:
The space from the end of the last sector of the file to the end of the cluster. This space is often filled with
leftover data from the drive itself.

- Forensic Importance:
Since slack space may still contain old data from previously deleted files, forensic investigators can recover
partial or whole fragments of deleted files by analyzing slack space.

4. Lost Clusters
Lost clusters occur when the OS marks a cluster as "in use," but no file is associated with it. This happens
if files aren’t properly closed or if the system crashes while writing data.

- Reasons for Lost Clusters:


- System crashes.
- Power failures.
- Improperly shutting down the system.

- Impact:
Lost clusters reduce available disk space and can lead to data integrity issues. They may also slow down
system performance, as the OS struggles to manage these orphaned data blocks.

- Detection and Repair:


Disk-checking utilities such as `chkdsk` (Check Disk) can scan for and resolve lost clusters. These utilities
either recover the data and link it to a new file or free up the lost clusters, returning them to available space
on the disk.

5. File Allocation Table (FAT) vs. New Technology File System (NTFS)

- FAT (File Allocation Table):


Abhi

- Old and Simple:


FAT is an older file system used in earlier versions of Windows and MS-DOS. It uses a simple table to
keep track of which clusters are used, and which are free.
- Inefficient for Large Volumes:
FAT’s simplicity results in inefficiencies on larger disks, as it often leads to large cluster sizes, which in
turn causes more slack space.
- Variants:
FAT12, FAT16, and FAT32 are different versions, with the number indicating the size of the file allocation
table in bits. FAT32, for instance, is more efficient than its predecessors but is still not optimal for modern
large disks.

- NTFS (New Technology File System):


- Modern and Advanced:
NTFS is the file system used in current versions of Windows. It supports larger volumes and smaller
clusters, making it much more efficient in terms of storage management.
- Enhanced Features:
NTFS supports file compression, encryption, access control lists (ACLs), and better recovery features,
making it more secure and efficient.
- Recovery:
NTFS has built-in features that help with data recovery, reducing the chance of lost clusters or other file
system errors.

6. GUID Partition Table (GPT)


GPT is a modern partitioning scheme designed to replace MBR. It supports larger disk sizes and overcomes
many of MBR’s limitations.

- Advantages of GPT:
- Larger Disk Support:
GPT can manage disks larger than 2 TiB, whereas MBR cannot.

- More Partitions:
GPT supports up to 128 partitions, while MBR is limited to four primary partitions.

- Redundancy and Security:


GPT stores multiple copies of partition and boot data across the disk, ensuring that even if the primary
copy becomes corrupt, the system can still boot using the backup copy. It also uses Cyclic Redundancy
Check (CRC) to detect data corruption.

- GPT Structure:
- LBA 0: Contains the protective MBR, which prevents older systems from misidentifying a GPT disk.
- LBA 1: Stores the GPT header, which points to the partition table (Partition Entry Array).
- LBA 2: Contains the Partition Entry Array, where each partition’s details are stored.

7. Disk Partitions

- Primary Partition:
Abhi

A primary partition is one that can host an operating system or data. Only one primary partition can be
active at any given time during the boot process.

- Extended Partition:
Since MBR only allows four primary partitions, extended partitions are used to create additional logical
drives. These logical drives can be used for data storage, but they cannot be used to boot the OS.

- Hidden Partitions:
Hidden partitions are often used by manufacturers for recovery purposes. These partitions are not visible to
the OS by default but can be accessed with specialized tools.

8. BIOS Parameter Block (BPB)


The BIOS Parameter Block (BPB) is a data structure found in the volume boot record. It describes the
physical layout of a disk and contains details about the volume, such as the number of sectors per cluster and
total sectors on the disk. It is used by the file system to manage the disk.

Booting Process of Windows, Linux, and Mac Operating Systems


1. Booting Process of Windows Operating System

The Windows boot process varies slightly between systems using BIOS with MBR (Master Boot Record)
and those using UEFI with GPT (GUID Partition Table). Both methods have distinct phases that ensure the
operating system loads correctly.

A. BIOS-MBR Boot Process

1. Power On:
- The user presses the power button, sending a signal to the CPU.
- The CPU checks the Power Good signal from the power supply, ensuring stable voltage levels.

2. BIOS Initialization:
- The BIOS firmware initializes hardware components, loading essential firmware settings stored in the
non-volatile memory (CMOS).
Abhi

- It performs the Power-On Self-Test (POST), which checks components like RAM, CPU, keyboard, and
storage devices for functionality.
- If any hardware fails POST, the BIOS emits beep codes or displays error messages.

3. Boot Device Selection:


- Upon successful POST, the BIOS looks for a bootable device in a pre-defined order (e.g., hard drive,
USB, optical drive).
- It searches for the MBR on the first sector (sector 0) of the bootable disk.

4. Loading MBR:
- The MBR contains the partition table and the boot code. The BIOS loads the MBR into memory.
- The MBR identifies the active partition, usually containing the Windows Boot Manager.

5. Executing Boot Manager:


- The MBR triggers the execution of Bootmgr.exe (Windows Boot Manager), which reads the Boot
Configuration Data (BCD) to determine the installed operating systems and their boot parameters.

6. Locating Windows Loader:


- Bootmgr locates the Winload.exe file on the Windows boot partition, which is responsible for loading the
OS.

7. Loading the Kernel:


- Winload.exe loads the Windows kernel (`ntoskrnl.exe`) into memory.
- It also loads the Hardware Abstraction Layer (HAL) (`hal.dll`) and any boot-start device drivers marked
in the BCD.

8. Session Manager Process:


- Control is passed to the Session Manager Subsystem (SMSS.exe), which initializes system components.
- It loads the SYSTEM registry hive and initializes non-essential drivers.

9. Winlogon and User Login:


- SMSS.exe triggers Winlogon.exe, which displays the user login screen for authentication.
- After user credentials are verified, Windows creates a user session.

10. Service Control Manager:


- Winlogon starts the Service Control Manager (SCM), which initializes services and non-essential
drivers.
- It also starts the Local Security Authority Subsystem Service (LSASS.EXE) for security processes.

11. Desktop Initialization:


- Once logged in, the SCM starts explorer.exe, which creates the user interface and the Desktop Window
Manager (DWM) initializes the desktop environment.
Abhi

B. UEFI-GPT Boot Process

1. Platform Firmware Initialization:


- The UEFI firmware initializes the platform, running any necessary pre-boot diagnostics.
- It configures hardware settings and sets up an execution environment for the boot manager.

2. Security Phase:
- The UEFI firmware checks security settings, managing reset events and preparing the environment for
subsequent phases.

3. Pre-EFI Initialization (PEI) Phase:


- This phase initializes the CPU, memory, and the boot firmware volume (BFV).
- It runs pre-initialization modules (PEIMs) to ensure all necessary hardware is ready.

4. Driver Execution Environment (DXE) Phase:


- The DXE phase initializes the entire system memory, IO devices, and memory-mapped IO (MMIO).
- The DXE core produces a set of EFI boot services (for loading applications) and EFI runtime services
(for system-level services).

5. Boot Device Selection (BDS) Phase:


- The BDS interprets the boot configuration data and determines which device to boot from based on user
settings or defaults.
- It loads the bootloader from the EFI partition for UEFI systems.

6. Loading the OS:


- The OS loader is executed, transferring control to the Windows kernel, continuing the boot process
similarly to the MBR method.

Identifying MBR and GPT

- Windows PowerShell Cmdlets:


Abhi

- `Get-GPT` can be used to analyze GPT structures, while `Get-BootSector` identifies whether the disk is
MBR or GPT.
- In Disk Management, users can view the partition style under the Volumes tab for any selected disk.

2. Booting Process of Linux Operating System

The Linux boot process is also composed of three major stages: BIOS, Bootloader, and Kernel.

A. BIOS Stage

1. Power On:
- Similar to Windows, the BIOS initializes when the system is powered on.
- POST checks hardware functionality.

2. Boot Device Search:


- The BIOS looks for bootable disks following the configured order.
- The first bootable disk found must have an MBR with a primary bootloader.

B. Bootloader Stage

1. Loading the Bootloader:


- The bootloader (commonly GRUB or LILO) is located in the MBR and is loaded into memory.
- Users can select which Linux kernel to boot if multiple kernels or OSes are available.

2. Kernel Loading:
- The bootloader loads the Linux kernel into memory, which is the core component of the operating
system.
Abhi

- It may also load an initial RAM disk (`initrd`), which contains temporary filesystem data required to
mount the real filesystem.

C. Kernel Stage

1. Executing initrd:
- The kernel uses the initrd image to create a temporary root filesystem.
- It executes the `init` process (or systemd), which is responsible for managing system initialization tasks.

2. Mounting the Root Filesystem:


- The kernel identifies and mounts the actual root filesystem specified in the boot parameters.
- It searches for hardware devices, loading necessary drivers for system operation.

3. Starting System Services:


- The `init` process reads configuration files (like `etcinittab` or systemd service files) to start essential
system services and user applications.

4. User Login:
- Once services are up and running, the user is presented with a login prompt or graphical interface,
allowing them to access the system.

Bootloaders

- Common bootloaders include:


- GRUB (Grand Unified Bootloader): Supports various configurations and OS selections.
- LILO (Linux Loader): Older bootloader that does not support dynamic configuration but is simpler in
functionality.

3. Booting Process of macOS


Abhi

The macOS boot process is also distinct, leveraging its unique architecture (PowerPC and Intel). It involves
several key steps that integrate hardware and software initialization.

A. BootROM Activation

1. System Power On:


- Similar to other systems, the user powers on the Macintosh.
- The BootROM initializes and performs POST to check essential hardware components.

B. OS Selection

1. OS Choice:
- If multiple operating systems are installed, users can hold down the Option key during startup to choose
which OS to boot.

C. Boot Loader Execution

1. Boot Loader:
- The system passes control to the boot loader (`boot.efi` for Intel Macs or `BootX` for older PowerPC
Macs).
- The boot loader is responsible for loading the macOS kernel and required components.

2. Kernel Loading:
- The boot loader loads a pre-linked version of the kernel located at
`SystemLibraryCachescom.apple.kernelcaches`.
- If the pre-linked kernel is unavailable, it attempts to load the mkext cache file, which contains device
drivers.

D. Driver Initialization

1. Driver Management:
- If the mkext cache is not found, the system searches for drivers in `SystemLibraryExtensions`.
- The kernel initializes its environment by linking loaded drivers using the IO Kit and the device tree
structure.

E. Launch Services

1. launchd Process:
- The `launchd` process replaces the older `mach_init`, responsible for running startup items and preparing
the user environment.
- It manages services and applications needed for user sessions.

2. User Login:
- Once the system is ready, the user is presented with a login screen to access their account.

File Systems of Windows, Linux, and Mac Operating Systems,


Abhi

Windows File Systems


Windows operating systems utilize several file systems, primarily FAT (File Allocation Table), FAT32, and
NTFS (New Technology File System). Each of these file systems has distinct characteristics, functionalities,
and historical significance. Here’s a detailed exploration of these file systems, focusing on their structure,
features, and uses.

1. File Allocation Table (FAT)


The File Allocation Table (FAT) file system, developed in 1976, has been a foundational file system for
various operating systems, including DOS and Windows. It is designed for small hard disks and provides a
straightforward folder structure. FAT organizes files and folders using a file allocation table, which resides at
the beginning of the volume.

Versions
FAT has three versions, distinguished by the size of entries in the FAT structure:
- FAT12: Supports fewer than 4,087 clusters.
- FAT16: Handles between 4,087 and 65,526 clusters.
- FAT32: Accommodates between 65,526 and 268,435,456 clusters.

Key Characteristics
- Cluster Size: The volume size determines the cluster size, which is a fundamental unit of disk space
allocation.
- Redundancy: FAT creates two copies of the file allocation table to safeguard against damage, and it
maintains a permanent location for the root folder.
- Usage: Commonly used in portable devices like flash drives, digital cameras, and other removable storage
devices. Its simplicity makes it compatible with many operating systems.

2. New Technology File System (NTFS)


The New Technology File System (NTFS) is a more advanced file system introduced with Windows NT. It
offers significant improvements over FAT, focusing on performance, security, and data integrity.

Key Features
- Metadata Storage: NTFS uses a system file called the Master File Table (MFT) to store metadata about
files and folders, including file names, locations, and access times. This is crucial for forensic investigations.
- Self-Repairing: NTFS is designed to recover from disk errors automatically, reducing the risk of data loss.
- Security Features: It supports file-level security, allowing users to set permissions for different files and
folders. Data can also be encrypted for additional protection using the Encrypting File System (EFS).
- Fault Tolerance: NTFS maintains a log of all changes made to files in case of a system crash, enabling
recovery with minimal data loss.
Abhi

NTFS Architecture
NTFS's architecture consists of several components:
- Master Boot Record (MBR): Contains boot code and partition information.
- Boot Sector (Volume Boot Record - VBR): The first sector in an NTFS volume that holds the boot code
and details about the file system.
- Ntldr.dll: The boot loader that accesses the NTFS file system.
- Ntfs.sys: The file system driver for NTFS.
- Kernel Mode & User Mode: NTFS operates in two modes, with kernel mode allowing direct access to
system components, while user mode restricts access for security.

System Files
NTFS maintains various system files in the root directory that are crucial for file system management:
- $MFT: Contains a record for every file.
- $LogFile: Used for recovery purposes.
- $AttrDef: Contains definitions for system and user-defined attributes.
- $BadClus: Records all bad clusters on the disk.
- $Bitmap: Contains a bitmap for the entire volume.
- $Quota: Indicates disk quota for each user.

Sparse Files
NTFS supports sparse files, which are efficient in terms of disk space. When large portions of the file
contain no data, NTFS only allocates space for the non-zero data, effectively managing disk usage.

Encrypting File System (EFS)


EFS is a built-in feature of NTFS that allows users to encrypt files for security. It employs symmetric key
encryption combined with public key technology:
- Digital Certificates: Users obtain a certificate containing a public and private key pair to facilitate
encryption.
- Automatic Encryption: Users do not need to manually decrypt files to modify them; the system handles
encryption seamlessly.
- Recovery Options: If a user loses their encryption certificate, recovery agents (such as domain
administrators in Windows 2000 Server networks) can restore access to the encrypted data.

Linux File System Architecture


Abhi

The architecture of Linux file systems is composed of two main components: User Space and Kernel Space.

1. User Space: This is the memory area where user processes operate. It is protected from direct access to
kernel operations, providing a secure environment for user applications.

2. Kernel Space: This is where the kernel executes core functions and services. Access to kernel space is
facilitated through system calls, allowing user processes to interact with the underlying hardware and system
resources.

GNU C Library (glibc) serves as an intermediary between user space and kernel space, providing a system
call interface that connects user applications to kernel services.

Virtual File System (VFS) is an abstraction layer that enables applications to access various file systems
seamlessly. Its internal architecture consists of:
- Dispatching Layer: Provides file-system abstraction.
- Caching Mechanisms: Enhances performance by storing frequently accessed data. The main cached objects
are dentry (directory entry) and inode (index node) objects.

When a file is accessed, the dentry cache records directory levels, and the inode identifies file attributes and
locations. The caching mechanism improves performance, using techniques like least-recently-used (LRU)
algorithms to manage memory effectively.

Filesystem Hierarchy Standard (FHS)

Linux file systems follow a single hierarchical directory structure, which is defined by the Filesystem
Hierarchy Standard (FHS). This standard organizes directories and files, making the system easier to
navigate. Here’s a summary of common directories within the FHS:

| Directory | Description |
Abhi

|-----------|-------------|
| `bin` | Essential command binaries (e.g., `cat`, `ls`, `cp`) |
| `boot` | Static files for the boot loader (e.g., Kernels, Initrd) |
| `dev` | Device files (e.g., `devnull`) |
| `etc` | Host-specific configuration files |
| `home` | User home directories |
| `lib` | Libraries for binaries in `bin` and `sbin` |
| `media` | Mount points for removable media |
| `mnt` | Temporary mounted file systems |
| `opt` | Add-on application software packages |
| `root` | Home directory for the root user |
| `proc` | Virtual file system for process and kernel information |
| `run` | Runtime process information |
| `sbin` | System binaries |
| `srv` | Site-specific service data |
| `tmp` | Temporary files |
| `usr` | Secondary hierarchy for read-only user data |
| `var` | Variable data (e.g., logs, spool files) |

Popular Linux File Systems

1. Extended File System (ext)


The first extended file system (ext), introduced in 1992, addressed the limitations of the Minix file system,
allowing larger partitions (up to 2 GB) and longer filenames (up to 255 characters). However, it lacked
certain features, such as separate access timestamps for files, leading to fragmentation issues.

2. Second Extended File System (ext2)

Developed by Remy Card, ext2 became the foundation for many Linux distributions. It uses fixed-size data
blocks and has an inode structure that describes files. Each file is identified by a unique inode number, and
the file system is managed through a superblock containing critical metadata. Key aspects include:

- Inodes: Store file metadata.


- Superblock: Contains information about the file system's structure and health.
- Group Descriptor: Holds data for each block group, including bitmaps for block and inode allocation.

3. Third Extended File System (ext3)

Ext3, introduced by Stephen Tweedie in 2001, is a journaling file system that enhances data integrity by
logging changes before they are committed to the disk. It supports large file sizes (16 GB to 2 TB) and offers
improved reliability and performance compared to ext2. Features include:

- Journaling Modes: Options to balance data integrity and speed.


- Easy Transition: Supports converting ext2 to ext3 without data loss.

4. Fourth Extended File System (ext4)


Abhi

As the successor to ext3, ext4 was designed for scalability and performance. It can handle file sizes up to 16
TB and volume sizes up to 1 exbibyte. Key enhancements include:

- Extents: Replaces traditional block mapping to reduce fragmentation.


- Delayed Allocation: Allocates larger data blocks to improve efficiency.
- Journal Check Summing: Enhances data reliability.
- Improved Performance: Faster filesystem checking and better timestamp granularity.

Journaling File System


Journaling file systems, such as ext3 and ext4, maintain a log of changes before applying them to the actual
file system. This mechanism protects against data loss in case of unexpected shutdowns or crashes. By
keeping a journal of updates, the system can restore data to a consistent state after recovery.

macOS File Systems


Apple's macOS is based on UNIX and employs a unique approach to data storage, differing significantly
from Windows and Linux file systems. Understanding macOS file systems is essential for forensic
investigators, as traditional forensic techniques for Windows and Linux may not apply. This discussion
covers the file systems utilized by various versions of macOS.

1. UNIX File System (UFS)


The UNIX File System (UFS) is a foundational file system used by many UNIX and UNIX-like operating
systems. It originates from the Berkeley Fast File System and was employed in the first version of UNIX
created at Bell Labs. Variants of UFS are utilized by BSD UNIX derivatives such as FreeBSD, NetBSD,
OpenBSD, NeXTStep, and Solaris.

Design:
- Boot Blocks: Initial blocks in the partition reserved for booting.
- Superblock: Contains a magic number to identify the file system as UFS and vital parameters describing
the file system's geometry, statistics, and tuning.
- Cylinder Groups: The file system is divided into cylinder groups, each containing:
- A backup copy of the superblock.
- A header with statistics and free lists.
- Numerous inodes, each representing file attributes.
- Numerous data blocks.

2. Hierarchical File System (HFS)


Developed by Apple in September 1985, HFS was created to support macOS in the proprietary Macintosh
environment, succeeding the Macintosh File System (MFS). HFS organizes a volume into logical blocks of
512 bytes, which are grouped into allocation blocks.

Structure:
1. Boot Blocks: Logical blocks 0 and 1 contain system startup information.
2. Master Directory Block (MDB): Found in logical block 2, it holds volume metadata, such as creation
timestamps and location of other volume structures. An Alternate MDB exists at the end of the volume for
utility purposes.
Abhi

3. Volume Bitmap: Starts at logical block 3 and tracks the allocation of blocks, with bits indicating used or
free status.
4. Extents Overflow File: A B-tree storing additional extents when initial extents in the Catalog File are
exhausted, also includes records of bad blocks.
5. Catalog File: Another B-tree containing records for all files and directories, with unique catalog node IDs
for efficient lookup.

3. HFS Plus (HFS+)


HFS Plus, also known as Mac OS Extended, is the successor to HFS and serves as the primary file system in
macOS. It supports larger files and uses Unicode for file naming.

Features:
- Uses B-tree data structures for improved performance.
- Supports large files up to 64 bits and file names of up to 255 characters.
- Employs a 32-bit allocation table for mapping, allowing for more allocation blocks than HFS.

Advantages:
- Efficient disk space usage.
- Compatibility with international file naming.
- Allows booting on non-Mac operating systems.

4. Apple File System (APFS)


Introduced in 2017 for macOS High Sierra and iOS 10.3, APFS replaced HFS+ as the default file system
across all Apple OSes, including macOS, iOS, watchOS, and tvOS. It is optimized for SSDs and flash
storage.

Structure:
- Container Layer: Manages volume metadata, encryption states, and snapshots.
- File-System Layer: Contains data structures for file metadata, content, and directory organization.

Key Features:
- Copy-on-Write: Allows efficient data management without overwriting existing data.
- Snapshots: Captures the state of the file system at a specific time, facilitating backups and recovery.
- Cloning: Creates duplicates of files and directories without using additional disk space.
- High Timestamp Granularity: Provides precise timestamps for file changes.
- TRIM Support: Enhances performance by allowing the operating system to inform the SSD which blocks
of data are no longer in use.

Drawbacks:
- APFS-formatted drives are incompatible with OS X 10.11 Yosemite and earlier, complicating file transfers
to older Mac devices.
- The copy-on-write feature can lead to fragmentation issues on copied files.
- APFS is not suitable for traditional HDDs due to its design optimizations for SSDs.
- Lacks support for non-volatile RAM (NVRAM), compression, and Apple Fusion Drives.
Abhi

File System Examination.


File system examination is a critical aspect of digital forensics that involves the analysis of computer file systems to
extract, recover, and investigate data. This process is essential for uncovering evidence in criminal investigations,
corporate fraud, and data breaches. The examination is typically performed using specialized tools that can analyze
disk images and file systems without altering the original data. Below, we discuss the primary tools used in file system
examination: The Sleuth Kit (TSK), Autopsy, and WinHex.

1. The Sleuth Kit (TSK)


The Sleuth Kit is a powerful suite of command-line tools designed for digital forensic investigations. It allows forensic
examiners to analyze disk images and file system structures, providing the ability to extract valuable data without
modifying the source.

Key Features:

- Volume and File System Analysis:


TSK enables detailed examination of disk layouts, including partitions and file systems. It supports various types of
partitions such as:
- DOS partitions
- BSD partitions
- Mac partitions
- GUID Partition Table (GPT) disks

- Support for Multiple File Systems:


TSK is compatible with a wide range of file systems, including:
- NTFS (Windows)
- FAT (Windows)
- exFAT (Windows)
- UFS (1 and 2)
- ext2, ext3, ext4 (Linux)
- HFS (Mac)
- ISO 9660 (CDDVD)
- YAFFS2 (used in embedded systems)

- Command-Line Interface:
The Sleuth Kit's command-line tools allow for efficient data analysis through various commands. Investigators can
run scripts and automate tasks to streamline their workflow.

- Image Formats:
TSK can analyze several disk image formats, including:
- Raw images (e.g., created with the `dd` command)
- Expert Witness format (EnCase)
- Advanced Forensic Format (AFF)

- Plug-in Framework:
TSK supports a plug-in architecture, allowing users to extend its functionality by integrating additional modules that
can analyze specific file types or automate forensic processes.

- Documentation and History Tracking:


Comprehensive documentation and a history feature are included to assist users in tracking their analysis processes
and findings.
Abhi

Use Case:
To utilize TSK effectively, a forensic examiner first creates an image of a hard disk or USB drive using disk imaging
tools like AccessData FTK Imager. They then load this image into TSK to analyze the file system structure, locate
files, and extract data.

2. Autopsy
Autopsy is a digital forensics platform that serves as a graphical user interface (GUI) for The Sleuth Kit. It simplifies
the forensic analysis process by providing a user-friendly interface that integrates various digital forensic tools and
modules.

Key Features:

- Graphical User Interface:


Autopsy’s GUI makes it accessible for forensic investigators, allowing them to navigate through file systems,
visualize data, and generate reports without needing extensive command-line experience.

- Timeline Analysis:
This feature provides an advanced graphical event viewing interface, enabling investigators to visualize and analyze
the sequence of activities on a computer over time.

- Hash Filtering:
Autopsy includes hash filtering capabilities that allow users to flag known malicious files while ignoring known good
files. This reduces noise in the investigation and helps focus on relevant evidence.

- Keyword Search:
The platform supports indexed keyword searches, enabling investigators to find files that contain specific terms or
phrases quickly. This is crucial for locating evidence related to investigations.

- Web Artifacts Extraction:


Autopsy can extract web browsing artifacts, including history, bookmarks, and cookies from various browsers like
Firefox, Chrome, and Internet Explorer, providing insights into user activities.

- Data Carving:
Autopsy employs data carving techniques to recover deleted files from unallocated space on a disk. It can utilize
tools like PhotoRec to extract data effectively.

- Multimedia Metadata Extraction:


The platform can extract Exif data from images and videos, which can provide additional context about when and
where a photo was taken, along with other metadata.

- Indicators of Compromise (IOC):


Autopsy can scan for potential security threats using Structured Threat Information Expression (STIX), helping to
identify files or actions that may indicate a compromise.

Use Case:
Autopsy serves as an all-in-one platform for digital forensic investigations. For instance, a forensic investigator might
analyze a suspect’s computer by creating a disk image, loading it into Autopsy, and using its various features to
recover deleted files, examine user activity, and compile evidence for reporting.
Abhi

3. Recovering Deleted Files with WinHex


WinHex is a versatile hexadecimal editor and forensic analysis tool used for data recovery, low-level data processing,
and IT security. It is particularly useful for inspecting and editing all types of files, as well as recovering deleted data
from various storage devices.

Key Features:

- Disk Editor:
WinHex can edit data on multiple types of storage media, including:
- Hard disks
- Floppy disks
- CD-ROMs
- DVDs
- ZIP drives
- Memory cards (e.g., from cameras)

- File System Support:


The software supports several file systems, including:
- FAT121632
- exFAT
- NTFS
- Ext234
- Next3® (for NextStep operating systems)
- CDFS (CD file system)
- UDF (Universal Disk Format)

- Data Recovery Techniques:


WinHex employs various recovery techniques to restore deleted files or lost data, particularly from corrupted file
systems.

- RAM Editor:
The RAM editor feature allows access to physical RAM and virtual memory, enabling investigators to analyze
running processes and capture volatile data.

- Data Analysis and Comparison:


Users can analyze and compare files to identify modifications, deletions, or other alterations. This is essential for
detecting tampering or unauthorized changes.

- Disk Cloning:
WinHex can create disk images and backups, ensuring that original data is preserved for examination.

- Security Features:
The software includes secure erasing capabilities to wipe confidential files and cleanse hard drives, making it useful
for data sanitization as well.

- Scripting and API:


WinHex supports scripting and provides an API, allowing users to automate repetitive tasks or integrate its
functionalities into larger forensic workflows.

Use Case:
Abhi

WinHex is often used in scenarios where files have been deleted or systems have become corrupted. For example, an
investigator may use WinHex to recover critical documents from a damaged hard drive, analyze running processes in
memory, or securely wipe a device before disposal.

Module-4
Data Acquisition and Duplication

Data Acquisition Fundamentals:


Forensic data acquisition is the systematic process of imaging or collecting data from various media using
established methodologies that meet certain standards for forensic value. It involves extracting
Electronically Stored Information (ESI) from suspect computers or storage media to gain insights into
crimes or incidents. As technology evolves, data acquisition becomes more accurate, simple, and versatile.
However, the acquisition process must remain forensically sound to ensure the admissibility of evidence in
court.

Importance of Forensic Data Acquisition


1. Integrity of Evidence:
The primary objective of forensic data acquisition is to preserve the integrity of the data being collected. If
evidence is altered during the acquisition process, it may be deemed inadmissible in court.

2. Verifiable and Repeatable Methods:


The methodologies used in data acquisition must be both verifiable and repeatable. This means that other
investigators should be able to replicate the acquisition process and obtain the same results, thereby ensuring
reliability.

3. Admissibility in Court:
Courts require evidence to be collected in a manner that minimizes risk and maintains integrity.
Forensically sound acquisition practices enhance the likelihood that evidence will be accepted during legal
proceedings.

Categories of Data Acquisition

Forensic data acquisition is generally categorized into two main types: Live Data Acquisition and Dead Data
Acquisition.

1. Live Data Acquisition:


The live data acquisition process involves the collection of volatile data from devices when they
are live or powered on. Volatile information, as present in the contents of RAM, cache, DLLs, etc.
Live data acquisition occurs when data is collected from a powered-on device. This method allows
investigators to capture volatile information that would be lost if the device were turned off.

Key Characteristics:
- Volatile Data Collection:
Involves collecting data stored in RAM, caches, and system registries, which are dynamic and can change
rapidly. Examples of volatile data include:
Abhi

- System Data: Information related to the current configuration and running state of the computer (e.g.,
login activity, running processes, open files, etc.).
- Network Data: Information regarding the network state (e.g., open connections, routing information, ARP
cache, etc.).

- Real-Time Data Capture:


Investigators need to acquire this data in real-time to ensure that no critical information is lost.

- Post-Acquisition:
After live data acquisition, investigators may proceed with static (dead) acquisition by shutting down the
system and obtaining a forensic image of the hard disk.

Potential Uses:
- Collecting data from cloud services (e.g., Dropbox or Google Drive).
- Accessing unencrypted data that may be open on the system.

Considerations:
The order of data collection should follow the principle of Order of Volatility to prioritize the most volatile
data first (e.g., registers and processor caches) and proceed to less volatile sources.

2. Dead Data Acquisition


Dead data acquisition involves collecting non-volatile data from a powered-off device. This data remains
unchanged even when the system is shut down.

Key Characteristics:
- Unaltered Data Capture:
Non-volatile data is collected in an unaltered manner, ensuring the integrity of the evidence. Sources of
non-volatile data can include:
- Hard drives
- USB drives
- External storage media (e.g., DVDs, flash drives)

- Types of Data Recovered:


Investigators may recover various data types, including:
- Temporary files
- System registries
- Event and system logs
- Web browser caches
- Slack space and unallocated drive space

Repeatability:
Dead acquisition can be repeated on well-preserved disk evidence, allowing for comprehensive analysis.

Rules of Thumb for Data Acquisition


Abhi

Best practices (or rules of thumb) help ensure successful data acquisition. Here are several important
considerations:

1. Avoid Modifying Original Evidence:


Investigators must never perform forensic investigations on original evidence to avoid altering data and
rendering it inadmissible.

2. Create Duplicate Bit-Stream Images:


Instead of working with the original media, investigators should create a bit-stream image. This approach:
- Preserves the original evidence.
- Allows for recreation of the duplicate if necessary.

3. Produce Two Copies:


It is essential to create two copies of the original media:
- Working Copy: For analysis.
- Control Copy: Stored for disclosure or backup purposes.

4. Integrity Verification:
After creating duplicates, investigators must verify their integrity by comparing hash values (e.g., MD5) of
the copies against the original media.

Types of Data Acquisition:


Data acquisition in digital forensics is a crucial step that impacts the integrity and utility of evidence
gathered for investigation. The choice of method often depends on the specific requirements of the case,
available tools, and time constraints. Below are the primary types of data acquisition, with detailed
explanations of each.

1. Logical Acquisition
Logical acquisition involves selectively collecting specific files or directories from a storage medium rather
than capturing a complete image of the entire device. This method is utilized when investigators can identify
the exact data relevant to their investigation.

When to Use:
- Time Constraints: When the investigation needs to proceed quickly.
- Targeted Investigation: When investigators are aware of which files are crucial to the case, such as emails,
documents, or logs.

Examples:
- Email Investigations: Collecting .pst or .ost files from Microsoft Outlook. These files may contain
important communications relevant to a case.
- Database Investigations: Gathering specific records from large databases or RAID arrays, allowing for
rapid access to pertinent information.

Advantages:
- Efficiency: Fast data collection since only relevant files are extracted, reducing the time taken compared to
full imaging.
Abhi

- Reduced Storage Needs: Requires less storage space than full images, which is beneficial in environments
with limited resources.
- Focused Evidence: Allows investigators to concentrate on specific files that are likely to contain evidence,
thus streamlining the analysis process.

Considerations:
- The risk of missing important data that may not have been initially identified as relevant. Therefore, careful
planning and knowledge of the system are required.

2. Sparse Acquisition
Sparse acquisition is a method that collects only fragments of deleted or unallocated data from a storage
device. This technique focuses on areas of the disk that may contain remnants of files that are no longer
visible in the file system.

When to Use:
- Non-Exhaustive Investigations: When full imaging is unnecessary, and investigators only need to retrieve
specific remnants of data.
- Targeted Recovery: When investigators suspect that deleted data may contain critical evidence.

Advantages:
- Resource Efficient: Saves time and storage space by focusing only on potentially useful fragments of data.
- Potential Recovery of Important Evidence: May recover deleted files that are still recoverable but not
present in the standard file system view.

Considerations:
- Incomplete Recovery: There’s a possibility that important data could be overlooked, as only fragments are
retrieved.
- Requires Advanced Tools: Tools must be capable of recognizing and extracting unallocated space
effectively.

3. Bit-Stream Imaging
Bit-stream imaging is the process of creating a complete bit-by-bit copy of a storage medium. This method
captures all data, including active files, deleted files, and hidden data, ensuring a comprehensive forensic
representation of the source.

Types of Bit-Stream Imaging Procedures:

1. Bit-Stream Disk-to-Image-File:
- Description: This method allows investigators to create one or more image files of the suspect drive,
capturing the complete data structure, including all sectors and clusters.
- Common Tools: ProDiscover, EnCase, FTK, The Sleuth Kit, and X-Ways Forensics.
- Advantages:
- Full Data Preservation: Captures all data, including deleted files that may still be recoverable.
- Forensic Integrity: Maintains the original media in an unaltered state, ensuring that evidence remains
admissible in court.
Abhi

- Flexibility: The ability to create multiple copies allows for different analyses without risking original
evidence.

2. Bit-Stream Disk-to-Disk:
- Description: Used when direct imaging to an image file is not feasible, such as with outdated hardware or
specific data recovery needs. This involves making a direct copy from one disk to another.
- Tools: EnCase, SafeBack, Tableau Forensic Imager, and other specialized imaging hardware.
- Advantages:
- Efficient for Old Drives: Particularly useful for older drives where imaging software may be
incompatible.
- Real-Time Recovery: Facilitates the recovery of credentials or other data during the copying process.
- Customizable Geometry: Adjusts the geometry of the target disk to match the source, optimizing the
acquisition process.

General Advantages of Bit-Stream Imaging:


- Comprehensive Evidence: Ensures that no data is missed, as all information on the storage medium is
preserved.
- Facilitates Detailed Analysis: Enables investigators to conduct in-depth examinations using various
forensic tools without risking data integrity.
- Supports Legal Validity: Due to the nature of the process, it reinforces the admissibility of the evidence in
legal contexts.

Considerations:
- Time-Consuming: Creating a full image of a large drive can take significant time, which may be a
constraint in urgent situations.
- Storage Requirements: Requires adequate storage space to accommodate large image files, particularly for
high-capacity drives.

Data Acquisition Format:


Data acquisition in forensic investigations involves creating images or copies of suspect media to extract
valuable information while preserving the integrity of the original evidence. The format of the acquired data
plays a crucial role in ensuring that it can be effectively analyzed and utilized in legal proceedings. Below
are the primary formats in which data can be acquired from suspect media, along with their characteristics,
advantages, and disadvantages.

1. Raw Format
Raw format refers to a bit-by-bit copy of the suspect drive, creating an exact duplicate of all data, including
unallocated space and deleted files. The image is typically obtained using command-line tools, such as the
`dd` command.

Advantages:
- Fast Data Transfers: The raw format allows for rapid copying of data from the source media.
- Minor Data Read Errors Ignored: The acquisition process can overlook minor errors, ensuring that most
data is captured even if some sectors have issues.
- Wide Tool Compatibility: Most forensic tools can read raw image files, making this format versatile for
analysis.
Abhi

Disadvantages:
- Storage Requirements: The raw image requires storage space equivalent to that of the original media,
which can be significant for large drives.
- Bad Sector Handling: Open-source tools may fail to recognize or collect data from marginal (bad) sectors
effectively. Commercial tools generally handle these situations better, employing more retries to ensure data
integrity.

2. Proprietary Format
Proprietary formats are specific to commercial forensic tools. These tools acquire data from the suspect
drive and save the images in their unique formats.
Commercial forensics tools acquire data from the suspect drive and save the image files in their
own formats.

Advantages:
- Compression Options: These formats often include features to compress image files, reducing the amount
of space needed on target media.
- Segmentation: Proprietary formats can split large images into smaller segments, allowing for easier storage
on smaller media like CDs or DVDs without losing data integrity.
- Metadata Incorporation: They can embed relevant metadata within the image file, such as acquisition date,
time, hash values, and case details, which can enhance the contextual understanding of the evidence.

Disadvantages:
- Interoperability Issues: Image files created in one tool’s format may not be supported by other forensic
tools, potentially complicating cross-tool investigations.

3. Advanced Forensics Format (AFF)


The Advanced Forensics Format (AFF) is an open-source data acquisition format designed to store disk
images along with associated metadata. It aims to provide an alternative to proprietary formats and improve
interoperability among forensic tools.

Characteristics:
- File Extensions: AFF uses two main extensions: `.afm` for metadata and `.afd` for segmented image files.
- Simple Design: The format is designed for accessibility across multiple platforms and operating systems,
providing an open solution for forensic investigators.

Advantages:
- Compression Support: AFF supports image file compression, which can save storage space.
- Metadata Storage: It allocates space for metadata associated with images, allowing for better evidence
management.
- Internal Consistency Checks: AFF includes mechanisms for self-authentication to ensure data integrity.

Compression Algorithms:
- Zlib: Offers faster compression but is less efficient.
- LZMA: Provides more efficient compression but at the cost of slower performance.
Abhi

4. Advanced Forensic Framework 4 (AFF4)


Advanced Forensic Framework 4 (AFF4) is a redesigned and improved version of the AFF format,
developed to handle storage media with larger capacities and provide a more structured data organization.

Characteristics:
- Object-Oriented Design: AFF4 uses generic objects (volumes, streams, graphs) to facilitate data
management and retrieval.
- Unified Data Model: It offers a consistent naming scheme and a unified approach to organizing
information, enhancing usability.

Advantages:
- Support for Large Data Sets: AFF4 can handle a vast number of images and supports various container
formats (e.g., Zip, Zip64).
- Network Storage: It allows for data to be stored over networks and supports WebDAV for imaging directly
to HTTP servers.
- Efficiency Improvements: The format supports zero-copy transformations, which enhance processing
efficiency by reducing the need for data duplication.

Basic Object Types:


- Volumes: These hold segments, which are indivisible blocks of data.
- Streams: Data objects that facilitate reading or writing operations, encompassing segments, images, and
maps.
- Graphs: Collections of RDF statements that describe relationships among data objects.

Choosing the Appropriate Format

When determining the data acquisition format, forensic investigators must consider several factors:

1. Case Requirements: The specific needs of the investigation, including the types of data needed and their
urgency.
2. Storage Capacity: The available storage space for the acquired data and whether compression is necessary.
3. Tool Compatibility: The forensic tools available for analysis and their support for various data formats.
4. Interoperability: The ability to share and access data across different forensic tools or platforms.

Data Acquisition Methodology:


Data acquisition in digital forensics is a critical phase where forensic investigators obtain data from suspect
media. This process must be systematic and forensically sound to ensure that any evidence collected is
admissible in a court of law. The following detailed explanation covers the steps and considerations involved
in forensic data acquisition methodology.

1. Determining the Data Acquisition Method

The method of data acquisition chosen depends on various situational factors. Key considerations include:

- Size of the Suspect Drive: For larger drives, disk-to-image copying is typically required. If the target drive
is smaller than the suspect drive, investigators may need to reduce the data size using:
Abhi

- Microsoft Disk Compression Tools: Tools like DriveSpace or DoubleSpace can be used to exclude slack
disk space.
- Compression Methods: Archiving tools (e.g., PKZip, WinZip, WinRAR) can reduce file sizes.
- Testing Lossless Compression: By applying MD5, SHA-2, or SHA-3 hashes before and after compression
to ensure data integrity.

- Time Required to Acquire the Image: Larger drives take longer to acquire. For example, acquiring a 1 TB
drive may take over 11 hours. Investigators should prioritize acquiring only data of evidentiary value to save
time.

- Retention of the Suspect Drive: If the drive cannot be retained (e.g., in civil litigation), logical acquisition
may be necessary. If the drive can be retained, a reliable data acquisition tool should be used to create a
copy.

2. Selecting the Data Acquisition Tool

Choosing the right tool is paramount in forensic data acquisition, depending on the acquisition technique.
The following requirements for data acquisition tools are crucial:

Mandatory Requirements:
- No Alteration of Original Content: The tool must not modify the original data.
- Logging of IO Errors: The tool must log any input output errors in an accessible format.
- Comparison Capability: The tool must compare source and destination data and alert users if the
destination is smaller.
- Scientific Validation: Tools must pass scientific and peer review and yield repeatable results.
- Complete Acquisition: The tool must acquire all visible and hidden data sectors.
- Bit-Stream Copy: It must create a bit-stream copy when there are no errors, or a qualified bit-stream copy
when errors occur.
- Destination Size Check: The tool must ensure the destination is larger or equal to the source.
- Correct Documentation: Documentation must be accurate for expected outcomes.

Optional Requirements:
- Hash Value Calculation: The tool should compute hash values for the bit-stream copy and compare them
with the source hash.
- Logging Features: Optional logging of tool actions, settings, and errors is desirable.
- Partition Handling: The tool should allow for the creation of images of individual partitions.
- Visibility of Partition Tables: It should make the source disk partition table visible to users.

3. Sanitizing the Target Media

Before data acquisition, the target media must be sanitized to erase any previous data. Data sanitization
protects sensitive information from retrieval. Common methods include:

- Overwriting Data: This involves applying sequences of zeroes and ones to completely erase data. Methods
include:
- Russian Standard GOST P50739-95 (6 passes)
Abhi

- German VSITR (7 passes)


- DoD 5220.22-M (7 passes)

- NIST SP 800-88 Standards: The guidelines propose three sanitization methods:


- Clear: Logical techniques for sanitizing data.
- Purge: Physical or logical techniques to make data recovery infeasible.
- Destroy: Techniques rendering the media unusable for storage.

Physical destruction (e.g., shredding) can also be employed for sensitive media.

4. Acquiring Volatile Data

Volatile data (e.g., data in RAM) is dynamic and must be acquired carefully. Actions performed on a live
system can alter data. The acquisition methods vary based on the operating system:

- Windows: Tools like Belkasoft Live RAM Capturer can capture the entire volatile memory. It's crucial to
conduct this without modifying the system to avoid data loss.

5. Enabling Write Protection on Evidence Media

To prevent alterations to the original evidence, write protection must be enabled. This can be achieved
through:

- Hardware Methods: Setting jumpers on disks to make them read-only.


- Software Methods: Utilizing write blocker tools that provide read-only access.

6. Acquiring Non-volatile Data

Non-volatile data can be obtained both live and dead:

- Live Acquisition: Tools like Netcat or bootable CDsUSBs can be used.


- Dead Acquisition: Involves removing the hard drive from the suspect device, connecting it to a forensic
workstation, enabling write-blockers, and using tools like FTK Imager to create a forensic image.

7. Planning for Contingency

Contingency planning involves preparing for unexpected issues during data acquisition. It is essential for
ensuring the integrity of the evidence. Plans should include:

- Creating Multiple Images: At least two copies of evidence should be made to prevent loss due to
corruption.
- Utilizing Different Imaging Tools: If multiple tools are available, use them for redundancy.
- Handling Hardware Failures: Have hardware acquisition tools ready for BIOS-level data access.
8. Validating Data Acquisition
Abhi

Validation is crucial to ensure the integrity of the acquired data. This involves calculating and comparing
hash values. Hashing algorithms like CRC-32, MD5, SHA-1, and SHA-256 can be used for validation. In
Windows, the Get-FileHash cmdlet can be employed for hash calculations, and forensic tools often include
built-in validation features.

Windows Validation Methods:


- Forensic tools provide metadata that includes hash values for original media, ensuring that any
discrepancies are identified immediately, indicating potential corruption.

Defeating Anti forensics Techniques


Anti-forensics and its Techniques:
Anti-forensics refers to a range of techniques and practices employed by individuals to obstruct, manipulate,
or eliminate evidence that could be collected during forensic investigations. These methods aim to hinder
investigators' ability to uncover the truth about criminal activities or illicit behavior.

Goals of Anti-Forensics:
- Impediment of Information Collection: Reduce the availability of relevant evidence, complicating the
investigator's task.
- Concealment of Criminal Activity: Hide or obscure actions taken by the perpetrator to avoid detection.
- Compromise Evidence Integrity: Alter evidence to mislead investigations and forensic analysis.
- Disguise Use of Anti-Forensics Tools: Remove indicators that anti-forensics measures have been applied.

Techniques of Anti-Forensics
1. DataFile Deletion
- Description: When files are deleted, the OS typically only removes the pointers to the data, marking the
space as available for future use. The actual data can often be recovered until it is overwritten.
- Implications: Although standard file deletion might seem effective, forensic investigators can often recover
deleted files using recovery tools, such as Recuva or EnCase. Advanced deletion methods, like using the
Secure Erase command, can mitigate this risk by overwriting the data.

2. Password Protection
- Description: This involves securing files or drives with passwords, limiting access to authorized users only.
- Techniques: Attackers may use strong, complex passwords, and tools like KeePass for management.
- Implications: Password cracking techniques (e.g., brute-force, dictionary attacks) can sometimes bypass
these protections. Investigators might use tools like John the Ripper or Hashcat to attempt recovery of
passwords.

3. Steganography
- Description: This technique involves concealing information within other non-suspicious files, such as
images, audio, or video.
- Tools: Software like OpenStego or Steghide enables users to hide data without significantly altering the
original file.
- Implications: Steganography can be difficult to detect with traditional forensic methods, making it a
powerful tool for evading detection.

4. Data Hiding in File System Structures


Abhi

- Description: Some file systems (e.g., NTFS) support Alternate Data Streams (ADS), which allow data to be
stored within a file in a way that is not visible through standard file operations.
- Implications: Investigators may need specialized tools (like StreamArmor or ADS Scout) to detect and
analyze these hidden streams, complicating investigations.

5. Trail Obfuscation
- Description: Manipulating system logs, timestamps, and other records to mislead forensic investigators.
This includes deleting or altering log files and timestamps.
- Tools: Log cleaning tools, such as CCleaner or BleachBit, can be employed to erase traces of user activity.
- Implications: Accurate timeline reconstruction becomes challenging for investigators, obscuring the
perpetrator's actions.

6. Artifact Wiping
- Description: Permanently removing digital evidence through specialized software that overwrites data on a
hard drive multiple times.
- Tools: Software like DBAN (Darik's Boot and Nuke) or Eraser can securely wipe data.
- Implications: This technique significantly hampers data recovery efforts, as overwritten data is much
harder to retrieve.

7. Overwriting DataMetadata
- Description: Attackers may overwrite existing data or modify metadata to obscure their activities, such as
altering the timestamps on files to mislead investigators.
- Implications: This complicates the forensic analysis and reconstruction of events, as the evidence trail
becomes less reliable.

8. Encryption
- Description: Data is encoded into an unreadable format using encryption algorithms, accessible only with
the correct decryption key.
- Tools: Common encryption tools include VeraCrypt, BitLocker, and TrueCrypt. These can encrypt
individual files or entire drives.
- Implications: Strong encryption can make data inaccessible to forensic investigators, especially if the key
is not available.

9. Program Packers
- Description: These tools compress and obfuscate executable files, making reverse engineering difficult.
- Implications: Investigators may struggle to analyze the true nature of the files, hindering malware detection
and analysis.

10. Minimizing Footprint


- Description: Reducing one’s digital presence by deleting unnecessary logs, files, and traces of online
activities.
- Implications: This makes it challenging for investigators to gather sufficient evidence to establish patterns
of behavior.

11. Exploiting Forensics Tool Bugs


Abhi

- Description: Some forensic tools may have vulnerabilities that attackers can exploit to evade detection or
manipulate evidence.
- Implications: If investigators rely on flawed tools, they may draw incorrect conclusions or miss critical
evidence.

12. Detecting Forensics Tool Activities


- Description: Attackers may monitor forensics tools using rootkits or monitoring software to disable or
circumvent their functions.
- Implications: This proactive approach complicates investigations, as evidence may be tampered with or
removed before forensic analysis can occur.

Anti-forensics Countermeasures.
Anti-forensics poses significant challenges to digital forensic investigations by hindering the collection and
analysis of evidence. To combat these techniques effectively, investigators can employ various
countermeasures aimed at mitigating the impact of anti-forensics. Here’s an in-depth exploration of these
countermeasures:

1. Train and Educate Forensic Investigators


- Description: Continuous training and education for forensic investigators is essential. Understanding the
latest anti-forensics techniques empowers investigators to recognize and respond to such tactics effectively.
- Implementation: Conduct workshops, seminars, and training programs focused on anti-forensics. Include
case studies of previous incidents involving anti-forensic measures to enhance practical understanding.

2. Validate Results Using Multiple Tools


- Description: To ensure the accuracy of findings, investigators should validate their results by employing
multiple forensic tools. Different tools may have varying capabilities and may uncover different aspects of
the evidence.
- Implementation: Use a combination of commercial and open-source forensic tools to cross-check findings.
This reduces the likelihood of false positives or negatives resulting from tool-specific limitations.

3. Impose Strict Laws Against Illegal Use of Anti-forensics Tools


- Description: Legislative measures can deter the use of anti-forensic tools for illicit purposes. Stricter laws
create a legal framework that discourages individuals from employing such tactics.
- Implementation: Advocate for the development and enforcement of laws specifically targeting the illegal
use of anti-forensic tools. Collaborate with law enforcement agencies to ensure compliance and prosecution.

4. Understand Anti-forensic Techniques and Their Weaknesses


- Description: A thorough understanding of the various anti-forensic techniques and their inherent
weaknesses allows investigators to devise effective counter-strategies.
- Implementation: Develop comprehensive guides and resources that outline common anti-forensics methods
and their vulnerabilities. Encourage investigators to stay updated with the latest research in digital forensics.

5. Use Latest and Updated CFTs (Computer Forensics Tools)


- Description: Employing up-to-date forensic tools helps investigators stay ahead of potential anti-forensic
strategies. Older tools may lack the necessary features to combat new anti-forensics techniques.
Abhi

- Implementation: Regularly assess and upgrade forensic tools to ensure they include the latest features and
functionalities. Conduct routine vulnerability tests on these tools to identify and address potential
weaknesses.

6. Save Data in Secure Locations


- Description: Storing data in secure, controlled environments minimizes the risk of tampering or deletion by
individuals employing anti-forensic measures.
- Implementation: Use encrypted storage solutions and secure access controls to safeguard evidence.
Implement robust backup procedures to ensure data integrity.

7. Use Intelligent Decompression Libraries


- Description: Compression bombs, which are files that expand massively when decompressed, can
overwhelm forensic tools. Employing intelligent decompression libraries can mitigate this risk.
- Implementation: Integrate advanced decompression libraries capable of detecting and handling
compression bombs within forensic tools, ensuring that they do not disrupt investigations.

8. Replace Weak File Identification Techniques with Stronger Ones


- Description: Utilizing strong file identification techniques enhances the ability to detect hidden or altered
files that might be part of anti-forensics strategies.
- Implementation: Adopt more sophisticated methods of file identification, such as analyzing file signatures
and employing heuristic-based detection, to improve the accuracy of forensic analyses.

Caution Against Over-reliance on Specific Tools


- Note: While employing various tools can enhance investigative capabilities, it is crucial not to depend
entirely on specific tools. Every tool has its limitations and can be vulnerable to attacks. A holistic approach
that combines multiple strategies, ongoing education, and a thorough understanding of anti-forensics will
yield better results in forensic investigations.

Anti-forensics Tools
Understanding the landscape of anti-forensics also involves recognizing the tools that may be employed by
individuals aiming to conceal their activities. Some commonly known anti-forensics tools include:

- Steganography Studio: A tool for hiding data within other files.


- CryptaPix: An application for encrypting images and hiding data.
- GiliSoft File Lock Pro: Software designed to protect files and folders.
- wbStego: A steganography tool for hiding messages in images.
- Data Stash: A program for secure data storage and retrieval.
- OmniHide PRO: Software that helps in hiding sensitive files.
- Masker: A tool for obscuring file information.
- DeepSound: An application for hiding data in audio files.
- DBAN (Darik's Boot and Nuke): A disk-wiping utility for securely erasing data.
- east-tec InvisibleSecrets: Software that hides and encrypts sensitive files.
Abhi

Module-5
Windows and Linux Forensics

Volatile and Non-Volatile Information in Windows and Linux:


Volatile Information in Windows
Volatile information is crucial in digital forensics, as it contains valuable data that is temporarily stored in a
system's RAM (Random Access Memory). This information is lost when the system is powered off or
rebooted, making it essential for investigators to collect it quickly during a live data acquisition process.

1. Characteristics of Volatile Information


- Temporary Nature: Data stored in RAM is dynamic and can change frequently, often in response to user
interactions or system events. If the system is turned off, all data in RAM is lost.
- Critical Artifacts: Volatile information can provide important forensic artifacts, including:
- Logged-on Users: Information about users currently signed into the system, both locally and remotely.
- Command History: A record of commands executed in command-line interfaces.
- Shared Resources: Information about shared files and folders on the network.
- Network-related Information: Data on active network connections, IP addresses, and network statuses.
- Processes and Open Files: Information on running processes, the state of the system, and open files.

2. Collecting Volatile Information


The collection of volatile information must occur during live data acquisition. This process requires careful
attention to preserve the integrity of the data.

Key Steps in Collection:


1. Immediate Data Capture: Investigators should collect volatile data as soon as they arrive at the scene. Any
delay increases the risk of losing valuable information.

2. Locard's Exchange Principle: This principle states that whenever two objects come into contact, they
exchange material. Thus, investigators should minimize actions that could alter volatile data in memory.

3. Use of Command-Line Tools: Command-line tools are essential for retrieving volatile data. Here are some
important commands and their usage:

Key Commands for Collecting Volatile Information:

1. Collecting System Time


- Command: date t & time t

- Description: This command displays the current date and time, helping investigators establish the context
of events.

2. Collecting Logged-On Users


- PsLoggedOn: Displays local and remote logged-on users.
- Command:
psloggedon [-l] [-x] [computername | username]
Abhi

- Net Sessions: Lists all logged-in sessions on the local computer.


- Command:
net sessions

- LogonSessions: Lists currently active logged-on sessions.


- Command:
logonsessions

3. Collecting Open Files


- Net File: Displays files opened on the server.
- Command:
net file

- NetworkOpenedFiles: Displays all files opened by remote systems.


- Command:
networkopenedfiles

4. Collecting Network Information


- Netstat: Displays network connections and their states.
- Command:
netstat -ano

- Nbtstat: Displays NetBIOS name resolution information.


- Command:
nbtstat -c

5. Process Information
- Tasklist: Lists all running processes along with their process IDs (PIDs).
- Command:
tasklist

- PsList: Provides detailed information about running processes.


- Command:
pslist

6. Process-to-Port Mapping
- Netstat (with -o switch): Displays PIDs associated with each network connection.
- Command:
netstat -ano

7. Examining Process Memory


- Process Explorer: Provides detailed information about running processes and their memory usage. While
not a command-line tool, it can be used to analyze processes visually.
Abhi

- ProcDump: Captures process memory dumps during a spike in CPU usage.


- Command:
procdump -ma [pid] [dumpfile.dmp]

8. Collecting Network Status


- IPConfig: Displays the configuration of network interfaces.
- Command:
ipconfig all

- PromiscDetect: A tool to check if network adapters are in promiscuous mode.

Non-Volatile Information in Windows


Non-volatile information encompasses data that persists even after the system is powered off. This type of
data is critical in digital forensics as it can reveal user activity, system configurations, and even recover
deleted files. Below, we delve deeper into the characteristics, sources, and examination techniques for non-
volatile information, incorporating relevant command-line utilities.

1. Characteristics of Non-Volatile Information

- Persistence: Unlike volatile data, which resides in RAM and disappears when the power is cut, non-volatile
data remains stored on disk drives or other permanent storage devices.
- Content Types: Examples include:
- Documents: Emails, spreadsheets, and word processing files.
- Deleted Files: Files that have been removed but not yet overwritten.
- Configuration Settings: Information stored in the Windows Registry about user preferences and system
settings.
- Historical Data: Logs and system events that provide insights into user actions.

2. Importance of Collecting Non-Volatile Information

- Evidence Recovery: It allows forensic investigators to retrieve evidence of criminal activity, user behavior,
and system configuration.
- Establishing Timelines: Non-volatile data helps in building a timeline of actions leading up to an incident.
- User Behavior Analysis: Understanding how users interact with the system can provide clues about their
intent and activities.

3. Sources of Non-Volatile Data

Non-volatile information can be collected from various sources:

File Systems
The Windows file system is structured to manage data efficiently. Understanding its components is essential
for forensic analysis.

- File System Data: Includes information about the file system structure (e.g., NTFS, FAT32).
Abhi

- Content Data: The actual data stored within files.


- Metadata: Information like file size, creation, modification, and access timestamps (MAC).
- Application Data: Information related to applications that interact with the file system.

4. Command-Line Tools for Data Collection

Several command-line utilities can aid in the collection and analysis of non-volatile information. Here are
some key commands:

Viewing Directory Structure and Metadata

- Command:
dir /o:d

- Description: Lists files in the current directory sorted by date. This command can help identify the last
accessed or modified files, providing insights into user activity.

Examining File Properties

- Command:
dir s t:w

- Description: Recursively lists all files in the current directory and subdirectories, displaying their last
modified time. This is helpful for identifying recently modified files that may be of interest.

Accessing the Windows Registry

The Windows Registry holds a wealth of information about system configuration and user settings.
Investigators can use the following commands to access registry data.

- Command:
reg query HKLMSoftwareMicrosoftWindowsCurrentVersionUninstall

- Description: Lists all installed applications. This can help investigators identify software that may be
relevant to the investigation.

- Command:
reg query HKCUSoftwareMicrosoftWindowsCurrentVersionExplorerRecentDocs

- Description: Displays recently accessed documents. This can provide insights into user behavior.

5. ESE Database Files and Tools

ESE (Extensible Storage Engine) databases store important user and system information. These files
typically have the `.edb` extension and can be examined using specific tools.
Abhi

Common ESE Database Files

- Windows.edb: Contains indexed data for Windows Search.


- contacts.edb: Stores contact information for Microsoft applications.
- DataStore.edb: Holds information about Windows updates.

Using ESEDatabaseView

The ESEDatabaseView utility is a powerful tool for analyzing .edb files. It presents the data in an organized
format, making it easier to extract evidence.

- Steps to Use ESEDatabaseView:


1. Download and Install ESEDatabaseView from Nirsoft.
2. Open the .edb file with the tool.
3. View tables and records.
4. Export data to formats like CSV or HTML for further analysis.

6. Windows Search Index Analysis

The Windows Search Index accelerates file searches by indexing content. The primary file is located at:

- Path:
C:ProgramDataMicrosoftSearchDataApplicationsWindowsWindows.edb

Extracting Data from Windows.edb

To parse the Windows Search Index, investigators can use the ESEDatabaseView tool mentioned above.
This allows for analysis of indexed data, including deleted items and user searches.

7. Detecting Externally Connected Devices

Identifying devices that were connected to the system can reveal potential avenues of attack or evidence of
data exfiltration.

Using DriveLetterView

DriveLetterView is another utility that lists all drive letter assignments:

- Features:
- Shows local, remote, and removable drives, even if they are not currently plugged in.
- Allows exporting of drive lists to various formats.

8. Analyzing Slack Space

Slack space is the unused space in disk clusters that may contain remnants of previously stored files.
Analyzing this space can reveal hidden or deleted information.
Abhi

Using DriveSpy

DriveSpy can collect slack space from an entire partition, making it easier for investigators to analyze this
potentially significant data.

9. Example of Analyzing Slack Space

To identify slack space:


1. Use a hex editor to analyze disk sectors directly.
2. Command:

fsutil sparse queryflag <filename>

- Description: Check if a file is sparse, which can indicate that it may have slack space.

Volatile Information in Linux:


Volatile information refers to data stored in a system's memory or temporary files, which is lost when the
system is powered down or restarted. In Linux forensics, collecting volatile data is crucial for constructing a
timeline of events related to an incident, identifying potential threats, and understanding the state of a system
during an investigation.

1. Hostname, Date & Time, and Time Zone

- Purpose: The hostname, current date and time, and the system's time zone help in correlating logs, network
traffic, and other data with the specific system and time period.

- Commands:
- Hostname: hostname

This command retrieves the system's hostname, which is useful for identifying the machine in a network.

- Date and Time:


date

This command displays the system’s date and time.

- Time Zone:
cat etctimezone

This retrieves the current time zone configuration. You can also obtain the epoch time, which is critical for
precise event timelines:
date +%s

2. System Uptime
Abhi

- Purpose: The uptime command provides insight into how long the system has been running, which can be
crucial for determining the timeframe of malicious activities.

- Command:
uptime

This command shows the system's uptime, the number of logged-in users, and load averages for the past 1,
5, and 15 minutes.

3. Network Information

- Purpose: Network details such as active network interfaces, IP addresses, and routing tables help
investigators analyze the system's communication and detect abnormal connections.

- Commands:
- Network Interfaces and IP Addresses:
ip addr show

This command displays active network interfaces and their associated IP addresses.

- Checking for Promiscuous Mode:


Promiscuous mode allows a network interface to capture all packets, not just those addressed to it, often
used for malicious snooping. To check if a network interface is in promiscuous mode:
ifconfig <interface>

- Routing Table:
Displays the kernel's routing table to trace network communication paths:
netstat -rn

4. Open Ports

- Purpose: Open ports can indicate services running on the system. Attackers often exploit these to infiltrate
networks.

- Command:
netstat -tulpn

This command lists open TCP and UDP ports and the associated processes.

- Alternative:
Use nmap to scan for open ports on the local machine:

nmap -sT localhost For TCP


nmap -sU localhost For UDP

5. Running Processes
Abhi

- Purpose: Investigating active processes helps identify malicious activities. Suspicious processes can
indicate malware, rootkits, or other threats.

- Command:
ps auxww

This command lists all processes running on the system, along with their resource consumption (e.g., CPU
and memory usage).

6. Open Files

- Purpose: Investigating open files is useful for identifying malicious programs or unusual activity.

- Command:
lsof

This lists all open files and the processes associated with them. You can filter results for specific users:
lsof -u <username>

7. Mounted Filesystems

- Purpose: Understanding the mounted filesystems helps investigators identify active disk partitions and
potential external storage devices connected to the system.

- Command:
mount

This command lists all mounted filesystems along with their mount points.

8. Kernel Modules

- Purpose: Attackers may load malicious kernel modules to manipulate the system. Investigating loaded
modules helps in identifying unauthorized changes to the kernel.

- Command:
lsmod

This command displays all currently loaded kernel modules.

9. Swap Areas and Disk Partition Information

- Purpose: Swap memory can hold remnants of data from running processes. Investigating swap areas can
provide insight into recently running applications or files.

- Command:
Abhi

swapon --show

This command lists all active swap areas on the system.

- For Disk Partitions:


fdisk -l

This lists all disk partitions, useful for identifying attached storage devices.

10. Kernel Messages

- Purpose: The kernel logs important system events that may not appear in user-level logs. Reviewing kernel
messages can reveal suspicious activities, such as hardware failures or attempts to access restricted
resources.

- Command:
dmesg

This command retrieves kernel messages, which can be critical in analyzing system-level events.

Non-Volatile Information in Linux:


Non-volatile information refers to data that remains intact even after the system is powered off. In Linux
forensics, this type of data is crucial for reconstructing a timeline of events and uncovering potential traces
of an attack. It includes system configuration, user activities, and log files that persist across reboots.
Collecting non-volatile data helps investigators in building a more comprehensive understanding of the
incident under investigation.

Here are key types of non-volatile data, along with the methods to collect them.

1. System Information

- Purpose: System information reveals essential hardware and software configurations, which help
investigators understand the environment in which an incident occurred.

- Commands:
- CPU Details:
cat proccpuinfo

This command displays detailed information about the system's CPU, including model, architecture, and
speed.

- Mount Points:
cat procselfmounts

This shows currently mounted file systems and external devices, which could be significant if external
storage was used for illicit activities.
Abhi

2. Kernel Information

- Purpose: The kernel version and configuration are critical for identifying vulnerabilities or security patches
that may have been exploited during an attack.

- Commands:
- Kernel Version:
uname -r

This provides the current kernel version, which is useful to determine if the system has outdated or
vulnerable components.

- Alternative Command:

cat procversion
or
hostnamectl | grep Kernel

3. User Account Information

- Purpose: Investigating user accounts helps identify suspicious accounts, unauthorized users, or privilege
escalation.

- Command:
- User Accounts:
cat etcpasswd

This file contains information about all user accounts on the system. Each line in the file corresponds to a
user and includes details like username, user ID, group ID, home directory, and default shell.

- Filtering Usernames:

cut -d: -f1 etcpasswd

This command lists only the usernames from the `etcpasswd` file.

4. Currently Logged-in Users and Login History

- Purpose: Collecting login information helps determine who was logged into the system at the time of the
incident and if any unauthorized access occurred.

- Commands:
- Currently Logged-in Users:
w

This command shows information about users currently logged into the system.
Abhi

- Login History:
last -f varlogwtmp

This retrieves the login history, including remote access attempts and system reboots, by analyzing the
`varlogwtmp` file.

5. System Logs

- Purpose: System logs store various activities related to the system’s operations, security events, and user
actions. These logs are key sources of evidence in forensic investigations.

- Important Logs:
- Authorization Logs:
cat varlogauth.log

This file contains information about user authentication, sudo command executions, and other
authorization events.

To specifically retrieve sudo-related logs:


grep sudo varlogauth.log

- System Messages:
cat varlogsyslog

This retrieves general system messages, including error reports and status updates from running services.

- Kernel Logs:
cat varlogkern.log

This file stores all kernel-related messages, including initialization, errors, and warnings.

6. Linux Log Files

- Purpose: Different log files provide specific types of information based on the services running on the
system, such as web servers, mail servers, or database services. Understanding the contents of these logs
helps in pinpointing the timeline and scope of an attack.

- Common Linux Log File Locations:


- varlogauth.log: Logs related to user logins and authentication mechanisms.
- varlogkern.log: Kernel-related messages.
- varlogfaillog: Logs failed user login attempts.
- varlogmysql: Logs related to the MySQL database server.
- varlogapache2: Logs for the Apache web server.
- varlogdebug: Debugging messages.
- varlogdpkg.log: Logs related to package installations or removals.
Abhi

7. User History Files

- Purpose: Shell history files can reveal commands executed by users, which can provide evidence of
malicious activities or attempts to cover up traces of an attack.

- Common Locations:
- history:
cat ~._history

This file contains the history of commands executed by the user in the shell.

8. Hidden Files and Directories

- Purpose: Hidden files may contain malicious scripts or files intentionally concealed by an attacker.
Investigators need to uncover these files to fully assess the extent of an attack.

- Command:
find -type f -name "."

This command lists hidden files (those starting with a `.`) throughout the system.

9. File Information and Suspicious Data

- Purpose: Investigating suspicious files and verifying file integrity are crucial for detecting tampered or
unauthorized files.

- Commands:
- File Information:

file <filename>

This command identifies the type of a given file, which can be useful when examining files with unknown
or altered extensions.

- Strings within a File:


strings <filename>

This command extracts printable strings from a binary file, which helps investigators search for hidden or
embedded data.

10. Writable Files

- Purpose: Investigating writable files helps identify files or directories that could have been modified during
the attack, especially if they contain unauthorized or malicious data.
Abhi

- Command:
find -type f -perm 222

This lists all writable files on the system, which may have been altered or tampered with during an
intrusion.

Windows Memory and Registry Analysis:


Windows memory and registry analysis are critical components of digital forensics, allowing investigators to
gather evidence of user actions, processes, and system configurations. This detailed explanation will cover
various aspects of memory and registry analysis, along with relevant commands and tools used in the
process.

1. Windows Memory Analysis


Windows memory analysis is an integral part of forensic analysis and involves acquisition of
physical memory or RAM dumps of the Windows machine. Examining these memory dumps help
investigators detect hidden rootkits, find hidden objects, determine any suspicious process, etc.

Importance of RAM Analysis


Random Access Memory (RAM) plays a crucial role in storing volatile information essential for forensic
investigations. It contains traces of:
- Processes and Threads: Active applications and their states.
- Open Files: Files currently in use by applications.
- Network Connections: Active connections and communications.
- Hidden Applications: Programs running without user awareness.
- Encryption Keys: Sensitive data used for encryption and decryption.

Memory Dumps

A memory dump (or crash dump) captures the system's memory state during failures or specific triggers.
Analyzing these dumps can reveal evidence of internal errors or potential attacks.

In Windows, there are several types of memory dumps:


1. Automatic Memory Dump: The default setting for Windows to store memory dumps when a crash occurs.
2. Complete Memory Dump: Captures all the contents of RAM, providing comprehensive data.
3. Kernel Memory Dump: Only captures memory allocated to the kernel.
4. Small Memory Dump: A minimal set of information, typically useful for troubleshooting.

Crash Dumps

Crash dumps include information about system failures, providing insights into:
- Stop Messages: Error codes generated during failures.
- Loaded Drivers: Information about active drivers at the time of the crash.
- Processor State: Context of the processor when the failure occurred.

The crash dump is essential for forensic investigators, as it can indicate whether a crash was caused by a
system error or an external threat.
Abhi

Analyzing Memory Dumps:

To analyze memory dumps, tools such as DumpChk can be utilized. This tool verifies the integrity of the
dump files and provides summary information.

Command for DumpChk:


DumpChk [-y SymbolPath] DumpFile

- SymbolPath: Specifies the symbol files to be used for analysis.


- DumpFile: The path to the dump file to be analyzed.

Collecting Process Memory


Investigators often need to capture specific process memory rather than all running processes. This can
involve capturing:
- Physical Memory: Data currently held in RAM.
- Virtual Memory: Data in the page file, which can include more extensive information about the process.

Tools for Process Memory Dumping:

1. Userdump.exe:
- This tool allows investigators to dump process memory without terminating the process.
- The generated dump file can be analyzed using Microsoft debugging tools.

Command to use Userdump:


userdump.exe <PID>

- <PID>: The process ID of the application to dump.

2. adplus.vbs:
- This VBScript tool can automate memory dumps based on specific conditions.

3. Process Dumper (procdump):


- A Sysinternals tool that provides advanced options for capturing process memory.

Command for procdump:


procdump -ma <PID> <DumpFilePath>

- -ma: Captures a full memory dump.


- <DumpFilePath>: The output file for the dump.

Random Access Memory (RAM) Acquisition

During investigations, capturing RAM is crucial for acquiring volatile data. Tools commonly used for RAM
acquisition include:
Abhi

- Belkasoft RAM Capturer:


- This tool is designed for live memory acquisition, producing a physical memory dump.

- AccessData FTK Imager:


- A versatile tool for capturing memory as well as disk images.

To use FTK Imager for RAM capture:


1. Open FTK Imager.
2. Go to `File > Capture Memory`.
3. Select the destination to save the memory dump.

Memory Forensics: Malware Analysis Using Redline

Redline is a tool developed by FireEye for analyzing memory and identifying malicious activities. The
analysis involves:

1. Loading the RAM Dump:


- Navigate to the ‘Analyze Data’ section and load the RAM dump file.

2. Examining Processes:
- Under the ‘Processes’ tab, inspect all running processes at the time of the dump.

3. Checking Ports:
- Click on ‘Ports’ to view all network connections established by the processes.

Example:
- If `rundll32.exe` is shown to connect to a suspicious IP (e.g., `172.20.20.21` over port `4444`), further
investigation is warranted.

2. Windows Registry Analysis

The Windows Registry is a hierarchical database that stores settings and configurations for the operating
system and applications. Analyzing the registry helps investigators uncover user actions, installed software,
and system settings.

Key Components of the Windows Registry

The registry is divided into several hives:


- HKEY_CLASSES_ROOT (HKCR): Contains information about file associations and OLE objects.
- HKEY_CURRENT_USER (HKCU): Stores settings for the user currently logged in.
- HKEY_LOCAL_MACHINE (HKLM): Contains settings for the local machine, including software and
hardware configurations.
- HKEY_USERS (HKU): Stores settings for all users.
- HKEY_CURRENT_CONFIG (HKCC): Contains information about the current hardware configuration.

Types of Registry Hives


Abhi

- Non-volatile: Stored on the disk (e.g., HKEY_LOCAL_MACHINE, HKEY_USERS).


- Volatile: Captured during live analysis (e.g., HKEY_CLASSES_ROOT, HKEY_CURRENT_USER).

Forensic Analysis of the Windows Registry


1. Static Analysis:
- Investigators examine registry files found in `C:WindowsSystem32config`. These files can be accessed
after acquiring a forensic image of the drive.

2. Live Analysis:
- Use the Windows Registry Editor or tools like FTK Imager to capture registry hives from a live system.

Capturing Windows Registry Files with FTK Imager:


1. Open FTK Imager.
2. Navigate to `File > Obtain Protected Files`.
3. Select the relevant registry files for extraction.

Example of Extracted Registry Subkeys:


- SAM (Security Account Manager): Stores user account information and password hashes.
- Security: Contains current user security policy settings.
- Software: Lists installed applications and their settings.
- System: Contains configurations for hardware drivers and services.

Analyzing Registry Files

Investigators can use various tools to analyze registry files:


- Hex Workshop: A hex editor for examining binary data and editing registry files.

Example of using Hex Workshop:


1. Open Hex Workshop.
2. Load the extracted registry file.
3. Utilize features to search, edit, and analyze binary data.

Cache, Cookie, History Recorded in Web Browsers:


Web browsers maintain a record of user activities through caches, cookies, and browsing history. This
information is vital for forensic investigations, as it provides insights into online behavior, including visited
websites, downloaded files, and timestamps. This detailed analysis will focus on the mechanisms of caching,
cookie storage, and browsing history in popular web browsers, namely Google Chrome, Mozilla Firefox,
and Microsoft Edge.

1. Cache
Cache is a temporary storage area that holds copies of frequently accessed web resources, such as images,
scripts, and other data files. By caching these elements, browsers can quickly retrieve them, improving load
times for websites previously visited.

Purpose of Caching:
Abhi

- Improved Performance: Faster access to web resources reduces loading times and improves user
experience.
- Reduced Bandwidth Usage: By storing copies of resources locally, less data needs to be downloaded
repeatedly, saving bandwidth.

Cache Locations by Browser:

- Google Chrome:
- Location: `C:\Users{username}\AppData\Local\Google\Chrome\User\ DataDefaultCache`
- Analysis Tool: ChromeCacheView
- Functionality: Displays a list of cached files along with details like URL, content type, file size, last
accessed time, expiration time, server name, and response code.

- Mozilla Firefox:
- Location: `C:\Users<Username>\AppData\Local\MozillaFirefox\ProfilesXXXXXX.defaultcache2`
- Analysis Tool: MZCacheView
- Functionality: Similar to ChromeCacheView, it allows investigators to see cached content and associated
metadata.

- Microsoft Edge:
- Location: `C:\Users\Admin\AppData\Local\MicrosoftWindows\WebCache`
- Analysis Tool: IECacheView
- Functionality: Displays cached items along with pertinent information for forensic analysis.

2. Cookies
Cookies are small pieces of data stored on the user’s device by web browsers to remember user preferences,
login sessions, and other information between visits. Cookies can be essential for maintaining user sessions
and personalizing browsing experiences.

Types of Cookies:
- Session Cookies: Temporary cookies that are deleted once the browser is closed.
- Persistent Cookies: Remain on the user’s device for a specified period, even after the browser is closed.

Cookie Locations by Browser:

- Google Chrome:
- Location: `C:Users{username}AppDataLocalGoogleChromeUser DataDefaultCookies`
- Analysis Tool: ChromeCookiesView
- Functionality: Lists all cookies with details such as hostname, path, name, value, secure status, HTTP-
only status, last accessed time, creation time, and expiration time.

- Mozilla Firefox:
- Location: `C:Users<Username>AppDataRoamingMozillaFirefoxProfilesXXXXXX.defaultcookies.sqlite`
- Analysis Tool: MZCookiesView
- Functionality: Similar capabilities to ChromeCookiesView, allowing for comprehensive cookie
examination.
Abhi

- Microsoft Edge:
- Location:
`C:UsersAdminAppDataLocalPackagesMicrosoft.MicrosoftEdge_xxxxxxxxxACMicrosoftEdgeCookies`
- Analysis Tool: EdgeCookiesView
- Functionality: Displays cookies stored by Edge, providing insights into user activity and preferences.

3. Browser History

Browser History records all the websites visited by the user, including the URLs, timestamps, and additional
metadata. This information can be pivotal for understanding a user’s online behavior and identifying any
potential criminal activities.

History Locations by Browser:

- Google Chrome:
- Location: `C:Users{username}AppDataLocalGoogleChromeUser DataDefaultHistory`
- Analysis Tool: ChromeHistoryView
- Functionality: Displays all visited web pages, including URL, title, visit datetime, number of visits, and
referrer information. The data can be exported into various formats for further analysis.

- Mozilla Firefox:
- Location: `C:Users<Username>AppDataRoamingMozillaFirefoxProfilesXXXXXX.defaultplaces.sqlite`
- Analysis Tool: MZHistoryView
- Functionality: Provides access to browsing history in a structured format, making it easier to analyze
user behavior over time.

- Microsoft Edge:
- Location: `C:UsersAdminAppDataLocalMicrosoftWindowsHistory`
- Analysis Tool: BrowsingHistoryView
- Functionality: Similar to the tools used for Chrome and Firefox, it offers an overview of user browsing
activity.

Windows Files and Metadata.


Examining Windows files and their associated metadata is an essential aspect of digital forensics. This
examination helps investigators detect unauthorized changes to application files, uncover user activity, and
understand the context surrounding specific files. This detailed overview covers various types of Windows
files and metadata, their significance, and methods for analysis.

1. Key Components of Windows Files and Metadata

a. Restore Point Directories

- Purpose: Restore points are snapshots of system settings and files at a specific time, created during
significant system events such as installations, uninstalls, or updates.
Abhi

- Importance: Investigators can analyze these directories to track changes made to application files,
including when applications were installed or removed.

b. Prefetch Files

- Purpose: Prefetch files store information about applications that have been executed on a Windows system.
- Importance: Even if an application has been uninstalled, its prefetch file remains, providing evidence of the
application's execution history.

c. Metadata

- Definition: Metadata refers to data about data, offering insights into the characteristics of files, including
their creation, access, and modification details.
- Importance: It contains evidentiary information crucial for investigations, such as timestamps, file size, and
author information.

d. Image Files and EXIF Data

- Purpose: Image files, especially JPEGs, often contain EXIF data, which includes metadata about the
image, such as the camera model and settings.
- Importance: Investigating EXIF data can provide context about how and when an image was taken.

2. Windows File Analysis

System Restore Points

- Rp.log Files:
- These log files reside in the restore point directories and indicate the creation event of a restore point,
including its type and description.
- They are instrumental for noting the date of application installations or removals.

- Change.log.x Files:
- Change.log files document changes to monitored system and application files, providing a sequence of
alterations along with original filenames.
- When a monitored file is modified, it is preserved in a restore point directory with a new name format
(e.g., Axxxxxxx.ext).

3. Prefetch Files

- Location: C:WindowsPrefetch

- Contents: Prefetch files provide:


- The number of times an application has been launched (DWORD value at offset 144).
- The last time the application was run (DWORD value at offset 120, in UTC format).
Abhi

- Forensic Use: Investigators can correlate prefetch data with registry or Event Log information to identify
user sessions and application usage.

Prefetching

- Definition: A Windows feature to speed up application launches by collecting data for the first 10 seconds
after an application starts.
- Registry Control: The prefetching process can be controlled via:

HKEY_LOCAL_MACHINESYSTEMControlSet00xControlSessionManagerMemoryManagementPrefetch
Parameters

4. Image File Analysis and EXIF Data

- EXIF Metadata:
- Commonly found in JPEG files, this data includes details such as camera settings, timestamp, and GPS
coordinates.
- Tools like Exiv2, IrfanView, and ExifTool can be used to view and extract this metadata.

5. Understanding Metadata
- What is Metadata? Metadata provides structured data about files, including details on who created,
accessed, and modified them.
- Examples of Metadata:
- Organization name
- Author name
- Computer name
- Hidden text
- Document versions
- GPS data

Usefulness in Investigations

- Metadata can reveal hidden data, identify individuals who attempted to obscure data, and correlate
information from various documents.

6. Metadata in Different File Systems

MAC Timestamps

- Definition: MAC timestamps refer to the Modified, Accessed, and Created times of a file.
- Storage Differences:
- FAT File System: Uses local time for timestamps.
- NTFS File System: Stores timestamps in Coordinated Universal Time (UTC).

Impact of File Operations on Timestamps


Abhi

- FAT16 System:
- Copying a file: Modification date remains, creation date updates.
- Moving a file: Both dates remain unchanged.

- NTFS System:
- Copying a file: Same behavior as FAT16.
- Moving a file: Same behavior as FAT16.

7. Metadata in Specific File Types

PDF Files

- Contents: PDF metadata can include the author, creation date, and application used.
- Extraction Tools: Tools such as `pdfmeta.pl` and `pdfdmp.pl` can be used to extract metadata.

Word Documents

- Metadata Characteristics: Word documents can include hidden data, previous revisions, and a list of
authors.
- Extraction Methods: Use built-in inspection tools in Microsoft Word to view metadata:
1. Click on the File tab.
2. Select Info.
3. Click Check for Issues → Inspect Document.

8. Metadata Analysis Tools

- Metashield Analyzer: An online tool that allows investigators to analyze file metadata.
- Usage Steps:
1. Select the file for analysis.
2. Accept terms and conditions.
3. Click Analyze to view the output.

Analyze Filesystem Images Using the Sleuth Kit or Autopsy.


Introduction to The Sleuth Kit (TSK)
The Sleuth Kit (TSK) is an open-source digital forensic toolset that allows investigators to analyze disk
images, particularly file systems and volumes, to uncover important information during investigations. It is
often used in incident response and forensic investigations to perform deep dives into file systems, examine
metadata, and recover deleted files. Autopsy, which is a graphical interface for TSK, further simplifies the
process by providing a user-friendly platform for analysis.

TSK and Autopsy can be used to analyze both live and dead systems (disk images) for recovering files,
examining metadata, and tracking changes. This section explains how to use specific TSK commands and
the features of Autopsy for forensic file system analysis.

File System Analysis Using The Sleuth Kit


Abhi

1. Using the `fsstat` Command – File System Information

The `fsstat` command in TSK provides comprehensive information about a file system. This command helps
forensic investigators get details about the volume, such as file system type, size, layout, and timestamps. It
can also reveal file system metadata, such as the last time it was mounted or the volume ID. Knowing these
attributes helps investigators understand the structure of the file system under investigation.

Command:

fsstat -i <input_filetype> <filename.extension>

- `-i` option specifies the type of input (e.g., disk image type or file type).
- `filename.extension` is the path to the disk image or file system being examined.

Output Information:
- File System Type: The type of file system (e.g., ext3, NTFS, FAT).
- Volume Size: Size of the file system in sectors or bytes.
- Mounted Timestamps: The last time the volume was mounted.
- Volume ID: A unique identifier for the volume.
- Mount Directory: The directory where the file system was last mounted.

This information can provide insight into when and how the disk was used, possibly revealing if it was
tampered with or mounted during unauthorized activities.

2. Using the `fls` Command – Listing Files and Directories

The `fls` command in TSK lists all files and directories on a specified image file, including deleted files.
This command is useful for quickly getting an overview of the contents of the disk image or file system
under investigation, which is crucial for identifying important files or artifacts.

Command:

fls -i <image_type> <imagefile_name>

- `-i` option specifies the image type (e.g., raw, ewf).


- `imagefile_name` is the disk image to be analyzed.

Output Information:
- List of Files and Directories: Including file names, directory structures, and their inodes.
- Recently Deleted Files: These are often listed, making it possible to examine deleted data that hasn't been
fully removed from the file system.

This command is highly useful for locating files, including hidden and deleted files, that are essential in
forensic investigations.

3. Using the `istat` Command – Viewing File Metadata


Abhi

The `istat` command is used to display detailed metadata of a file, including information about its MAC
times (Modification, Access, and Creation times), file size, access permissions, and other attributes. This is
critical in tracking user actions on files and determining the timeline of file access, modifications, and
deletions.

Command:

istat -f <fstype> -i <imgtype> <imagefile_name> <inode_number>

- `-f` option specifies the file system type (e.g., ext3, NTFS).
- `-i` option specifies the image type.
- `inode_number` is the inode number of the file whose metadata you want to inspect.

Output Information:
- MAC Times (Modification, Access, and Creation Times): These timestamps give a history of when the file
was last modified, accessed, and created, helping investigators to build a timeline of activity.
- File Size and Permissions: Size of the file and access permissions (read, write, execute).
- Other Metadata: Information such as the number of hard links to the file and the block addresses used by
the file.

The `istat` command is often used in combination with `fls` to examine specific files based on their inode
numbers.

Autopsy: A Graphical Interface for TSK

While TSK provides command-line utilities for analyzing file systems and disk images, Autopsy provides a
graphical user interface (GUI) that makes the process easier and more intuitive. It is particularly useful for
investigators who are not familiar with command-line interfaces, as it allows them to leverage TSK’s
powerful features through a point-and-click interface.

Key Features of Autopsy:

1. Disk Image Analysis:


Autopsy allows investigators to load and analyze disk images (raw, E01, etc.). Once loaded, it performs an
automated scan of the image to identify and classify files.

2. File System and Metadata Viewing:


Similar to the `fsstat`, `fls`, and `istat` commands in TSK, Autopsy provides a visual interface to browse
through the file system and view metadata such as MAC times, file permissions, and ownership.

3. File Recovery:
Autopsy can locate and recover deleted files that may still reside on the disk but are no longer visible in
the file system.

4. Timeline Analysis:
Abhi

One of the most powerful features of Autopsy is its ability to create a timeline of events. By correlating
timestamps from file system metadata, log files, and other artifacts, Autopsy helps investigators build a
detailed timeline of user activities and system events leading up to and during an incident.

5. Keyword Search:
Investigators can search for specific keywords or patterns across the file system, helping them locate
incriminating evidence such as malicious scripts, confidential data, or communication logs.

6. Hashing and Signature Analysis:


Autopsy includes built-in hashing functions that can be used to generate MD5, SHA1, or other hashes of
files, which is crucial for verifying the integrity of files or identifying known malicious files.

7. Reporting:
Autopsy automatically generates reports based on the analysis, including detailed logs of actions taken,
files analyzed, and key findings. These reports can be used as evidence in legal proceedings.

Workflow for Using TSK and Autopsy Together

1. Load Disk Image:


- In TSK: Use the `fls` and `istat` commands to explore the image and gather information about files and
their metadata.
- In Autopsy: Load the disk image into Autopsy for a more interactive and detailed view of the file system.

2. Identify Suspicious Files:


- In TSK: Use `fls` to identify files, directories, and deleted files, and then inspect metadata with `istat`.
- In Autopsy: Browse the file system visually, and use keyword search to locate specific files or patterns.

3. Recover Deleted Data:


- Both TSK and Autopsy can recover deleted files. In TSK, use `fls` to list deleted files. In Autopsy, use
the recovery tools to restore these files.

4. Analyze Metadata:
- Analyze MAC times, file permissions, and other attributes to track user activity and potential tampering.
Use `istat` in TSK or the metadata viewer in Autopsy.

5. Generate Reports:
- Use Autopsy’s built-in reporting tools to generate comprehensive reports, which include file listings,
metadata analysis, and keyword search results.

Investigating Web Attacks-


Web Application Forensics:
Web application forensics is a specialized field within digital forensics that focuses on the investigation of
security incidents involving web-based applications. Web applications are used to manage and exchange
information across various sectors, such as enterprises and government agencies. Due to their widespread
use, web applications are prime targets for attackers, necessitating forensic investigations to trace the origin
of attacks, identify how they were executed, and determine what systems or devices were involved.
Abhi

Introduction to Web Application Forensics:

Web applications allow users to interact with a central server through a browser, enabling data submission
and retrieval from a database. This communication happens through standardized formats such as HTML
and XML, regardless of the operating system or browser. However, due to vulnerabilities, attackers can
exploit web applications to access sensitive information like user credentials and financial data.

When an attack occurs, web application forensics is needed to:


- Investigate logs, directories, configuration files, and other data.
- Trace the origin of the attack.
- Understand how the attack propagated.
- Identify which devices (mobile or desktop) were involved.

Forensic investigators must examine server, network, and host machine logs to gather clues about the attack.

Challenges in Web Application Forensics:

1. Distributed Nature: Web applications interact with various hardware and software components, making it
difficult to trace attacks as logs may be spread across multiple systems (e.g., IIS, Apache).
2. Volatile Data: Capturing livevolatile data (e.g., processes, network connections) without taking the
website offline is a challenge. Taking a site offline for imaging can impact business operations.
3. Log Volume: High-traffic websites generate large amounts of logs, making it hard to collect and analyze
them.
4. Diverse Logs: Investigators must be familiar with different web and application servers to analyze logs
that may be in various formats.
5. Anonymization Tools: Attackers often use proxies and anonymizers, making it difficult to trace them.
6. Access Restrictions: Many web applications restrict access to HTTP information, which is essential for
distinguishing between valid and malicious HTTP requests.

Key Data Collected in Web Application Forensics:

- Date and time of the request.


- IP address initiating the request.
- HTTP method (GET/POST).
- URI and query.
- HTTP headers and request body.
- Event logs, file listings, and timestamps.

Indicators of a Web Attack:

- Denial of Service (DoS): Legitimate users are denied access, and customers may report unavailability of
services.
- Redirecting to Malicious Sites: Users being redirected to unknown, malicious websites is a sign of an
attack.
Abhi

- Anomalies in Logs: Suspicious log entries, changes in passwords, and creation of new user accounts
indicate potential breaches.
- Error Messages: HTTP error pages like "500 Internal Server Error" can signify SQL injection attempts.

Common Web Application Threats:

1. Cookie Poisoning: Modifying cookies to bypass security or gain unauthorized access.


2. SQL Injection: Injecting malicious SQL commands to extract sensitive data.
3. Cross-Site Scripting (XSS): Injecting malicious scripts that can hijack sessions or deface websites.
4. Cross-Site Request Forgery (CSRF): Making authenticated users unknowingly perform actions on the
attacker’s behalf.
5. Directory Traversal: Accessing directories outside the web server’s root directory.
6. Denial of Service (DoS): Overloading a server to deny service to legitimate users.
7. Sensitive Data Exposure: Poor encryption techniques lead to unauthorized access to sensitive information.
8. Broken Authentication: Exploiting vulnerabilities in authentication mechanisms to impersonate users.
9. Security Misconfigurations: Using default configurations or outdated software increases the risk of
attacks.
10. Log Tampering: Altering logs to hide traces of the attack.
11. Broken Access Control: Exploiting flaws in access control policies to gain unauthorized access.

Web Attack Investigation Methodology:

1. Interviews: Interview individuals involved to gather preliminary information about the attack.
2. Server Seizure: Identify and isolate compromised servers or devices to prevent further damage.
3. Forensic Imaging: Create forensic images of affected systems for analysis.
4. Log Collection: Gather logs from various sources, such as web server logs, SIEM tools, and web
application firewall (WAF) logs.
5. Encryption and Integrity: Use encryption and checksums to ensure log integrity during collection and
analysis.
6. Log Analysis: Analyze logs to find patterns or suspicious entries that correlate to the attack.
7. IP Tracing: Attempt to trace the attacker’s IP address, though anonymization tools may complicate this
process.
8. Documentation: Meticulously document every step of the investigation for potential legal proceedings.

IIS, Apache Web Server Logs:

IIS (Internet Information Services) Logs

IIS is a Microsoft-developed web server that handles HTTP requests and other protocols such as HTTPS,
FTP, and SMTP. It operates on Windows servers and can host websites and applications. Its log files capture
essential data that can be used in security investigations, as well as for performance monitoring. Logs play a
critical role in identifying suspicious activity and reconstructing security incidents.

Components and Architecture of IIS


IIS includes several core components that work together to process HTTP requests:
- HTTP.sys: The protocol listener that intercepts client requests.
Abhi

- World Wide Web Publishing Service (WWW Service): Configures the HTTP.sys listener.
- Windows Process Activation Service (WAS): Manages application pools and worker processes, ensuring
the system is properly configured for the incoming requests.

When a browser sends a request to IIS:


1. HTTP.sys intercepts the request.
2. WAS reads configuration data from `ApplicationHost.config`.
3. The worker process handles the request, and the response is sent back through HTTP.sys.

IIS Log Format and Storage Location


- The default location for IIS log files is: `%SystemDrive%inetpublogsLogFiles`.
- The logs use the W3C Extended Log File Format and contain essential fields such as:
- DateTime (UTC)
- Client IP Address
- User Agent
- HTTP Method (GETPOST)
- Status Code
- Request Processing Time

Example of IIS Log Entry:

2019-12-12 06:11:41 192.168.0.10 GET imagescontentbg_body_1.jpg - 80 - 192.168.0.27 Mozilla5.0


Chrome48.0 200 0 0 365

Fields explained:
- `2019-12-12 06:11:41`: Timestamp of the request.
- `192.168.0.10`: Server IP address.
- `GET`: HTTP method.
- `imagescontentbg_body_1.jpg`: Requested resource.
- `192.168.0.27`: Client IP address.
- `Mozilla5.0 Chrome48.0`: Client browser information.
- `200`: HTTP status code (OK).
- `365`: Time taken to process the request in milliseconds.

IIS Log Analysis in Forensics


- Locate Logs: Go to IIS Manager, expand the server, and navigate to `Logging` to see the log directory.
- Time Analysis: Logs are recorded in Coordinated Universal Time (UTC), requiring careful time zone
adjustments during analysis.
- Investigation Use Cases:
- Client IP Tracking: Investigators can determine which IP addresses accessed specific resources.
- HTTP Methods: A high number of POST requests could indicate brute-force attempts or SQL injection.
- Error Codes: Unusual error codes like `404` (Not Found) or `500` (Internal Server Error) could indicate
failed exploits or attempts to locate vulnerabilities.

Apache Web Server Logs


Abhi

The Apache HTTP Server is a highly modular, open-source web server that supports many operating
systems (Windows, Linux, macOS, etc.). Its logging capabilities capture all HTTP requests made to the
server, helping administrators troubleshoot issues and track malicious activities.

Apache Architecture and Components


The server consists of:
- Apache Core: Manages basic server functions.
- http_protocol: Handles data exchange between the server and clients.
- http_main: Manages server startups and client connections.
- http_request: Manages request processing and error handling.
- Apache Modules: Extend functionality to handle authentication, authorization, and content generation.

Apache Log Types


1. Access Log: Captures every HTTP request received by the server.
2. Error Log: Records errors and diagnostics that occur during request processing.

Access Log Location:


- Linux:
- `varloghttpdaccess_log` (Red HatCentOS)
- `varlogapache2access.log` (DebianUbuntu)
- Windows: `C:Program FilesApache GroupApache2logsaccess.log`

Access Log Formats


1. Common Log Format (CLF): Contains basic information like client IP, timestamp, request method, status
code, and response size.
2. Combined Log Format: Extends CLF by adding the `Referer` and `User-Agent` fields.

Example of Apache Access Log Entry (Combined Log Format):

10.10.10.10 - Jason [17Aug2019:00:12:34 +0300] "GET imagesbg.jpg HTTP1.0" 500 1458


"http:abc.comlogin.php" "Mozilla5.0 Firefox73.0"

Fields explained:
- `10.10.10.10`: Client IP address.
- `Jason`: Authenticated user (if applicable).
- `[17Aug2019:00:12:34 +0300]`: Date and time.
- `"GET imagesbg.jpg HTTP1.0"`: Request method, URI, and protocol.
- `500`: HTTP status code (error).
- `1458`: Response size in bytes.
- `"http:abc.comlogin.php"`: Referer (previous page).
- `"Mozilla5.0 Firefox73.0"`: Client browser and platform.

Apache Error Logs


The error log captures any issues or warnings, such as missing files, permission errors, or server
misconfigurations.
Abhi

Example of Apache Error Log Entry:

[Wed Aug 28 13:35:38.878945 2020] [core:error] [pid 12356:tid 8689896234] [client 10.0.0.8] File not
found: imagesfolderpic.jpg

Fields explained:
- `[Wed Aug 28 13:35:38.878945 2020]`: Timestamp.
- `[core:error]`: Module and severity level.
- `[pid 12356:tid 8689896234]`: Process and thread IDs.
- `[client 10.0.0.8]`: Client IP.
- `File not found: imagesfolderpic.jpg`: Error description.

Forensic Analysis of Apache Logs


- IP Address Tracking: Helps trace suspicious activity by identifying the source IP.
- HTTP Methods: Anomalies in HTTP methods (e.g., unexpected PUT or DELETE requests) can be an
indicator of attempted exploitation.
- Error Log Investigation: Helps pinpoint issues like resource unavailability, unauthorized access, or
configuration problems.

Investigating Web Attacks on Windows-based Servers:


Investigating web attacks on Windows-based servers requires a systematic approach to identifying
vulnerabilities and signs of compromise. Attackers often exploit weaknesses in web applications hosted on
these servers to gain unauthorized access, perform malicious activities, or disrupt services. This process
involves analyzing logs, network connections, active sessions, processes, and system configurations to
detect potential security breaches. Below is a detailed step-by-step guide on how to investigate web attacks
on Windows-based servers.

Understanding the Context:


- Prevalence of Windows-based Servers: According to recent data, Microsoft Windows operating systems
hold about 77.64% of the global market share in web server environments. Due to their popularity,
Windows-based servers are prime targets for attackers. These servers often host a variety of web
applications, which may have exploitable vulnerabilities.
- Types of Attacks: Common attacks on web applications and servers include:
- Directory Traversal: Attackers try to access restricted directories and files on the server.
- Command Injection: Malicious commands are injected into web forms or URLs to control the server.
- Denial of Service (DoS): Attackers overwhelm the server with traffic to disrupt its operations.

Steps to Investigate Web Attacks on Windows-based Servers

1. Review Event Logs Using Event Viewer


- Command: `C:> eventvwr.msc`
- Purpose: Event Viewer is a built-in tool that records system events, including errors, warnings, and
information related to system activity. Logs can reveal:
- Failed logins: Numerous failed login attempts could indicate a brute-force attack.
- Service failures: If critical services like logging are disabled, it may indicate tampering.
Abhi

- File protection status: Windows File Protection, when disabled, leaves the system vulnerable to file
modification by attackers.
- Telnet service running: Telnet is rarely used in modern environments. Its unexpected activation can
signal unauthorized remote access attempts.

Suspicious Events to Look For:


- Event log service has ended: If logging stops, attackers may be trying to hide their tracks.
- Windows File Protection inactive: Disabling this protection could allow unauthorized file changes.
- MS Telnet Service running: Rarely used legitimately, this could indicate backdoor access.

2. Check for Failed Login Attempts or Locked-out Accounts


- Purpose: A spike in failed login attempts or locked-out user accounts could indicate an attempted brute-
force attack or unauthorized access attempts.
- How to Investigate:
- Check Event Viewer under "Security" logs for failed login attempts (Event ID 4625).
- Look for locked-out accounts, as attackers might try multiple passwords against administrative or high-
privilege accounts.

3. Review File Shares


- Command: `C:> net view <IP Address>`
- Purpose: File shares may be exploited if they are not properly secured. Attackers often look for open
shares to gain unauthorized access to sensitive files.
- How to Investigate:
- Use the `net view` command to list shared resources on the server.
- Ensure that file shares are not unnecessarily open to the public or unauthorized users.

4. Analyze Open Sessions


- Command: `C:> net session`
- Purpose: This command shows all active user sessions on the server. Unauthorized sessions can indicate
an ongoing attack or compromise.
- How to Investigate:
- List all active sessions on the server.
- Identify any unusual or unexpected sessions, especially those from suspicious IP addresses.

5. Check Connections to Other Systems


- Command: `C:> net use`
- Purpose: Attackers may attempt lateral movement within a network after compromising a server. The `net
use` command shows network shares and connections with other systems.
- How to Investigate:
- Review the connections the server has with other networked systems.
- Look for unauthorized or unusual connections that could indicate data exfiltration or further spread of
the attack.

6. Analyze NetBIOS over TCPIP Activity


- Command: `C:> nbtstat -S`
Abhi

- Purpose: The NetBIOS protocol over TCPIP is used for file sharing and name resolution in a Windows
environment. Monitoring NetBIOS activity can help identify unauthorized access.
- How to Investigate:
- Check the current NetBIOS sessions.
- Look for connections from untrusted or suspicious devices attempting to use NetBIOS services for data
or resource access.

7. Check for Unusual Port Activity


- Command: `C:> netstat -na`
- Purpose: `netstat` displays active network connections and listening ports. Unusual or unexpected
listening ports may indicate a backdoor or malware installed on the server.
- How to Investigate:
- List all active TCP and UDP connections and listening ports.
- Pay close attention to high-numbered or unexpected ports, as these could be used by attackers for covert
communication.

8. Examine Scheduled Tasks


- Command: `C:> schtasks.exe`
- Purpose: Scheduled tasks can be used by attackers to maintain persistence by running malicious scripts
or programs at specific intervals.
- How to Investigate:
- Use `schtasks` to list scheduled tasks on the server.
- Look for unauthorized or unusual tasks that could be part of an attack (e.g., running malware
periodically).

9. Check for New Administrative Accounts


- Command: `Start -> Run -> lusrmgr.msc -> OK`
- Purpose: Attackers often create new accounts with administrative privileges to maintain access after
compromising a system.
- How to Investigate:
- Open the Local Users and Groups Manager (`lusrmgr.msc`).
- Review the list of users, especially those in the Administrators group.
- Look for any unfamiliar or unauthorized accounts.

10. Review Running Processes in Task Manager


- Command: `Start -> Run -> taskmgr -> OK`
- Purpose: Suspicious processes running on the server can indicate malware or unauthorized software
being executed.
- How to Investigate:
- Use Task Manager to check for unusual or unauthorized processes.
- Pay close attention to processes that mimic legitimate system processes but have unusual names or
locations.

11. Check Active Network Services


- Command: `C:> net start`
Abhi

- Purpose: The `net start` command lists all active network services on the server. Attackers might install
rogue services to perform malicious activities.
- How to Investigate:
- List all running network services.
- Investigate any unfamiliar services, as they could be linked to malicious software or unauthorized
activity.

12. Monitor Disk Space Usage


- Command: `C:> dir`
- Purpose: Unexpected changes in disk space (e.g., sudden decreases in available space) could be due to
the installation of malware, creation of large log files, or uploading of files by attackers.
- How to Investigate:
- Use the `dir` command to monitor disk usage.
- Look for significant decreases in free space, especially if the server is running normally otherwise.

Detect and Investigate Attacks on Web Applications.


Web applications are frequently targeted by attackers because of vulnerabilities such as improper input
validation and lack of sanitation methods, which can lead to various forms of attacks. Detecting and
investigating these attacks is crucial to understanding the extent of the compromise and preventing future
incidents. This process often involves examining log files from web servers, intrusion detection systems
(IDS), web application firewalls (WAFs), and security information and event management (SIEM) systems
to trace attack signatures and establish a timeline of events. Below, we explore methods to detect and
investigate two common types of attacks on web applications: Cross-Site Scripting (XSS) and SQL
Injection.

1. Investigating Cross-Site Scripting (XSS) Attack

Cross-Site Scripting (XSS) is a common attack where an attacker injects malicious scripts into web pages
that users view in their browsers. These scripts are often in the form of JavaScript but can also include
HTML or CSS. Attackers aim to hijack user sessions, deface websites, or spread malware.

1.1 XSS Detection Methods


To bypass security mechanisms like firewalls, IDSs, IPSs, and antivirus software, attackers often use various
obfuscation techniques. Common methods include:

- Hex encoding: Replaces characters with their hexadecimal equivalents.


- In-line comments: Inserts comments within the attack string to disrupt pattern detection.
- Char encoding: Encodes characters as their ASCII or Unicode values.
- Toggle case: Alternates the case of keywords to evade case-sensitive filters.
- Replaced keywords: Substitutes keywords like `alert` or `script` with other terms.
- White-space manipulation: Inserts extra white space or encoded spaces in the script.

1.2 Regular Expressions for Detecting XSS Attacks


XSS attacks can be detected using regular expressions (), which help identify patterns that match known
XSS signatures. Below are common examples used for detecting different types of XSS:
Abhi

- Simple XSS detection:

((%3C)|<)((%2F)|)[a-zA-Z0-9%]+((%3E)|(%253E)|>)ix

- This checks for opening and closing angle brackets `< >` or their hex equivalents in a web request, which
are typical indicators of injected HTML tags.

- Detecting `<img src>` XSS attacks:

((%3C)|<)((%69)|i|(%49))((%6D)|m|(%4D))((%67)|g|(%47))[^n]+((%3E)|>)i

- This looks specifically for obfuscated `<img>` tags, which are often used in XSS attacks to execute
arbitrary JavaScript.

- Detecting general HTML tag-based XSS:

(javascript|vbscript|script|embed|object|iframe|frameset)i

- This detects various HTML tags that can be used to execute malicious code within a browser.

- Paranoid for CSS-based XSS:

((%3C)|<)[^n]+((%3E)|>)I

- This generic pattern detects any HTML tags that may include CSS-based XSS attempts, looking for
opening and closing tags.

1.3 Log Analysis for XSS Detection


In an investigation, the following steps can be taken to detect XSS attacks:

- Apache Logs:
Investigators can search through web server logs, such as Apache’s `access.log`, for malicious HTML tags
and their encoded equivalents. Tools like `grep` can be used to filter relevant log entries that contain
suspected attack payloads.
- Example log entry:

GET wordpresswp-adminadmin.php?page=newsletters-subscribers&%3Cscript%3Ealert%28XSS
%29%3C%2Fscript%3E

This log entry contains an encoded XSS script, which when decoded translates to
`<script>alert(XSS)<script>`. The log reveals the page where the attack was attempted, the timestamp, and
the client’s IP address.

- Snort Alerts:
IDS tools like Snort can generate alerts for XSS attacks. These alerts will show details such as the
attacker’s IP address, source port, and destination IP address. For example:
- Source IP: 192.168.0.233
Abhi

- Source Port: 64580


- Destination IP: 192.168.0.115
- Destination Port: 80

- SIEM Systems:
SIEM tools like Splunk collect data from multiple log sources, including web servers, IDS tools, and
WAFs. They can be configured to detect XSS attack signatures and generate alerts for further investigation.

2. Investigating SQL Injection (SQLi) Attack

SQL Injection (SQLi) is a common attack where an attacker exploits vulnerabilities in a web application's
database query handling. They insert or "inject" SQL queries into user inputs to manipulate database
behavior, often gaining unauthorized access to sensitive data.

2.1 SQLi Detection Methods


SQLi attacks often involve the use of meta-characters such as quotes (`'`), semicolons (`;`), and comment
indicators (`--`). Attackers also employ obfuscation techniques to bypass detection mechanisms, such as:

- In-line comments: Inserts comments within SQL queries to disrupt pattern detection.
- Char encodingdouble encoding: Encodes characters in hexadecimal or other formats to bypass filtering.
- Toggle case: Uses mixed case to avoid simple -based filters.
- White-space manipulation: Uses encoded white spaces to bypass pattern-based detection.

2.2 Regular Expressions for Detecting SQLi Attacks


Below are some examples of patterns used to detect SQLi attacks:

- Detecting SQL meta-characters:

(%27)|(')|(--)|(%23)|()ix

- This looks for common SQL meta-characters such as single quotes, comments (`--`), and hash (``), which
are often used in SQLi attacks.

- Error-based SQLi detection:

((%3D)|(=))[^n]((%27)|(')|(--)|(%3B)|(;))i

- This detects error-based SQLi attacks by looking for a combination of the equals sign (`=`) followed by
single quotes, comments, or semicolons, which are used to terminate SQL queries.

- Union-based SQLi detection:

((%27)|('))unionix

- This detects SQLi attacks that use the `UNION` keyword to combine results from multiple queries.

2.3 Log Analysis for SQLi Detection


Abhi

Similar to XSS investigations, SQLi attacks can be detected by examining logs and identifying suspicious
patterns. Below are steps to investigate SQLi:

- IIS and Apache Logs:


Investigators can search for SQL meta-characters and common SQLi payloads in web server logs.
- Example log entry:

GET login.asp?username=blah' or 1=1;--

This query contains a typical SQL injection payload attempting to bypass authentication.

- Snort Alerts:
Similar to XSS detection, Snort can generate alerts for SQLi attempts, detailing the source IP, destination
port, and other relevant information.

- SIEM Systems:
SIEM tools like Splunk are valuable for SQLi detection as they consolidate data from multiple sources,
allowing investigators to search for SQLi signatures and trace attacks. An example event from Splunk might
include:
- Attacker’s IP: 10.10.10.55
- SQL query: `' OR 1=1;--`
- Timestamp: December 11, 2019, 18:37:31

You might also like