0% found this document useful (0 votes)
2 views

DataCompression-Section1

The document discusses data compression, outlining the differences between data and information, various data types, and methods of data representation and storage. It emphasizes the importance of data compression for conserving storage space, reducing file transfer time, and saving network bandwidth, while explaining the feasibility of compression due to data redundancy. Additionally, it touches on concepts like self-information and entropy in the context of data compression.

Uploaded by

ayaalaakamal15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

DataCompression-Section1

The document discusses data compression, outlining the differences between data and information, various data types, and methods of data representation and storage. It emphasizes the importance of data compression for conserving storage space, reducing file transfer time, and saving network bandwidth, while explaining the feasibility of compression due to data redundancy. Additionally, it touches on concepts like self-information and entropy in the context of data compression.

Uploaded by

ayaalaakamal15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

DATA COMPRESSION

CS411
Section 1
Lecturer: Dr. Saleh Mesbah
TA: Mahmoud ElMorshedy
Agenda

• Data Vs. Information


• Data Types
• Data Representation and Storage
• Why Data Compression?
• Why Data Compression is feasible
Data Vs. Information

• Data is raw, unorganized facts that need to be processed.

• When data is processed, organized, structured or presented in a given context so


as to make it useful, it is called information.

Data Processing Information

• Example: Each student's test score is one piece of data, but average score of class
is information.
Data Types

Data

Text Numbers Programs Audio Image Video


Data Representation and Storage

• Text
• Represented in characters that are encoded into
• ASCII (1 byte)
• Unicode
• UTF-8 (1-4 bytes) variable-length encoding
• UTF-16 (2-4 bytes) variable-length encoding
• UTF-32 (4 bytes) fixed-length encoding
Data Representation and Storage Cont.

• Numbers
• Numbering systems (Decimal – Hexadecimal – Octal – Binary)
• Integers: uint8 – uint16 – uint32 – uint64 – int8 – int16 – int32 – int64
• Floating-Point Numbers: 32-bit Single-Precision, 64-bit Double-Precision
Data Representation and Storage Cont.

• Programs
• Programs include code (opcode) and data (operands)
• Example: MOV, AL, 34h

• Audio
• Analog audio data is converted into digital form through “Sampling” using ADC
• Example: Sampling Rate 44.1 kHz with 16-bit resolution
• Audio file size(in Kbytes) = Sample Rate * Bit Resolution/8 * length * Number of
channels
Data Representation and Storage Cont.

• Images
• Digital images are made up of pixels. Each pixel in an image is made up of binary
numbers.
• Grayscale, Colored images.

• Video
• Video is made up of frames of images and audio file.
• Digital video file size (in bytes)= Frame size * Color depth * Frame Rate * length
• Example: Frame size(160*120), Color resolution(8-bit), Frame rate(25 fps), Duration=10
mins
Why Data Compression?

• Conserve storage space


• Reduce file transfer time
• Save network bandwidth
Why Data Compression is feasible

• Most data from nature has redundancy


• There is more data than the actual information contained in the data.
• Squeezing out the excess data amounts to compression.
• However, data reconstruction is necessary to be able to figure out what the data means.

• Information theory is needed to understand the limits of compression and give


clues on how to compress well.
Self-Information
• If an event A occurs with probability P(A), then its self-information is given by:
b=2
Entropy
• The entropy is a measure of the average number of bits needed to code up a
symbol.
• Example: {a, b, c}
• P(a) = 1/8
• P(b) = 1/4
• P(C) = 5/8
1 1 5
𝐻 = × 3 + × 2 + × 0.678
8 4 8
= 1.3 𝑏𝑖𝑡𝑠/𝑠𝑦𝑚𝑏𝑜𝑙
Summary

• Data Vs. Information


• Data Types
• Data Representation and Storage
• Why Data Compression?
• Why Data Compression is feasible

You might also like