Data Storage and Compression
Data Storage and Compression
Data Storage
As computer are binary, the binary prefix is the most exact, but the decimal prefix units
are sill used because it is far more easier to multiply and divide by 1000 than it is by 1024
Example
A hard disk is described as having a storage capacity of 1.5 TB . What is that in megabytes
An image file has a file size of 363143213 bits. Create an expression to convert this size to
mebibytes (MiB) and show the results.
1000 * 1000
• A photograph is 1024 * 1080 and use color depth of 32 bits. How many photographs of
this size would fit onto a memory stick of 64 GiB.
= 4423680 Byte
= 68719476736 Bytes
File size of digital audio file = sample rate * bit depth * duration(in seconds ) * number of channels
= 635040000 byte
106
= 635040000 byte
1000 * 1000
= 635 MB
Exercise
1. A hard disk is described as having a storage capacity of 1.5 TB . What is that in kilobyte
.
2. A camera detector has an array of 1920 by 1536 pixels. A color depth of 16 bit is used.
Calculate the size of photograph take by this camera, given your answer in MB.
3. Photographs have been taken by a smartphone which uses a detector with a 1024*
1536 pixel array. The software use color depth of 24 bits. How many photographs
could be stored on a 640 MB memory card?
4. The typical song stored on a music CD is 3 minutes and 30 seconds. Assuming each
song is sampled at 44100 Hz and 16 bits used per sample. Each song utilizes two
channels. Calculate how may typical songs could be stored on a 740 MiB.
Compression
The sound and image file size can be very large.
Therefore it is necessary to reduce ( or compress) the size of file for the following reason
• To save storage space
• To reduce transmission time
Compression : Changing the format of a data file so that the size of the file become smaller.
There are two type of compression
• Lossless
• Lossy
Compression
Lossless compression
• When lossless compression is used, no data is lost and the original file can be
restored.
• This is particularly important for files where any loss of data would be disastrous
• Lossless compression is used for text file and executable file (source code) as
missing data would completely change the meaning so that it could not be
understand.
Compression
Lossless Compression
Run-Length Encoding (RLE)
• RLE is used to reduce the size of a repeating string of items.
• Is a form of lossless file compression
• The repeating string is called a run and is represented by two bytes :
• The first byte represents the number of times the item of information is repeated.
• The second byte represents the item of information
• RLE is only effective where there is a long run of repeated unit/bit.
• The file may not be compressed very much at all if the characters are not repeated.
RLE version
3c5m5s3d5c = 10 byte
Compression
• Run-Length Encoding
• For black and white image, RLE would be effective
• A color image in which there are very short runs of different colors would not be encoded
as effectively. Code RLE Size of Code
version
wwbbbwww 2w3b3w
wwbwwwww 2w1b5w
wwbwwwww 2w1b5w
wwbbbwww 2w3b3w
wwbbbwww 2w3b3w
wwwwbwww 4w1b3w
wwwwbwww 4w1b3w
wwbbbwww 2w3b3w
To demonstrate RLE, the letter w and b are
used to represent as white and black.
Compression
Lossy Compression
BitMap Image
A lossy Compression algorithm analyses all of the data in the image.
When it find tiny color differences, it give them the same colors value and then it can
rewrite the file using fewer bits.
It reduce resolution and color depth in image.
JPEG with extension jpg use lossy compression.
Compression
Lossy Compression
Audio File
Much of the data in an audio file
encodes tone and frequencies that
our ears cannot hear and small
differences in volume and frequency.
The lossy compression remove this
redundant and excess data
MP3, MPEG,MP4 used lossy
compression
MP3
• Is a format of digital audio
• Is AUDIO (Lossy) COMPRESSION format to convert music and other sounds
into an MP3 file format.
• will reduce the size of a normal music file by about 90 per cent.