0% found this document useful (0 votes)
31 views7 pages

Chapter 1

The document discusses various data representation systems including binary, denary, hexadecimal, and character sets like ASCII and Unicode. It also covers topics such as sampling of sound, representation of images as bitmaps, file compression techniques including lossy formats like MP3 and JPEG as well as lossless compression using run-length encoding, and units of data storage.

Uploaded by

golchhavidhi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views7 pages

Chapter 1

The document discusses various data representation systems including binary, denary, hexadecimal, and character sets like ASCII and Unicode. It also covers topics such as sampling of sound, representation of images as bitmaps, file compression techniques including lossy formats like MP3 and JPEG as well as lossless compression using run-length encoding, and units of data storage.

Uploaded by

golchhavidhi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Data representation

> Binary System


- The binary number system is a base 2 number system.
- Two ‘values’ 0 and 1 can be used in this system to represent all values.

> Denary system


- Denary uses ten separate digits, 0-9, to represent all values.
- Denary is known as a base 10 number system.

2048 1024 512 256 128 64 32 16 8 4 2 1

> Hexadecimal System


- Base 16 system and therefore needs to use 16 different ‘digits’ to represent each value.
- the numbers 0 to 9 and the letters A to F are used to represent each hex digit.

> Uses of hexadecimal System


- Hex numbers are shorter than Binary and thus are far easier to remember, copy and work with for
humans.

- Error Codes
~ Shown in Hexadecimal
~ Refer to the memory location of the error
~ Automatically generated by computer

- Media Access Control (MAC) addresses


~ A number which uniquely identifies a device on a network.
~ Refers to network interface card (NIC) which is part of the device.
~ Rarely change and thus devices can always be identified regardless of where they are.
~ Usually made up of 48 bits (64 bits also exist) shown in groups of 6 hexadecimal digits.
~ NN-NN-NN-DD-DD-DD or NN:NN:NN:DD:DD:DD
~ First half – identify number of manufacturer
~ Second half – serial number of device

- Internet Protocol (IP) addresses


~ All devices connected to a network are given an IP address
~ IPv4 – 32-bits, written in denary or hexadecimal, 109.108.158.1
~ IPv6 – 128-bits broken down into 16-bit chunks, written in hexadecimal, better,
a8fb:7a88:fff0:0fff:3d21:2085:66fb:f0fa

- Hypertext Mark-up Language (HTML) colour codes


~ Used to write and develop web pages.
~ Isn’t a programming language it’s a markup language.
~ Mark-up languages are used in the processing, definition, and presentation of text.
~ HTML uses <tags>

~ Often used to represent colours of text on the computer screen.


~ The different intensities of red blue and green make up all colors
~ Thus, different hexadecimal values represent different values.
~ # symbol always precedes hexadecimal values in HTML code
~ Six hexadecimal numbers representing R, B, G with 256 possible values for each

> Two’s complement


- allow the possibility of representing negative integers.

- positive 127 (01111111) to negative 128 (10000000).

> Character sets


- Used in communication systems and computer systems.

- Standard Code for Information Interchange - ASCII


Standard ASCII CODE 1963 Extended ASCII CODE 1986

7-bit codes 8-bit codes


(0 to 127 in denary or 00 to 7F in hexadecimal) 0 to 255 in denary or 0 to FF in hexadecimal). This
Represent letters, numbers, and characters found gives another 128 codes to allow for characters in
on a standard keyboard together with 32 control the non-English alphabets.
codes.

DISADVANTAGES:
Does not represent characters in non-Western languages, for example, Chinese characters.

- Unicode
~ ASCII uses one byte to represent a character, whereas Unicode will support up to four bytes per
character.
~ create a universal standard that covered all languages and all writing systems
~ produce a more efficient coding system than ASCII
~ adopt uniform encoding where each character is encoded as a 16-bit or 32-bit code
~ create unambiguous encoding where each 16-bit and 32-bit value always represents the same
character
~ reserve part of the code for private use to enable a user to assign codes for their own characters
and symbols (useful for Chinese and Japanese character sets, for example).
> Representation of sound
- Soundwaves are vibrations in the air. The human ear senses these vibrations and interprets them
as sound.
- Each sound wave has a frequency, wavelength, and amplitude.
- The amplitude specifies the loudness of the sound.
- Sound is analogue as sound waves vary continuously.
- Computers cannot work with analogue data.
- Sampling means measuring the amplitude of the sound wave using an analogue to digital converter
(ADC).

- 4 binary bits can be used to represent each amplitude value


- The number of bits per sample is known as the sampling resolution (also known as the bit depth).
So, in our example, the sampling resolution is 4 bits.
- Sampling rate is the number of sound samples taken per second. This is measured in hertz (Hz),
where 1Hz means ‘one sample per second.

- How is sampling used to record a sound clip?


~ the amplitude of the sound wave is first determined at set time intervals (the sampling rate)
~ this gives an approximate representation of the sound wave
~ each sample of the sound wave is then encoded as a series of binary digits.

- Benefits and drawbacks of using a larger sampling resolution

> Representation of bitmap images


- Bitmap images are made up of pixels (picture elements)
- An image is made up of a two-dimensional matrix of pixels.
- pixels can be in different shapes
- Each pixel can be represented as a binary number, and so a bitmap image is stored in a computer
as a series of binary numbers.
- If each pixel is represented by 3 bits, then each pixel can be one of eight colours (2 3)
- Black and white pictures only need 1 bit to represent each pixel.
- The number of bits used to represent each colour is called the colour depth.
- Increasing colour depth also increases the size of the file when storing an image.
- Image resolution refers to the number of pixels that make up an image.
- Increasing image resolution increases file size when storing the image.
- A certain amount of reduction in the resolution of an image is possible before the loss of quality
becomes noticeable.

> Measurement of data storage


- A bit is the basic unit of all computing memory storage terms and is either 1 or 0. The word comes
from a binary digit.
- The byte is the smallest unit of memory in a computer.
- 8 bits = 1 byte
- 4 bits = 1 nibble

- The above table refers to some storage devices but is inaccurate as it is memory size should be
based on powers of 2.
- New system – IEC (International Electrotechnical Commission) is based in the binary system.

> Calculation of file size

> Data Compression


- Files can be large, we should compress them because:
~ Save storage space on devices such as the hard disk drive/solid-state drive
~ reduce the time taken to stream music or video file
~ Increase the download/upload process as it uses up network bandwidth – this is the maximum rate
of transfer of data across a network, measured in bits per second. This occurs whenever a file is
downloaded, for example, from a server. Compressed files contain fewer bits of data than
uncompressed files and therefore use less bandwidth, which results in a faster data transfer rate.
~ Reduces costs. For example, when using cloud storage, the cost is based on the size of the files
stored. Also, an internet service provider (ISP) may charge a user based on the amount of data
downloaded.

> Lossy file compression


- file compression algorithm eliminates unnecessary data from the file.
- original file cannot be reconstructed once it has been compressed.
- results in some loss of detail when compared to the original file.
- in an image, it may reduce the resolution and/or the bit/colour depth.
- in a sound file, it may reduce the sampling rate and/or the resolution.
- Lossy files are smaller than lossless files which is of great benefit when considering storage and
data transfer rate requirements.

- MPEG-3 (MP3)
~ MP3 files are used for playing music on computers or mobile phones. This compression technology
will reduce the size of a normal music file by about 90%.
~ MP3 music files can never match the sound quality found on a DVD or CD, the quality is
satisfactory for most general purposes.
~ Removal of sounds outside the human ear range
~ If two sounds are played at the same time, only the louder one can be heard by the ear, so the
softer sound is eliminated. This is called perceptual music shaping.

- MPEG-4 (MP4)
~ This format allows the storage of multimedia files rather than just sound – music, videos, photos
and animation can all be stored in the MP4 format.
~ retains an acceptable quality of sound and video.

- JPEG
~ When a camera takes a photograph, it produces a raw bitmap file which can be very large in size.
These files are temporary in nature. JPEG is a lossy file compression algorithm used for bitmap
images.
~ human eyes don’t detect differences in colour shades quite as well as they detect differences in
image brightness
~ by separating pixel colour from brightness, images can be split into 8 × 8 pixel blocks, for example,
which then allows certain ‘information’ to be discarded from the image without causing any real
noticeable deterioration in quality.

> Lossless file compression


- all the data from the original uncompressed file can be reconstructed.
- Lossless file compression is designed so that none of the original detail from the file is lost.

- Run-Length encoding (RLE)


~ lossless file compression technique used to reduce the size of text and photo files.
~ Reduces the size of a string of adjacent, identical data (e.g. repeated colours in an image)
~ Repeating string is encoded into two values:
* The first value represents the number of identical data items (e.g., characters) in the run
* The second value represents the code of the data item (such as ASCII code if it is a keyboard
character)
~ RLE is only effective where there is a long run of repeated units/bits.

~ Using RLE on text data:

~ Using RLE on Images

You might also like