0% found this document useful (0 votes)
820 views194 pages

Application of Computer in Economics

This document provides information about a course on the application of computers in economics. It discusses: 1) The course is titled "Application of Computer in Economics" and is taught by Dr. Sanatan Nayak in the Department of Economics at B.B. Ambedkar University in Lucknow, India. 2) It covers definitions of computer terms, the basic components and organization of computers, and the evolution of computer technology through different generations from the earliest mechanical devices to modern integrated circuits and microprocessors. 3) The goals of the chapter are to discuss the various generations of computers and the different types of computers defined by their hardware and software capabilities.

Uploaded by

kanishka
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
820 views194 pages

Application of Computer in Economics

This document provides information about a course on the application of computers in economics. It discusses: 1) The course is titled "Application of Computer in Economics" and is taught by Dr. Sanatan Nayak in the Department of Economics at B.B. Ambedkar University in Lucknow, India. 2) It covers definitions of computer terms, the basic components and organization of computers, and the evolution of computer technology through different generations from the earliest mechanical devices to modern integrated circuits and microprocessors. 3) The goals of the chapter are to discuss the various generations of computers and the different types of computers defined by their hardware and software capabilities.

Uploaded by

kanishka
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 194

Application of Computer in Economics

Course: DE-403(ii)

Course teacher

Dr. Sanatan Nayak

Dept. of Economics,
B.B. Ambedkar University
Rae Bareli Road, Lucknow-25
Contents of Introductions
• Definitions
• Features or characteristics
• Basic computer Organization/ Components
• Evolution
Definitions
• The word computer has been derived from the word
“compute”, means to calculate with high speed.
Original objectives
• To create a fast calculating machine
• Now-a-days, 80 % of data are for non-mathematics.
• It is created for operation of information and data, bio-data,
railway tickets, air tickets, govt. data base.
• What the computer does,
• Store the data
• Process the data
• Retrieve the data (data processor)
Characteristics of Computer
• High speed Million: seconds (1/10000), micro seconds
(1/10000000), nano seconds (1/10 000000000), piso seconds
(1/10 000 000 000 000).
• Accuracy: error occurs due to human rather than
technological weakness.
• Diligence: it is lack of monotony, tiredness, lack of
concentration.
• Versatility: different type of work.
• Power of remembering: it can store and remember any
amount of information.
• No I.Q: it does not have intelligence
• No feelings: no heart, no taste, no knowledge and experience
Evolution of Computer
• Necessary is the mother of invention.
• The earliest one that qualifies “abaccus” or “soroban”. It was
invented in 600 B.C.
• It does only addition, subtraction with little speed.
• Manual Calculating device: John Napier’s Card Board- 17th
century and updated in 1890 AD.
• First mechanical machine by Blair Pascal in 1642 AD.
• Baron Gottfried: German’s first calculator for multiplication.
• Key Board originated in 1880 AD in USA.
• Herman Hollerith: Punched cards are extensively used a input
media in modern digital computer.
Basic Computer’s Organization
• Five important operations:
1. Inputting
2. Storing
3. Processing
4. Outputting
5. Controlling
Therefore, five important functional units or blocks.
6. Input Unit:
• Data and information must be given through outside device.
• Through Key Board
• All the data and instruction are transformed into binary
codes/acceptable form, those are saved in primary memory.
• It supplies the converted instructions and data to the
computer system for further processing.
Basic Computer’s Organization cont ….
2. Output Unit:
• It is reverse of input Unit
• It accept the result produced by the computer, which are in
coded form and can not be easily understand by us.
• It convert from binary form to the human acceptable form.
• It is designed to the external environment through printer etc.
• It supplies information and results of the computer to the
outside world.
3. Storage Unit:
• All the data and instructions to be stored and kept for
processing (received from input device)
• It stores the intermediate results for processing.
• Final results of processing before these results are released to
be an output device.
Basic Computer’s Organization cont ….
4. Arithmetic Logic Unit
• It is the place where actual execution of instruction are taken
place.
• All the calculations are performed and all decisions are made
in ALU
• All data and instructions are stored in the primary storage
prior to the processing are transferred as and when needed to
ALU.
• Intermediate results are generated in the ALU are temporarily
transferred back to primary storage.
• All the ALU are designed to perform the four basic arithmatic
operations, +, -, X, / and all the logic operation, / , >, <,
Basic Computer’s Organization cont ….
5. Control Unit:
• It is central nervous system in the computer.
• It abtain instructions from the programme stored in main
memory, interpret the instructions and issues signal that
cause other units of the system to execute.
• It acts as selection, interpretation and execution of
instruction.
• Central Processing Units (CPU)
• CU + ALU = CPU
References
• P.K. Sinha (latest), Computer Fundamentals,
BPB Publications, New Delhi.
Goals of the chapter
• This chapter deals with
• Various Generations Computers
• Types of computers
Generations of Computers

Classifications of generations is based on


• Development of hard wares in the computers
• Development of soft wares and its applications
First Generations (FG) of Computers
• First large electronic computer was completed in 1946 in USA
is called The ENIAC –Electronic Numerical Integration and
Calculation (ENIAC).
a. It was the first all electronic computer.
b. Designed by team lead by Eckert and Mauchly at University of
Pennsylvania, USA.
c. It was operated by wiring board and used high speed vacuum
tube switching devices.
d. It had a very small memory and designed primarily to calculate
the trajectories of missiles.
e. ENIAC took about 200 microseconds for addition and 2800
MS for multiplications.
EDSAC (Electronic Delay Storage Automatic
Calculator)
• Major breakthrough took place due to stored program by
John Von Neumann in 1946.
• To store the machine instruction in the memory of computer
along with data.
• The first computer using this principle was designed and
commissioned at Cambridge by Maurice Wilkes.
• It is called as EDSAC and completed in 1949.
• It used mercury delay lines for storage.
UNIVAC
• This is commercial production of stored program electronic
computers
• It is built by Univac divison of Remington Rand and delivered in
1951.
• It used vacuum tubes.
• The tube has limited life and each tube consumed half watt of
power.
• It consumed ten thousand tubes.
Language during this period
• Computer programming was done through machine language.
• Assembly of languages was done in early 50’s.
• Computer application was mainly in science and engineering.
• FG was basically more on hard ware with little soft ware
development.
The Second Generations
• Inventions of transistors by Bardeen , Brattain and Shockley in
1947 was big revolutions.
• Transistors made of germanium semiconductor material and it
is more reliable than tubes.
• No filaments to burn.
• They occupies less space and consume only one tenth of
power.
• They also switch from one place to another in a few seconds,
about one tenth time needed by tubes.
• Thus switching circuits for computers made with transistors
were about ten times more reliable, ten time faster, occupied
about one tenth space, and cheaper.
• Computers thus changed from tubes to transistors.
• This generations lasted till 1965.
SG Continu…….
• Another major invention was magnetic cores of storage.
• Magnetic cores are tiny rings (0.05 cm diameter) made of ferrite
and can be magnetized in either clock wise or anti-clock wise
direction.
• Magnetic cores were used to construct large random access
memories.
• Memory capacity in SG was about 100 KB
• Magnetic disk storage was developed during this period.
Due to development of Large Memories
• Development of high level languages, FORTRAN, COBOL,
Algol, SNOWBOL were developed.
• With higher speed of CPU, disk storage, operating systems were
developed.
• Good batch operating system particularly 7000 series computers
emerged during the SG.
SG Continu…….
• Rapid development of computers due to development of
business and industry (80%).
• A number of application operation research such as linear
programming, critical path methods (CPM), simulation were
used in computers.
• New professions in computing such as systems analysis and
programmers emerged during the second generations
• Academic programmes in computer sciences were also
initiated.
The Third Generations (TG)
• The TG began in 1965 with germanium transistors replaced by
silicon transistors.
• Integrated circuits, circuits consist of transistors, resistors and
capacitors grown on single chip of silicon eliminating wired
interconnection between components emerged.
• From small scale circuits to medium scale circuit of 100
transistors per chips developed.
• Switching speed of transistors went up by a factors of 10
times.
• Reliability increased by factor of 10.
• Power dissipation increased by factor of 10
• Size also reduce by factor of 10
• Powerful CPU with carrying capacity of 1 million instructions
per seconds.
(TG) Conti……
• There were significant improvements in design of magnetic
core meories.
• The size of main memories reached about 4 MB.
• Magnetic disk technology improved rapidly.
• 100 MB drive became feasible.
• Time shared operating system was developed (combination of
high capacity memory, powerful CPU, large disk memories).
• Many important online systems became feasible.
• Dynamic production control system developed.
• Airline reservation, interactive query systems and real time
closed loop process control system were developed.
• Integrated data base management system was developed.
(TG) Conti……
• High level languages developed.
• FORTRAN and Optimizing FORTRAN compliers were
developed.
• COBOL 68 developed by American National Standards
Institute.
• It was end by 1975 but no revolutionary new concepts
developed.
The Fourth Generations (FG)
First Decade (1976-85)
• It is identified by the advent of microprocessor chip.
• Medium scale integrated circuits yielded to Large and Very
Large Scale Integrated (VLSI) circuits packing about 50000
transistors in a chip.
• Semiconductor memory sizes of 16 MB of 16 MB with a cycle
time 200 nsecs were in common use.

Emergence of Microprocessor lead to two directional


development
• Extremely powerful PC.
FG Conti…….
Major impact on history of computing
• Due to development of IBM PC and Operating System (OS)
• Due to development of MSDOS (MS Disk OS) and MS’s
CP/M (Control Program for Microcomputers)
Many small companies made PCs conforming
IBM’s architecture
• Word processor,
• Spread Sheet
• Data base management
FG Conti…….
Decentralisation of computer organisation
• Network of computers and distribution of computer system
were developed.
• Disk memories became very large (1000 MB)
• Concurrent programming language, such as ADA
• Interactive graphic devices
• Language interface to graphic system
• UNIX OS
• OS became user friendly and highly reliable
Second Phase (1986-2000) of FG
• The speed of microprocessor and the size main memory and hard
disk went of 4 factors in each 3 years.
• Many features of CPU in 1st decade of FG became microprocessor
architecture of 2nd decade.
• The mainframe computer of early 80s died in 90s.
• Microprocessor chip designed by DEC in 1994 packed 9.3 million
transistors in single chip and could carry out one billion operation
per seconds (300 MHz clock).
• Apart from IBM, Apple computer, Motorola designed processor
called Power PC 600 series.
• Intel designed powerful chips called Pentium (1993).
• It was followed by Pentium with MMX( Multi media Extension)
and Pentium II
• Celeron processor with a 300 MHz clock
• Intel introduced a 64 bit processor called IA 64 or Itanium.
Second Phase (1986-2000) of FG
• The area of hard storage also saw vast improvement.
• 1 GB of disk on workstation became common in 1994.
• Optical disks also emerged as mass storage for read only files.
• New optical disks is known as Digital Versatile Disk ROMs (DVDROMs)
of storage capacity of 17 GB in 1998.
• Writable CDs were developed during the same time.
• Local Area Networks which could transmit 100 MB/sec to 1 GB/sec.
• Rapid increase in number of computers connected to internet.
• Introduction of WWW, which eased information retrieval.
• Objective oriented language called Java for internet.
• C language became popular.
• C++ emerged as most popular.
• PROLOG was designed for logic oriented specification
language.
• HASKELL, FP as functional specification oriented language.
Comparative Chart of generations
Generation Years Switching Devices Storing devices Switching Time

1st 49-55 Vacuum tubes 1KB memory 0.1 to 1 mili


seconds

2nd 56-65 Transistor 100 KB main 1 to 10 micro


memory secs

3rd 66-75 Integrated Circuits Large disks (100 0.1 to 1 micro


MB), 1MB main secs
memory

4th 75-84 LSI (large scale 1000 MB disks 10 to 100 nano


1st phase
integrated circuits) 10 MB MM secs

4th 85-2000 VLSI (very LSI) 100 GB 1 to 10 nano secs


2nd phase Disks, 1GB MM
Comparative Chart of generations
Generation MTBF (mean time Software Applications
between failure of
Processor)
1st 30 minutes to 1 hour Machine and simple Science and business
monitor

2nd About 10 hours FORTRAN, COBOL Engineering, busineess,


optimisation

3rd About 100 hours FORTRAN IV, COBOL DBMS, On line system
68

4th About 1000 hours FORTRAN 77, Pascal, PCDS, Integrated


1st phase
ADA, COBOL 74 CAD/CAM real time
control
4th About 10000 hours C, C++, Java, PROLOG, Simulations,
2nd phase Haskell, FORTRAN Visualilasation, parallel
90/95 computing, multimedia
The 5th Generations
• FG is radically different from Von Neumann architecture.
• Specification oriented programming and incorporate artificial
intelligence features.
• Changing the processor architecture. It is called Very large
Instruction Word (VLIW). The size of one instruction is about
128 to 256 bits and has several parallel instructions.
• Any time and any place access to data and processing. This is
called as wireless enabled processor chips (Centrino of Intel),
which are used laptop and hand held computers.
• Demand for multimedia allowing users to use simple graphical
user interface, listen to good quality audio, video on the
desktop and mobile computers.
• FG is wireless enabled multimedia and high performance
mobile computers.
5th Generations …..
• Fifth generation computing devices, based on
• Artificial intelligence: Artificial Intelligence is the branch of
computer science concerned with making computers behave
like humans. The term was coined in 1956 by John McCarthy
at the Massachusetts Institute of Technology. Artificial
intelligence includes
• Games Playing: programming computers to play games such
as chess and checkers.
• Expert Systems: programming computers to make decisions
in real-life situations (for example, some expert systems help
doctors diagnose diseases based on symptoms)
• Natural Language: programming computers to understand
natural human languages.
5th Generations ……
• Neural Networks: Systems that simulate intelligence by
attempting to reproduce the types of physical connections
that occur in animal brains
• Robotics: programming computers to see and hear and react
to other sensory stimuli
• Voice recognition :Computer systems that can recognize
spoken words. Comprehending human languages falls under a
different field of computer science called natural language
processing.
• A number of voice recognition systems are available on the
market. The most powerful can recognize thousands of words.
However, they generally require an extended training session
during which the computer system becomes accustomed to a
particular voice and accent.
• Such systems are said to be speaker dependent.
5 Generations ……
th

• Quantum computation : First proposed in the 1970s,


quantum computing relies on quantum physics by taking
advantage of certain quantum physics properties of atoms or
nuclei that allow them to work together as quantum bits, or
qubits, to be the computer's processor and memory. By
interacting with each other while being isolated from the
external environment, qubits can perform certain calculations
exponentially faster than conventional computers. Qubits do
not rely on the traditional binary nature of computing
5th Generations ……
• Molecular and nanotechnology: Nanotechnology is a field of
science whose goal is to control individual atoms and
molecules to create computer chips and other devices that
are thousands of times smaller than current technologies
permit. Current manufacturing processes use lithography to
imprint circuits on semiconductor materials. While
lithography has improved dramatically over the last two
decades -- to the point where some manufacturing plants can
produce circuits smaller than one micron(1,000 nanometers)
-- it still deals with aggregates of millions of atoms. It is widely
believed that lithography is quickly approaching its physical
limits. To continue reducing the size of semiconductors, new
technologies that juggle individual atoms will be necessary.
This is the realm of nanotechnology.
5 Generations ……
th
• Natural language: natural language means a human language.
For example, English, French, and Chinese are natural
languages. Computer languages, such as FORTRAN and C,are
not.
• Probably the single most challenging problem in computer
science is to develop computers that can understand natural
languages. So far, the complete solution to this problem has
proved elusive, although great deal of progress has been
made. Fourth-generation languages are the programming
languages closest to natural languages.
5th Generations ……
Parallel processing and superconductors :
• The use of parallel processing and superconductors is helping to make
artificial intelligence a reality. Parallel processing is the simultaneous use
of more than one CPU to execute a program. Ideally, parallel processing
makes a program run faster because there are more engines (CPUs)
running it. In practice, it is often difficult to divide a program in such a way
that separate CPUs can execute different portions without interfering with
each other.
• Most computers have just one CPU, but some models have several. There
are even computers with thousands of CPUs. With single-CPU computers,
it is possible to perform parallel processing by connecting the computers
in a network. However, this type of parallel processing requires very
sophisticated software called distributed processing software.
• Note that parallel processing differs from multitasking, in which a single
CPU executes several programs at once.
• Parallel processing is also called parallel computing.
Moore’s Law
• 1965, Gordon E. Moore predicted that density of transistors in
integrated circuits with double at regular interval of 2 years.
• Since, 1965, his prediction became true.
• Number of transistors per integrated circuit chip has
approximately double in every 18 months.
• In 1974, the largest Dynamic Random Access memory chip
had 16 kbits, whereas in 1998 it has 256 mbits, as increase of
16000 times in just 24 years.
• In 1984, the disks capacity in PCs was around 20 MB, where
as it was 80 GB by 2004, which is 8000 fold increase.
• Now it around 150 GB.
• It has come without increase in price.
• Moore’s law that foreseeable future will get more powerful
computer with less price.
Classification of computers
• Microcomputers
• Mainframe
• Supercomputers
But technology has changed and all computers use microprocessor
as their CPU. Thus classification is possible only through their
mode of use.
• Palms
• Laptop PCs
• Desktop PCs
• Workstations
Based on interconnected characteristics,
• Distributed computers
• Parallel computers
Palm PCs/Simputer
• Which can be held in palm
• High density packing of transistors on a chip
• Palm with capabilities nearly that of PCs
• It accept handwritten inputs using an electronic pen on a palm
screen
• Have small disk storage
• Can be connected to wireless network
• It has facilities to be used as mobile phone
• Has the facility of fax and e-mail.
• A version of MS OS called Window-CE is available for palm.
Simputer
• Indian need for rural population called Simputer
• Simputer is a mobile handheld computer with inputs through
icons on touch sensitive overlay on the LCD display panel.
• A unique feature of Simputer is the use of free open source OS
called GNU/Linux.
• Cost is low as there is no cost for software.
• Another unique feature of Simputer is a smart card
reader/writer which increases the functionality of the Simputer
including possibility of personalisation of a single Simputer
for several users.
Laptop
• It is portable computer weighing around 2 kgs.
• They have key board, flat screen liquid crystal display and
pentium or power PC processor.
• Colour display are also available
• Normally WINDOWS OS is used.
• LT come with hard disk (20 GB), CDROM and Floppy disk.
• They are designed to conserve energy by using power
efficient chips.
• Trend of wireless connectivity to laptops so that they can read
files from large stationery computers.
• Lt are used for word processing and spreadsheet computing.
Personal Computers (PCs)
• Most of the PCs are desktop machines.
• Early PCs had intel 8088 microprocessor.
• Intel Pentium IV is the most popular process.
• The machines made by IBM are called IBM PCs.
• IBM PCs mostly use MS-Windows, WINDOWS-XP or
GNU/Linux as operating system.
• Till 2004, PCs has 64 to 256 MB main memory, with 40 to 80
GB disk and now 160 GB
• 650 MB CDROM is also provided in PCs for multi-media use.
• Apple Pc are called Apple Machintosh.
• IBM Pcs are most popular.
Workstations
• Woskstations are also desktop machines.
• More powerful processors about 10 times that of PCs.
• Most workstations have a large colour video display unit.
• Normally they have main memory of around 256 MB to 4 GB and disk of 80 to
320 GB.
• Workstations normally use RISC (Reduced Instruction Set Computer) processor
such as MIPS (SIG), RIOS (IBM), SPARC (SUN), or PA-RISC (HP).
• Some manufactures of workstations are silicon graphics (SIG), IBM, SUN
Microsystems and HEWlett Packed (HP).
• The standard OS of Workstations is UNIX and its derivatives such as AIX (IBM),
Solaris (SUN), and HP-UX (HP).
• Very good graphics facilities an large video screens are provided by most
workstations.
• A system called X Windows is provided by workstations to display the status of
multiply process during their executions.
• Most workstations have built in hardware to connect to a LAN.
Servers
• Workstations are characterized by high performance processors
with large screens for interactive programming,
• While servers are used for specific purposes such as high
performance numerical computing, web page hosting, data base
store, printing etc.
• Interactive large scale screen are not necessary.
• Compute servers have high performance processors with large
main memory, database servers have big on-line disk storage (100s
of GB) and print servers support several high speed printers.
Mainframe Computers
• Insurance, Banking and other companies need processor for
large number of transactions on-line.
• They require computers with very large disks to store several
Tera bytes of data and transfer data form disk to main memory
at several hundred Megabytes/sec.
• The processing power needed from such computers is hundred
million transactions per seconds.
• These computers are much bigger and faster than workstations
and several hundred times more expensive.
• They provide extensive services such as user accounting, file
security and control.
• they are much more reliable
• Few manufacturers, viz., IBM, and Hitachi.
Supercomputers
• Super-computers are fastest computers available at any given
time.
• They are used to solve the problem which require intensive
numerical computations.
• Prediction of weather condition, designing supersonic aircrafts,
design of drugs, modeling complex molecules.
• All these problems require 1016 calculations.
• These problems will be solved by 3 hours by a computer, which
can carry a trillion calculations at a second.
• These computers are called super-computers by 2004.
• Super computers are built by interconnecting several high speed
computers and programming them to work co-operatively to
solve the problems.
Supercomputers Conti………
• They functions are expanded to analyze large commercial data
base, produce animated movies and play games like chess.
• Besides these functions, SC have large main memory of 16
GB and secondary memory of 1000 GB.
• The speed of transfer of data from the secondary memory to
main memory should be at least a tenth of the memory to CPU
data Transfer speed.
• All SC use parallelism to achieve their speed.
Parallel Computers
• A set of computers connected together by a high speed
communication network and programmed in such a way that they
co-operate to solve a single large problems is called a Parallel
computers.
• Two types of Parallel computers:
• Shared memory parallel computer (SMPC)
• distributed memory parallel computer (DMPC)
Shared Memory Parallel Computer
Process of SMPC
• A number of processing elements are connected to a common
main memory by a communication network.
• Programmes are written in such a way that multiple
processor can work independently and co-operate to solve a
problem.
• Programming of such a computer is relatively easy provided
the problem can be broken up into parts.
Shared Memory parallel Computers

Shared Memory

Communication Network

CPU CPU CPU CPU


SMPC Conti……
Limitations/Problems
• It is not scalable beyond about 16 processors as all
the processors share a common memory.
• This memory is accessed via single communication
network which gets saturated when many processors
try to read or write from memory.
DMPC
• A number of processors, each with its own memory are
interconnected by a communication network.
• A programme is divided into many parts and each computer
works independently. Whenever computer need to exchange
data to continue with computation they do so by sending
messages to another via the communication net work.
• Such computers are called message passing multi-computers.
• DMPC scalable to over 1000 processors as each computers
works reasonable independently and there are multiple
communication paths to exchange messages.
• A popular interconnection network is called hypercube.
Other Types of Parallel Computers
• Ethernet System: the use of the shelf high
standard performance PCs and interconnect
them.
• Ethernet speed of 1 Gbps is now available.
• Linux system is available now.
Reference
• Rajaraman, V. (2008), Fundamental of Computers, PHI Pvt. Ltl.
• http://www.techiwarehouse.com/engine/a046ee08/
Generations-of-Computer
Input/Output Units
• Types of Input units, their advantage and
disadvantages
• Output units, their advantages and
disadvantages
Process from input to output
Data Written in documents

Data Conversion

Data in Machine readable form

Input Unit

Data Coded in Internal form

Memory and Processor

Processed data in internal form

Output Unit

Data Transformed to a readable form


Description of Computer Input Units
• General Purposes: Keyboard and Desktop
• Special purposes: Scanners, magnetic Ink character readers,
Optical mark readers, Optical Character readers and bar code
readers
• Compact Disk Read Only Memory (CDROM): when large
data are recoded for distribution of many users and for reading
only and store it in computer memory.
• 650 MB of data can be recorded in CDROM.
• Floppy disk is used if small amount of data is transferred such
as 1.2 MB
• Memory card or memory disk or flash memory: it is a solid
state read only memory having 32 KB to 512 MB to store and
distribute.
• Storage device: Floppy, CDROM and Flash memory
Keyboard
• It is used for manual entry of data
• It is used for all types computers such as PC, Workstations, or
notebook computer.
• It is also called QWERTY keyboard as these are first six letters in
the third row.
• Categories of keys
• Letter Keys -26 letters.
• Digit Keys – 2 sets of digits keys.
• Special Character Keys:- >< ?/{} [] (), “” \ | @ with the help of shift
key.
• Non- Printable Control Key. Back space, moving, cursor on above,
insert space Bar.
• Function Keys: F1,…… up to F15.
• Functions of Non-tabulated keys: Backspace Key, Enter Key, Tab
Key, Shift Key. 
Vodeo Terminal (VDU)
What is VDU:
• A video terminal or a video display unit consists of a televison
screen and a keyboard.
• When a key is pressed, the corresponding character is
displayed on the screen.
• Simultaneously, a cursor moves to the position where the next
character will be displayed.
• A cursor is small arrow, underline or a small rectangle which
can be moved horizontally o vertically indicate the osition of
character.
VDU Conti……..
What is function of Cathod Rays
• Cathode ray television tube is scanned by an electron bean to
create a raster of horizontal lines. The intensity of the electron
beam is increased at certain moments creating bright spots on the
face of the tube. Each character is displayed by a matrix of 5 dots
along horizontal direction and 7 dots in vertical direction.
• A display normally has 80 characters per horizontal line and 24
such lines on the screen.
How typed Characters are displayed on the Screen
• When key on the keyboard is pressed, the corresponding
character is displayed on the screen because an appropriate
coded series of electrical pulses are sent to computers memory.
Output Units
There are three principal devices to output
• Printer: it is most common method
• Video terminal
• Computer output Micro-film: It is expensive and used in
special cases.
Hard Copy Devices of Output: Printer and Microfilm as the data
written using these devices can be read by human being.
Soft Copy Devices of Output: Floppy Disks, CDROM (R/W),
Solid State Memory.
• These are removable portable devices that the data in them can
be read by another computer and stored in its memory for
processing.
Printers
Two main categories:
• Line Printers
• Serial Character Printers
• Line Printers: It prints complete line at a time. Printing speed
varies from 150 lines to 2500 lines per minute with 96 to 160
characters on a 15 inch line.
• Printer are available in almost all scripts: English, Arabic,
cryillic (Russian), Hindi.
• Two types of Line Printers: Drum printers and Chain printers
Drum Printer
Features of DP:
• The character to be printed are embossed on its surface.
• One complete set of characters is embossed for each print position on a
line.
• A printer with 132 character per line and a 96 character set will have on
its surface 132 X 96 =12672 characters are embossed on it.
• The codes of all characters to be printed on one line are transmitted from
the memory of the to a storage units in the printer.
• A set of print hammers, one for each character in a line are mounted in
front of the drum. A character is printed by striking a hammer against the
embossed character on the surface.
• A carbon ribbon and paper are interposed between the hammer and the
drum.
• It is expensive and can not be changed quickly.
Chain Printers
Features of CP:
• It has steel band on which character sets are embossed.
• For a 64 character set printer, 4 sets of 64 characters each would
be embossed on the band.
• All the characters in the line are sent from the memory to the print
buffer register.
• Band is rotates with high speed.
• When band rotate, a hummers is activated is activated when desire
characters as specified in the buffer register comes in front of it.
• For a 132 character per line, 132 hammers will be positioned to
strike the carbon ribbon which is placed between the chain, paper
and the hammer.
• Different fonts and different scripts may be used in same printer.
Serial printers
Features of SP:
• It prints one character at a time with the print head moving
across a line.
• It is normally slow and print 30 to 300 character per second.
• The popular SP is called dot-matrix.
• The print head consist of array of pins.
• Characters to be printed are sent one character at a time from
the memory to the printer. The character code is decoded by
the printer electronics and activates the appropriate pins in the
print head.
• Many dot matrix are bidirectional: left to right and right to
left.
SP: cont……..
Advantages of DM printers:
• It prints other than English: also in regional language such as
devanagari, tamil script.
• It is low cost, multiple copies can be taken by using carbon
paper.
• DMP have 24 pins in a vertical line are available.
• It provide high quality print materials.
• It is less expensive compared to line printers
Inkjet Printers
Features of IP:
• The character are represented by sharp continuous line.
• It consists of a print head, which has number of small holes or
nozzles
• Individual holes can be heated very rapidly by an integrated
circuit resistor.
• When the register heats up, the ink near it vaporizes and is
ejected through the nozzle and make a dot on paper placed
near the head.
• The Printer has enough memory to print an entire page
accommodating different fonts.
• It has multiple heads: one per colour, which allows colour
printing.
• 120 Character per second and the cost of ink cartridge is high.
Laser Printers
• Earlier two are slow, a head to move and impinge on a ribbon
to print.
• In Laser, an electronically controlled laser beam traces out the
desired character to be printed in a photo-conducitve drum.
• The drum attracts an ink toner on the exposed areas.
• This image is transferred to the paper which comes contact
with the drum.
• Low Speed Laser Prints up to 4-8 per minutes.
• Graphics, art & colors printer facility are available.
• Good quality prints are produced.
Comparison of printers
Type Speed resolution Capital Running Drawing Capacity
Cost Cost Capacity

DP And CP 100 lin/mi Average High Low No More


132 carbon
char/lines copies
DMP 100 Average Low Low Poor 2 to 3
char/sec carbon
copies
IP 100 Good 100 Low Higher Good Light duty
char/sec dots/cm than DMP single
copy
LP Low 10 Good 120 Higher Lower Good Light duty
Speed pages/min dots/cm than IP than IP single
copy
LP High 10000 Very good high low Good Heavy
speed lines/min 600 duty
dots/cm
Reference

• Rajaraman, V. (2008), Fundamental of


Computers, PHI Pvt. Ltl
Storage Unit
Storage unit is ranked based on the following criteria
• Access time
• Storage capacity
• Cost per bit of storage

• Two types of Storage


• Primary Storage Unit (Main Memory)
• Secondary Storage Unit

• Primary Storage Unit (Main Memory)


• Faster Access time
• Smaller storage capacity
• Higher cost per bit of storage
Storage Location and Address
• It is basis to all computers.
• It is made up many small storage areas called locations or cells.
• Each location can store fixed number of bits called word length.
• Address of Location: it is used to identify the location.
• Each location can hold either a data item or an instruction.
Storage Capacity
• The capacity is defined in terms of bytes or words.
• Storage capacity is commonly denoted as K (kilo), which is equal to
210 or 1024 bytes or characters.
• 32 kilo bytes means 32 X1024 = 32, 768 bytes or characters.
• It is necessary to know word size in bits or bytes in order to
determine the actual storage capacity of the computer.
• It is necessary to know total number of bits per word or total words.
• 16 bit 4096 word memory is called 4096 location each with different
address and each location storing 16 bits.
• 32 K16 – bits memory having 215 words with each word of 16 bits.
• If word size of a memory is 8 bits (equal to a byte) then it becomes
immaterial whether the memory capacity is expressed in terms of
bytes or words.
• Memory having 216 words with each word of 8 bits is simply
reffered to 64 K memory.
Why do need more BITS
• Meaning of 8 Bits, 16 bits and 32 bits computer: Word size in
terms of total number of bits.
• What is the advantage of having more number of bits per word
instead of having more words of smaller size?
• Example of High ways of 4 lanes, 8 lanes and 16 lanes
• Greater bits means more rapid flow of electronic signal means
faster computer.
• What is Word Addressable Computer: fixed number of characters
in each numbered address location. They apply fixed word length
storage approach.
• Character Addressable computer: the primary storage section is
also designed in such a way that each numbered address can
only store a single character. They employ variable word length
storage approach.
Merit and Demerit of Fixed and Variable Word
Length Storage Approach
• FWLSA is normally used in large scientific
computers for gaining speed of calculations.
• Suppose in a FWLSA word length is eight
characters, words are stored is less than five
characters, then many storage will be unused.

• VWLSA is used in small business computers


for optimizing the use of storage space.
• No problem of Unused space
Types of Storage
RAM: Random Access Memory
• Primary storage is usually referred to as random access
memory because it is possible to randomly selected
• Use any location of this memory to directly store and retrieve
data and instruction.
• It is also referred to as read/ write memory because
information.
ROM: Read Only Memory
• Information is permanently stored.
• The information can only be read and it is not possible to write
fresh information into it.
• When power is switched off, the does not wash off.
Micro-programmes
• Special programmes are written to run the operations of low
level of machine operations.
• They are substitute of additional hardware
• MP are written to aid the control unit in directing all the
operations of the computer system.
• ROMs are mainly used by computer manufactures for storing
these micrprogramms, so that they can not modify the users.
Programmable ROM
• It is possible for a user to customise a system by converting
his own programms to micro-programs and storing them in
PROM.
• Once the users programmes are stored in PROM chip, they
can usually be executed in a fraction of the time previously
required.
• Once the chip has been programmed, the recorded information
cannot be changed, i.e., PROM becomes ROM.
• PROM is non-volatile storage, i.e., the stored information
remains intact even if power is switched off.
Erasable PROM
• Another type of memory chip EPROM, that overcome this
problem.
• It is possible to erase information stored in an EPROM chip
and chip can be reprogrammed to store new information using
a special prom-programmer facility.
• EPROM is erase by exposing the chip by ultraviolet light.
• When an EPROM is in use, information can only be read and
the information remains on the chip until it is erased.
• EPROM are mainly used by R& D personnel because they
frequently change the micro-programms to test the efficicny of
the computer.
CACHE MEMORY
• A special high speed memory is used to speed of processing
by making current programs and data available to the CPU at a
rapid rate.
• The technique used to compensate the mismatching in
operating speed between CPU and Main Memory is called
cache memory.
• It is a memory in hiding and is not addressable by the user of
the computer system.
• Cache memory makes main memory faster than it really is.
• It improve the memory transfer rates and thus raising the
processor speed.
Registers
• Registers are special memory units which makes the moment of
information between the various units satisfactory and makes speed
up.

• These are not considered as a part of the main memory and are used
to retain information on a temporary basis.
Function Cont……..
Sl. NO Name of register Function

1 Memory Address Hold the address of the active memory location


(MAR)

2 Memory Buffer (MBR) Hold information on its way to and from memory

3 Programme Control Holds address of the next instruction to be executed


(PC)

4 Accumulator (A) Accumulated results and data to be operated upon.

5 Instruction (I) Holds an instruction while it is being executed

6 Input/Output (I/O) Communicates with the I/O devices


Secondary Storage Devices
• An additional memory called auxiliary memory or secondary
storage.
• It is referred to as backup storage because it is used to store
large volumes of data on a permanent basis which can be
partially transferred to the primary storage as and when
required for processing.

Method of accessing Information:


• A Sequential Access: Information can be retrived in the same
sequence.
• Direct or Random Access: Computerised Bank
Reference
• Sinha, P.K. (1996), Computer Fundamental,
BPB Publications, New delhi.
Meaning of Research
• Search for knowledge
• ALDCE “a careful investigation or inquiry specially through
search for new facts in any branch of knowledge”.
• Research is an academic activity and as such the term should
be used in a technical sense.
• Clifford Woody defined “ it comprises defining and redefining
problems, formulating hypothesis or suggested solutions,
collecting, organizing and evaluating data, making deductions
and reaching conclusions and at last carefully testing the
conclusion to determine whether they fit the formulating
hypothesis.”
Types of Research
• Descriptive vrs analytical: Ex post facts vrs use
the facts and information available.
• Applied vrs Fundamental: Getting solution to
the present problem vrs. Generalization
• Quantitative vrs qualitative:
• Conceptual vrs empirical: abstract ideas or
theory vrs data based
Sampling, Design and Size

Sanatan Nayak
L-4
DE/SAS, BBAU
Sampling Difference in Quantitative and
Qualitative Research
Quantitative Research Qualitative Research

Unbiased and representative Case of accessibility to


sample to population potential respondents,
judgement, situation of interest

To draw inference To gain in-depth knowledge


Pre-determined sample size No need for a pre-determined
sample size..
Relationship exist in variation No such relationship
among the respondents and
sample size
Both probability and non- Only non-probability
Probability
What is Sampling?
• Definition:
• Sampling is the process of selecting a few elements (a sample) from a
bigger group (the sampling population) as the basis for estimating or
predicting the prevalence of an unknown piece of information,
situation or outcome regarding the bigger group.
• Advantage:
• It save times, finance and human resources.
• Disadvantages:
• It does not cover the whole population. Hence, there is an possibility
of an error.
• Principles of Sampling
• Mean age of four students, A=18, B=20, C=23 and D=25, Mean =21.5
years.
1. In a simple way of finding the probability is 2/4X1/3=1/6
Principles of Sampling
• Principle 1: In majority cases, there will be difference
between mean of samples and mean of true population. Hence,
sampling error is attributed. Exa: Prepare the probability chart
of mean age of two samples out of four population.
• Principle 2: Greater the sample size, the more the accurate the
estimate of the true population mean. Exa: Prepare the
probability chart of mean age of three samples.
• Principle 3: Greater the difference in the variable under study
in a population for given sample size, the greater difference
between sample mean and true population mean. Hence,
greater is the sample error. Exa: Prepare for a example of
higher variation among the population and samples and find
the probability chart of mean age of two and three samples.
Factors Affecting Inferences Drawn from Sample
• Size of the Sample:
• Extend of variation in the sampling of population.
1. Greater the variation among sample, greater is SD, higher
uncertainty and greater is the standard error.
2. For high heterogeneity, sample size need to be higher.
Types of Sample Design
Sample Design

Non-Random/Non-
Random/Probability Mixed Sampling
probability

Sim StraD
i
ple Ptifie S Clus
s
Ran r d i ter M Quata Systematic
p
dom oRanr n
u
pdom g D
l
o o l o
t
p
r o e u i
t - Judgemental
r b s
i s l
o t t e t
i a
n o a
a n g
g Accident
t e
e
e a al
t
e
Types of Sample Design
A. Random/Probability Sampling:
1. Each element in the population has an equal and independent
chance in selection of the sample.
2. Equality means, the probability of selection of each element is
same.
3. Independence means choice of an sample does not depend upon
choice of other element.
4. Exa: Students of 80 in a class, where 20 are interested for your
study (equality). Five close friends and one is included
(independent)
Advantages:
5. As they represent the total sampling population, the inference
drawn from such samples can be generalised to the total population
sample.
6. Statistical test based upon the theory of probability can be applied
to data collected from random sampling.
Types of Sample Design
B. Non-Random/Non-Probability Sampling:
• When either the number of elements in a population is
unknown or elements cannot be individually identified.
There are six methods used in qualitative and quantitative
methods.
1. Quota Sampling
2. Accidental Sampling
3. Convenience Sampling
4. Judgemental or Purposive Sampling
5. Expert Sampling
6. Snowball Sampling
Types of Sample Design
C. Systematic /Mixed Sampling:
• It has characteristics of both random and non-
random methods.
• Suppose 10% sample would be selected from
50 population. then, every 5th item would be
selected from the population.
Specific Random/Probability Sample Designs
• Simple Random Sampling
• Stratified Sampling
• Cluster Sampling
• Sequential Sampling
• Area Sampling
• Multi Stage Sampling
• Sampling with Probability Proportional to Size
Specific Random/Probability Sample Designs
• Random/Probability Samplings:
• The Fishbowl Draw:
• Computer Programme:
• Table of Randomly generated Numbers:
• No of Samples= N(N-1)....(N-n+1)/n!
• Probability of getting a sample =n!/N(N-1)....
(N-n+1)
A. Specific Random/Probability Sample Designs
• Stratified Random Sampling:
• To reduce the variability or heterogeneity in the large sample
population is the objective.
1. If population is not homogenous group, then SST is normally
applied.
2. The population is divided in to many sub-population, which
is called strata. Population within stratum is homogeneous,
but across stratum, it is heterogeneous.
3. SST is more reliable and provides detailed information.
• Important Questions on Stratified Sampling Techniques:
1. How to form a strata?
2. How should items be selected from each stratum?
3. How to allocate the sample size of each stratum?
A. Specific Random/Probability Sample Designs
• How to form a strata?
1. The elements within strata must be homogeneous.
2. It is done based on experience of the researcher.
3. Pilot study needs to be done carefully.
• How should items be selected from each stratum?
1. Either random sampling method or systematic sampling will
be applied.
• How many sample or How to allocate the sample size of
each stratum?
1. Proportional sampling method.
• Exa: Total population= 8000, population of three stratum,
P1=4000, P2=2400, P3=1600, total sample size, n=40,
Pi= proportion of population in each stratum, then how to
calculate sample size in each stratum?
A. Specific Random/Probability Sample Designs
• How many sample or How to allocate the sample size of
each stratum?
1. Then, how to handle when comparison is made across
stratum along with variability in size and elements?
2. Then, disproportionate sampling design is required.
Proportionately larger sample in larger strata and smaller
sample in smaller strata.
3. Write the Formula:
4. This method is called optimum allocation of samples through
disproportionate sampling.
5. Example:
6. Then how to optimise cost?
A. Specific Random/Probability Sample Designs
• Cluster Sampling:
• In case of large population of one city or a country CS is taken.
1. Conveniently and randomly take a smaller area of one
bigger area, i.e., cluster.
2. Clusters are visible or easily identifiable small group in a
geographical proximity or common characteristics.
3. Sampling from each cluster can be done through SRS or
systematic sampling.
4. Exa: Problems of higher education in the country.
5. Clustering sampling is extremely useful for random sampling.
A. Specific Random/Probability Sample Designs
• Different Stages of Cluster Sampling:
1. CS may be start from country or territory level. Then choose
similar state based on socio-economic profile or all states.
2. Then, select one or more educational institutions of higher
education.
3. Then, one or more academic programme from each
institution may be selected.
4. Students of a particular academic year to be taken.
5. Proportionate basis students may be identified.
A. Specific Random/Probability Sample Designs
• Area Sampling:
1. If cluster happens to be an geographical area,
then CS known as AS.
A. Specific Random/Probability Sample Designs
• Multi Stage Sampling:
1. It is based on the principle of cluster sampling.
2. Bank Efficiency in India.
• First, select a state, then select many districts. Then chose all
banks in the chosen districts. Two stage sampling.
• Then add certain towns, and interview all banks. Three stage
sampling.
• If banks are selected on sample basis from selected towns,
then four stage.
• If random is on all stages, that is called multi stage random
sampling method.
A. Specific Random/Probability Sample Designs
• Sampling with Probability Proportional to Size:
1. If cluster sampling units do not have the same number, then
random selection process, where probability of each cluster
being included in sample.
2. The actual cluster selected in this way do not refer to
individual elements but it indicates which cluster and how
many are selected from each cluster.
3. Exa. There are 15 cities and cluster of stores in each city.
Select 10 stores from the 15 cities.
A. Sampling with Probability Proportional to Size
City No. Of depart. stores Cumulative Sample

1 35 35 10
2 17 52
3 10 62 60
4 32 94
5 70 164 110, 160
6 28 192
7 26 218 210
8 19 237
9 26 263 260
10 66 329 310
11 37 366 360
12 44 410 410
13 33 443
14 29 472 460
15 28 500
A. Specific Random/Probability Sample Designs
• Sequential Sampling
1. It is complex in nature. Ultimate sample is not fixed and
depend the information yielded as survey progress.
2. If a particular lot is selected or rejected based on single
sample, it is called single sample.
3. If decision is taken on the basis of two samples, it is called
double sample.
4. If decision is taken on the basis of many samples but
sample size is certain and known in advance, it is called
multiple sampling.
5. If decision is taken on the basis of many samples but
sample size is not certain and not known in advance, it is
called sequential sampling.
B. Non-Random/Non-Probability Samplings
• It does not follow the theory of probability in the selection of
elements.
• Other considerations are required for selection of elements.
• There are six methods used in qualitative and quantitative
methods for non-probability samplings.
1. Quota Sampling
2. Accidental Sampling
3. Convenience Sampling
4. Judgemental or Purposive Sampling
5. Expert Sampling
6. Snowball Sampling
B. Non-Random/Non-Probability Samplings
1. Quota Sampling:
• Based on easy access and convenience on visible
characteristics such as gender, race, caste etc.
• Process will continue till you have easy access to required
number of respondents.
• Advantages:
• Least expensive and no sampling frame.
• Disadvantages:
• No probability sampling and can not be generalised.
2. Accidental Sampling
• Similar to Quota sampling but not based visible
characteristics.
• Stop collecting data when required number are done.
• It is mostly applied in the area of market research and
newspaper reports.
B. Non-Random/Non-Probability Samplings
• Convenience Sampling:
• Similar to accidental Sampling but geographical proximity,
known contacts, ready approval etc are main criteria.

• Judgemental or Purposive Sampling and Expert Sampling:


• In your opinion, who are the best people in a particular field
such as historical reality, where a little is known.

• Snowball Sampling: it is a process based on network .


• Few individual are selected initially and later on they are
asked to identify other people in the group.
C. Systematic/Mixed Sampling

• It has both random and non-random


characteristics.
• Sampling frame is designed into number of
segments called intervals.
• From the first interval, first element is selected
on random basis.
• Width of interval (k)=Total population
(N)/sample size (n)
• Sampling frame is needed.
Calculation of Sample Size
• Quantitative Research:
• It depend on the purpose of the findings in quantitative
research.
• Greater the heterogeneity, greater the sample size.
• Level of confidence or test of hypothesis.
• Degree of accuracy
• Level of variation (SD).

• Qualitative Research:
• Sample size is less important in qualitative research.
• Sampling design may be on purposive, judgemental, expert,
accidental and snowball method.
Bias and Error
• Difference between sample mean and population mean is
called error.
• It caused due to sampling selection.
• There are large number of errors as there are many
alternative samples.
• Therefore, there is possibility to have one summary of
measure of sample error, which is called as Mean Square
Error (MSE).
• However, bias and error can take place at data collection, data
entry and analysis. These errors are called Non-Sampling
Errors. These errors are taken place in sampling as well as
Census.
• There is difference between error and bias, however both
affect MSE.
Bias and Error

• First part of equation 1.4 is termed bias.


• First theory of sample is equal probability of
selection method (EPSEM).
• Second principles is known as sampling
variance of mean. Its square root as the
standard error of the mean.
• MSE (y)=B2 +sampling variance of mean
• Exa 1 and 2: See in excel the Daily wage of Six
Employees.
References
• Kumar, Ranjit (2014), Research Methodology: A Step by Step
Guide for the Beginners, Sage Publication, New Delhi.

• Roy, Taru Kumar et al., (2016), Statistical Survey Design and


Evaluating Impact, Cambridge University Press, New Delhi.

• Ladu Singh L. (2018), Survey Sampling Methods, Eastern


Economy Edition, New Delhi.

• Bryman, Alan (2009), Social Research Methods, OUP, New


Delhi.
• Kothari, C.R. (2004), Research methodology, New age
International Publications, New Delhi.
Methods of Research
• Research Process
• Formulating Research Problems
• Extensive Literature Survey
• Development of Hypothesis
• Preparing Research Design
• Determining Sample Design
• Data Collection
• Execution of Project
• Analysis of Data
• Hypothesis testing
• Generalization and Interpretations
• Preparation of Reports
Data: Sources and Methods
• The term data (singular datum) refers to facts from which
other facts may be deduced.
• Bertrand Russel remarks “the questions of Data has been
mistakenly, as I think, mixed up with the questions of
certainty. The essential characteristics of a datum is that it is
not inferred.”
Difference between Facts and Data
• A fact is statement of actuality. It involves tangible things as
well as sentiments and feeling in social studies.
• A datum is fact on which reasoning is based and thus serves as
base for analyzing and interpretations.
Sources of Data
Sources of Data

Primary Secondary

Experime Observa
nt
Survey tion

Compl Case Sample


ete study survey
Methods of Collecting Data
• Observation Method
• Interview Method
• Questionnaire method
• Schedule
• Other Methods
1. Warrant cards
2. Distributors Audits
3. Pantry Audits
4. Consumer Panels
5. Using mechanical device
6. Projective Techniques
7. Depth Interviews
8. Contents Analysis
Observation Methods
• Behavioral sciences
• Investigators own direct observation without asking any
questions to respondents
• It deals with current happening not with past behaviour
• It is independent of respondents behavior
• Limitations
• Expensive
• information provided in this method is limited
• Unforeseen factors also affect the methods
Experimental Methods
• It is applied with a good deal of success in
certain cases to measure a group of factors
which operate as a social programme.
• Example: Impact of modern technology on the
behavior of farmers (with and without
situations).
• Teaching on certain issues: With exhibition
and without it. With television and without it.
• The methodology of an experimental in nature
has not penetrated far into the social sciences.
Survey methods
• It is widely used technique.
• Economic Survey was first introduced in UK.
• Prof. G.F. Warren experimented his systematic study which is
published in 1911.
• Survey defined by Campbell and Katona as “Many research
problems require the systematic collection of data from
populations or samples of population through the use of
personal interviews or other data gathering devices. The study
are usually called surveys, especially when they are concerned
with large or widely dispersed groups of people. When deal
with only a fraction of a total population, a fraction
representation of the total, they are called sample surveys”.
Characteristics of Survey Methods
• It gets response directly from respondents
• It is representative sample of population.
• It provides maximum information for a given
amount of effort, time and expenditure.
• It is conducted in natural environment
Types of Survey
• Complete Enumeration: study of all individual
in the universe
• Case studies: Intensive investigation and
analysis of individuals or families
Characteristics of Case study
Definitions: it is intensive study of all details of the
domestic life of few carefully chosen families. To work it
well requires a rare combination of judgment in selecting
cases and or insights and sympathy in interpreting them.
According to Palmer, a case study characterize,
• Which are common to every individual
• Variation of these commons attribute the characteristics of
groups
• Other characteristics which belong uniquely to the
individuals
Sample Survey
• It is the study of the sample of whole population which
provides information which could be generalized by
use of adequate sampling criteria and with the aid of
statistical methods.
Types of Sample Surveys:
• Non-Controlled: it is employed as an exploratory
technique.
• Controlled: Standardization of Observational methods
• Formulated hypothesis
• Prepare questionnaire
• Select a sample to be studied.
• Seeks formal answers to the questions
Survey Procedures
Framing a questionnaires
• A set of questions to be answered by the informant without the
personal aid of an investigator or enumerator.
Advantage of mailed questionnaires:-
• Economical
• Convenient
• Standardized words
Drawbacks:-
• Not sure about our sample of information
• Adequate replies.
Schedules:
A Schedules is a data recording devices where the interviewer fills
up the form.
Difference between Questionnaires and
Schedule
Sl no Questionnaire Schedule

1 Filled by respondent Filled by interviewer

2 More economical Travel expenses

3 Certain degree of Secrecy No Secrecy

4 Possible not be collect Possible


needed information
5 Not sure to be returned Sure

6
Collection of Data Through Questionnaire

• Main Aspects of Questionnaire


1. General form: Structured or Unstructured
2. Closed or open ended
3. Measurement vrs. Categorical questions
Interview Methods
• Personal Interview: Face to Face contact
• Structured interview: predetermined questions and
standardized techniques of recording
• Unstructured interview: not a systematic predetermined
questions.
• Focused interview: based on respondents experience and its
effects
• Clinical Interview: feeling or motivation or with the course of
individuals life experience
• Non-Directive interview: No Direction from the interviewer
Collection of Secondary Data
• Published data of various publications of central, state, and
local governments
• International bodies, UNO, UNDP, ILO, IMF, and other
national Govts.
• Technical and trade journals
• Books, magazines and newspapers
• Reports and publications of various associations connected
with business, industry, banks, stocks exchange.
• Reports prepared by scholars, researchers, universities,
institutes.
• Public records and statistics, historical documents and other
sources of published information.
• Internet, E-journals, E-database
Characteristics of Secondary data
• Reliability of data: who, when, what sources, was proper
method applied, any bias of the compiler, what level of
accuracy.
• Suitability of Data: one enquiry may not be good for another
enquiry.
• Adequacy of Data: level of accuracy is inadequate, then
researcher should not be used.
Selection of Appropriate Methods
Following factors must be kept in mind
• Nature, scope and objective of enquiry
• Availability of funds
• Time factors
• Precision required
Reference
• Kothari, C.R. (2004), Research methodology,
New age International Publications, New
Delhi.
• Bryman, Alan (2009), Social Reserch Methods,
OUP, New Delhi.
• Kumar, Ranjit (2014), Research Methodology:
A Step by Step Guide for the Begginers, Sage
Publication, New Delhi.
Introduction to Stata
What Shall We Cover
• Introduction to Stata
• Data Entry, File creation, saving and reopen
• Data Processing:
1.Data Validation for both categorical and Measurement data
2.Data Manipulation
3.Data Tabulation
4.Data Interpretations
• Data Analysis
1. Descriptive Statistics (three commands)
2. Modelling of Time series (Regression, Panel Regressions) and Cross
Section analysis (Dummy, Logit and Probit analysis)

• Use of large scale data such as NSS, Census and NFHS

• Examination Pattern: MT-II, MT-III (Term Paper), End-Semester


Examination.
Introduction
• It is a multi-purpose statistical package to help you
explore, summarize and analyze datasets.
• A dataset is a collection of several pieces of information
called variables (usually arranged by columns). A variable
can have one or several values (information for one or
several cases).
• Statistic package developed by Stata Corporation
• Forms of Stata
• Stata Intercooled (IC)
• Small
• Extended (Special edition)
• Types of Windows
• Command/ Review window, Variable Window, Output
window, Data editor/browser window, Do File Editor
Comparison of Stata
Features Stata SPSS SAS R Excel
Learning Steep/ Gradual/Flat Pretty steep Pretty steep Easy
curve Gradual

User Programmin Mostly point Programmin Programmin Programmin


Interface g-Point and and click g g g and point
click

Data Very Strong Moderate Very Strong Very Strong Moderate


Manipulatio
n

Data Powerful Powerful Powerful/ Powerful/ moderate


Analysis Versatile Versatile

Graphics Very Good Very Good Good Excellent Good

Costs Affordable Expensive Expensive Free Along with


MS Office
Manuals of Stata
• Manual of Stata (16 volumes)
• Stata Getting Started: Operating System
• Stata Users Guide: Command more General
• Stata base References Manuals (four
Volumes): details on command and help files
• Stata Graphic Manual Reference (Specialized
manuals)
• Stata Programming Reference manuals
Reading Materials
• Hamilton (2004)
• Kohler and Kreuter (2004)
• Hills and De Stavola (2002)
• Saphia Rabe-Hesketh, Brain Everitt (2003), A Handbook of
statistical Analysis Using Stata, Chaman and Hall/CRC
• Lang and Frees (2003), Regression Model Categorical dependent
Variable using Stata
• Clevel, Gould and Gutiereerd (2004): An Introduction to survival
Analysis Using Stata
• Hardin and Hilbe (2001), Generalised Linear Model and Extension
• www. Stata.com/bookstore/statabooks.html
• Through Internet: FAQ
• Concept wise:
• Oscar Torres-Reyna, Data Consultant, [email protected]
How to write Syntax
• Put help language
• [by varlist:] command [varlist] [=exp] [if] [in] [weight] [using filename] [,
options]
• [by varlist:] instruct the stata to repeat the command for each
combination of values in the list of variables varlist
• Command is the name of the command
• Varlist is the list of variables
• =exp is the expression
• [If exp] restrict the command to the subset of the observation that
satisfies a logical exp
• [In range] restrict the command to those observations whose indices lie
in a particular range
• [weight] allows weight to be associated with observation
• [using] specify the filename to be used
• [,options] is only needed if options are used
Stata Commands
• For loading (or importing) and saving in main memory: use,
infile, insheet, infix, save, outfile, outsheet
• Data Manipulation: generate, egen, edit, sort, recode, xtile,
pctile
• Tabulation: tab, summarize, table, tabstat
• Combining data into two files: append, merge, mmerge,
xmerge
• Command on reshaping: reshape, compress, collapse, separate
• For controlling working environment of Stata: log, cmdlog,
more, for, cd dir, type, shell, mkdir, copy, erase, help, search,
view
• Auxiliary information: label, notes, rename
• Displaying status of data: describe, inspect, cf, compare,
browse, list, count
1. Data Entry and Creation of Files
• Three types of files are in stata:
• Data File (.dta)
• Command file (.do)
• Output file (.log)
• How to Create a Data File
• How to enter the data
• Rename the variables
• Label the variables (See help menu)
• Label define the variables
• Label value the variables
• Save the file
• Locate the file in the disk D/E/F drives.
Command and Output File
• How to open a command/do file
• Cmdlog using table1.do
• How to close a command file
• Cmdlog c
• How to open a output file
• Log using table1.log
• How to close a output file
• Log c
Knowing about the data File
• Use of basic Commands
• Describe, List, Codebook, label list, di _N, browse for the file
• Summarize for the measurement variables
• Tab and histogram for categorical variables
• Use of other graphs like bar dot, histo, pie and box

• Open the existing do.file


• Cmdlog using table1.do, append
• save the existing file

• Open the existing output file


• Log using table1.log, append
• Open the existing dta.file

• Open the existing Data file


• Use table1
• Shifting to another data file
• Use table2, clear
2. Data Processing in Stata
• Data validation: summarize for measurement
variables, tab for categorical variables.
• Data manipulation: generate, merge, reshape,
egen, append, by, collapse, xtile, sort, recode,
pctile.
• Data Tabulation: Tab, table and tabstat
• Data Interpretation: descriptive vrs models.
2.1.Data Validation
1. Validation of Data: Check points, Data coding,
Convert open end to close end, Types of
variables
a. Categorical variable
b. Measurement Variable
2.2. Data Manipulation
1. Generation of New Variables
• Generating a variable
• Gen pcl=land/fsize
• Grouping of Measurement Data
• Grouping with Cut Points
1. Gen sizegr=recode(size, 5,7,12)
Label define sizegr 5 “small” 7 “medium” 12 “large”
2. Equal width
Gen sizegrl = autocode(fsize, 3, 0,9)
3. Equal frequency
Xtile sizegr = size, nq(4)
2.2. Data Manipulation
To understand more on nature of Households
• Create a new file as table2
• Create gender, edu, age, occupation

2. Use of merge command


• Master file vrs. Working file
• Use table1
• Sort hhno
• Save, replace
• Use table2, clear
• Sort hhno
• Merge hhno using table1, keep(caste religion fsize land)
• Save, replace
2.2. Data Manipulation
3. Use of Collapse Command
• In the table2, collapse fsize and income
• Save in temp.files and take to table1.
• Merge it in table1
• Then use collapse
• Collapse (sum) income, by (hhno)
• Collapse (count) fsize, by (hhno)
4. Use of egen Command
• Within the table2, we can collapse some of the variables by using
egen command
• egen new var= event(var), by (aspect)
• egen totalincome= sum(income), by (hhno)
• Then, use table command for understanding the relationship
between categorical and measurement variable.
• duplicates drop hhno, force
2.2. Data Manipulation
5. Use of by Command
• sort education
by education: tab caste gender, row
• by occupation: tab caste religion
6. Use of xtile Command
• xtile agegr = age, nq(5)
2.3. Data Tabulation
1. Frequency Distribution between Categorical with
Categorical Variable
• tab caste religion
• tab caste religion, row col
• Test of Association
• tab caste religion, row col chi2

2. Descriptive Statistics between Categorical with


Measurement variable(s)
• table caste, c(mean land sd land min land max land) row
f(%5.2f)

3. Descriptive Statistics among Measurement variable(s)


• tabstat land age income, s(mean sd min max) f(%5.2f)
2.3. Data Tabulation
4. tab and table
• Use of by
• by gender: summarize income
• by income: tab gender
• by income, sort: tab gender
• by gender income, sort: summarize age
• By caste: table edu gender, c(mean age)
f(%5.2f)
Use of Large Scale data
• Introduction to NSS Data
• Introduction to NSS Data 69th Round
• How 69th Round Data is different from small data
• How to feel the Data (Describe, List, Codebook, label
list, di _N, browse for the file)
• Use of few more command (if, in, weight, recode,
xtile, regress, factor)
• Descriptive Statistics
• Inequality Analysis
• Regression Analysis: Multiple regression, Dummy
and Logit regression
• Factor Analysis or Indexing
Inequality Measurement on Consumption
Expenditure
• xtile mpce_qt_r= mpce [w=weight] if
sector==1, nq(5)
• xtile mpce_qt_u=mpce [w=weight] if
sector==2, nq(5)
• gen mpce_qt2=mpce_qt_r if sector==1
• replace mpce_qt2=mpce_qt_u if sector==2
• drop mpce_qt_r mpce_qt_u
• tab mpce_qt2
Component of The Term Paper
• Based on the survey data, fin out the
following.
• Descriptive statistics
• Inequality Measurement: Deciles, Quintiles
• Regression Analysis: Multiple Regression,
Dummy (Anova and Ancova)
• Logit Regression: Poverty line estimation and
analysis
• Multi Dimensional Poverty estimation
• Indexing: Factor Analysis
Regression Analysis
• Time Series Analysis
• Multiple Regession:
• regress pcexp expdur expnondur expservice
• Double log model
• Gen lpcexp=log(pcexp)
• regress lpcexp lexpdur lexpnondur lexpservice
• Log-lin Model
• Regress lpcexp year
• Use keep command
• lin-log model
• Regress pcexp lexpdur

• Structural change model


Regression Analysis
• Regression Analaysis
• regress mpce hhnntotal [w=weight]
• regress mpce hhnntotal [w=weight] if state==9
• regress mpce hhnntotal [w=weight] if district==927
• regress mpce hhnntotal [w=weight] if district==927 & sector==2

• regress mpce hhnntotal new_edu_male [w=weight]


• regress mpce hhnntotal new_edu_male [w=weight] if state==9
• regress mpce hhnntotal new_edu_male [w=weight] if district==927
• regress mpce hhnntotal new_edu_male [w=weight] if district==927 & sector==2

• regress mpce hhnntotal new_edu_male new_edu_female [w=weight]


• regress mpce hhnntotal new_edu_male new_edu_female [w=weight] if state==9
• regress mpce hhnntotal new_edu_male new_edu_female [w=weight] if
district==927
• regress mpce hhnntotal new_edu_male new_edu_female [w=weight] if
district==927 & sector==2
Regression Analysis
• recode landpossessed 1=0.005 2=0.02 3=0.21 4=0.41 5=1.01
6=2.01 7=3.01 8=4.01 10=6.01 11=8.01 12= 10, gen(new_land)
• summa new_land

• regress mpce hhnntotal new_edu_male new_edu_female


new_land [w=weight]
• regress mpce hhnntotal new_edu_male new_edu_female
new_land [w=weight] if state==9
• regress mpce hhnntotal new_edu_male new_edu_female
new_land [w=weight] if district==927
• regress mpce hhnntotal new_edu_male new_edu_female
new_land [w=weight] if district==927& sector==2
Post-Mortem (Multi-colinearity, Hetero-
scadasticity, Auto-correlation)
• Hetero-scadasticity
• hettest
• imtest
• hettest mpce (variable wise)

• Multi-colinearity
• corr mpce hhnntotal new_edu_male
new_edu_female new_land [w=weight]

• Auto-correlation
• vif (variance influential factor)
Dummy Regression
• Anova Model:
• Create a dummy variable, by using
• Does gender play any discriminatory role ?
• Tab gender, gen(gender)
• Table gender, c(mean mpce, sd mpce min mpce max mpce)
• regress mpce gender1 [w=weight]
• regress mpce gender1 [w=weight] if state==9
• regress mpce gender1 [w=weight] if district==927
• regress mpce gender1 [w=weight] if district==927 & sector==2
• regress mpce gender1 [w=weight] if district==927 & sector==1
• Does Caste plays any discriminatory role ?
• Tab caste, gen(caste)
• Table caste, c(mean mpce, sd mpce min mpce max mpce)
Dummy Regression
• Ancova M
• How gender and family size impact on the MPCE?
• regress mpce gender1 hhnototal [w=weight]
• regress mpce gender1 hhnototal [w=weight] if state==9
• regress mpce gender1 hhnototal [w=weight] if
district==927
• regress mpce gender1 hhnototal [w=weight] if
district==927 & sector==2
• regress mpce gender1 hhnototal [w=weight] if
district==927 & sector==1
• How Caste and Family Size impact on the MPCE?
• Tab caste, gen(caste)
• regress mpce caste1 caste2 caste3 hhnototal [w=weight]
How to estimate poverty line
• See Rangarajan Commitee Report, p.4
• Rs.972 for rural areas and Rs.1407 for urban areas
• recode mpce (0/972=1) (972.01/174286=2) if sector==1 , gen (pov_r)
• recode pov_r (. = 0)
• recode mpce (0/1407=1) (1407.01/174286=2) if sector==2 , gen (pov_u)
• recode pov_u (. = 0)
• gen pov_i= pov_r + pov_u
• label var mpce "Monthly Per Capita Expenditure"
• label var pov_r "Poverty in Rural Sector"
• label var pov_u "Poverty in Urban Sector"
• label var pov_i "Poverty in Both Sector"
• label define pov_r 1 "Below Poverty Line" 2 "Above Poverty Line"
• label values pov_r pov_r
• label define pov_u 1 "Below Poverty Line" 2 "Above Poverty Line"
• label values pov_u pov_u
• label define pov_i 1 "Below Poverty Line" 2 "Above Poverty Line"
• label values pov_i pov_i
How to Estimate Logit Model
• Caste and Household Size
• label list caste
• recode caste 1/3=1 9=2, gen(new_caste)
• tab new_caste
• tab new_caste, gen (new_caste)
• logit pov_i1 new_caste1
• logit pov_i1 new_caste1, or
• logit pov_i1 new_caste1 hhnototal
• logit pov_i1 new_caste1 hhnototal, or

• Male Education
• label list highestedumale
• recode highestedumale 1/6=1 7/10=2, gen(new_highestedumale)
• tab new_highestedumale
• tab new_highestedumale, gen (new_highestedumale)
• logit pov_i1 new_caste1 hhnototal new_highestedumale
• logit pov_i1 new_caste1 hhnototal new_highestedumale, or
How to Estimate Logit Model
• Female Education
• label list highestedufemale
• recode highestedufemale 1/6=1 7/10=2, gen(new_highestedufemale)
• tab new_highestedufemale
• tab new_highestedufemale, gen (new_highestedufemale)
• logit pov_i1 new_caste1 hhnototal new_highestedumale new_highestedufemale
• logit pov_i1 new_caste1 hhnototal new_highestedumale new_highestedufemale, or

• Occupation
• recode occupation 1/2=1 3=9 4=2 5=3 6=4, gen(new_occupation)
• recode new_occupation 1/2=1 3/9=2, gen(new_occupation1)
• tab new_occupation1
• label define new_occupation1 1"agriculture" 2"non_agriculture"
• label value new_occupation1 new_occupation1
• tab new_occupation1
• logit pov_i1 new_caste1 hhnototal new_highestedumale new_highestedufemale
new_occupation1
• logit pov_i1 new_caste1 hhnototal new_highestedumale new_highestedufemale
new_occupation1, or
How to Estimate Logit Model
• Religion
• tab religion
• label list religion
• recode religion 1=1 2/9=2, gen(new_religion)
• logit pov_i1 new_caste1 hhnototal new_highestedumale new_highestedufemale
new_occupation1 new_religion
• logit pov_i1 new_caste1 hhnototal new_highestedumale new_highestedufemale
new_occupation1 new_religion, or

• Land Possession
• tab landpossessed
• recode landpossessed 1/5=1 6/12=2, gen(new_land)
• label define new_land 1"marginal" 2"others"
• label value new_land new_land
• tab new_land
• logit pov_i1 new_caste1 hhnototal new_highestedumale new_highestedufemale
new_occupation1 new_religion new_land
• logit pov_i1 new_caste1 hhnototal new_highestedumale new_highestedufemale
new_occupation1 new_religion new_land, or
How to Estimate Logit Model
• Status of Dwelling
• tab tstatusdwelling
• label list tstatusdwelling
• recode tstatusdwelling 1=1 2/9=2, gen(new_tenure)
• label value new_tenure new_tenure
• label define new_tenure 1"owned" 2"others"
• label value new_tenure new_tenure
• logit pov_i1 new_caste1 hhnototal new_highestedumale
new_highestedufemale new_occupation1 new_religion
new_land new_tenure
• logit pov_i1 new_caste1 hhnototal new_highestedumale
new_highestedufemale new_occupation1 new_religion
new_land new_tenure, or
How to estimate Multi Dimensional Poverty
Example of MPI calculation using Hypothetical data
Indicators Household Weights
1 2 3 4 5
Household Size 3 7 6 5 4
V1: 0 0 0 0 0 (1/4)*(1/2)= 0.125
The Household has any undernourished (BMI
<18.5) ever married women (15 – 49 years)

V2: The non- salaried household does not have any 1 0 0 1 0 (1/4)*(1/2)= 0.125
health insurance
Education
V3: No one has completed five years of schooling 0 1 0 1 1 (1/4)*(1/2)= 0.125
(15 years and above)
V4: At least one school- age child not enrolled in 0 0 1 0 0 (1/4)*(1/2)= 0.125
school (6 – 14 years of age)
Economic Dimension
V5: If the household falls below the consumption 0 1 0 0 0 (1/4)*(1/2)= 0.125
expenditure threshold limit
V6: Any member in the household (15+) has not 1 1 0 1 0 (1/4)*(1/2)= 0.125
worked 183 days or more in the year preceding
the survey
Household Environmental Condition
V7: No access to clean drinking water 0 0 1 1 0 (1/4)*(1/2)= 0.125
V8: No access to improved sanitation 1 1 0 0 0 (1/4)*(1/2)= 0.125
Results
Weighted count of deprivation, c(sum of each
deprivation multiplied by its weight)
Is the household poor (c>0.250) Yes Yes No Yes No
Note: 1 indicates deprivation in the indicator; 0 indicates non- deprivation
How to estimate Multi Dimensional Poverty
• Weighted count of deprivation in household 1:
• c = (1*0.125) + (1*0.125)+ (1*0.125) = 0.375
• Head Count Ratio:
• H= = 0.60
• (60 % of the population are multidimensional
poor)
• Intensity of Poverty:
• A= = 0.475
• (The average poor person is deprived in 47.5 % of
the weighted indicators)
• Multidimensional poverty index:
• MPI= H*A= 0.60*0.475 = 0.285
NSSO: An Overview

1st to 66th Rounds

Dr. Sanatan Nayak


Professor,
Deptt of Economics
BBAU, Lucknow-25
Background
The National Sample Survey (NSS) which came into existence
in the year 1950, is a multi-subject integrated continuing
sample survey programme launched for collection of data on
the various aspects of the national economy required by
different agencies of the Government, both Central and States.
Ministry of Statistics & Programme Implementation
These surveys are conducted in the form of rounds extending
normally over a period of one year though in certain cases the
survey period was six months.
The organization has already completed 66 such rounds and
the 67 round survey is in progress.
The “Glossary of Technical Terms used in National Sample
Surveys” was first brought out in 1981.
It was found to be of immense use in promoting
standardization of the terms used up to the 35 round survey.
Subjects brought under the coverage
• (1) Household surveys on socio-economic
subjects
• (2) Surveys on land holding, livestock and
agriculture
• (3) Establishment surveys, and enterprise
surveys
• (4) Village surveys
Household surveys on socio-economic
subjects

• Population, birth, death, migration, fertility,


family planning, morbidity, disability,
• employment & unemployment,
• agriculture and rural labour, household
consumer expenditure, debt, and
investment, savings, construction,
• capital formation, housing condition and
utilization of public services in health,
education and other sector etc
Surveys on land holding, livestock and
agriculture
• land holding,
• land utilisation,
• livestock number,
• product and livestock enterprises
Establishment surveys, and enterprise
surveys
• Medium and small industrial establishments and
own-account enterprises not covered by the Annual
Survey of Industries (ASI),
• Surveys on other non-agricultural enterprises in the
unorganized sector and
• Collection of rural retail prices from markets and
shops in rural areas belong to the third category.
Village surveys
• on the availability of infrastructure facility in Indian
villages
Ad-hoc surveys and pilot enquires for
methodological studies
• Surveys on small and medium irrigation projects
• Rural electrification,
• Railway travel,
• Pilot enquiries on employment-unemployment,
• Construction activities,
• Living condition of tribals,
• Estimation of catch of fish from inland water, etc
Decadal Programme of NSSO
NSS has now drawn up a ten-year programme for the
conduct of socio-economic surveys in 2000-2001.
(i) employment-unemployment, and consumer
expenditure
(ii) unorganised enterprises in non-agricultural sectors
(iii) population, births, deaths, disability, morbidity,
fertility, maternity & child care, and family planning
(iv) land holdings and livestock enterprises
(v) debt, investment and capital formation
Survey Nature
• (i) and (ii) are to be taken up quinquennially1
• The remaining three groups of subjects i.e., (iii), (iv)
and (v) decennially.
• Each survey extends over a period of a few months
or a year which is termed a round.
• Till the thirteenth round (1957-58), the period of a
round varied from three to nine months.
• Since the fourteenth round (1958-59), each round
has generally been of one year's duration spread
over the agricultural year July to June.
Seasonality and Glossary
• Seasonality is a factor to be reckoned within data collection.
• The survey period of one year is divided into four or six equal
sub-periods called sub-rounds.
• Normally an equal number of representative sample villages
and urban blocks are allotted to each sub-round in such a
manner as to obtain valid estimates for each sub-round.
• NSSO used large number of technical terms and concepts
were documented and published in January 1980 issue of
Sarvekshana for the first time and later released as a
“Glossary in 1981.
• This document is confined to socio-economic topics and
excludes terms used in the Annual Survey of Industries,
price-collection work and crop surveys.
General Description of NSSO
• SAMPLING DESIGN
• SAMPLING UNIT
• Villages and urban blocks are First Stage Sampling units
(FSU) in rural and urban areas respectively.
• The second or ultimate stage sampling units (SSU or USU)
are households for household .
• DOMAIN OF STUDY
• In the NSS, the domains of study are usually rural and urban
areas within a zone, state, region or district. For example, for
rural labour enquiry in the 29th round only the rural labour
population within each region was the domain of study.
DOMAIN OF STUDY
Country/State Code State Code
All – India 01 Nagaland 18
Andhra Pradesh 02 Orissa 19
Arunachal Pradesh 03 Punjab 20
Assam 04 Rajasthan 21
Bihar 05 Sikkim 22
Goa 06 Tamil Nadu 23
Gujarat 07 Tripura 24
Haryana 08 Uttar Pradesh 25
Himachal Pradesh 09 West Bengal 26
Jammu & Kashmir 10 Andaman & Nicobar Islands 27
Karnataka 11 Chandigarh 28
Kerala 12 Dadra & Nagar Haveli 29
Madhya Pradesh 13 Daman & Diu 30
Maharashtra 14 Delhi 31
Manipur 15 Lakshadweep 32
Meghalaya 16 Pondicherry 33
Mizoram 17 *as per 1991 census
Region of the Country
• Regions are hierarchical domains of study below the level of
State/ Union Territory in the NSS.
• No region was formed during the first three rounds.
• From 4th to 10th and 13th to 15th rounds of NSS, 52 natural
divisions of 1951 population census.
• During the 16th and 17th rounds 48 regions were formed. The
survey on land holdings in consultation with the Central
Ministry of Food & Agriculture and the State Statistical
Bureaus.
• In 1965, 64 regions were formed in consultation with
different Central Ministries, Planning Commission, Registrar
General and State Statistical Bureaus.
• These regions were in use up to the 31st round. This set of
regions was revised during 1977 .
Region of the Country
• Total number of regions were increased to 73 in
consideration of the changed conditions.
• This revised set of regions was in use during 32nd
and 35th round.
• The total number of regions went up to 77 during
36th to 43rd rounds after the State/ Union Territories
of Sikkim, Andaman & Nicobar Island, Dadra and
Nager Haveli and Lakshadweep were covered in
NSS from is 36th round.
• From NSS 44th round, total number of regions
became 78 after Goa was declared a separate
state.
REGION CODE

• Regions are assigned 3 digited codes termed as


SR (State Region) code where the first two digits
indicate State/ Union Territory and the third
indicates region number within a State/ Union
Territory.
• The composition of regions (used for selection of
samples in NSS 49th round) and their SR codes are
shown in the Annexure 2.
RURAL AND URBAN AREAS
• The required information is available with
the Survey Design and Research Division of
the NSSO.
• The lists of census villages as published in
the Primary Census Abstracts (PCA)
constitute the rural areas,
• The lists of cities, towns, cantonments, non-
municipal urban areas and notified areas
constitute the urban areas.
URBAN AREA
• The urban area of the country was defined in 1971 census as
follow:
• all places with a Municipality, Corporation or Cantonment and
places notified as town area
• all other places which satisfied the following criteria
• a minimum population of 5000,
• at least 75 percent of the male working population are non-
agriculturists, and
• a density of population of at least 390 per sq. km.
• The definitions of urban area adopted for 1981 and 1991
Censuses were the same as those for 1971 Census.
• I n 1991 Census, a density of at least 400 persons per sq. km.
RURAL AREA

• The rural sector covers


• whole villages as well as part villages
• A village includes all its hamlets.
• When part of a revenue hamlet is treated as
urban area,
• the rural part of the revenue hamlet is
termed as part village.
FORMATION OF STRATA
• The objective of stratification in NSS is to
• increase efficiency of the survey design
• ensure administrative and operational convenience.
• Village strata:
• The strata relating to the first stage units (villages and urban
blocks) are geographical areas.
• Up to the 27th round, the number of strata formed within a
State or U.T. was usually half the number of investigators in
the respective State or U.T.
• Due to the increasing demand for district-wise estimates,
the districts are treated as the ultimate strata since the
28th round of the survey.
Urban Strata
• It is a district or group of districts within the same region.
• The above procedure of stratification continued up to 37th
round of NSS.
• The same procedure is being continued since 38th round for
the rural areas with the change that the cut-off point of 1.5
million rural population.
• It has been raised to 1.8 million rural population according to
1981 Census
• Again increased to 2.0 million according to 1991 Census for
the purpose of deciding whether the district will be divided
into more than one stratum or not.
Strata Conti…….
• In the 54th round, at first the following three
special strata (namely, strata types 1, 2 and 3)
were formed.
• Stratum 1 : uninhabited villages ( as per 1991
Census)
• Stratum 2 : villages with population 1 to 50
(including both the boundaries)
• Stratum 3 : villages with population more than
15,000. d at the level of each State / UT:
Strata Cont….
• 44th Round of NSSO

Stratum no. Population size class ST population Construction level


of towns * group** type***
1 P < 0.5 A -
2 P < 0.5 B -
3 0.5 ≤ P < 2 A -
4 0.5 ≤ P < 2 B -
5 2 ≤ P< 10 - (i)
6 2 ≤ P< 10 - (ii)
7 P ≥ 10 - (i)
8 P ≥ 10 - (ii)

• P stands for population of the town in lakhs,


• ** A : towns with significant ST population, and
• B : other towns
• *** (i) : UFS blocks falling in areas with high level of building
construction activity, and
• (ii) : others UFS blocks.
Strata Cont…..
• During 40th to 49th rounds of NSS excepting 42nd, 47th and 48th
rounds, rural / urban strata so formed were further divided
into a number of ‘sub-strata’ or ‘ultimate strata’ taking
different types of auxiliary information for each village / block
into consideration.
• For example, sub-strata were formed in 40th, 41st, 45th and 46th
rounds (surveys on manufacturing and trade) by grouping
villages/ blocks into a few categories by looking at whether
they have different types of manufacturing / trading
enterprises or not.
• In the 51st round, if any district had a small number of
manufacturing enterprises, it was clubbed with the
neighbouring districts, within the same NSS region to form a
rural stratum to ensure minimum allocation of 8 villages at
the stratum level as far as possible.
Strata Cont …..
• In the 51st round
• sub-stratum 1 consisting of villages having at least one DME
(Directory Manufacturing Establishment)
• (b) sub-stratum 2 consisting of remaining villages in the
stratum which had at least one NDME; and
• (c) sub-stratum 3 consisting of all the residual villages.
• In the 53rd round,
• each district was divided into two area types,
• (i) area type 1 consisting of the villages having at least one
NDTE (Non-Directory Trading Establishment)
• (ii) area type 2 consisting of the remaining villages of the
district.
Strata Cont……
• In the 53rd Round,
• In the urban areas, each town class within a district was
divided into two area types, namely, (i) area type 1 consisting
of the UFS blocks classified as ‘bazaar area’ and (ii) area type 2
consisting of the remaining, UFS blocks of the town class.
• In the 54th round,
• In the urban areas, each stratum was divided into 2 sub-strata
as follows:
• Sub-stratum 1: UFS blocks identified as ‘slum area’, and
• Sub-stratum 2: remaining UFS blocks of the stratum.
SAMPLING FRAME FOR THE RURAL FIRST STAGE
UNITS (FSU)
• The decennial Population Census provides a complete list of
villages grouped by tehsils and districts.
• This list is being used as sampling frame for the selection of
villages (rural fsu's).
• The 1941 census frame was used during first three rounds.
• Te 1951 census frame from the 4th to the 17th round.
• The 1961 census frame from the 18th to the 26th round.
• The 1971 census frame from the 27th to the 37th rounds,
• The 1981 Census frame from the 38th to 49th round,
• The 1991 Census frame from the 50th round 55th rounds.
• The 2001Census frame from the 56th round to 66th rounds
Thank You

You might also like