0% found this document useful (0 votes)
50 views17 pages

Week #2

This document provides an overview of several commonly used statistical software programs: - SPSS is widely used for complex statistical analysis and easily produces descriptive statistics and graphs. It allows for more accurate data analysis and reporting. - SAS is used widely in industry and social sciences for tasks like data manipulation, storage, and reporting. Coding can be difficult for new users. - R is a free and open-source software used for statistical analysis and graphics. It allows for large data analysis and leverages toolboxes for specific applications. - Stata enables data analysis, management, and visualization without coding knowledge. It has both command line and graphical interfaces. - MATLAB provides an analytical platform and programming language for matrix/array

Uploaded by

Muntaha Fawad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views17 pages

Week #2

This document provides an overview of several commonly used statistical software programs: - SPSS is widely used for complex statistical analysis and easily produces descriptive statistics and graphs. It allows for more accurate data analysis and reporting. - SAS is used widely in industry and social sciences for tasks like data manipulation, storage, and reporting. Coding can be difficult for new users. - R is a free and open-source software used for statistical analysis and graphics. It allows for large data analysis and leverages toolboxes for specific applications. - Stata enables data analysis, management, and visualization without coding knowledge. It has both command line and graphical interfaces. - MATLAB provides an analytical platform and programming language for matrix/array

Uploaded by

Muntaha Fawad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

BS-VII (A&B): Fall Semester 2023 Data Analysis through Software’s(PY:405)

What is Statistical Software?


•Statistical software are specialized programs
designed to perform complex statistical
analysis.
•These are the tools that assist in the
Statistical Software, organization, interpretation, and presentation
Introduction to SPSS & of selected data sets to provide science-based
insights into patterns and trends
Data Entry into SPSS
•Statistical software use statistical analysis
theorems and methodologies such as
Correlation, mean differences, regression
analysis, time series analysis to perform data
sciences
1 2

1 2

What Types of Statistical Analysis Software Benefits of Statistical Software


Exist?
• Statistical analysis comes in many different flavors, the •Increases efficiency of the work
ultimate goal of the analysis is broken into three types: •More accuracy in data analysis and
descriptive, prescriptive, and predictive. management
• Descriptive analysis simply looks at the data as it is, but •Less time consuming
does not refer to or recommend any future outcome. •Easy customization
“What happened?”
•Grants access to large database
• Prescriptive analysis uses historical data to recommend a •Reduces sampling error
particular outcome. “What should we do?” •Empowers to make data driven decisions
• Predictive analysis uses historical data to predict future •https://guides.nyu.edu/quant/statsoft
data or outcomes. “What will happen?”
3 4

3 4

Muhammad Usman, NIP, Quaid-i-Azam University, Islamabad 12 October 2023 - Page# 1


BS-VII (A&B): Fall Semester 2023 Data Analysis through Software’s(PY:405)

 SPSS is the most widely used Screenshot of SPSS


powerful software for complex
statistical data analysis.
 It easily compiles descriptive
statistics, parametric and non-
parametric analysis as well as
delivers graphs and
presentation ready reports to easily
Statistical Package for communicate the results.
Social Sciences  More accurate reports are achieved
here through estimation and
uncovering of missing values in the
data sets.
 Widely used in Social Sciences
5 6

5 6

 It is a cloud based platform that provides Screenshot of SAS for Windows


ready to use programs for data
manipulation, information storage and
retrieval.
 Used by administrators for paychecks,
reports, etc. Widely used in industry,
banks, Census Bureau and across a great
Statistical Analysis variety of disciplines from Anthropology to
System, then Strategic Systems Engineering.
Analysis System  Coding can be difficult to those new to this
approach.
 Graphing capabilities great, but harder to
use.
 Requires learning syntax and
programming/syntax rules..
7 8

7 8

Muhammad Usman, NIP, Quaid-i-Azam University, Islamabad 12 October 2023 - Page# 2


BS-VII (A&B): Fall Semester 2023 Data Analysis through Software’s(PY:405)

 ‘R’ software is widely used free statistical Screenshot of R Software


software that provides statistical and
graphical techniques including linear and
non-linear modeling.
 Toolboxes essentially plugins are available
for great range of applications. Knowledge
of coding is required here.
The name R was  It provides interactive reports and
chosen to reflect the applications, leverage large amount of data
creators' first names. and is complaint with security practices and
standards
Ross Ihaka and Robert
 Used in Data Science, Finance and
Gentleman
Economics, Bioinformatics, Marketing

9 10

9 10

 Stata is also a widely used software that Screenshot of Stata for Windows
enables to analyze, manage, store and
produce graphical visualization of data.
 Coding knowledge is not necessary to use
it.
 Presence of both command line and
graphical user interface makes its use more
The name Stata is a intuitive.
syllabic abbreviation
 It is generally used by researchers in the
of the words statistics
field of economics, social sciences and
and data
bio-medicine to examine the data patterns.

11 12

11 12

Muhammad Usman, NIP, Quaid-i-Azam University, Islamabad 12 October 2023 - Page# 3


BS-VII (A&B): Fall Semester 2023 Data Analysis through Software’s(PY:405)

 MATLAB is software that provides an


analytical platform and programming Screenshot of Matlab
language
 It expresses matrix and array mathematics,
plotting of functions and data,
implementation of algorithms, creation of
user interfaces.
MATLAB stands for  Live Editor is also included which creates a
MATrix LABoratory script that combines code, output, and
formatted text in a executable notebook.
 It is widely used by engineer and scientist
and mathematicians
 Specially focused on quality improvement
analytic techniques.

13 14

13 14

 Mini-tab offers both basic as well as fairly Screenshot of Minitab for Windows
advanced statistical analysis tools.
 It can analyze all kinds of data sets,
automates statistical calculations, creations
of stunning visualization
 Mini-tab helps to look at current and past
data to discover trends and patterns,
uncover hidden relationships between
variable, allowing users to focus more on
data analysis.
 It makes it simpler to get the insights of the
data.

15 16

15 16

Muhammad Usman, NIP, Quaid-i-Azam University, Islamabad 12 October 2023 - Page# 4


BS-VII (A&B): Fall Semester 2023 Data Analysis through Software’s(PY:405)

 It is a public domain suite software tool


designed for researchers and public health Screenshot of Epi-info
practitioners of the globe developed by
Centre for disease control and prevention
(CDC)
 It provides easy data entry form and
database construction, and data analyses
Epidemiological with epidemiologic statistics, maps, and
Information graphs for those who may lack an
information technology background.
 It is used for outbreak investigations; for
developing small to mid-sized disease
surveillance systems; as analysis,
visualization, and reporting (AVR)
components of larger systems.
17 18

17 18

 It allows to organize and store qualitative data


for analysis Screenshot of NVIVO
 NVivo is used for the analysis of unstructured
text, audio, video, and image data, including
interviews, Focus Group Discussions (FGD),
surveys, social media, and journal articles.
 Word docs, pdf, video, images, and audio can
also be imported here
 It help the users to organize, analyze and find
insights from structured or qualitative data more
efficiently
 The interface is easy to use and makes it
instantly familiar and intuitive for the user
 NVivo is used for qualitative data analysis
and mixed method research
19 20

19 20

Muhammad Usman, NIP, Quaid-i-Azam University, Islamabad 12 October 2023 - Page# 5


BS-VII (A&B): Fall Semester 2023 Data Analysis through Software’s(PY:405)

 Program for confirmatory factor analysis


and structural equation modeling (SEM).
Screenshot of Lisrel
 Advanced statistical procedure requiring
advanced statistical background and
training.
 The LISREL was first developed by Karl
Joreskog in 1970.
LISREL stands for  Used for structural regression modeling.
LInear Structural Structural equation models are the system
RELation of linear equations.
 LISREL makes it possible to combine the
structural equation and factor analysis, and
it can also generate path diagrams for
structural equations

21 22

21 22

 Easy to use structural equation


modeling program.
AMOS Screenshot
 AMOS is the product of Smallwaters,
Inc.
 Distributed by SPSS and can read
SPSS system files.
Analysis of MOment
Structures  Has graphical user interface as well as
command syntax. Some options only
through commands.
 Windows only version. No Unix or
Macintosh versions.

23 24

23 24

Muhammad Usman, NIP, Quaid-i-Azam University, Islamabad 12 October 2023 - Page# 6


BS-VII (A&B): Fall Semester 2023 Data Analysis through Software’s(PY:405)

Which one should I use?


complexity of your data
Statistical techniques you plan to use
Your programming skills,
Compatibility with other software or databases
INTRODUCTION TO
user interface preferences SPSS
The cost of the software
What are your colleagues and in your field using?
Ease of learning and use
Innovativeness
Graphical capabilities
25 26

25 26

What is SPSS?
SPSS History?
• SPSS is a Windows based program that can be used to • First released in 1968 and was developed by
perform data entry and analysis and to create tables Norman H Nie, Dale H. Bent and C. Hadial Hull
and graphs.
• SPSS is capable of handling large amounts of data and • Since its release, SPSS was under SPSS Inc.
can perform all of the analyses covered in the text and • However in July 28, 2009 SPSS was acquired by
much more. IBM for US$1.2 billion
• SPSS is commonly used in the Social Sciences and in • Versions 17 and 18 were known as PASW
the business world, so familiarity with this program (Predictive and Analytical Software)
should serve you well in the future.
• Version 19 was renamed as SPSS Statistics. It is
• One of the most popular statistical package which can now officially named "IBM SPSS Statistics".
perform highly complex data manipulation and
analysis with simple instructions.
27

27 28

Muhammad Usman, NIP, Quaid-i-Azam University, Islamabad 12 October 2023 - Page# 7


BS-VII (A&B): Fall Semester 2023 Data Analysis through Software’s(PY:405)

 SPSS designed to cater for your every statistical need.


SPSS Key Features  It carried out analyses that take a couple of days by hand in a
matter of nano seconds. Here’s a list of what SPSS can and
• Built on GUI technology, very user friendly cannot do:
• Can start with point and click with mouse and easily move to  SPSS will:
building command strings that can be saved to files for  Do complex statistical analyses in a few seconds
repeated use later.  Produce graphs
 Save your time
• Spreadsheet view of data  Do anything statistical that you ask it to do.
• Powerful software. Can do both menu and command
programs SPSS won’t:
 Decide what test needs to be done in a given
• Easy and good cross-platform and cross-application file
sharing. circumstance.
 Decide what appropriate graph is needed for your data
• SPSS mainframe version is the largest and most powerful Although SPSS is very powerful tool, it works only with the limits
version. It can analyze millions of records and thousands of of your own statistical knowledge. In many respects it is
variables rather stupid because if you provide it with data and ask it to run
• Easy to use and easy to misuse a completely meaningless analysis, it will happily oblige.
Therefore, you still need to used your brain to initiate the correct
29
analysis 30

29 30

SPSS ENVIRONMENT
Graphical Components
SPSS Provide a broad range of:  Windows
 Data Editor Window
 Data Manipulation  Variable View

 Data Transformation  Data View


 Viewer/Output Window
 Graphical Representation of Data  Syntax Window
 Chart Editor Window
 Advance Statistical Capabilities
 Menus
 Interactive Help System
 Toolbars
 Handling a massive data set  Dialogue Boxes
 Operating System Compatibility

31

31 32

Muhammad Usman, NIP, Quaid-i-Azam University, Islamabad 12 October 2023 - Page# 8


BS-VII (A&B): Fall Semester 2023 Data Analysis through Software’s(PY:405)

Screenshots of SPSS Window


VARIABLE VIEW
Spreadsheet-like system for defining, entering, editing, and displaying Variable View contains descriptions of the attributes of each variable
variable & data. Extension of the saved file will be “.sav” in the data file.
In Variable View:
Variable View Data View  Rows are variables.
 Columns are variable attributes.
You can add or delete variables and modify attributes of variables,
including the following attributes:
 Variable name
 Variable type
 Number of digits or characters
 Number of decimal places
 Descriptive variable and value labels
 User-defined missing values
 Column width
 Measurement level
 Role
All of these attributes are saved when you save the data file.
33

33 34

Data View window


DATA VIEW • The Data View window
Many of the features of Data View are similar to the features that are This sheet is visible when you first open the Data Editor
found in spreadsheet applications. There are, however, several and this sheet contains the data
important distinctions: • Click on the tab labeled “Variable View”
1. Rows are cases. Each row represents a case or an observation.
2. Columns are variables. Each column represents a variable or
characteristic that is being measured.
3. Cells contain values. Unlike spreadsheet programs, cells in the
Data Editor cannot contain formulas.
4. The data file is rectangular. The dimensions of the data file are
determined by the number of cases and variables. You can enter data
in any cell. If you enter data in a cell outside the boundaries of the
defined data file, the data rectangle is extended to include any rows
and/or columns between that cell and the file boundaries.

35 36

Muhammad Usman, NIP, Quaid-i-Azam University, Islamabad 12 October 2023 - Page# 9


BS-VII (A&B): Fall Semester 2023 Data Analysis through Software’s(PY:405)

Variable View window Variable View window: Type


• This sheet contains information about the data set that is stored
with the dataset • Type
• Name • Click on the ‘type’ box. The two basic types of
• The first character of the variable name must be alphabetic variables that you will use are numeric and
• Variable names must be unique, and have to be less than 64 string.This column enables you to specify the
characters. type of variable.
• Spaces are NOT allowed.

37 38

Variable View window: Width Variable View window: Decimals


• Decimals
• Width • Number of decimals
• Width allows you to determine the number of • It has to be less than or equal to 16
characters SPSS will allow to be entered for the
variable
• Max. display width of variable may be 41

3.14159265

39 40

Muhammad Usman, NIP, Quaid-i-Azam University, Islamabad 12 October 2023 - Page# 10


BS-VII (A&B): Fall Semester 2023 Data Analysis through Software’s(PY:405)

Variable View window: Label Variable View window: Values


• Label • Values
• You can specify the details of the variable • This is used and to suggest which numbers
• You can write characters with spaces up to 256 represent which categories when the
characters variable represents a category

41 42

Defining the value labels VIEWER/OUTPUT WINDOW


• Click the cell in the values column as shown below
• For the value, and the label, you can put up to 120
• A viewer window opens
characters. automatically the first
time you run a procedure
• After defining the values click add and then click OK. that generates some
output.
• All statistical results are
displayed in this window.
• You can edit the output
• Can save the output
• Automatically assign
Click “.SPO” extension to the
file when you save the
file.

44

43 44

Muhammad Usman, NIP, Quaid-i-Azam University, Islamabad 12 October 2023 - Page# 11


BS-VII (A&B): Fall Semester 2023 Data Analysis through Software’s(PY:405)

SYNTAX EDITOR WINDOW CHART EDITOR WINDOW


• Language of command for • Provides a powerful, easy-to-use
carrying out statistical and environment where you can
graphically analysis. customize your charts and explore
your data. The Chart Editor features:
• Useful to carry out the very • You can directly select and edit parts
similar analysis on the data set. of the chart using menus, context
• Automatically open when paste menus, and toolbars. You can enter
push button is pressed in any text right on a chart.
statistical procedure. • Wide range of formatting and
statistical options. You can choose
• You can do most advance from a full range of styles and
statistical procedure that you statistical options.
can not do with dialogue boxes. • Powerful exploratory tools. You can
explore your data in various ways,
such as by labeling, reordering, and
rotating it.

45 46

45 46

Graphical Components of SPSS

• Menus
Data File • Toolbars
… SAV

• Status Bar
• Dialogue Box
Output File
… SPV

• Sub-Dialogue Box
• Command Pushbuttons

Syntax File
… SPS

48

47 48

Muhammad Usman, NIP, Quaid-i-Azam University, Islamabad 12 October 2023 - Page# 12


BS-VII (A&B): Fall Semester 2023 Data Analysis through Software’s(PY:405)

MENUS DIALOG BOXES


• Many of the tasks that you want to perform with SPSS Dialog boxes for statistical procedures and charts
are available through menu selections. typically have two basic components:
Source variable list
• Each window in SPSS has its own menu bar with
menu selections that are appropriate for that window A list of variables in the active dataset. Only variable
type. types that are allowed by the selected procedure are
displayed in the source list.
• The “Analyze” and “Graphs” menus are available in all
Target variable list
windows, making it easy to generate new output
without having to switch windows. One or more lists indicating the variables that you
have chosen for the analysis, such as dependent and
independent variable lists.

49 50

Dialogue Box, Sub-Dialogue Box and Command Command Push Button


Pushbuttons

•OK

•Paste

•Reset

•Cancel

•Help

52

51 52

Muhammad Usman, NIP, Quaid-i-Azam University, Islamabad 12 October 2023 - Page# 13


BS-VII (A&B): Fall Semester 2023 Data Analysis through Software’s(PY:405)

Key Points in Data Entry


 Each Row represents data from one individual while
each Column represents a response from individual to
the variable
 Any variable that measures different groups of people
DATA ENTRY (between-group) should be represented by single
column.

INTO SPSS  Any variable that measures with the same participants
(a repeated-measured) should be represented by
several columns.
 Levels of the between-group variables goes to down
the data editor whereas levels of within-subject
(repeated-measures) variable goes to across the data
editor.
 Before input any data, the variable should be created by
setting characteristics of the variable
53 54

53 54

Basic Steps in Creating a Data File


Rules to Assign Variable Name
Do the following steps in variable view of Data Editor Window
• Unique  Create variable by assigning an appropriate name according
to the rules to assign the variable name (Compulsory)
• Short  Define the type of the variable (Compulsory)
 Assign the width and decimal place (Compulsory)
• Begin with a letter
 Assign a comprehensive label to the variable name that will
reflect in your output results (Optional) but very useful for
• No blank space understanding the output
 If it is a coding variable then assign label to each value used
• No reserved words like (ALL, TO, WITH, BY, in the variable (Compulsory)
AND, OR, NOT, EQ, NE, LE, LT, GE, GT)  Assign appropriate code to signify missing information in the
variable. You can control missing values in your statistical
analysis (Optional) but very useful for understanding the
• Non-case sensitive output
55

55 56

Muhammad Usman, NIP, Quaid-i-Azam University, Islamabad 12 October 2023 - Page# 14


BS-VII (A&B): Fall Semester 2023 Data Analysis through Software’s(PY:405)

Variable Labels and Value Labels Missing Values


 Variable Labels  Missing Values defines specified data values as
 You can assign descriptive variable labels up to 256 user-missing. It is often useful to know why
characters long and variable labels can contain spaces and information is missing. Data values specified as
reserved words which are not allowed in variable names. user-missing are flagged for special treatment and
 Value Label are excluded from most calculations.
 You can assign descriptive value labels for each value of a
coding variable. This is particularly useful if your data file  You can enter up to three discrete missing values,
uses numeric codes to represent non-numeric categories (for a range of missing values, or a range plus one
example, codes of 1 and 2 for male and female). discrete value.
 Value labels are saved with the data file. You do not need to
redefine value labels each time you open a data file.  Ranges can be specified only for numeric
 Value labels can be up to 120 characters long. variables.

57 58

57 58

READING DATA FROM NON-SPSS FORMAT Reading Excel Files

Data files come in a wide variety of • Read variable names.


formats; SPSS designed to support many • SPSS reads variable names from the first row of
of them. the file or the first row of the defined range.
• Spreadsheets created in MS-Excel • Values that don't conform to variable naming
rules are converted to valid variable names, and
• Data files created in dBase-III, FoxPro the original names are used as variable labels.
formats. • Worksheet.
• Tab-delimited and other types of ASCII • Excel files can contain multiple worksheets. By
text files. default, the Data Editor reads the first
• Data files in SPSS format created on worksheet. To read a different worksheet, select
other operating systems. the worksheet from the drop-down list.
• Many other computer format files.
60

59 60

Muhammad Usman, NIP, Quaid-i-Azam University, Islamabad 12 October 2023 - Page# 15


BS-VII (A&B): Fall Semester 2023 Data Analysis through Software’s(PY:405)

BASIC STEPS IN DATA ANALYSIS USING SPSS Data Screening & Transformation
Modification
Good data analysis
Instrument Analysis involves a mixture of Transformation A valid data analysis
must have a valid set

Identification of Data Common Sense, Technical Create new variable of data (GIGO)
Expertise and Curiosity
Identify each & every variable Analyze the Data
Identify the type of each variable Select an appropriate statistical procedure
Identify Measurement Level Select the variables for the analysis
An appropriate
Description of Data Run the procedure statistical procedure is
the key to Correct
Missing values in the variable Astarts
valid Data Analysis
with a solid Interpret the Results analysis.

Variable label planning of study. Extract appropriate values


Value label Construct table according to APA manual
An appropriate analysis
Get your data into Data Editor (DATA ENTRY) Saving the Files needs correct
interpretation of results.
(Continued…) 1. Data File2. Output File 3. Syntax File

61 62

61 62

Lab Activity: Data Entry Activity Data Sheet


Col.1 Col.2 Col.3 Col.4 Col.5 Col.6 Col.7 Col.8 Col.9 Col.10 Col.11 Col.12
1 2 2 25 64 62 66 2 3 4 1 4
•Objective: In this activity you will take the data 2 1 2 30 72 75 72 3 4 5 5 4
provided below and create an SPSS data file 3
4
2
1
3
1
29
27
61
71
80
98
62
71
1
5
2
4
1
3
2
3
3
1
and define all variables consistent with the 5 2 1 22 75 63 63 4 3 2 2 4
coding scheme presented below the actual 6
7
1
1
2
2
20
26
67
69
83
65
68
74
4
4
4
5
3
4
4
3
3
2
data. You will note that there are 12 columns 8 1 2 28 58 71 73 5 5 4 5 5
9 2 3 29 52 85 65 1 3 2 1 2
of data and 20 rows of data representing 12 10 2 2 23 61 67 64 2 3 1 1 2
variables and 20 individual survey 11
12
2
1
2
3
22
25
64
64
71
62
63
64
2
3
4
4
4
2
3
4
2
1
respondents. 13 1 2 20 70 64 65 5 4 3 2 4
14 2 2 20 63 70 62 4 2 3 1 2
•Define your data first and then input data 15 2 1 23 75 63 60 3 2 1 3 2
16 2 2 24 63 82 60 4 3 3 5 4
17 2 2 24 65 75 58 4 1 4 2 3
18 1 2 19 71 65 76 4 3 4 5 3
19 1 2 19 72 67 75 5 5 4 4 4
63
20 2 1 20 58 71 65 1 2 3 3 64
4

63 64

Muhammad Usman, NIP, Quaid-i-Azam University, Islamabad 12 October 2023 - Page# 16


BS-VII (A&B): Fall Semester 2023 Data Analysis through Software’s(PY:405)

The information below aims to tell you what variable is represented in


each column and what the numeric figures mean or represent. Create
appropriate labels and values for each variable. Variable names are in
parenthesis e.g for col.5 “ht” is the variable name and “Respondent’s
Height” is the variable label
Data Description
Col.1 - Respondent's Identification Number (sno)
Assign all respondents a unique identifier starting with 1.
Col.2 - Respondent's Gender (gender)
1 = Male 2 =Female
Col.3 - Respondent's Marital Status (ms)
1=Married 2 = Unmarried 3 = Widowed
Col.4 - Respondent's age (age)
All data reflect actual values (in years).
Col.5 - Respondent's Height (ht)
All data reflect actual values (in inches).
Col.6 - Respondent's Weight (wt)
All data reflect actual values (in kilograms).
Col.7 - Respondent's same sex parent’s Height (pht)
All data reflect actual values (i.e., inches).
Col.8 to Col.12 - Health Rating Questionnaire (h1 to h5)
1. Strongly Disagree (SDA) 2. Disagree (DA)
3. Neutral (N) 4. Agree (A)
5. Strongly Agree (SA) 65

65

Muhammad Usman, NIP, Quaid-i-Azam University, Islamabad 12 October 2023 - Page# 17

You might also like