0% found this document useful (0 votes)
52 views9 pages

To Statistical Analysis: Yale Braunstein School of Information

This document provides an introduction and approximate schedule for a course on statistical analysis. It covers descriptive statistics, research design, sample size, sources of error, measures of central tendency, and how to use Excel and SPSS for statistical analysis. The schedule indicates topics to be covered on specific dates, including data collection instruments, sample size, central tendency measures, and a statistics assignment. It also introduces key concepts in quantitative analysis, data types, research design issues, and how data is analyzed and results presented.

Uploaded by

Tharindu Gangoda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views9 pages

To Statistical Analysis: Yale Braunstein School of Information

This document provides an introduction and approximate schedule for a course on statistical analysis. It covers descriptive statistics, research design, sample size, sources of error, measures of central tendency, and how to use Excel and SPSS for statistical analysis. The schedule indicates topics to be covered on specific dates, including data collection instruments, sample size, central tendency measures, and a statistics assignment. It also introduces key concepts in quantitative analysis, data types, research design issues, and how data is analyzed and results presented.

Uploaded by

Tharindu Gangoda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 9

Introduction

to Statistical Analysis

Yale Braunstein
School of Information

1
Approximate (!) Schedule
 Today
 Data, data collection instruments (e.g., surveys)
– Focus is on descriptive statistics
 Research design
 Sample size, sources of error (maybe)
 Thursday
 Sample size, sources of error
 Measures of central tendency
 Demos of Excel & SPSS
 Discussion of statistics assignment
 Next Tuesday
 More on SPSS with lots of examples
 Q & A on the assignment

2
Introduction
 We are focusing on “quantitative analysis”

 The general idea is to summarize and analyze


data so that it is useful for decision-making
 We do this by calculating “measures of central
tendency” and by looking for relationships
 (We will NOT cover formal tests of
hypotheses)
 Primary vs. secondary data sources

 Data on uses (system) vs. data on users (people)

3
Data

 Data may be continuous or discrete

 Just looking at the data often does not enable


one to ascertain what is actually happening
 Solution: Use appropriate descriptive statistics
to summarize and present results

Another Data 

4
Analysis--Introduction

 The BIG Questions:


 What are you trying to discover or show?
 How will you present the results?
 From survey to report
 Flow of information
 Sample survey of California ISPs
 Brief comparison of Excel & SPSS

5
Data Collection Instruments
 Questionnaires & surveys

 Transactions logs

 Experimental observation

 Bills & invoices

 Census forms & reports

 Pre-packaged data sets


Interviewing & designing surveys requires skill &
experience. It is often useful to get professional help.

6
Issues in Research Design
 Case study vs. statistical sample
 What is the universe ? (uses, users, etc.)
 Example: political debate over “average tax
cut” vs. “tax cut for the average family”
 Is the sample representative ?
 Volumes vs. titles in the library
 Does correlation imply causality?
 Do we need to identify the pathogen?
 Controlling for outside factors

7
Sample Size
 How large a sample is needed?
 The larger the sample the more accurate the results
(unless the response rate becomes very low)
 The larger the sample the more the cost/effort
 Sample size does NOT depend on the size of the population
 Rules of thumb
 100 for 95% confidence, 5% tolerance, 90-10 expected split
 400 for 95% confidence, 5% tolerance, 50-50 expected split
 30 – 50 in each cell on n x m discrete classes
 Exact formula (use with care):
 Size = 0.25 * (certainty factor/acceptable error)^2
 Where the certainty factor = 1.96 for 95%; 2.576 for 99%
[Alternate approach: hire a statistical consultant.]

8
Sources of Error

 The respondent

 The investigator

 Sampling error

 Change in the system itself

 Coding & analysis

 Other

You might also like