0% found this document useful (0 votes)
293 views7 pages

This Study Resource Was: IS 665 Data Analysis For Information Systems Technical Assignment 1 Part I. Statistics (40 PTS.)

This document contains a technical assignment analyzing movie star data and various data sets related to commute times, interest rates, and search algorithms. It includes questions about drawing histograms to analyze the distribution of movie star salaries and commute times. It also involves analyzing the relationship between domestic and foreign movie gross, trends in interest rates over time, and the time complexity of different search algorithms for a sorted array. Binary search and linear search are described as two methods for searching a sorted array, with binary search having logarithmic time complexity and linear search having linear time complexity.

Uploaded by

Malik Asad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
293 views7 pages

This Study Resource Was: IS 665 Data Analysis For Information Systems Technical Assignment 1 Part I. Statistics (40 PTS.)

This document contains a technical assignment analyzing movie star data and various data sets related to commute times, interest rates, and search algorithms. It includes questions about drawing histograms to analyze the distribution of movie star salaries and commute times. It also involves analyzing the relationship between domestic and foreign movie gross, trends in interest rates over time, and the time complexity of different search algorithms for a sorted array. Binary search and linear search are described as two methods for searching a sorted array, with binary search having logarithmic time complexity and linear search having linear time complexity.

Uploaded by

Malik Asad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Keya Shah

2/7/2018

IS 665 Data Analysis For Information Systems

Technical Assignment 1

Part I. Statistics (40 pts.)


1. The file “assignment1_actors.xls” contains information on 66 movie stars. This data
set contains five variables:
a. Gender
b. DomesticGross: average domestic gross of star’s last few movies (in $
million)

m
c. ForeignGross: average foreign gross of star’s last few movies (in $ million)

er as
d. Salary: current amount the star asks for a movie (in $ million)

co
eH w
o.
Question 1. Using bin width of 2 (0-2, 2-4, 4-6, etc.), draw a histogram using the
rs e
ou urc
“salary” variable using Excel

Actors' Salaries
o
aC s

16
vi y re

14
12
10
ed d

8
Frequency
ar stu

Number of Actors 6
4
2
is

0
Th

Salary ($ millions)
sh

This study source was downloaded by 100000821659204 from CourseHero.com on 06-12-2021 17:41:48 GMT -05:00

https://www.coursehero.com/file/40494653/Assignment-1docx/
Question 2. Is there an association between a star’s domestic and foreign gross? Choose
an appropriate tool to analyze this problem. Show your work in Excel, and put some brief
analysis under the chart / table / picture you have in Excel

Domestic Gross vs. Foreign Gross


200
180
160
140
120
DomesticGross
100 ForeignGross
$ millions
80
60

m
er as
40
20

co
eH w
0

o.
rs e Actors
ou urc
o
aC s

The domestic gross and foreign gross have the same trend. For most of all the 66
vi y re

actors, when the domestic gross increases, the foreign gross increases as well.
ed d
ar stu
is
Th
sh

This study source was downloaded by 100000821659204 from CourseHero.com on 06-12-2021 17:41:48 GMT -05:00

https://www.coursehero.com/file/40494653/Assignment-1docx/
2. The file “Assignment1_Commute.xls” contains average time it takes for a citizen of
each metropolitan area to commute to work and back home each day.

Question 3. Generate a histogram for distribution of daily commute times. Are shorter or
longer commute times generally more likely for these commuters? (Hint: analyze the
shape of the distribution).

Daily Commute Time for U.S. Metropolitan Areas


120
100

m
80

er as
60 Frequency

co
Number of Areas

eH w
40
20

o.
rs e0
ou urc
25 30 35 40 45 50 55 60 65 70 75 80
Average Time (minutes)
o
aC s
vi y re

The average time is 42.1 minutes among all the cities combined. The mode is 36.5
minutes and the median is 41.1 minutes. The chart above shows the peak from 40-45
ed d

minutes. This means the graph is skewed right because the mode is the less than the
ar stu

median which is less than the mean. There are shorter commute times for generally
all the commuters.
is
Th
sh

This study source was downloaded by 100000821659204 from CourseHero.com on 06-12-2021 17:41:48 GMT -05:00

https://www.coursehero.com/file/40494653/Assignment-1docx/
3. The file “Assignment1_Mortgage.xls” contains annual interest rates on 30-year fixed
mortgages over the years.

Question 4. What conclusions can be drawn through time series analysis generated from
this data?

Average Annual Interest Rate on 30-Year Fixed-Rate Mortgages


18.00
16.00

m
er as
14.00

co
12.00

eH w
10.00
Rate Rate

o.
8.00
6.00 rs e
ou urc
4.00
2.00
o

0.00
1970 1975 1980 1985 1990 1995 2000 2005
aC s
vi y re

Year
ed d
ar stu

The conclusion that can be drawn is that rates peaked from 1979 to 1985. The other
years, the rates were very normal and consistent.
is
Th
sh

This study source was downloaded by 100000821659204 from CourseHero.com on 06-12-2021 17:41:48 GMT -05:00

https://www.coursehero.com/file/40494653/Assignment-1docx/
Part II. Data Structure and Algorithms (60 pts.)
In class, we discussed array and linked list’s operations such as insertion and
deletion. However, I left out an important operation: SEARCH

Question 1. Assume that you have an SORTED array of records. Assume that the length
of the array (n) is known. Give TWO different methods to SEARCH for a specific value
in this array. You can use English or pseudo-code for your algorithm. What is the time
complexity for each algorithm and why?

m
er as
1. Binary Search

co
eH w
You can start off by comparing the value with the elements in the middle
position in the array. If the value is matched, then the value is returned. If the

o.
rs e
value is less than the middle element, then it will be in the lower half of the array;
ou urc
if the value is greater than the element, then it will be in the upper half of the
array. The steps are repeated on the lower or upper half of the array until the
target is matched. Binary search is implemented when there are large numbers of
o

elements in the array. This process is faster and helps you find the number easily
aC s

in a sorted list.
vi y re

2. Linear Search
ed d
ar stu

function findIndex(values, target){


for(var i=0; i< values.length; ++i)
{
is

if (values[i]==target)
{
Th

return i;
}
}
return -1;
sh

This study source was downloaded by 100000821659204 from CourseHero.com on 06-12-2021 17:41:48 GMT -05:00

https://www.coursehero.com/file/40494653/Assignment-1docx/
Linear search can be used when there is not much data involved. Linear search checks
every element in the list until the element is found.

Question 2. Assume that you have a linked list of records. Assume that you have a head,
a current, and a tail pointer. Write an algorithm that swaps the data in the current node
and the node after it. You can use pseudo-code, English or drawing to describe your
solution.

Node addresses A B C D and the objective is to swap B and C:

1) Cross to B. Save the address of the current node in TEMP before you go the next node.
When you get to B, TEMP will have the address of A.

m
er as
2) TEMP->NEXT (remember TEMP is the address of A) is assigned B->NEXT. A is
pointing to C.

co
eH w
3) B->NEXT is assigned B->NEXT->NEXT

o.
B->NEXT is C. rs e
ou urc
B->NEXT->NEXT is really C->NEXT
o

C->NEXT is D
aC s
vi y re

TEMP->NEXT = B->NEXT;
B->NEXT = B->NEXT->NEXT;
ed d
ar stu
is
Th
sh

This study source was downloaded by 100000821659204 from CourseHero.com on 06-12-2021 17:41:48 GMT -05:00

https://www.coursehero.com/file/40494653/Assignment-1docx/
Question 3. Assume that you have a linked list of records. Assume that you have a head,
a current, and a tail pointer. Write an algorithm that DELETES the node BEFORE the
current node. You can use pseudo-code, English or drawing to describe your solution.
( this was, and remains to be, a popular technical interview question)

We can start off by having A, B, and C who have heads, currents and tail pointers; the
objective being to delete B. We can easily delete B’s head and tail. After, we have to
reverse the method so C’s pointer can search for A’s head. The order was A, B, C. This
means A can search B and B can search C. This means A and C can be searched for each
one from C to A.

m
er as
co
eH w
o.
rs e
ou urc
o
aC s
vi y re
ed d
ar stu
is
Th
sh

This study source was downloaded by 100000821659204 from CourseHero.com on 06-12-2021 17:41:48 GMT -05:00

https://www.coursehero.com/file/40494653/Assignment-1docx/
Powered by TCPDF (www.tcpdf.org)

You might also like