Data Analysis Assignment 2
Data Analysis Assignment 2
Balance Demo variable Two sample t test with a demo variable (ignore the demo variable
333 0
903 0
580 0 T-Test: Two-Sample Assuming Unequal Variances
964 0
331 0
1151 0 Mean
203 0 Variance
872 0 Observations
279 0 Hypothesized Mean Difference
1350 0 df
1407 0 t Stat
0 0 P(T<=t) one-tail
204 0 t Critical one-tail
1081 0 P(T<=t) two-tail
148 0 t Critical two-tail
0 0
0 0
368 0 P-value which is greater than the significance level 0.05 ,
891 0 0.192227913652033 > 0.05
1048 0
89 0
968 0 Conclusion
0 0 With the two-sample t-test, we cannot collect enough evidence to reject th
411 0 hypothesis therefore we do not have enough evidence to prove that the M
balance is away from 500$ as mentioned by the manager
0 0
671 0
654 0
467 0
1809 0
915 0
863 0
0 0
526 0
0 0
0 0
419 0
762 0
1093 0
531 0
344 0
50 0
1155 0
385 0
976 0
1120 0
997 0
1241 0
797 0
0 0
902 0
654 0
211 0
607 0
957 0
0 0
0 0
379 0
133 0
333 0
531 0
631 0
108 0
0 0
133 0
0 0
602 0
1388 0
889 0
822 0
1084 0
357 0
1103 0
663 0
601 0
945 0
29 0
532 0
145 0
391 0
0 0
162 0
99 0
503 0
0 0
0 0
1779 0
815 0
0 0
579 0
1176 0
1023 0
812 0
0 0
937 0
0 0
0 0
1380 0
155 0
375 0
1311 0
298 0
431 0
1587 0
1050 0
745 0
210 0
0 0
0 0
227 0
297 0
47 0
0 0
1046 0
768 0
271 0
510 0
0 0
1341 0
0 0
0 0
0 0
454 0
904 0
0 0
0 0
0 0
1404 0
0 0
1259 0
255 0
868 0
0 0
912 0
1018 0
835 0
8 0
75 0
187 0
0 0
1597 0
1425 0
605 0
669 0
710 0
68 0
642 0
805 0
0 0
0 0
0 0
581 0
534 0
156 0
0 0
0 0
0 0
429 0
1020 0
653 0
0 0
836 0
0 0
1086 0
0 0
548 0
570 0
0 0
0 0
0 0
1099 0
0 0
283 0
108 0
724 0
1573 0
0 0
0 0
384 0
453 0
1237 0
423 0
516 0
789 0
0 0
1448 0
450 0
188 0
0 0
930 0
126 0
538 0
1687 0
336 0
1426 0
0 0
802 0
749 0
69 0
0 0
571 0
829 0
1048 0
0 0
1411 0
456 0
638 0
0 0
1216 0
230 0
732 0
95 0
799 0
308 0
637 0
681 0
246 0
52 0
955 0
195 0
653 0
1246 0
1230 0
1549 0
573 0
701 0
1075 0
1032 0
482 0
156 0
1058 0
661 0
657 0
689 0
0 0
1329 0
191 0
489 0
443 0
52 0
163 0
148 0
0 0
16 0
856 0
0 0
0 0
199 0
0 0
0 0
98 0
0 0
132 0
1355 0
218 0
1048 0
118 0
0 0
0 0
0 0
1092 0
345 0
1050 0
465 0
133 0
651 0
549 0
15 0
942 0
0 0
772 0
136 0
436 0
728 0
1255 0
967 0
529 0
209 0
531 0
250 0
269 0
541 0
0 0
1298 0
890 0
0 0
0 0
0 0
0 0
863 0
485 0
159 0
309 0
481 0
1677 0
0 0
0 0
293 0
188 0
0 0
711 0
580 0
172 0
295 0
414 0
905 0
0 0
70 0
0 0
681 0
885 0
1036 0
844 0
823 0
843 0
1140 0
463 0
1142 0
136 0
0 0
0 0
5 0
81 0
265 0
1999 0
415 0
732 0
1361 0
984 0
121 0
846 0
1054 0
474 0
380 0
182 0
594 0
194 0
926 0
0 0
606 0
1107 0
320 0
426 0
204 0
410 0
633 0
0 0
907 0
1192 0
0 0
503 0
0 0
302 0
583 0
425 0
413 0
1405 0
962 0
0 0
347 0
611 0
712 0
382 0
710 0
578 0
1243 0
790 0
1264 0
216 0
345 0
1208 0
992 0
0 0
840 0
1003 0
588 0
1000 0
767 0
0 0
717 0
0 0
661 0
849 0
1352 0
382 0
0 0
905 0
371 0
0 0
1129 0
806 0
1393 0
721 0
0 0
0 0
734 0
560 0
480 0
138 0
0 0
966 0
their credit cards is $500
Unequal Variances
g unequal variances
e to be Zero assuming them to be equal
Unequal Variances
Female F1
529.536231884058
210187.104263402
207
ficance level
Not students M1 Students M2 Lets do a two sample t test assuming unequal variances
333 903 Keeping hypothised mean diffreence to be Zero assuming them to be equal
580 1350
964 654 t-Test: Two-Sample Assuming Unequal Variances
331 419
1151 1155 Nonstudents M1 Students M2
203 1241 Mean 480.369444444445 876.825
872 797 Variance 193085.136111111 240101.942948718
279 902 Observations 360 40
1407 532 Hypothesized Mean D 0
0 1380 df 46
204 375 t Stat -4.90277866136321
1081 431 P(T<=t) one-tail 6.09E-06
148 1587 t Critical one-tail 1.67866041355687
0 1404 P(T<=t) two-tail 1.21723866277E-05
0 1259 t Critical two-tail 2.01289559891943
368 868
891 1425
1048 156 P-value which is smaller than the significance level 0.05 ,
89 1020 0.0000121723866277271 < 0.05
968 1687
0 1411 Conclusion
411 1216
With the two-sample t-test, as the P-value is smaller than 0.05, therefore, we
0 195 have enough evidence to prove that the Mean balance of Students and non
671 1246 students are not equal, We reject the null hypothesis.
467 1549
1809 16
915 0
863 98
0 1050
526 728
0 1255
0 269
762 1036
1093 5
531 415
344 1054
50 790
385 840
976 1003
1120 1352
997
0
654
211
607
957
0
0
379
133
333
531
631
108
0
133
0
602
1388
889
822
1084
357
1103
663
601
945
29
145
391
0
162
99
503
0
0
1779
815
0
579
1176
1023
812
0
937
0
0
155
1311
298
1050
745
210
0
0
227
297
47
0
1046
768
271
510
0
1341
0
0
0
454
904
0
0
0
0
255
0
912
1018
835
8
75
187
0
1597
605
669
710
68
642
805
0
0
0
581
534
0
0
0
429
653
0
836
0
1086
0
548
570
0
0
0
1099
0
283
108
724
1573
0
0
384
453
1237
423
516
789
0
1448
450
188
0
930
126
538
336
1426
0
802
749
69
0
571
829
1048
0
456
638
0
230
732
95
799
308
637
681
246
52
955
653
1230
573
701
1075
1032
482
156
1058
661
657
689
0
1329
191
489
443
52
163
148
0
856
0
0
199
0
0
132
1355
218
1048
118
0
0
0
1092
345
465
133
651
549
15
942
0
772
136
436
967
529
209
531
250
541
0
1298
890
0
0
0
0
863
485
159
309
481
1677
0
0
293
188
0
711
580
172
295
414
905
0
70
0
681
885
844
823
843
1140
463
1142
136
0
0
81
265
1999
732
1361
984
121
846
474
380
182
594
194
926
0
606
1107
320
426
204
410
633
0
907
1192
0
503
0
302
583
425
413
1405
962
0
347
611
712
382
710
578
1243
1264
216
345
1208
992
0
588
1000
767
0
717
0
661
849
382
0
905
371
0
1129
806
1393
721
0
0
734
560
480
138
0
966
rage balance is concerned?
0.05 ,
Let us observe how the no. of credit cards(X) impacts the average balance of a credit card (Y)
Balance
ce (Y)
31 x + 434.286100809367
5 6 7 8 9 10
Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
54.5692816856 7.958435 1.834E-14 327.00604035 541.5661613 327.00604035 541.56616126
16.7430612955 1.731281 0.084177 -3.928944671 61.902841 -3.928944671 61.902840996
r of cards is 28 .98 which means that if there is an increase of value Value of card by 1 unit then there will be
edit cards by 28.98 dollars which Is weak as we can also see from the regression model that the P-value is
able that means the number of cards is not a very good predictor for the variable Y which is the credit card
Balance (Y)
1 x + 434.286100809367
4 5 6 7 8 9 10
Let's see how demographic values such as age, years of education and marita
Let's see how age affects the credit card balance.
For this we will use correlation coefficient and scatter plot to see how good their relationship is
0 90 100 110
Significance F
0.97081387232
Let's see how Years of edutaion affects the credit card balance For this we will use correlation coefficient and scatte
10 531
14 50 500
f(x) = − 1.18596356171414 x + 535.966209905055
13 385
14 976 0
4 6 8 10 12 14 16 18 20 22
16 1120
12 997
9 0
8 0 This is the lenear equation we get from the scatter plot
18 379 Y = -1.186x + 535.97
14 133
17 531
20 631 Reggression Model
10 108
7 0 SUMMARY OUTPUT
15 0
14 602 Regression Statistics
13 1388 Multiple R 0.008061576453627
16 889 R Square 6.498901491767E-05
15 822 Adjusted R Squar -0.00244741051017
12 1103 Standard Error 460.321142930303
18 663 Observations 400
10 945
15 145 ANOVA
11 391 df SS
9 162 Regression 1 5481.167793274
14 0 Residual 398 84334430.74221
12 1779 Total 399 84339911.91
12 0
9 1176 Coefficients Standard Error
15 0 Intercept 535.966209905055 101.8142179597
17 0 Years of Educatio -1.18596356171415 7.373874128537
17 155
12 745 Inference
18 210 Now if you see the coefficient of years of education which is -1.185 which clea
17 47 credit card balance. This explains that if years of education gets increased by
dollars which is a bad predictor in addition if you see the P-value which is .872
11 0 indicates that X is not a significant predictor for Y
12 768
8 510
16 0
17 0
15 0
11 904
14 0
12 0
15 0
11 0
12 1018
15 835
16 8
8 75
11 0
10 710
15 805
11 0
14 581
13 534
9 0
13 429
12 653
11 836
16 0
14 0
11 0
12 1099
9 283
14 384
10 453
11 1237
9 789
15 0
17 1448
14 450
16 126
17 538
19 1426
16 802
17 69
16 0
15 1048
13 0
7 456
11 732
12 95
16 681
19 246
13 653
11 1230
15 1075
15 1032
14 482
11 156
18 191
19 52
13 163
14 148
12 856
15 0
18 0
14 199
18 0
8 1355
6 1048
14 0
9 1092
13 465
15 133
13 549
16 15
15 942
18 772
10 436
9 967
15 209
11 250
12 541
13 0
17 1298
17 890
11 0
15 0
16 863
13 309
10 0
17 293
18 0
11 711
7 172
13 295
13 905
14 0
14 0
18 885
15 844
18 843
13 0
15 81
12 732
14 984
16 121
16 846
15 594
7 194
8 320
17 426
17 410
13 503
13 0
18 413
8 1405
10 712
16 382
19 710
6 345
18 588
16 1000
10 767
14 661
15 905
13 1129
15 1393
15 721
12 138
7 966
15 903
19 1350
12 654
12 419
16 1241
14 375
16 431
15 1404
12 868
12 1425
11 156
16 1020
17 1687
16 1411
10 1216
10 195
13 1246
18 16
13 98
8 1050
15 728
8 269
16 840
16 1003
11 333
11 580
16 331
10 1151
9 872
14 1407
16 0
9 1081
9 1048
10 0
8 411
16 467
16 0
10 0
14 344
15 0
15 654
20 211
12 607
10 957
17 333
17 133
14 1084
12 357
15 601
16 29
14 0
15 99
11 503
11 0
15 815
8 579
16 1023
16 812
13 0
16 937
18 1311
11 298
15 1050
15 0
12 0
14 227
19 297
14 1046
15 271
17 1341
16 0
13 454
15 255
19 0
13 912
9 187
7 1597
10 605
17 669
14 68
14 642
16 0
10 0
17 0
16 0
13 0
14 0
12 1086
12 548
11 570
15 0
9 0
18 108
12 724
16 1573
17 0
18 0
16 423
12 516
13 188
15 0
11 930
14 336
14 0
12 749
7 571
18 829
13 638
13 0
15 230
15 799
13 308
19 637
18 52
14 955
17 573
6 701
9 1058
9 661
12 657
14 689
11 0
14 1329
11 489
15 443
11 0
19 0
12 132
12 218
8 118
14 0
14 0
12 345
10 651
12 0
13 136
14 529
12 531
6 0
8 0
9 485
11 159
15 481
9 1677
13 0
17 188
17 580
11 414
8 70
6 681
11 823
14 1140
16 463
14 1142
16 136
16 0
11 265
17 1999
16 1361
16 474
15 380
17 182
13 926
12 0
14 606
16 1107
9 204
10 633
17 0
18 907
17 1192
14 0
9 302
14 583
18 425
14 962
14 0
18 347
12 611
10 578
16 1243
14 1264
14 216
16 1208
12 992
15 0
10 0
19 717
10 0
11 849
9 382
16 0
10 371
9 0
18 806
10 0
13 0
8 734
13 560
17 480
13 0
15 1155
15 797
13 902
16 532
17 1380
16 1587
17 1259
9 1549
16 0
14 1255
16 1036
13 5
17 415
17 1054
15 790
9 1352
ducation and marital status affect the Credit card balance
correlation coefficient and scatter plot to see how good their relationship is
6209905055
4 16 18 20 22
t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
5.2641587849 2.3086203E-07 335.805329636 736.12709017 335.8053296 736.1270902
-0.1608331714 0.8723064016 -15.6825748027 13.310647679 -15.6825748 13.31064768
education which is -1.185 which clearly shows that there is a negative relationship with the
ears of education gets increased by one year the credit card balance will get reduced by 1.185
n if you see the P-value which is .8723 is again greater than the 0.05 significance level which
or for Y
Let's see how demographic values such as age, years of education and marita
Let's see how credit card balance get affected by marital status For that we have to have two columns one with cred
balances of people who are married and the other of people who are not married
Not Married (M2) Married (M1) Two sample T test assuming unequal Variances
964 204
203 0 t-Test: Two-Sample Assuming Unequal Variances
279 0
148 368 Not Married (M2)
89 891 Mean 523.290322580645
968 671 Variance 221735.038542103
0 1809 Observations 155
915 863 Hypothesized Mean Difference 0
526 0 df 319
531 762 t Stat 0.112233601335984
50 1093 P(T<=t) one-tail 0.455354388554626
531 385 t Critical one-tail 1.64964431932712
631 976 P(T<=t) two-tail 0.910708777109252
108 1120 t Critical two-tail 1.96742838690237
0 997
889 0 P-value which is greater than the significance level 0.05 ,
145 0 0.910708777109252 > 0.05
0 379
0 133 Conclusion
1176 0
0 602 Looking at the P2 tail value which is greater than 0.05 which indicates tha
745 1388 hypothesis, which means then we don't have significant evidence to prove
both variables is different from other which means marriatal staus has no
0 822 credit card
510 1103
0 663
0 945
904 391
0 162
0 1779
8 0
75 155
710 210
836 47
0 768
453 0
1237 0
450 0
69 1018
456 835
653 0
1230 805
1032 0
156 581
0 534
199 0
1355 429
0 653
1092 0
133 0
436 1099
967 283
209 384
541 789
0 0
0 1448
0 126
172 538
0 1426
984 802
712 0
710 1048
345 0
767 732
661 95
905 681
1129 246
721 1075
966 482
654 191
419 52
375 163
431 148
156 856
1687 0
1411 0
1246 1048
728 465
269 549
840 15
580 942
1151 772
872 250
0 1298
1048 890
467 0
133 863
601 309
0 0
99 293
503 711
0 295
0 905
937 0
1311 0
298 885
1050 844
0 843
271 81
0 732
1597 121
669 846
642 594
0 194
0 320
0 426
0 410
108 503
1573 0
0 413
188 1405
0 382
930 588
638 1000
0 1393
230 138
799 903
637 1350
52 1241
661 1404
0 868
118 1425
0 1020
136 1216
485 195
681 16
1140 98
1142 1050
265 1003
380 333
926 331
606 1407
1107 1081
1192 0
425 411
962 0
216 0
992 344
0 0
0 654
371 211
0 607
806 957
0 333
0 1084
734 357
480 29
532 815
1380 579
1587 1023
1259 812
1549 0
1255 227
1036 297
1054 1046
1352 1341
454
255
0
912
187
605
68
0
0
0
0
1086
548
570
724
0
423
516
336
0
749
571
829
308
955
573
701
1058
657
689
0
1329
489
443
0
132
218
0
0
345
651
529
531
0
0
159
481
1677
0
188
580
414
70
823
463
136
0
1999
1361
474
182
0
204
633
0
907
0
302
583
0
347
611
578
1243
1264
1208
0
717
0
849
382
560
0
1155
797
902
0
5
415
790
tion and marital status affect the Credit card balance
columns one with credit card
s same M1=M2
card are not equal (can be less or greater but not equal)
Married (M1)
517.942857142857
205696.726229508
245
el 0.05 ,
To understand this statement we have to follow some tests to check whether the mean of all the ethnicities
present in the data are equal or not We'll divide the data set into three ethnicities present.
Let The mean value of three ethnicities are E1, E2 and E3
ties, E1=E2=E3
ethnicities, E1≠E2≠E3
Average Variance
531 235839.163265306
512.313725490196 231748.336245389
518.497487437186 190922.412872443
MS F P-value F crit
9227.10023602843 0.043442783049627 0.957492 3.018452
212396.618915688
5 (significance level)
Rating Limit
681 9504 Correlation Matrix
259 3388
266 3300 Rating Limit
394 5308 Rating 1
269 3291 Limit 0.9968797370017 1
200 2525
286 3714
339 4378
448 6384
235 2860 Scatter Plot
458 6378
156 1757 Limit
326 4323 16000
949 13414 14000
f(x) = 14.8716071166589 x − 542.928229986894
411 5611 12000
413 5666 10000
563 7838 8000
199 2646
6000
455 6457
4000
462 6481
2000
300 3899
0
253 3327 0 200 400 600 800 1000 1200
351 4763
445 6257
469 6375 Inference
564 7569 It shows a great lenear relation between these two Variable which is .99(close to 1)
138 1499 It explain it further we wil use scatter to see how good a liner relation looks
154 1786 Looking at the equation y = 14.872x - 542.93 Which is a regression equation who is c
372 4742 14.87 which shows that if there is an increase of rating by one unit the balance will in
367 4779
dollars which shows that it is a significant predictor
390 5294
364 5198
254 3089
160 1671
223 2937
320 4160
694 9704
380 5099
418 5619
538 7402
355 4923
418 5390
253 3254
468 6662
288 3449
122 1433
828 12066
182 2271
543 7518
245 3075
120 855
266 3388
365 4768
259 3182
250 3271
231 2959
474 6386
369 4828
186 2117
173 2161
128 1402
481 7056
117 1300
192 2529
195 2531
259 3411
427 5829
452 5835
257 3500
314 4116
175 2073
387 4896
371 5110
192 2420
435 5728
353 4831
143 1362
338 4284
406 5550
381 4865
203 2330
178 2327
219 2820
459 6179
299 4270
292 3965
316 4391
560 7499
459 6420
335 4090
805 11589
316 4442
326 4411
385 5352
730 10088
398 5384
304 3977
169 2000
529 7333
145 1448
392 5310
642 9156
243 3206
410 5289
337 4229
370 5222
633 8760
411 5673
527 6906
430 5614
341 4668
232 2923
236 2910
263 3557
262 3351
460 6617
147 1787
189 2001
265 3211
191 2430
621 8603
469 6386
135 1774
450 6196
296 4049
270 3536
379 5013
360 4952
433 5833
410 5565
347 4866
439 5869
257 3476
518 6982
377 5319
183 1852
581 8100
485 6396
156 1626
142 1552
387 5274
287 3665
149 1389
370 5140
204 2672
372 5051
289 3526
365 4964
536 7506
165 1924
298 3874
494 7010
396 5429
534 7685
129 1485
236 3096
364 5072
508 6662
297 3673
527 7576
351 4756
270 3409
301 3807
299 3922
280 3746
389 5145
155 1561
292 3873
832 11966
434 5891
362 4943
382 5101
440 5759
368 4840
413 5673
515 7167
538 7760
398 5640
482 6827
747 10578
472 6555
321 4171
415 5524
483 6645
491 6819
289 3625
220 2558
376 5043
241 2963
190 2433
433 5533
281 3821
459 6045
184 2120
406 5521
701 9560
499 6784
358 4391
149 1647
437 5765
128 1233
134 1551
665 9310
299 3690
248 3063
293 3782
383 5354
283 3606
514 7075
357 4897
569 8047
512 7114
589 8117
138 1311
511 6922
479 6626
213 2631
398 5179
333 4534
210 2733
162 1829
264 3461
205 2252
376 5183
301 3969
394 5441
413 5466
281 3480
251 2998
505 6819
318 3954
338 4523
224 3180
171 2101
317 4263
344 4433
232 2906
448 6340
352 4307
431 5767
456 6040
249 2832
388 5435
607 8494
256 3736
682 9540
115 1337
263 3189
449 6033
279 3261
491 6637
268 3326
626 9113
137 1410
599 8157
279 3461
162 1568
407 5443
278 3613
728 10384
483 6754
549 7416
228 2748
341 4673
150 1501
121 886
344 4612
235 3155
235 3000
160 1705
515 7530
429 5977
367 4527
214 2880
167 2021
344 4697
339 4745
750 10673
206 2168
221 2607
294 3584
382 5180
309 3806
167 2179
554 7667
287 3933
181 2120
517 7398
310 4159
383 5343
321 3878
180 2450
320 4327
397 5309
323 4351
383 5245
215 2762
392 5395
344 4613
584 7818
547 7555
387 5137
378 4776
360 4788
187 2278
579 8244
369 4986
388 5149
103 906
188 2220
267 3202
402 5182
304 4221
186 2493
215 2561
383 5184
380 5107
142 1349
217 3085
636 8732
353 5000
167 2047
272 3098
296 3907
268 3235
380 5096
817 11200
205 2532
321 4381
355 4632
352 4970
287 3762
332 4640
386 5227
656 9272
296 3907
522 7306
340 4712
229 2586
282 3484
982 13913
721 10230
291 3976
377 5228
261 3402
438 5884
119 855
377 5303
707 10278
260 2955
401 5199
137 1511
420 5380
754 10748
112 1134
374 5140
507 7140
342 4716
442 6090
188 2539
339 4336
344 4471
433 6127
685 9824
564 7871
263 3615
599 8028
466 6135
173 2150
142 1567
366 4941
214 2860
574 8029
282 3274
180 1870
287 3683
126 1357
503 7100
196 2308
138 1335
410 5758
307 4100
296 3838
192 2525
538 7659
320 4431
354 4569
251 3293
367 5382
531 7582
610 8376
451 6207
93 905
353 4706
369 4891
126 1160
223 2863
329 4543
489 6442
409 5495
mit to people with a higher credit rating
We will do a simple linear regression analysis to see how much X (CARD LIMIT) predicts the Y (BALANCE)
0
0 2000 4000 6000 8000 1000
1000
500
3615 216
1870 0
0
3683 371 0 2000 4000 6000 8000 1000
1335 0
3838 480 Inference
8376 1259 If you see the equation from the scatterplot where Y = which clearly suppor
4891 1036 coefficients derived from that which says but there will be an increase of 0.
increased by one unit
2525 0
3714 0 Also from the regression model we found that the P value is smaller than th
4323 671 makes the card limit a great predictor for balance in the credit card
13414 1809
2937 0
855 0
3388 155
5829 1018
5835 835
5728 581
6420 789
5384 802
2000 0
9156 732
2923 191
2001 0
5013 549
5051 711
3807 320
11966 1405
6819 1350
3821 868
6045 1425
2631 0
5179 411
5183 654
3969 211
3180 29
4307 579
5767 1023
3461 255
5443 912
3155 0
4745 724
5180 516
2120 0
7398 749
5343 829
8244 1329
906 0
5182 218
11200 1677
4970 414
2586 0
3402 182
855 0
2955 204
7140 583
4471 611
7659 1155
4569 902
1160 5
9504 964
2860 89
5198 631
3254 145
2271 0
3211 199
1774 0
6662 984
5640 905
5524 966
9560 1687
5765 1246
3690 728
7075 580
7114 872
6626 1048
2101 0
3189 0
7416 669
4673 642
3000 0
2021 0
4697 108
2179 0
2762 52
4640 681
7306 1142
1357 0
2308 0
5382 1380
4706 1255
4543 1054
5308 204
4378 368
6384 891
2646 0
4763 385
6257 976
1499 0
1786 0
4742 379
9704 1388
7402 1103
4923 663
12066 1779
3271 47
6386 768
1402 0
1362 0
5550 653
2820 0
3965 384
4090 0
5673 1075
3557 163
3351 148
2430 0
5833 942
1626 0
5274 863
7506 905
1924 0
3874 0
7685 843
5145 503
3873 413
4943 382
10578 1393
6645 903
5043 1241
5533 1404
1647 195
1233 16
9310 1050
2733 0
2252 0
5466 957
1337 0
3261 297
9113 1341
1568 0
3613 187
886 0
5977 548
3584 423
3933 336
4159 571
4613 573
7555 1058
4788 689
4986 489
5149 443
2493 0
5184 345
3235 159
2532 0
4381 188
5199 633
5380 907
2539 0
9824 1243
2863 415
3300 279
6378 968
1757 0
5611 915
7838 526
3899 531
5294 531
3089 108
1671 0
1433 0
4768 745
2117 0
2161 0
7056 904
3411 0
4865 836
4391 453
7499 1237
5310 456
5222 653
6906 1032
1787 0
8603 1355
6196 1092
4866 436
3476 209
1552 0
3526 172
1485 0
5891 712
5759 345
6827 1129
6555 721
3625 654
2558 419
2433 431
2120 156
3063 269
3782 840
8047 1151
1311 0
2998 133
4523 601
4263 99
4433 503
2906 0
2832 0
8494 1311
3736 298
9540 1050
3326 271
1410 0
4612 0
2168 0
3878 638
2450 0
5309 799
2220 0
1349 0
3085 136
3907 485
3484 265
5228 380
5884 926
5303 606
10748 1192
4716 425
6090 962
6135 992
2150 0
7100 806
5758 734
3293 532
7582 1587
6207 1549
5495 1352
5666 863
6457 762
6481 1093
6375 1120
7569 997
4779 133
4160 602
5619 822
5390 945
6662 391
3449 162
3182 210
2529 0
2531 0
2073 0
5110 805
2420 0
4831 534
4284 429
2327 0
6179 1099
4270 283
11589 1448
4411 126
5352 538
10088 1426
7333 1048
1448 0
3206 95
5289 681
4229 246
5614 482
2910 52
6617 856
6386 1048
4049 465
4952 15
5565 772
6982 250
8100 1298
6396 890
3665 309
1389 0
5140 293
4964 295
7010 885
5429 844
3096 81
5072 732
3673 121
7576 846
4756 594
3409 194
3922 426
3746 410
1561 0
4840 588
5673 1000
4171 138
5521 1020
4391 1216
1551 98
5354 1003
3606 333
4897 331
8117 1407
6922 1081
1829 0
3461 344
5441 607
3480 333
6819 1084
3954 357
6340 815
6040 812
6033 227
6637 1046
8157 454
6754 605
2748 68
1501 0
1705 0
7530 1086
4527 570
2607 0
4351 308
5395 955
7818 701
4776 657
2278 0
3202 132
2561 0
5107 651
8732 529
5000 531
2047 0
3098 0
5096 481
4632 580
3762 70
5227 823
3907 463
4712 136
13913 1999
10230 1361
3976 474
1511 0
1134 0
5140 302
4336 347
6127 578
7871 1264
8028 1208
1567 0
4941 717
2860 0
8029 849
3274 382
4100 560
2525 0
4431 797
905 0
6442 790
the Y (BALANCE)
MS F Significance F
62624255.25 1147.764214 2.53058E-119
54561.9514
t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
-10.9727522 1.184152E-24 -345.2485494 -240.3324415 -345.2485494 -240.3324415
33.87866901 2.53058E-119 0.1616773537 0.181597203 0.1616773537 0.181597203
92.790495455992
at the P value is smaller than the 0.05 significance level which again
ance in the credit card
Let's see how much a credit card Ratings affects the balance
We will do a simple linear regression analysis to see how much X (Rating) predicts the Y (BALANCE)
We wil use correlation from data analysis tab and see their relation
500
0
0 200 400 600 800 1000
1500
1000
304 118
500
656 1140
707 1107
263 216 0
0 200 400 600 800 1000
180 0
287 371
138 0 Inference
296 480 If you see the equation from the scatterplot where Y = which clearly supports the
610 1259 regression model and the coefficients derived from that which says but there will be an
increase of 2.566 dollars in the balance if the Ratings gets increased by one unit
369 1036
200 0 Also from the regression model we found that the P value is smaller than the 0.05
286 0 significance level which again makes the card limit a great predictor for balance in the
credit card
326 671
949 1809
223 0
120 0
266 155
427 1018
452 835
435 581
459 789
398 802
169 0
642 732
232 191
189 0
379 549
372 711
301 320
832 1405
491 1350
281 868
459 1425
213 0
398 411
376 654
301 211
224 29
352 579
431 1023
279 255
407 912
235 0
339 724
382 516
181 0
517 749
383 829
579 1329
103 0
402 218
817 1677
352 414
229 0
261 182
119 0
260 204
507 583
344 611
538 1155
354 902
126 5
681 964
235 89
364 631
253 145
182 0
265 199
135 0
508 984
398 905
415 966
701 1687
437 1246
299 728
514 580
512 872
479 1048
171 0
263 0
549 669
341 642
235 0
167 0
344 108
167 0
215 52
332 681
522 1142
126 0
196 0
367 1380
353 1255
329 1054
394 204
339 368
448 891
199 0
351 385
445 976
138 0
154 0
372 379
694 1388
538 1103
355 663
828 1779
250 47
474 768
128 0
143 0
406 653
219 0
292 384
335 0
411 1075
263 163
262 148
191 0
433 942
156 0
387 863
536 905
165 0
298 0
534 843
389 503
292 413
362 382
747 1393
483 903
376 1241
433 1404
149 195
128 16
665 1050
210 0
205 0
413 957
115 0
279 297
626 1341
162 0
278 187
121 0
429 548
294 423
287 336
310 571
344 573
547 1058
360 689
369 489
388 443
186 0
383 345
268 159
205 0
321 188
401 633
420 907
188 0
685 1243
223 415
266 279
458 968
156 0
411 915
563 526
300 531
390 531
254 108
160 0
122 0
365 745
186 0
173 0
481 904
259 0
381 836
316 453
560 1237
392 456
370 653
527 1032
147 0
621 1355
450 1092
347 436
257 209
142 0
289 172
129 0
434 712
440 345
482 1129
472 721
289 654
220 419
190 431
184 156
248 269
293 840
569 1151
138 0
251 133
338 601
317 99
344 503
232 0
249 0
607 1311
256 298
682 1050
268 271
137 0
344 0
206 0
321 638
180 0
397 799
188 0
142 0
217 136
296 485
282 265
377 380
438 926
377 606
754 1192
342 425
442 962
466 992
173 0
503 806
410 734
251 532
531 1587
451 1549
409 1352
413 863
455 762
462 1093
469 1120
564 997
367 133
320 602
418 822
418 945
468 391
288 162
259 210
192 0
195 0
175 0
371 805
192 0
353 534
338 429
178 0
459 1099
299 283
805 1448
326 126
385 538
730 1426
529 1048
145 0
243 95
410 681
337 246
430 482
236 52
460 856
469 1048
296 465
360 15
410 772
518 250
581 1298
485 890
287 309
149 0
370 293
365 295
494 885
396 844
236 81
364 732
297 121
527 846
351 594
270 194
299 426
280 410
155 0
368 588
413 1000
321 138
406 1020
358 1216
134 98
383 1003
283 333
357 331
589 1407
511 1081
162 0
264 344
394 607
281 333
505 1084
318 357
448 815
456 812
449 227
491 1046
599 454
483 605
228 68
150 0
160 0
515 1086
367 570
221 0
323 308
392 955
584 701
378 657
187 0
267 132
215 0
380 651
636 529
353 531
167 0
272 0
380 481
355 580
287 70
386 823
296 463
340 136
982 1999
721 1361
291 474
137 0
112 0
374 302
339 347
433 578
564 1264
599 1208
142 0
366 717
214 0
574 849
282 382
307 560
192 0
320 797
93 0
489 790
the Y (BALANCE)
Significance F
1.8989E-120
87237
Limit (X1) Rating (X2) CARD BALANCE (Y) Scatter plot of Card limit(X1) and Ca
3388 681 203
3291 259 148
3327 266 50 CARD BALANCE (Y)
5099 394 889 2500
7518 269 1176
3075 200 0
2000 f(x) = 0.171637278371482 x − 292.790495455992
2959 286 0
4828 339 510
1300 448 0 1500
3500 235 8
4116 458 75
1000
4896 156 710
2330 326 0
4442 949 450 500
3977 411 69
8760 413 1230 0
4668 563 156 0 2000 4000 6000 8000 10000
3536 199 133
5869 455 967
5319 462 541
1852 300 0
2672 253 0
5101 351 710
7167 445 767 Scatter plot of Rating(X2) and Card
7760 469 661
2963 564 375
6784 138 1411 CARD BALANCE (Y)
4534 154 467 2500
5435 372 937
10384 367 1597 2000
2880 390 0
10673 364 1573 1500
3806 254 188
7667 160 930 1000
4327 223 230
5245 320 637 500 f(x) = 0.066427766794535 x + 496.437128453948
5137 694 661
4221 380 118 0
9272 418 1140 0 200 400 600 800
10278 538 1107
3615 355 216
1870 418 0
3683 253 371
1335 468 0
3838 288 480
8376 122 1259
4891 828 1036
2525 182 0
3714 543 0
4323 245 671
13414 120 1809
2937 266 0
855 365 0
3388 259 155
5829 250 1018
5835 231 835
5728 474 581
6420 369 789
5384 186 802
2000 173 0
9156 128 732
2923 481 191
2001 117 0
5013 192 549
5051 195 711
3807 259 320
11966 427 1405
6819 452 1350
3821 257 868
6045 314 1425
2631 175 0
5179 387 411
5183 371 654
3969 192 211
3180 435 29
4307 353 579
5767 143 1023
3461 338 255
5443 406 912
3155 381 0
4745 203 724
5180 178 516
2120 219 0
7398 459 749
5343 299 829
8244 292 1329
906 316 0
5182 560 218
11200 459 1677
4970 335 414
2586 805 0
3402 316 182
855 326 0
2955 385 204
7140 730 583
4471 398 611
7659 304 1155
4569 169 902
1160 529 5
9504 145 964
2860 392 89
5198 642 631
3254 243 145
2271 410 0
3211 337 199
1774 370 0
6662 633 984
5640 411 905
5524 527 966
9560 430 1687
5765 341 1246
3690 232 728
7075 236 580
7114 263 872
6626 262 1048
2101 460 0
3189 147 0
7416 189 669
4673 265 642
3000 191 0
2021 621 0
4697 469 108
2179 135 0
2762 450 52
4640 296 681
7306 270 1142
1357 379 0
2308 360 0
5382 433 1380
4706 410 1255
4543 347 1054
5308 439 204
4378 257 368
6384 518 891
2646 377 0
4763 183 385
6257 581 976
1499 485 0
1786 156 0
4742 142 379
9704 387 1388
7402 287 1103
4923 149 663
12066 370 1779
3271 204 47
6386 372 768
1402 289 0
1362 365 0
5550 536 653
2820 165 0
3965 298 384
4090 494 0
5673 396 1075
3557 534 163
3351 129 148
2430 236 0
5833 364 942
1626 508 0
5274 297 863
7506 527 905
1924 351 0
3874 270 0
7685 301 843
5145 299 503
3873 280 413
4943 389 382
10578 155 1393
6645 292 903
5043 832 1241
5533 434 1404
1647 362 195
1233 382 16
9310 440 1050
2733 368 0
2252 413 0
5466 515 957
1337 538 0
3261 398 297
9113 482 1341
1568 747 0
3613 472 187
886 321 0
5977 415 548
3584 483 423
3933 491 336
4159 289 571
4613 220 573
7555 376 1058
4788 241 689
4986 190 489
5149 433 443
2493 281 0
5184 459 345
3235 184 159
2532 406 0
4381 701 188
5199 499 633
5380 358 907
2539 149 0
9824 437 1243
2863 128 415
3300 134 279
6378 665 968
1757 299 0
5611 248 915
7838 293 526
3899 383 531
5294 283 531
3089 514 108
1671 357 0
1433 569 0
4768 512 745
2117 589 0
2161 138 0
7056 511 904
3411 479 0
4865 213 836
4391 398 453
7499 333 1237
5310 210 456
5222 162 653
6906 264 1032
1787 205 0
8603 376 1355
6196 301 1092
4866 394 436
3476 413 209
1552 281 0
3526 251 172
1485 505 0
5891 318 712
5759 338 345
6827 224 1129
6555 171 721
3625 317 654
2558 344 419
2433 232 431
2120 448 156
3063 352 269
3782 431 840
8047 456 1151
1311 249 0
2998 388 133
4523 607 601
4263 256 99
4433 682 503
2906 115 0
2832 263 0
8494 449 1311
3736 279 298
9540 491 1050
3326 268 271
1410 626 0
4612 137 0
2168 599 0
3878 279 638
2450 162 0
5309 407 799
2220 278 0
1349 728 0
3085 483 136
3907 549 485
3484 228 265
5228 341 380
5884 150 926
5303 121 606
10748 344 1192
4716 235 425
6090 235 962
6135 160 992
2150 515 0
7100 429 806
5758 367 734
3293 214 532
7582 167 1587
6207 344 1549
5495 339 1352
5666 750 863
6457 206 762
6481 221 1093
6375 294 1120
7569 382 997
4779 309 133
4160 167 602
5619 554 822
5390 287 945
6662 181 391
3449 517 162
3182 310 210
2529 383 0
2531 321 0
2073 180 0
5110 320 805
2420 397 0
4831 323 534
4284 383 429
2327 215 0
6179 392 1099
4270 344 283
11589 584 1448
4411 547 126
5352 387 538
10088 378 1426
7333 360 1048
1448 187 0
3206 579 95
5289 369 681
4229 388 246
5614 103 482
2910 188 52
6617 267 856
6386 402 1048
4049 304 465
4952 186 15
5565 215 772
6982 383 250
8100 380 1298
6396 142 890
3665 217 309
1389 636 0
5140 353 293
4964 167 295
7010 272 885
5429 296 844
3096 268 81
5072 380 732
3673 817 121
7576 205 846
4756 321 594
3409 355 194
3922 352 426
3746 287 410
1561 332 0
4840 386 588
5673 656 1000
4171 296 138
5521 522 1020
4391 340 1216
1551 229 98
5354 282 1003
3606 982 333
4897 721 331
8117 291 1407
6922 377 1081
1829 261 0
3461 438 344
5441 119 607
3480 377 333
6819 707 1084
3954 260 357
6340 401 815
6040 137 812
6033 420 227
6637 754 1046
8157 112 454
6754 374 605
2748 507 68
1501 342 0
1705 442 0
7530 188 1086
4527 339 570
2607 344 0
4351 433 308
5395 685 955
7818 564 701
4776 263 657
2278 599 0
3202 466 132
2561 173 0
5107 142 651
8732 366 529
5000 214 531
2047 574 0
3098 282 0
5096 180 481
4632 287 580
3762 126 70
5227 503 823
3907 196 463
4712 138 136
13913 410 1999
10230 307 1361
3976 296 474
1511 192 0
1134 538 0
5140 320 302
4336 354 347
6127 251 578
7871 367 1264
8028 531 1208
1567 610 0
4941 451 717
2860 93 0
8029 353 849
3274 369 382
4100 126 560
2525 223 0
4431 329 797
905 489 0
6442 409 790
e on credit cards on the
SUMMARY OUTPUT
ARD BALANCE (Y)
Regression Statistics
Multiple R 0.861697267
R Square 0.74252218
71482 x − 292.790495455992 Adjusted R Square 0.7418752508
Standard Error 233.58499824
Observations 400
ANOVA
df SS MS
Regression 1 62624255.2509 62624255
Residual 398 21715656.6591 54561.95
Total 399 84339911.91
6000 8000 10000 12000 14000 16000 Coefficients Standard Error t Stat
Intercept -292.7904955 26.6834145155 -10.97275
Limit (X) 0.1716372784 0.0050662344 33.87867
SUMMARY OUTPUT
ARD BALANCE (Y)
Regression Statistics
Multiple R 0.0223551513
R Square 0.0004997528
Adjusted R Square -0.002011554
Standard Error 460.22106007
Observations 400
ANOVA
df SS MS
4535 x + 496.437128453948
Regression 1 42149.1062229 42149.11
Residual 398 84297762.8038 211803.4
0 600 800 1000 1200 Total 399 84339911.91
Coefficients Standard Error t Stat
Intercept 496.43712845 57.6458284031 8.611848
Rating (X2) 0.0664277668 0.14890934588 0.446095
F Significance F
1147.764 2.53058E-119
F Significance F
0.199001 0.65577096
P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
1.699E-16 383.1087546 609.76550231 383.1087546 609.76550231
0.655771 -0.226319419 0.3591749523 -0.226319419 0.3591749523
Lets see how No of cards to a perticular person and limit of the card affect th
We wil run a multiple regression model to see how Both together have an effect
on the balance and what effect ih has keeping one variale constant
Balance Y
ession where Two factors means 2 X variables affecting Y variable which is balance
MS F Significance F
31566353.684 590.92382442 9.758453E-120
53418.651237
03
Let's run a multiple linear regression model with a variable Y as balance and variable X with a set of columns such as
with a set of columns such as income age education limit and rating
Corelation matrix
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.936702578167
R Square 0.877411719944
Adjusted R Square 0.875856031111
Standard Error 161.9917646986
Observations 400
ANOVA
df SS MS F Significance F
Regression 5 74000827.16891 14800165.43378 564.0020685522 4.5908276E-177
Residual 394 10339084.74109 26241.33183018
Total 399 84339911.91