Data Analysis Assignment
Data Analysis Assignment
PREPARED BY:
MUHAMMAD 2022362079 NIMBSF3A
SYAHMI BIN (202321)
KAMARUL AZMAN
PREPARED FOR:
DR. HAZILA BINTI TIMAN
SUBMISSION DATE:
25TH OF JUNE 2023
HONESTY STATEMENT
This is to state that I have completed this assignment on my own. I have not shared (given or received)
ANY help from another person in completing this assignment. I understand that any dishonest
activity discovered by the instructor will result, in a minimum, of a zero grade for this assignment.
Signature: syahmi
ABSTRACT
The primary goal of this research is to analyze and extract useful insights from census
data in order to acquire a better knowledge of demographic trends and patterns in the
designated areas. The study demonstrates the statistical approaches used to discover
noteworthy discoveries, identify critical variables, and create correlations between
various population indicators using SPSS. (W. P. O'Hare, 2019)
Through its findings and conclusions, this investigation adds to a better understanding
of society aspects such as population growth, distribution, and socioeconomic
determinants. Using data.gov.my as a reliable data source and SPSS as the analytical
tool, this assignment provides an instructive and extensive investigation of numerous
data-driven census datasets.
.
ACKNOWLEDGEMENT
I would like to express my deepest gratitude and appreciation for your assistance and
guidance throughout the process of concluding my open data analysis individual
assignment. Your knowledge and commitment have been instrumental in shaping my
understanding of this topic.
I appreciate the opportunity to learn under your guidance, and I admire your dedication
to teaching and knowledge sharing. Your dedication to nurturing a supportive learning
environment has had a substantial impact on my academic progression.
Again, I'd like to extend my deepest gratitude for your unwavering support and
encouragement. Your assistance has been invaluable in augmenting my abilities and
broadening my horizons in the field of open data analysis. It has been a delight to
participate in your class.
Contents Page
Introduction……….…………………………………………………………………………………………………. 1
Descriptive Analysis Findings……………………………………………….………………………………… 2 - 58
Discussion……………………………………………………………………………………………………………… 59
Conclusion……….…………………………………………………………………………………………………… 60
References……………………………………………………………………………………………………………. 61
Appendices……………………………………………………………………………………………………………. 62 - 63
INTRODUCTION
Countries around the world are beginning to recognize the significance of data
analysis in decision-making as it has become an essential method for comprehending
complex data. The Malaysian government has taken a stride forward by launching the
useful website data.gov.my. This duty primarily involves analyzing the data available
on this platform. SPSS (Statistical Package for the Social Sciences) is an easy-to-use
programme that will be utilized to make sense of the data and uncover crucial insights.
(Kelley, Karin, 2023)
SPSS, which stands for "Statistical Package for the Social Sciences," is a robust piece
of software for statistical analysis that is utilized in a variety of fields. It facilitates
researchers' ability to examine, summaries, and comprehend data. We will rely heavily
on descriptive statistics for this assignment, which are useful for summarizing data
using measures such as mean, frequency, and standard deviation. These figures
provide an overview of the data and allow us to identify patterns and trends. (William,
Kate, 2022)
Using both data.gov.my and SPSS, we can sift through the vast quantity of publicly
available data and uncover intriguing insights. With SPSS's descriptive statistics, we
will be able to summarize the data provided by the Malaysian government, identify the
most significant factors, and generate useful charts and graphs.
This assignment will analyze data from data.gov.my, a wonderful resource provided by
the Malaysian government, using SPSS. We will concentrate on summary statistics
such as mean, frequency, and standard deviation to gain a deeper understanding of
the data and extract valuable information from it.
1|Page
DESCRIPTIVE ANALYSIS FINDINGS
KLUSTER: BANCIAN
Frequencies
Statistics
BILANGAN
N Valid 74 74 74 74 74
Missing 0 0 0 0 0
Frequency Table
TAHUN
Valid Cumulative
Frequency Percent Percent Percent
DAERAH
Valid Cumulative
Frequency Percent Percent Percent
2|Page
Kuala Pilah 5 6.8 6.8 20.3
KATEGORI
Valid Cumulative
Frequency Percent Percent Percent
HOSPITAL
Valid Cumulative
Frequency Percent Percent Percent
3|Page
Hospital Jempol 5 6.8 6.8 31.1
4|Page
UCSI 1 1.4 1.4 100.0
BILANGAN KATIL
Valid Cumulative
Frequency Percent Percent Percent
5|Page
199 1 1.4 1.4 85.1
Statistics
BILANGAN
N Valid 74 74 74 74 74
Missing 0 0 0 0 0
Histogram
6|Page
2. Transportation vehicle licenses in Sabah from 2018 – 2022
Frequencies
Statistics
KELAS BILANGAN
Missing 0 0 0 0 0 0
Frequency Table
TAHUN
Valid Cumulative
Frequency Percent Percent Percent
7|Page
2020 180 13.8 13.8 72.6
NEGERI
Valid Cumulative
Frequency Percent Percent Percent
Valid 1 .1 .1 .1
DAERAH
Valid Cumulative
Frequency Percent Percent Percent
kudat 5 .4 .4 29.2
kunak 6 .5 .5 32.7
8|Page
lahad datu 29 2.2 2.2 38.1
pitas 3 .2 .2 55.0
putatan 6 .5 .5 58.1
sipitang 5 .4 .4 72.8
tongod 5 .4 .4 91.0
9|Page
PEJABAT
Valid Cumulative
Frequency Percent Percent Percent
KELAS LESEN
Valid Cumulative
Frequency Percent Percent Percent
10 | P a g e
Kereta Sewa & Pandu 59 4.5 4.5 60.4
pembawa A 3 .2 .2 68.8
pembawa C 1 .1 .1 74.9
Pembawa c 1 .1 .1 75.0
teksi 1 .1 .1 81.2
Histogram
11 | P a g e
3. Medical personnel in government and private hospital from 2012 – 2021
Frequencies
Statistics
PERSONEL
N Valid 40 40 40 40 40
Missing 0 0 0 0 0
Frequency Table
TAHUN
Valid Cumulative
Frequency Percent Percent Percent
12 | P a g e
2014 4 10.0 10.0 30.0
PERSONEL PERUBATAN
Valid Cumulative
Frequency Percent Percent Percent
KERAJAAN
Valid Cumulative
Frequency Percent Percent Percent
13 | P a g e
228 1 2.5 2.5 15.0
14 | P a g e
2350 1 2.5 2.5 87.5
SWASTA
Valid Cumulative
Frequency Percent Percent Percent
15 | P a g e
455 1 2.5 2.5 47.5
JUMLAH
Cumulative
Frequency Percent Valid Percent Percent
16 | P a g e
338 1 2.5 2.5 22.5
17 | P a g e
3157 1 2.5 2.5 90.0
Histogram
18 | P a g e
4. Poverty and population statistics by district in Negeri Sembilan
Frequencies
Statistics
Miskin
Tahun Daerah Miskin Tegar
N Valid 77 77 73 77
Missing 0 0 4 0
Frequency Table
Tahun
Valid Cumulative
Frequency Percent Percent Percent
19 | P a g e
Daerah
Valid Cumulative
Frequency Percent Percent Percent
Miskin
20 | P a g e
144.000 1 1.3 1.4 21.9
21 | P a g e
241.000 1 1.3 1.4 60.3
22 | P a g e
1117.000 1 1.3 1.4 98.6
Total 77 100.0
MiskinTegar
Valid Cumulative
Frequency Percent Percent Percent
23 | P a g e
37 1 1.3 1.3 68.8
24 | P a g e
Histogram
25 | P a g e
5. Number of District Police Headquarters, Police Stations, and Police Huts by
District in Negeri Sembilan
Frequencies
Statistics
IBU PEJABAT
POLIS IBU PEJABAT
KONTINJENN POLIS BALAI PONDOK
TAHUN DAERAH EGERI DAERAH POLIS POLIS
N Valid 49 49 49 49 49 49
Missing 0 0 0 0 0 0
Frequency Table
TAHUN
Valid Cumulative
Frequency Percent Percent Percent
26 | P a g e
DAERAH
Valid Cumulative
Frequency Percent Percent Percent
Valid Cumulative
Frequency Percent Percent Percent
Valid Cumulative
Frequency Percent Percent Percent
27 | P a g e
BALAI POLIS
Cumulative
Frequency Percent Valid Percent Percent
PONDOK POLIS
Cumulative
Frequency Percent Valid Percent Percent
28 | P a g e
Histogram
29 | P a g e
30 | P a g e
6. Primary schools in Negeri Sembilan
Frequencies
Statistics
N Valid 49 49 49 49 49 49 49
Missing 0 0 0 0 0 0 0
Frequency Table
TAHUN
Valid Cumulative
Frequency Percent Percent Percent
DAERAH
Valid Cumulative
Frequency Percent Percent Percent
31 | P a g e
Jempol 7 14.3 14.3 28.6
SEKOLAH KEBANGSAAN
Valid Cumulative
Frequency Percent Percent Percent
Valid Cumulative
Frequency Percent Percent Percent
32 | P a g e
SEKOLAH KEBANGSAAN KHAS
Valid Cumulative
Frequency Percent Percent Percent
Valid Cumulative
Frequency Percent Percent Percent
Valid Cumulative
Frequency Percent Percent Percent
33 | P a g e
4 4 8.2 8.2 38.8
Histogram
34 | P a g e
35 | P a g e
36 | P a g e
7. Percentage of Women as KSU, TKSU, KP, Director and General Manager of
Statutory Bodies (Badan berkanun)
Frequencies
Statistics
Peratusan
Tahun Jawatan Lelaki Perempuan Perempuan
N Valid 30 30 30 30 30
Missing 0 0 0 0 0
Frequency Table
Tahun
Valid Cumulative
Frequency Percent Percent Percent
Jawatan
Valid Cumulative
Frequency Percent Percent Percent
37 | P a g e
Ketua Setiausaha 6 20.0 20.0 40.0
Negara
Lelaki
Valid Cumulative
Frequency Percent Percent Percent
38 | P a g e
95 1 3.3 3.3 90.0
Perempuan
Valid Cumulative
Frequency Percent Percent Percent
Peratusan Perempuan
Valid Cumulative
Frequency Percent Percent Percent
39 | P a g e
10.0 1 3.3 3.3 36.7
Histogram
40 | P a g e
41 | P a g e
8. Year 2021-2022 Total Perak State Civil Servants - Under the Perak State
Government Secretary's Office.
Frequencies
Statistics
N Valid 8 8 8 8 8 8 8 8
Missing 0 0 0 0 0 0 0 0
Frequency Table
Tahun
Senarai
Valid Cumulative
Frequency Percent Percent Percent
42 | P a g e
Melayu
Valid Cumulative
Frequency Percent Percent Percent
Cina
Valid Cumulative
Frequency Percent Percent Percent
43 | P a g e
India
Valid Cumulative
Frequency Percent Percent Percent
Lain lain
Valid Cumulative
Frequency Percent Percent Percent
44 | P a g e
Lelaki
Valid Cumulative
Frequency Percent Percent Percent
Perempuan
Valid Cumulative
Frequency Percent Percent Percent
45 | P a g e
Histogram
46 | P a g e
47 | P a g e
9. Position of ISM vacancies by race Until 1 November 2021. (Note: Social Institut
Malaysia)
Frequencies
Statistics
JUMLAH
TAHUN PERKHIDMATAN MELAYU LELAKI PEREMPUAN CINA LELAKI PEREMPUAN PEREMPUAN DIISI DI ISM INDIA LELAKI LAIN LAIN LELAKI PEREMPUAN
N Valid 4 4 4 4 4 4 4 4 4 4 4
Missing 0 0 0 0 0 0 0 0 0 0 0
Mean 2020.50 15.50 15.75 .00 .00 .50 34.00 .00 .25 2.00
Std. Deviation .577 12.715 2.986 .000 .000 .577 16.186 .000 .500 .816
Frequency Table
TAHUN
Valid Cumulative
Frequency Percent Percent Percent
48 | P a g e
Total 4 100.0 100.0
KUMPULAN PERKHIDMATAN
Valid Cumulative
Frequency Percent Percent Percent
MELAYU LELAKI
Valid Cumulative
Frequency Percent Percent Percent
MELAYU PEREMPUAN
Valid Cumulative
Frequency Percent Percent Percent
49 | P a g e
JUMLAH JAWATAN YANG DIISI DI ISM
Cumulative
Frequency Percent Valid Percent Percent
Cumulative
Frequency Percent Valid Percent Percent
INDIA PEREMPUAN
Cumulative
Frequency Percent Valid Percent Percent
50 | P a g e
Histogram
51 | P a g e
52 | P a g e
10. Subethnic Indigenous Population for the year 2010.
Frequencies
Statistics
BIL
NEGERI ETNIK SUBETNIK PENDUDUK
Missing 0 0 0 1
Mean 1099.981
Frequency Table
NEGERI
Valid Cumulative
Frequency Percent Percent Percent
Valid 1 .6 .6 .6
53 | P a g e
KEDAH 18 11.0 11.0 22.7
ETNIK
Valid Cumulative
Frequency Percent Percent Percent
Valid 1 .6 .6 .6
SUBETNIK
Valid Cumulative
Frequency Percent Percent Percent
Valid 1 .6 .6 .6
54 | P a g e
Jakun 9 5.5 5.5 28.2
BIL PENDUDUK
Valid Cumulative
Frequency Percent Percent Percent
55 | P a g e
9.0 1 .6 .6 59.3
12.0 1 .6 .6 61.1
13.0 1 .6 .6 61.7
15.0 1 .6 .6 63.6
18.0 1 .6 .6 65.4
19.0 1 .6 .6 66.0
20.0 1 .6 .6 66.7
21.0 1 .6 .6 67.3
23.0 1 .6 .6 67.9
30.0 1 .6 .6 71.0
32.0 1 .6 .6 71.6
34.0 1 .6 .6 72.2
38.0 1 .6 .6 72.8
40.0 1 .6 .6 73.5
43.0 1 .6 .6 74.1
48.0 1 .6 .6 74.7
62.0 1 .6 .6 75.3
85.0 1 .6 .6 75.9
123.0 1 .6 .6 76.5
132.0 1 .6 .6 77.2
134.0 1 .6 .6 77.8
163.0 1 .6 .6 78.4
56 | P a g e
186.0 1 .6 .6 79.0
205.0 1 .6 .6 79.6
208.0 1 .6 .6 80.2
219.0 1 .6 .6 80.9
222.0 1 .6 .6 81.5
225.0 1 .6 .6 82.1
229.0 1 .6 .6 82.7
344.0 1 .6 .6 83.3
356.0 1 .6 .6 84.0
369.0 1 .6 .6 84.6
504.0 1 .6 .6 85.2
530.0 1 .6 .6 85.8
610.0 1 .6 .6 86.4
762.0 1 .6 .6 87.0
811.0 1 .6 .6 87.7
895.0 1 .6 .6 88.3
906.0 1 .6 .6 88.9
1427.0 1 .6 .6 89.5
1615.0 1 .6 .6 90.1
1838.0 1 .6 .6 90.7
2464.0 1 .6 .6 91.4
3454.0 1 .6 .6 92.0
3762.0 1 .6 .6 92.6
4421.0 1 .6 .6 93.2
4769.0 1 .6 .6 93.8
5220.0 1 .6 .6 94.4
5328.0 1 .6 .6 95.1
57 | P a g e
7091.0 1 .6 .6 95.7
7884.0 1 .6 .6 96.3
11908.0 1 .6 .6 96.9
12055.0 1 .6 .6 97.5
18541.0 1 .6 .6 98.1
18876.0 1 .6 .6 98.8
27125.0 1 .6 .6 99.4
31437.0 1 .6 .6 100.0
Missing System 1 .6
Histogram
58 | P a g e
DISCUSSION
We were able to summarize the dataset with the aid of descriptive statistics such as
mean, frequency, and standard deviation. The mean represented the average value
of our variable. We were able to comprehend the distribution and frequency of values
within our sample by analyzing frequencies. The standard deviation informed us about
the dispersion of values around the mean. (Srivastav, A. K. 2023)
Our investigation revealed crucial census data demographic indicators. SPSS assisted
us in identifying correlations and patterns by analyzing the interactions between
variables. With the assistance of SPSS graphs and charts, we enhanced our data
visualization and were better able to identify trends.
For our data analysis, the combination of data.gov.my and SPSS proved to be a
successful strategy. We were able to investigate a variety of demographic and
socioeconomic trends due to the abundance of datasets on data.gov.my. With SPSS's
user-friendly interface and extensive statistical analysis tools, we were able to perform
a comprehensive data analysis and obtain valuable insights.
Overall, our data analysis of census data from data.gov.my provided us with insightful
information for this assignment. Using descriptive statistics and SPSS graphs, we
identified demographic indicators, identified trends, and drew insightful conclusions.
This demographic analysis is the basis for future research and decision-making.
We can conclude that all data taken from the website are valid as it can be read and
detected by SPSS and produced a decent graph and bars. Only one (1) data missing
detected that is from Data 10 (Subethnic Indigenous Population for the year 2010).
59 | P a g e
CONCLUSION
In a nutshell, we used the Statistical Package for the Social Sciences (SPSS) to
analyses the census statistics made accessible by data.gov.my. This study provided
valuable insights into demographic indices and socioeconomic tendencies. We were
able to obtain a better understanding of the facts at hand and draw meaningful
conclusions thanks to the use of descriptive statistics and graphical representations.
This assignment emphasizes the importance of data analysis in informing decision-
making and lays the framework for future research targeted at better understanding
the characteristics of populations.
Despite the fact that our findings are insightful, it is critical to examine the limits and
potential biases of the data and research approaches used. When interpreting the
results, data quality, sample size, and the range of statistical techniques used must all
be taken into account.
60 | P a g e
REFERENCES
O’Hare, William P. “The Importance of Census Accuracy: Uses of Census Data.” The
Importance of Census Accuracy: Uses of Census Data | SpringerLink, 14 Feb. 2019,
https://doi.org.
Kibuacha, Frankline. “How to Determine Sample Size for a Research Study - GeoPoll.”
GeoPoll, 7 Apr. 2021, www.geopoll.com/blog/sample-size-research.
61 | P a g e
APPENDICES
62 | P a g e
Appendix C: Among the data that can be retrieved in this website
63 | P a g e