1 - Assignment 01 570
1 - Assignment 01 570
Student declaration
I certify that the assignment submission is entirely my own work and I fully understand the consequences of plagiarism. I understand that
making a false declaration is a form of malpractice.
Student’s signature
Grading grid
P1 P2 P3 M1 M2 D1
Summative Feedback: Resubmission Feedback:
When data is presented in a way that has meaning for the audience, information is
created. Data must be processed and structured in order to become information.
Information design is a critical area in both information architecture and human-computer
interaction. It involves presenting data in a meaningful and valuable way. (TechTarget
Contributor, 2021).
2. Information
Information consists of inputs that, to the person receiving it, make meaning in a
certain context. Data is a common term for information that is kept in a computer. The
resulting data can be regarded as information after processing (TechTarget Contributor,
2021).
3. Knowledge
Understanding of the laws required to interpret information is known as knowledge.
Norms may be either subjective or objective. The same dataset and information may yield
two different pieces of knowledge for two decision makers (Anderson, D. R., Sweeney, D.
J., Williams, T. A., Camm, J. D. and Cochran, J. J., 2019).
4. Transforming Data
a) Into Information
To transform data into factual information, there are five stages: data
collection, organization, processing, aggregation, and reporting. The end result
could be unpredictable if multiple environmental factors are unfavorable
(rivery, 2022).
b) Into Knowledge
We must first provide a general overview of the processes involved in
turning data into knowledge in order to comprehend the function and
significance of semantic data: First step is Gather and verify data for processing
during data ingestion. Next, Assess, compile, and combine data to provide
information through data curation. The third step is Data discovery: Get insights
by using curated data and information. Next step is Data Insights - Integrate
data and find patterns to improve comprehension. Final is Sharing Insights -
Spread knowledge by sharing information and insights (Marklogic, 2020).
Data: 60; 192; 37; 49;1 (Data, on their own, do not have meaning)
Information:
60: Is a number that represents the number of small companies classified with the number of
employees from 0-10 people0-10 employees. (Information: data (60) + number of companies classified +
the number of the companies)
192: Is a number that represents the number of small companies classified with the number of
employees from 0-10 people 11-100 employees. (Information: data (192) + number of companies
classified + the number of the companies)
37: Is a number that represents the number of small companies classified with the number of
employees from 0-10 people 11-100 employees. (Information: data (37) + number of companies classified
+ the number of the companies)
49: Is a number that represents the number of small companies classified with the number of
employees from 0-10 people11-100 employees. (Information: data (49) + number of companies classified
+ the number of the companies)
Knowledge: A company is a legal entity that represents an association of people. The members of the
company share a common purpose and unite to achieve specific stated goals. People can look at the
numbers to know the size of the company micro, small, medium or large.
The orange area in this circle accounts for 15%, which means that Long
Chau company has a revenue contribution ratio of 15% in 2021. The remaining
85% is FPT's revenue contribution rate FPT shop in 2021.
B. Exploratory
1. Theory
The analysis, summary, and presentation of results relating to a data set
produced from a sample or the complete population are referred to as "descriptive
statistics". The three primary categories of descriptive statistics are measures of
variability, measures of central tendency, and frequency distribution. Data
visualization is facilitated by descriptive statistics. It makes it possible for data to be
presented in a meaningful and understandable fashion, which therefore makes it
possible for the data set in question to be more easily interpreted (CFI Team, 2020).
2. Application
Figure 4: The corresponding scatter plot gives a positive correlation between car weight and height (MBA
Skool Team, 2013).
The chart above is due to slope from left to right (increasing); As the value of one
variable increases, the value of the other increases as well, this is a positive correlation.
Here it can be seen that the X axis represents the height of, and the Y axis represents the
weight. See that the taller you are, the heavier your weight will be. This is a positive
relationship, through which it can be seen that it is true for the whole.
When done correctly, exploratory research can build a solid groundwork for
any subsequent research on the same topic. Exploratory research that is conducted
effectively aids in choosing research design, sample techniques, and data gathering
methods. This also entails a sense of duty on the part of the researcher to attempt a
thorough examination of the problem and focus on accurate reporting of findings
(Voxco, 2021).
Exploratory research provides a deeper grasp of a subject that has not been
thoroughly investigated and satisfies the researcher by revealing facts and
highlighting new problems. It also aids in the improvement of potential research
questions. It also aids in selecting the most effective strategy for achieving the goal
(Voxco, 2021).
b) Cons
Exploratory research yields unreliable findings and is therefore
inconclusive. Such research focuses on developing and grasping a better
understanding of the problem at hand. Effective decision-making cannot be based
on these research findings (Voxco, 2021).
The risk of the sample responses not being representative of the target
audience is increased by the small sample size utilized for exploratory research.
Smaller samples of people, while valuable for a rapid study, can impede a coherent
understanding, which not only degrades the quality of the current research but also
has a negative effect on future research along similar lines (Voxco, 2021).
When knowledge is acquired through secondary sources, it may be out-of-
date and unlikely to make a substantial addition to our understanding of a problem
today. Under the dynamic market conditions, outdated information is neither
actionable nor helpful in providing any sort of clarification (Voxco, 2021).
C. Confirmatory
1. Theory
The first step in your data analysis process is called an exploratory data analysis
(EDA). At this point, there are a number of crucial tasks to complete, but they all come
down to deciding how to interpret the data, formulating the questions you want to ask and
how to frame them, and devising the most effective ways to present and manipulate the
data you already have in order to extract those crucial insights (Sisense, 2018).
2. Application
Figure 6: Correlations table in the IBM SPSS Statistics Output Viewer (Bradburn, 2021).
The matrix table above shows the correlation between age and cholesterol
levels. In the above matrix, it can be seen that “Pearson Correlation” is the person
correlation coefficient denoted by r. This number between 0 and 1 means that the
linear relationship between age and cholesterol levels is strong. And the correlation
between age and blood cholesterol level gives the Correlation Coefficient r value of
0.082. For “Sig. (2-tailed)” means that the P value is less than 0.005 specifically
0.001 indicating that the accuracy compared to many people is very high. “N” is the
number of data pairs in the analysis, which is 10. Therefore, it can be concluded
that: There was a strong positive association between participants ages and blood
cholesterol levels (r = 0.082, P = 0.001).
b) Cons
Inaccurate appearance of precision under less-than-ideal conditions
(Topguypoland, 2023).
IV. Application
A. Inferential Statistics.
The research question posed in this report is “Average revenue of service industry in 3
regions of Red River Delta, North Central & Central Coast, Mekong River Delta”. This report will
focus on the business population in Vietnam, specifically 339 enterprises of all industries in the 3
regions of the Red River Delta, North Central & Central Coast, and Mekong River Delta.
Sampling method is the method used to select candidates from the population to proceed to sample
survey (Admin, 2023). The sampling strategy is cluster sampling.
Mean: This is the average value of the data. The average of the figures in this table is 151.6448326.
Median (median): This is the middle value of the data when sorted in ascending order. The median in this table is
26.
Mode (most common): This is the value that appears most in the data table. The most common in this table are 15.
SD (standard deviation): This is a measure that shows the difference between the values and the mean. The standard
deviation of this table is 810.4411635.
Range: This is the distance between the largest and the smallest value in the table. The range of this table is 16998.
For qualitative:
The table above describes the number of regions (named a2) in Vietnam divided into three regions:
Red River Delta, North Central and Central Coast, and Mekong River Delta. The second column of the
table is the number of regions, and the third column is the percentage of that number to the total number of
zones. From this table, it can be seen that the Red River Delta has the highest rate of 43.77% and has a
frequency of 302 twice as much as that of the Mekong River Delta of 148. Besides, it can be seen that the
North Central and Central Coast regions have the highest rate. rate is 34.78% lower than the Red River
Delta is about 9% and higher than the Mekong River Delta is 13.35%. And finally, it is found that the
Mekong Delta has the lowest frequency of 148 and the lowest percentage in the three regions of 21.43%.
C. Measuring association.
Figure 9: The Correlations between Sales (VND) and Hour Operating in a week
Based on the data table above that collected by 339 companies in the North Central Area &
Central Coastal Area, Red River Delta, and Mekong River Delta, it can be seen that the correlation
coefficient between "f2" and "d2" can help us understand how changes in hours operating/week
relate to changes in sales in VND. A negaive correlation indicates that as hours operating/week
increase, sales in VND also tend to increase. On the other hand, a negative correlation would
suggest that as hours operating/week increase, sales in VND tend to decrease. The correlation
coefficient between "f2" and "d2" is -0.00902334. So, this indicates a strong negative correlation.
It means that as one variable increases, the other variable tends to decrease in a consistent and
noticeable manner.
V. Conclusion
In December 1999, when SSI Securities Joint Stock Corporation (SSI - HOSE) was established, it
became the market's first and smallest private company to be given a securities trading license. The Firm,
which has more than 20 years of experience in the Vietnamese financial industry, has developed into a
famous financial institution with the highest rate of expansion, having more than 1,000 times increased its
charter capital. Because of its strong financial capacity, sound corporate governance standards, and skilled
human resource personnel, SSI offers a wide range of financial products and services to its clients while
maximizing cost-value for shareholders. The group plans to generate business reports utilizing statistical
methods for several Vietnamese enterprises and industries. Is an analyst with SSI Securities Joint Stock
Company. The author will prepare business reports of some Vietnamese companies/industries by applying
some statistical methods. As an illustration, consider the evaluation and analysis of business data
(financial data, stock market), microeconomics, current macroeconomic issues, upcoming
trends/intentions, etc. that are relevant to the research topic. I will: Assess business and economic
data/information from published sources using the essay below, applying statistics to business planning
and communicating conclusions with the relevant charts and tables. Analyzing and evaluating raw
business data. The article that comes after this one also uses clustering. The following essay's first section
will evaluate the nature and methodology of business and economic data/information from a variety of
published sources. The evaluation of data from numerous sources using various analytical techniques
comes next, and then graphics are applied to the data file provided to the author for interpretation.
VI. References
Admin, V. (2023) Chi tiết bài học Phương pháp lấy mẫu, Vimentor, [online] Available at:
https://vimentor.com/vi/lesson/phuong-phap-lay-mau (Accessed May 28, 2023).
Admin (2022) Advantages and Disadvantages of Descriptive Statistics, All Things Statistics, [online]
Available at: https://allthingsstatistics.com/descriptive-statistics/descriptive-statistics-advantages-
disadvantages/ (Accessed May 24, 2023).
Anderson, D. R., Sweeney, D. J., Williams, T. A., Camm, J. D. and Cochran, J. J. (2019) Essentials of
Statistics for Business and Economics, 14th ed, Cengage Learning.
Bradburn, S. (2017) How To Perform A Pearson Correlation In SPSS, Top Tip Bio, [online] Available
at: https://toptipbio.com/pearson-correlation-spss/ (Accessed May 25, 2023).
CFI Team (2020) Descriptive Statistics, Corporate Finance Institute, 18th November, [online] Available
at: https://corporatefinanceinstitute.com/resources/data-science/descriptive-statistics/ (Accessed May 24,
2023).
Chiêm, K. (2021) FPT Retail lãi trước thuế 76 tỷ đồng nửa đầu năm, hoàn thành 63% kế hoạch, Ndh.vn,
26th July, [online] Available at: https://ndh.vn/doanh-nghiep/fpt-retail-lai-truoc-thue-76-ty-dong-nua-dau-
nam-hoan-thanh-63-ke-hoach-1296123.html (Accessed May 25, 2023).
Hayes, A. (2023) Descriptive Statistics: Definition, Overview, Types, Example, Investopedia, 21st
March, [online] Available at: https://www.investopedia.com/terms/d/descriptive_statistics.asp (Accessed
May 24, 2023).
Leard statistics (2023) Pearson’s Product-Moment Correlation in SPSS Statistics, Procedure,
assumptions, and output using a relevant example., [online] Available at: https://statistics.laerd.com/spss-
tutorials/pearsons-product-moment-correlation-using-spss-statistics.php (Accessed May 25, 2023).
MBA Skool Team (2013) Scatter Plot - Meaning & Definition, MBA Skool, 4th May, [online]
Available at: https://www.mbaskool.com/business-concepts/statistics/6823-scatter-plot.html (Accessed
May 25, 2023).
Sisense (2017) Exploratory and Confirmatory Analysis: What’s the Difference? l Sisense, Sisense,
[online] Available at: https://www.sisense.com/blog/exploratory-confirmatory-analysis-whats-difference/
(Accessed May 24, 2023).
TechTarget Contributor (2021) information, TechTarget, 18th May, [online] Available at:
https://www.techtarget.com/searchdatamanagement/definition/information (Accessed May 22, 2023).
Topguypoland (2023) What are some differences between confirmatory analysis and exploratory
analysis?, Cross Validated, [online] Available at: https://stats.stackexchange.com/questions/89401/what-
are-some-differences-between-confirmatory-analysis-and-exploratory-analysis.
Voxco (2021) Exploratory Research: Pros And Cons, Voxco, [online] Available at:
https://www.voxco.com/blog/exploratory-research-pros-and-cons/ (Accessed May 24, 2023).
Zach (2020) Descriptive vs. Inferential Statistics: What’s the Difference?, Statology, [online] Available
at: https://www.statology.org/descriptive-inferential-statistics/ (Accessed May 28, 2023).
Marklogic (2020) Transforming Data into Knowledge with Semantics, MarkLogic, [online] Available at:
https://www.marklogic.com/blog/transforming-data-into-knowledge-with-semantics/ (Accessed July 8,
2023).
rivery (2022) What Is Data Transformation: A Guide to Basics, Rivery, [online] Available at:
https://rivery.io/data-management-glossary/data-transformation-guide/ (Accessed July 8, 2023).