Sri Ram Week 2 Assignment
Sri Ram Week 2 Assignment
Week 2 Assignment
Week 2 Assignment
Week 2 Assignment
Chapter 3
Discussion Question
Question 1
Data is the center of all analytics by offering the critical information which is needed
by analysts to use for their decision-making processes. Before any analytics can be
performed, data is required to establish a foundation from which patterns, trends, and
correlation can be identified, which are fundamental things need for judgment making in
business (Klee et al. , 2021). Analitical data capacities allow the companies to convert the
meaningless data to the business intelligence with the aim to make critical business decisions
and optimise the operational effectiveness. Thus, the part played by data in analytics is
irreplaceable, which proves the fact that it is the data which helps to disclose informative
To think of analytics without data for any analytical process would be ridiculous for
data is the bare foundation that it stands on. The analysis of data includes the application of
systematic computational process that uncovers patterns and insights (Klee et al. , 2021).
Without data, we cannot have any data to analyze. Since there is no data, there will be
nothing to do. Synergy between data and analytics is what allows organizations to leverage
the information power and pull it towards their competitive advantage making a case out for
Question 2
Developing a fresh and extensive concept of business analytics, the stuff for the
analytics cycle represents data, technology, and analytical expertise. Data comes to be a basic
source of input, which are further processed to be utilised as raw material for analysis (Klee
3
Week 2 Assignment
et al. , 2021). Technologies, which are analysis and numbers processing tools, make it
possible to convert the initial raw data into decision-making information. Quantitative
thinking, with a close knowledge of statistics, utilization of systems and data analysis is
detection and action. Insights, which in this case are to do with the profitable and strategically
important patterns and trends derived from the data, help in decision-making (Klee et al. ,
2021). This course of actions, which are driven by information collectd from analytical
Question 3
There has the inherent capacity in that entity to access both internal and external
information resources for the analytics needs in business. Internal avenues use T-Channel,
XYZ reports, and CRM systems that provide the structured data for customer relationships,
inventory, as well as sales (Klee, Van Guler, and Durbin, 2021). Principally, it means the data
compiled by firms do their research and consumers’ ratings found on social media or where
public reviews are stored as well as semi-structured andunstructed data collected by agencies
and institutions which include but not limited tomarkets reports and economic indexes.
The churning in data can be achieved via different sectors such as business
structures are usually ordered in a form that simplifies the search process because they are
precisely defined, and majority of them are dependent on the relational database (Klee &
Moon, 2021). The disorderly data are referred to the collection of unstructured (for instance,
4
Week 2 Assignment
text, images, and videos) information that is not predictable, and this data requires more
complicate processing methods to acquire and store it. On the one hand, the JSON or XML
documents which also known as semi-structured data, have some features signifying the
structured data and some that signifies the unstructured data too. With that, a huge global
business analytics system comes to the market so that people can base their decisions on
Question 4
For companies to have data which is analytic-ready for their analysis, the key metrics
that they are supposed to prioritize includes accuracy, completeness, consistency, timeliness,
and relevance. Dependability describes the hits of the data to be correct and remain the scenes
which is reflected in the real world of this scenario (Klee et al. , 2021). Complete data, all
data that are required and there is no missing value, is a priority for good quality data.
Consideration of the mechanisms that the data is uniformly clean within different datasets
and systems being checked, in order to stop the inconsistency that may result in incorrect
analysis.
Gersemness serves to hold the data as refreshed in which they can be used for current
analysis and decision-making (Klee et al. 2021). Generality emphasizes the fact that data is
tailored to the choice of the analysis/business question being studied which in turn sometimes
leads to relevant findings. Combined they create a realization of analytics-ready data that is
high-quality and factual to execute precise and informative analytics, which are useful to
Week 2 Assignment
Exercise
Question 12
The dataset found on Data. gov under the title "Electric Vehicle Population Data,"
offers complete information about the Battery Electric Vehicles (BEVs) and Plug-In Hybrid
Electric Vehicles (PHEVs) number as registered with the Washington State Department of
Licensing (DOL). Created on April 19, 2024, the data which is hereinabove contains items
like make, model, and type of electric vehicles (Electric Vehicle Population Data Sheets,
2024).
This data set underlie a chart which portrays the leading ten cities in Washington State
that have numerous electric vehicles. Through the indexing of electric vehicles per city, an
individual profile is provided conveying how the distribution of the manufacturers' electric
cars in these cities is portrayed. Cities ranked in the top ten included Seattle, Olympia, Lacey,
Tenino, and Yakima, and several others did but may not have been included in the list. The
selected cities were identified as those that most of the electric vehicles registrations occur;
an apparent concentration of the electric vehicle adoption among the population in these
cities.
6
Week 2 Assignment
this data set. The graph demarked each of the cities as a bar fashioned with a brand name of
the electric vehicles being sold. At first, the dash showed various makers from electric cars,
so there was a mass of data on the chart. For better clarity, it has been decided to present the
chart only with the most ten electrical cars brands. This was the more commendable zcn. It
was less zoomed, thus easier to point out the distribution trends of the most sought-after
Besides this dataset, there are also literature including instruments and service region
design that contribute to the explanation of the observed trends accordingly. They depict that
era of urban electric vehicle sharing systems is absolutely about service region design
efficiency and users’ acceptance (2017; 651). Study “Improving Accessibility and
published in the Manufacturing & Service Operations Management journal and the focus is
strategic planning for mounting the electric vehicle services in urban areas. This observation
comes along with the fact that advanced cities are the most significant locations for the pure
electric vehicles in the world and thereof proves that urban infrastructure and planning are the
Chapter 4
Discussion Question
Question 1
The data mining is the operation of data discovering meaningful information which is
hidden in large datasets by using scientific statistics, machine learning, and computations
techniques. It is all about taking the huge batch of information stored in databases, data
warehouses or other data containers and through it, extracts the most important knowledge. It
7
Week 2 Assignment
is a targeted process – processing unprocessed data into valuable insights that inform
decision-making and strategic planning (Marconi et al. , 2019). The data mining processes
encompass revealing hidden patterns, which, in turn, help the organizations to predict trends
for the future, improve processes within the organization, and make the strategy development
more effective.
Data mining works at the junction of numerous fields, which makes it impossible to
name this notion. These fields are statistics, computer science, and artificial intelligence.
Through different fields great attention is paid to some special aspects of data mining, so that
(2019))Furthermore, as data mining methods and applications get updated through the
the new methods. Diversity of the teams shows the richness of different applications, the
great promise and the constant development of data mining as a field. In addition to that, the
various terms that are used are an indication of the interdisciplinary nature of data mining as
well as the flexibility of this field of study to tackling unique problems of different context
and industries.
Question 2
The recent introduction of data mining as an integral part of business activities can be
explained by a number of crucial factors. Initially, big data caused the rapid expansion of data
created by enterprises, social media and general Internet of Things (IoT) devices, which
overwhelm conventional analytical tools. Data mining techniques allow companies to extract
meaningful information from the ocean of these big datasets. This knowledge helps them to
make solid decisions and obtain leading prospects over rivals. Things that can be added to
8
Week 2 Assignment
this is that, thanks to computational power and machine learning algorithms, data mining has
become more advanced and efficient, hence making it more widely used.
Furthermore, the growing requirement for data driven decision making in multiple
fields resulted into the popularity of data mining. Organizations target reducing redundancy
in their processes, ensuring better consumer satisfaction and finding unseen prospects with
data analytics (Marconi et al. , 2019). Predictive and prescriptive capabilities forecast trends,
detect abnormal behavior, and identify ideal solutions to improve strategic decision-making.
In addition, both rules and controls contribute to the requirement for high standards of data
analysis thus relates data mining as a necessity for operation of a modern business.
Question 3
for what its particular needs and objectives are. Identifying the purpose, like realizing clients’
instruments (Marconi et al. , 2019). Further of it, the organization also must consider the
scalability software enabling it to take over large amount of data and the ability to exist along
other systems.
Besides simplicity and difficulty of learning that comprise a software, another vital
one is also a usability. For instance, organizations need to evaluate from time to time if their
team possess the sufficient talents or some training is needed (Marconi et al. , 2019).
Moreover, support and documentation offered in the software and the vendor's good customer
help can affect the implementation of the data mining projects positively and also the long
term success of the projects. One should also compare costs associated with its use (i. e.
licensing feel and other hidden costs) with the efficiency of investment.
9
Week 2 Assignment
Question 4
Data mining, unlike the other analytical instruments, can automatically unveil the
hidden patterns and the relationships which exist within the large datasets. Different from the
conventional statistical analysis, which primarily tests hypotheses that are predefined, the
data mining puts algorithms to work against data, coming up with hypotheses (Marconi et
al. , 2019). Such investigative capacity makes data mining a remarkable instrument for
discovering of patterns and regroups that are invisible in conventional research. Data mining
equally incorporates disciplines like machine learning and artificial intelligence and includes
future tendencies and patterns whilst business intelligence is all about descriptive analytics
which is mainly used to summarize historical data. This ability to predict the future leads
organization to take proactive and data - driven decisions which they cannot with traditional
analytical techniques that provide mostly retrospective insights. Due to the multi-dimensional
and deep nature of data mining, it is extremely efficient in pursuit of a very complex pattern
and application.
Question 5
Data mining methods that are most prevalent may be divided into four categories -
classification, clustering, association rule mining, and regression. To classify data, predefined
categories area chosen based on similar patterns learned from a given training dataset
(Marconi et al. , 2019). Clustering acts as a mechanism that clusters groups of data points that
share common traits while being ignorant of the imposed labels, thus illuminating the true
structures of the data. The association rule mining technique gives frequencies in variables in
large data sets showing frequent item sets and correlations. Regression as we know it is the
10
Week 2 Assignment
techniques to predict continuous outcomes from variables which are mapped to one another
Principally, the two approaches differ in the basis and tactics that they involve.
clustering is towards segregating data without using any predefined labels (Marconi et al. ,
2019). The purpose of the variable is the association rule mining which endeavors to uncover
co-appearances and co-occurrences sequentially and for the predictive model. Meanwhile,
such differences allow both techniques to take part in different types of data analysis tasks.
Exercise
Question 1
The latest in data mining and predictive modeling has taken great strides, most
particularly, with GANs' incorporation into advanced schemes like Generative Adversarial
Network reveal a growing trend towards factoring in causal patterns that contribute to
correlation, which does not provide full data picture with the distribution shifts between
training and testing data. Whereas the budget restraint is tackled by the integration of causal
information into the predictive models in order to raise the robustness and fitness for purpose
One major advance they proposed is the causal information learning GAN-based
framework which Zeng et al, (2024) introduced. This novel method, based on the
relationships of the causes and tasks in everyday objects such as body movement and health,
is used for sensing non-linear datasets. By paying attention to certain at-risk variables and
11
Week 2 Assignment
removing confounding bias systematically, such a system improves the accuracy of the
modern deep learning technology that already works. Extensive tests performed in domains
relevant to large-scale and are the best argument for these in comparison with existing models
Such developments are not confined to just one sector; but rather, they cut across
many fields like health and mobility that have high demands for precise predictions. Along
with the capability to manage the distribution shifts and the causal data through advanced
analytics which is indicative of a huge leap in the field of predictive analytics. Beyond their
disaster-related air pollutants, Zeng et al. (2024) also highlight the need for more research in
this field to identify and assess any other possible improvements in this technique. Therefore,
the combination of causal explanations with machine learning expert models is one of the
major developments envisaged at this time for increasing the reliability and use of predictive
analysis.
12
Week 2 Assignment
References
Electric Vehicle Population Data. (2024, April 19). Retrieved from
https://catalog.data.gov/dataset/electric-vehicle-population-data
He, L., Mak, H.-Y., Rong, Y., & Shen, Z.-J. M. (2017). Service Region Design for Urban
Klee, S., Janson, A., & Leimeister, J. M. (2021). How Data Analytics Competencies Can
Foster Business Value- A Systematic Review and Way Forward. Information Systems
https://doi.org/10.1177/0011128718787517
Zeng, J., Zhang, G., Yuan, J., Li, Y., & Jin, D. (2024). Empowering predictive modeling by