We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4
Data Analytics Database – Collection of computer files containing o Can make statistical judgements and
data perform limited math.
Data Analytics Information – Comes from analyzing data 3. Interval Data Is the science of analyzing raw data to pull Metrics – Used to quantify performance o An example of scale data out useful insights so conclusions can be Measures – Numerical values of metrics Scale data is in numeric drawn. Discrete Metrics – Involves counting on time or not format ($50, $100, $150) on time, number or proportion of on time Can be measured on a Data Analyst (Math & Stats and Domain Continuous Metrics – Measured in a continuum continuous scale knowledge) Variable – Unit of data collection whose value can The distance between each Are individuals that perform the collecting, vary can be observed and as a understanding, and creating report of Variables – Defined into types according to the result measured the data can conclusions from the raw data available. level of mathematical scaling be placed in a rank order Is like the person that pick-up jigsaw puzzle o Data without true zero pieces (raw data) and do the work to Four Types of Data Time produce a clear picture or message out of 1. Categorial (Nominal) Temperature the jigsaw pieces. o Categories that cannot be rank When you ask someone to ordered rate their meal on a scale Business Analytics Applications o Cannot be placed in any order and of one to ten. Management of customer relationships no judgement can be made 4. Ratio Data Financial and marketing activities o No quantitative relationship to one o Also, an example of scale data Supply chain management another o Measured on a continuous scale and Human resource planning o No mathematical operations can be does have a natural zero point Pricing decisions performed o Ratios are meaningful Sport team game strategies o Reflect qualitative differences Monthly sales Importance of Business Analytics Sub-groups of Qualitative Data Delivery times There is a strong relationship of BA with Dichotomic – If it takes the form of a word Weight profitability and revenue of businesses, and with two options (Yes or No) Age shareholder return. Polynomic – If it takes the form of a word Enhances understanding of data with more than two options (Education – Methods of Presenting Data Vital for businesses to remain competitive Primary School, Secondary School, and Textual Enables creation of informative reports College/University) Tabular Scope of Business Analytics Graphical 2. Ordinal Data Descriptive Analytics – uses data to o Can be rank ordered Big Data understand past and present o Distance between each category Large, hard-to-manage volumes of data – Predictive Analytics – analyzes past performance cannot be calculated but the both structured and unstructured – that categories can be ranked above or inundate businesses on a day-to-day basis. Prescriptive Analytics – uses optimization below each other A term used to describe a collection of data techniques o No fixed units of measurement that is huge in size and yet growing Data – Collected facts and figures exponentially with time. A term for data sets that are so large ore o Emails by inbox, draft and sent, Predictive Analytics complex that traditional data processing tweets organized by hashtags, o Helps answer questions about what applications are inadequate for them. images and videos with tags will happen in the future Examples of Big Data Unstructured Data o Use historical data to identify trends Transaction processing systems, o Information that either does not and determine if they are likely to Customer databases, have a pre-defined data model or is recur. Emails, not organized in a pre-defined Prescriptive Analytics Medical records, manner. o Helps answer questions about what Internet clickstream logs, o Typically text-heavy, but may should be done Mobiles apps, contain data such as dates, numbers, o By using insights from predictive Social networks and facts as well. analytics, data-driven decisions can o Emails, text files, social media, be made Companies using Big Data Analytics mobile and communications data, Amazon media Types of Charts/Graphs Apples Line Graphs Google Characteristics of Big Data o A line graph is commonly used to Spotify Velocity – the speed at which data is display change overtime as a series emanating and changes are occurring of data points connected by straight Facebook Volume – refers to the sheer volume of data line segments on two axes. Instagram being generated every second Bar Graphs Starbucks Variety – can use structured as well as o A bar graph or bar chart is a chart Netflix unstructured data with rectangular bars with lengths American Express Veracity – data reliability and trust. proportional to the values that they McDonald’s Verifying and validating data. represent. The bars can be plotted Value – having access to big data is all well vertically or horizontally. Types of Big Data and good but that’s only useful if we can o Bar graphs are good for plotting Structured Data turn it into a value. data that spans a length of time (for o The easiest to work with example, for comparing o Has certain predefined Types of Big Data Analytics achievement between the beginning organizational properties and is Descriptive Analytics and the end of the year) or they can present in structured or tabular o Helps answer questions about what be used for comparing different schema, making it easier to analyze happened items in a related category (for and sort. o Summarize large data set to example, achievement results for o Spreadsheets, Excels, Web logs, different classes. describe outcomes to concern party medical devices, Online forms Pie Graphs Diagnostic Analytics Semi-Structured Data o Helps answer the questions about o When it comes to statistical types of o Data that is not captured or graphs and charts, the pie chart (or why things happened formatted in conventional ways. o Take the findings from descriptive the circle chart) has a crucial place analytics and dig deeper to find the and meaning. It displays data and cause statistics in an easy-to-understand ‘pie-slice’ format and illustrates Lesson 2 o Range = Max. Value – Min. Value numerical proportion. Interquartile Range Histogram Measure of Central Tendency o Interquartile range is defined as the o A histogram shows continuous data difference between the 25th and in ordered rectangular columns (to Also referred to as measures of centre or 75th percentile (also called the first understand what is continuous data central location, is a summary measure that and third quartile). see our post discrete vs continuous attempts to describe a whole set of data with data). Usually, there are no gaps a single value that represents the middle or between the columns. centre of its distribution. Pictograph Mean Variance o The pictograph or a pictogram is o The mean represents the average o Variance measures how far each one of the more visually appealing value of the dataset. number in the set is from the mean types of graphs and charts that (average), and thus from every other display numerical information with number in the set. the use of icons or picture symbols to represent data sets. Median Dot Plot o Median is the middle value of the o A dot chart or dot plot is a statistical dataset in which the dataset is chart consisting of data points arranged in the ascending order or plotted on a fairly simple scale, in descending order. Standard Deviation typically using filled in circles. Mode o The most common measure of Pareto Chart o The mode represents the frequently variation, or spread, is the standard o A Pareto chart is a type of chart that occurring value in the dataset. deviation. contains both bars and a line graph. o The standard deviation is a number It is a graph that indicates the Interpretation of the Measure of Central that measures, on average, how far frequency of defects, as well as their Tendency data values are from their mean. cumulative impact. Pareto Charts o The standard deviation is always are useful to find the defects to Negatively Skewed positive or zero. prioritize in order to observe the o Mean < Median < Mode greatest overall improvement. Positively Skewed Stem and Leaf Plots o Mean > Median > Mode o Represents data by separating each data value into two parts: the stem Measure of Dispersion (such as the leftmost digit) and the Measures of dispersion or variability are leaf (such as the rightmost digit) used to describe the spread or dispersion of Time Series the data. The Coefficient of Variance o Data set is composed of quantitative Range o A statistical measure of the entries taken at regular intervals o The range is the difference between dispersion of data points in a data over a period of time. the largest and the smallest series around the mean. observation in the data. o 0%-10% = Very good, <10%-20% = Good, <20%-30% = Acceptable, 30% = Not Acceptable