Data Analyst Cheat Sheet FROM Parth Roy
Data Analyst Cheat Sheet FROM Parth Roy
4. Remove columns you are not going to use in your report. Prefer ‘Remove Other Columns’ above the 3. Give measures a prefix (%, #, €).
‘Remove Columns’ option, for lower risk that structural changes in your data source break the query. 4. Use abbreviations like YTD, LY, PY, PP as a suffix, to keep the base fields together in the sort order.
5. Maximize the use of Query Folding for faster and more efficient queries. With Query Folding, multiple 5. Hide columns that are needed but are irrelevant for the user.
transformations are merged as one query and then sent to the source. If ‘View Native Query’ is not 6. Hide the key at the many side of a many-to-one relation (e.g. [OrderDate] in the ‘Revenue’ table).
available, Query Folding has stopped before that step.
7. For each measure column in your data model, make a DAX Calculated Measure instead of using the
6. In general, prefer “Import” over “DirectQuery”. Unless the amount of data is too large to import, or ‘Default Summarization’, then hide the original column. This way all measures will have the same
when there are other requirements (like real-time insights). icon. And it enables you to easily change the calculation in the future (e.g. adding a filter condition).
7. Use Date.From instead of DateTime.Date to extract a date from another field, and to make sure Also, it is easier to reference this measure in other DAX calculations.
query folding won’t break. More info on this blog post: http://bit.ly/DateFrom. 8. Always use the table name when you refer to a column, for example: ‘Product’[Category].
8. Turn off ‘Enable Load’ for queries/tables that you don’t need in the Data Model. 9. Use DIVIDE() to prevent division by 0, and to improve the speed of your divisions.
9. Re-use Power Query code and lower impact on your data source by using Power BI dataflows. 10. Use IsInScope to get the right hierarchy level in DAX (read all about it in Kasper de Jonge’s blog:
10. Turn on the Formula Bar so you get familiar with Power Query (M) code. https://bit.ly/KasperOnBIInScope).
11. Automatically beautify all column names in a query, e.g. “CustomerName” → “Customer Name”, by 11. In DAX: (un)comment DAX lines by pressing Alt + Shift + A or CTRL + /, and Shift + Enter for line breaks.
using the Power Query function Alex Powers shared on his GitHub repo: http://bit.ly/PQSplitByCase. 12. Use aggregations to keep your model small and performant, and still have all detailed data available.
Note: he also has a function to replace underscores in all column names automatically. 13. Use Tabular Editor to make changes to your Power BI file (currently unsupported by Microsoft). Also,
make sure to check-out its best-practices analyzer.
Code examples (don’t forget that Power Query / M is case-sensitive!) 14. Avoid bi-directional cross filtering and make use of measure filters http://bit.ly/MeasureFilters.
• if T > 0 then A else B 15. For very large models, group measures or fields in display folders for better usability.
• try A/B otherwise 0 16. Use DAX Studio to capture all DAX queries executed on your Premium Capacity.
• #table( { “X”, “Y” }, { { 1, 2 }, { 3, 4 } } ) 17. Keep your PBI desktop file fast and small by using TOP N (http://bit.ly/ImproveReportBuilding) and
• DateTime.LocalNow() switch underlying data source in PBI service after publishing (http://bit.ly/ParameterizeDatasource).
• Date.From( DateTime.LocalNow() )
Resources
• Excel.Workbook(Web.Contents("[url]/[filename].xlsx"), null, true)
• #shared to list all functions and get PQ documentation • Increase the readability of your DAX calculations: https://www.daxformatter.com.
• Use DAX Studio to analyze and tune your calculations: http://daxstudio.org.
Resources • Find all about DAX expressions: https://dax.guide.
• Power Query M Formula Reference: http://bit.ly/PQMReference. • Use Tabular Editor to easily build and manage your models: https://tabulareditor.github.io/.
• Repo by Imke Feldmann with a lot of custom Power Query functions: https://github.com/ImkeF/M/.
Resources
• OK VIZ Visual reference: https://sqlbi.com/ref/power-bi-visuals-reference. Dave Ruijter Marc Lelijveld
• SQL Jason Financial Times Visual Vocabulary: https://bit.ly/SQLJasonVisualVocabulary. linkedin.com/in/daveruijter linkedin.com/in/marclelijveld
twitter.com/daveruijter twitter.com/marclelijveld
I’VE GOT THE POWER BI https://moderndata.ai/ https://data-marc.com/
TABLEAU CHEAT SHEET
Relevant videos are linked throughout the document. You must be signed in to your Tableau account in order to view the videos.
Workbook Components
Sheet: A sheet is a singular chart or map in Tableau. A sheet is represented in Tableau with this symbol:
Dashboard: A dashboard is a canvas for displaying multiple sheets at a time and allowing them to interact with each
other. A dashboard is represented in Tableau with this symbol:
Container: A container is a layout frame on a dashboard that can house sheets, images, filters/parameters, and text
boxes. Containers can be horizontal (objects placed go side-by-side) or vertical (objects placed are on top of one
another). Double-click any sheet on a dashboard by the center “grip” marks to select the container that the sheet sits in.
Story: A story is a viewing portal that contains a sequence of worksheets or dashboards that work together to convey
information. Each individual sheet in a story is called a story point. A story is represented in Tableau with this symbol:
Workbook: A workbook is the entire Tableau file containing your sheets and dashboards.
Packaged Workbook: A single zip file with a .twbx extension that contains a workbook along with any supporting
local file data sources and background images. Use this format to package your work for sharing with others who don’t
have access to the data.
Getting Started with Dashboards and Building a Dashboard Dashboard Layouts and Formatting
Stories (6 min) (4 min) (6 min)
Story Points
(4 min)
Tableau Interface
Data Pane: The default left pane that lists your open data sources and the dimensions and measures contained in the
selected data sources. Sets and Parameters are also listed here.
Analytics Pane: Clicking the Analytics tab on the left pane will display available analyses for the data displayed on
your sheet. Inapplicable analyses will be grayed out. Analyses include adding constant lines, box plots, trend lines,
forecasts, and reference bands.
Marks Card: The Marks card is the tool used to create a sheet that controls most of the visual elements in a sheet.
Using the Marks card, you can switch between different chart types (bar, line, symbol, filled map, and so on), change
colors and sizes, add labels, change the level of detail, and edit the tool tips.
Rows and Columns Shelves: The Rows shelf and the Columns shelf is where you determine which variables will go
on what axis. Put data you want displayed along the X-axis on the Columns shelf and data you want displayed on the Y-
axis on the Rows shelf.
Measure: A variable from the dataset that is meant to be aggregated. (This means it should be a number that it makes
sense to do math with: sum, average, and so on.) Measures are often continuous data. Examples include GPA, sales,
quantity, quota, height, and salary. When a measure is pulled into your sheet, it takes the form of a green pill.
Measure
Pill: The visual representation of a data item brought into your sheet. Pills can sit on the rows and columns shelves, the
marks card, and the filters card.
Data Types: Data fields will have an icon beside them to visually indicate what type of data field they are.
String Integer Geographic Loc. Date Group Set Hierarchy Bin Calculated Field
Getting Started
(4:50 – 7:00)
Filters/Parameters
Filters: A filter is used to limit what data is being displayed on the sheet. Visible controls for a filter on a sheet or
dashboard are called Quick Filters. Each filter is for an individual data field. Both dimensions and measures can be used
as filters.
Parameters: While filters limit the data shown in the view, parameters act as a variable in an equation that can be
controlled by the end user. Parameters only work in conjunction with either filters, sets, reference lines, or calculated
fields. Parameters are workbook-wide and can be used in multiple places (i.e. a single parameter can influence multiple
filters and calculated fields across different data sources in the workbook). Parameters are located separate from
Dimensions and Measures on the data pane.
NOTE: Filters, when layered appropriately, can affect the values displayed in other filters to show only relevant values
(i.e. selecting ENGR division will cause major filter selection to only should ENGR majors). Parameters cannot influence
filters in this manner (i.e. selecting “Undergraduate” through a parameter, will not limit the major filter selection to only
undergraduate majors)
Sets: A subset of your data that meets certain conditions based on existing dimensions. Unlike a group, sets only have
two values: IN and OUT. A member is either IN your set, or not (OUT). Like parameters, sets can be used throughout a
workbook on multiple sheets. Also like parameters, sets are located separate from Dimensions and Measures on the
data pane. Sets can be created by:
• Highlighting multiple header names or data points then right clicking will give you the option to put those
dimension fields in a set.
• Clicking on the dimension you want to group in the data pane, then selecting Create > Set… will give access to
greater control over your sets and the ability to create computed sets based on conditions.
Bins: Bins are buckets based on a range of values. While groups and sets are used for grouping dimensions, bins are
used for grouping measures. The created bin will set in the Dimensions shelf. Bins can be created by right-clicking on a
measure, then selecting Create > Bins…
Hierarchies: Often data sources have related dimensions that have an inherent hierarchy. For example, a data source
may have fields for Country, State, and City. These fields could be grouped into a hierarchy called Location. In this
example, a user can expand country and breakdown the data into by state and city. Hierarchies can be created by:
• Using the CTRL key, select the dimensions you want to be in your hierarchy, right click and Create Hierarchy.
Once the hierarchy is created it’s simple to put into the correct order, just drag and drop the dimensions in the
hierarchy into the correct position.
• Clicking a field and dragging it on-top of another field will also create a hierarchy.
Other Terminology
Action: An interaction that you can add to your views. There are three types of action: Filter, Highlight, and URL.
Aggregation: A result of a mathematical operation applied to a measure. Predefined aggregations include summation
and average. You can convert dimensions to measures by aggregating them as a count. For relational data sources, all
measures must be either aggregated or disaggregated (unless they appear on the Filters shelf). Tableau aggregates
measures, usually as a summation, when you place them on a shelf.
TABLEAU CHEAT SHEET
Aliases: an alternative name assigned to a dimension member, or to a field name. Aliases can be created by:
• Right-clicking on an individual dimension header and selecting Edit alias…
• Right-clicking on a dimension in the data pane and selecting Aliases…
• Clicking opening Data from the top toolbar, going to your data sources, and selecting Edit Aliases…
Calculated Field: A new field that you create by using a formula to modify the existing fields in your data source.
Caption: A description of the current view on the active worksheet. For example, “Sum of Sales for each Market”. You
can automatically generate captions or create your own custom captions. Show and hide the caption by selecting
Worksheet > Show Caption.
Crosstab: A text table view. Use text tables to display the numbers associated with dimension members.
Sheet Description: A thorough summary of the data used in a worksheet including all dimensions and measures
used, a written description of the view (marks, rows, columns), the formulas for all calculated fields used on the sheet,
and the data source details.
Tooltip: Tooltips are text boxes that appear when hovering over a mark on a sheet in order to give more information.
The text and text formatting in them are easily edited through the Marks card.
TABLEAU CHEAT SHEET
Shortcuts
Description Windows Mac
Sources:
http://www.dummies.com/programming/big-data/big-data-visualization/tableau-for-dummies-cheat-sheet/
http://onlinehelp.tableau.com/current/pro/desktop/en-us/glossary.html
https://www.tableau.com/learn/training
Population entire collection of objects or ➔ Mean arithmetic average of data ➔ Variance the average distance
individuals about which information is desired. values squared
➔ easier to take a sample ◆ **Highly susceptible to n
∑ (xi x)2
◆ Sample part of the population extreme values (outliers).
that is selected for analysis Goes towards extreme values
sx2 = i=1
n 1
◆ Watch out for: ◆ Mean could never be larger or
● Limited sample size that smaller than max/min value but ◆ sx2 gets rid of the negative
might not be values
could be the max/min value
representative of
◆ units are squared
population
◆ Simple Random Sampling ➔ Median in an ordered array, the
Every possible sample of a certain median is the middle number ➔ Standard Deviation shows variation
size has the same chance of being ◆ **Not affected by extreme about the mean
values
√
selected n
∑ (xi x)2
i=1
Observational Study there can always be ➔ Quartiles split the ranked data into 4 s= n 1
lurking variables affecting results equal groups
➔ i.e, strong positive association between ◆ Box and Whisker Plot ◆ highly affected by outliers
shoe size and intelligence for boys ◆ has same units as original
➔ **should never show causation data
◆ finance = horrible measure of
Experimental Study lurking variables can be risk (trampoline example)
controlled; can give good evidence for causation
Binomial Example
Binomial Distribution
➔ doing something n times
➔ only 2 outcomes: success or failure
➔ trials are independent of each other (Example cont'd next page)
➔ probability remains constant
Sums of Normals
➔ Mean for uniform distribution:
(a+b)
E (X) = 2
➔ Variance for unif. distribution:
(b a)2
V ar(X) = 12 Confidence Intervals = tells us how good our
estimate is
Normal Distribution Sums of Normals Example: **Want high confidence, narrow interval
➔ governed by 2 parameters: **As confidence increases , interval also
μ (the mean) and σ (the standard increases
deviation)
➔ X ~ N (μ, σ 2 ) A. One Sample Proportion
Hypothesis Testing
➔ Null Hypothesis:
➔ H 0 , a statement of no change and is
➔ If n > 30, we can substitute s for assumed true until evidence indicates
σ so that we get: otherwise.
➔ Alternative Hypothesis: H a is a
statement that we are trying to find
evidence to support.
Example of Sample Proportion Problem ➔ Type I error: reject the null hypothesis
when the null hypothesis is true.
(considered the worst error)
➔ Type II error: do not reject the null
hypothesis when the alternative
hypothesis is true.
4. PValues
➔ a number between 0 and 1
➔ the larger the pvalue, the more
consistent the data is with the null
➔ the smaller the pvalue, the more
consistent the data is with the
2. Test Statistic Approach alternative
(Population Mean) ➔ **If P is low (less than 0.05),
3. Test Statistic Approach (Population
H 0 must go reject the null
Proportion)
hypothesis
Two Sample Hypothesis Tests ➔ Test Statistic for Two Proportions 2. Comparing Two Means (large
1. Comparing Two Proportions independent samples n>30)
(Independent Groups)
➔ Calculate Confidence Interval ➔ Calculating Confidence Interval
Matched Pairs
➔ Two samples are DEPENDENT
Example:
︿
➔ Interpretation of slope for each ➔ corr (Y , e) = 0
additional x value (e.x. mile on
odometer), the y value decreases/ A Measure of Fit: R2
increases by an average of b1 value
➔ Interpretation of yintercept plug in
︿
0 for x and the value you get for y is
the yintercept (e.x.
y=3.250.0614xSkippedClass, a
student who skips no classes has a
gpa of 3.25.)
➔ **danger of extrapolation if an x
value is outside of our data set, we
can't confidently predict the fitted y ➔ Good fit: if SSR is big, SEE is small
value ➔ SST=SSR, perfect fit
Simple Linear Regression
➔ R2 : coefficient of determination
➔ used to predict the value of one
variable (dependent variable) on the Properties of the Residuals and Fitted R2 = SSTSSR
= 1 SSE SST
basis of other variables (independent Values ➔ R2 is between 0 and 1, the closer R2
variables) 1. Mean of the residuals = 0; Sum of is to 1, the better the fit
︿ the residuals = 0
➔ Y = b0 + b1 X ➔ Interpretation of R2 : (e.x. 65% of the
︿ 2. Mean of original values is the same variation in the selling price is explained by
➔ Residual: e = Y Y f itted ︿
as mean of fitted values Y = Y the variation in odometer reading. The rest
➔ Fitting error: 35% remains unexplained by this model)
︿
ei = Y i Y i = Y i b0 bi X i ➔ ** R2 doesn’t indicate whether model
◆ e is the part of Y not related is adequate**
to X ➔ As you add more X’s to model, R2
➔ Values of b0 and b1 which minimize goes up
the residual sum of squares are: ➔ Guide to finding SSR, SSE, SST
sy
(slope) b1 = r s
x
b0 = Y b1 X 3.
4. Correlation Matrix
Assumptions of Simple Linear Regression Example of Prediction Intervals: Regression Hypothesis Testing
1. We model the AVERAGE of something *always a twosided test
rather than something itself ➔ want to test whether slope ( β 1 ) is
needed in our model
2. ➔ H 0 : β 1 = 0 (don’t need x)
H a : β 1 =/ 0 (need x)
➔ Need X in the model if:
a. 0 isn’t in the confidence
interval
Standard Errors for b1 and b0 b. t > 1.96
➔ standard errors when noise c. Pvalue < 0.05
➔ sb0 amount of uncertainty in our
estimate of β 0 (small s good, large s Test Statistic for Slope/Yintercept
bad) ➔ can only be used if n>30
➔ sb1 amount of uncertainty in our
➔ if n < 30, use pvalues
estimate of β 1
➔ Multicollinearity
◆ when x variables are highly
correlated with each other.
◆ ovtest: a significant test ◆ R2 > 0.9
statistic indicates that ◆ pairwise correlation > 0.9
➔ Outliers polynomial terms should be ◆ correlate all x variables, include
◆ Regression likes to move added y variable, drop the x variable
towards outliers (shows up ◆ H 0 : data = no transf ormation that is less correlated to y
as R2 being really high) H a : data =/ no transf ormation
◆ want to remove outlier that is Summary of Regression Output
extreme in both x and y
➔ Nonlinearity (ovtest)
◆ Plotting residuals vs. fitted
values will show a
relationship if data is ➔ Normality (sktest)
nonlinear ( R2 also high) ◆ H 0 : data = normality
H a : data =/ normality
◆ don’t want to reject the null
hypothesis. Pvalue should
be big
◆ Log transformation
accommodates nonlinearity,
reduces right skewness in the Y, ➔ Homoskedasticity (hettest)
eliminates heteroskedasticity ◆ H 0 : data = homoskedasticity
◆ **Only take log of X variable ◆ H a : data =/ homoskedasticity
# #$
# (
μ &
( !
# ! * !
+
#
, #
'
2
=
(x μ) 2
2
=
x 2
μ 2
n n
2
s & #'
( x ) 2
(x x ) 2
x 2
n
s2 = s2 =
n 1 n 1
* ! +
s
) $
"!
s
% !! %
n
1 1 2 s12 s22
s + & +
n1 n 2 n1 n 2
!"!%" " '!" #"#"! " !$#
#'"!!243+!""!"""#"!#"!$#2μ43
" "$'"!!23+!""!"""!"#1! " "1!!!"μ4
"-!""!"+!" (!" "% μ4
x μ0
t= ! 23>
s
n
"""-!" #" "$#! "-$#2 "'"""!%!"'
$μ4!"#"3#!"#""-!""!" ! ,
+μAμ4""-!""!"!'!"$* "!$
+μ@μ4""-!""!"!'"$*""-!" #"!!'" ! " "'!""-
!""!"% !"$
"+""-!""!"!"/% 0!)"-$#!5#!"$" "
+μ?μ4 "-$#!""-!""!"% !"$#"2"! "!!" "
"3
"-$#!!!"" " $# !2!#!#'4,493) ""#
'"!!""" "$'"!!,
+
# & !!!! """"#!!#"&"',#"
!# "!#$!'# "%" ' %,# !#"! $%,
&#""$ ! %" !4,491,!"!#$'# "%" &
""2!!#>4,493.
' # 4+μ>4,49*+μA4,49
1 #""!" $"'# !+
5 4,495 x = 0.0508
6 4,4949 (x x ) 2 (0.051 0.0508) 2 + (0.0505 0.0508) 2 + etc...
2
7 4,48= s = = = 9.15 107
8 4,495:
n 1 6
9 4,496 s = s2 = 9.56 104
: 4,494<
; 4,494:
x μ0 0.0508 0.05
"-!""!"!+ t = = = 2.17 " ! >;-5>:
s 9.56 104
n 7
"""-!" #" "$#!")6,5;%": ! !"%4,49>4,469,
!!"""-$#!!!"4,49)!'# "4#"""!#$'# "
%" &!"",
!"!%" "!"%#"! !"' " "
$# # !! "'"$#"" #*+ " $#!" #
" "" $#""
#" """%$#! $#""!"$#!2" !3#!
μ4>4" -!"-"!"
"%#"! "
4+!""!"""!""%#"! #2μ5>μ63
+!""!"""!""%#"! ## ! " """ 2μ5?μ6)μ5Aμ6)μ5@μ63
!+ !! !)
x1 x 2 x1 x 2
""# ) t = """# ) t =
1 1 s 2
s 2
s2 + 1
+
2
n
1 n 2 n
1 n 2
;.1+1/:.2+1/
!!!+ !"!!#" !+#"" !"!!+ !! !
* !! !!+ !! !. "!!! /!
"!+#"μ1<μ2*
! 18! *12$ !!!##700& . !
#!;21& /* !6" !! !##668& . !
#!;30& /* !!! ! ! . ";0*05/,
μ1;700'1;21'1;12(μ2;668'2;30'2;6
0)μ1;μ2
)μ1=μ2." $& !!! /
! "!!!# !!$"! "" !!! " !
!#!& *
x1 x 2 700 668
!+ !! ! ) t = = = 2.342
s12 s22 212 30 2
+ +
n1 n 2 12 6
;.1+1/:.2+1/;.12+1/:.6+1/;16
!!+ !"!!'!+#" !$0*010*02' $!0* !!
! ! ! *
$!! #!!!! # !"!
0)! #!!! !# !"!
)! #!! !!!# !"!
(O E) 2
+ " !! ! ) =
2
. ! ##" !%!#"/
E
;"! ! !"!-1
!!+#"!!2!#" " !"!2#" *!+#" !'!
#! !!!%! !"!*='!!&! !%! !"!
"" $"!!!!!!!"!&*
" !"!" * ""'"$!
"'" *#" !1!"* 2 $
$*! !"!& !!&. ";0*05/,
" " " "
143 60 55 18
! !! !&!9)3)3)1' )
0)! # !"!2 ! 9)3)3)1 !"!
)! # !"!2 !!9)3)3)1 !"!
%!#" )
" " " "
155*25 51*75 51*75 17*25
(O E) 2 (143 155.25) 2 (60 51.75) 2 (55 51.75) 2 (18 17.25) 2
2 = = + + + = 2.519
E 155.25 51.75 51.75 17.25
<5,2<4
!##3!#%$")#,%$"!#!#1+36)"#"!#!$!!#(
""!##($""+
$!"#$(###!"!"$"#!#!#
!!"#"!($"+#"$+#$""($!
"#!$#.""$<1+16/-
1*#!"$!"#!$#$"
*#!"#$!"#!$#$"
!!##36$")"#!"$!"#!$#)#!"$3+889
$"!" $!+ !#$"!%%$"!3)5)5)4)4)4)3)4)2)"#3"##"#"*
(O E) 2 (1 2.778) 2 (2 2.778) 2 (3 2.778) 2 (4 2.778) 2
2 = = + 2 + 4 + 2 = 2.72
E 2.778 2.778 2.778 2.778
<:,2<9
!##3!#%$")#,%$"!#!#1+36)"&#!#1+ $""(
$!"#!$#+
"&#!#&#!%!"!!#!#./
1*##&%!"!#
*##&%!"!##
"#(""$#"$#'#"#!$#
"!%%$".02)03)04)05/!$"$(!"#"#+!&"#!(%!2
$"#!(%!3+
!2 #"
#!( #!(
!3 #!( 02 0 3 02;03
#!( 04 0 5 04;05
#" 02;04 03;05 02;03;04;05
!!##!(%!2"#$!%$"#!(%(###$!
%$" #1 +# 3 +""$)#'#$!%$"##&##!(
#1 +# 2 +# 3 +# 4
%!3"#!!##!($#(#$!%$"#!(
#1 +# 3 + $")#'#%$"*
( #1 +# 2 )
#1 +# 2 +# 3 +# 4
(#1 +# 3 )(#1 +# 2 ) (row total)(column total)
E= =
#1 +# 2 +# 3 +# 4 grand total
!"!<.,2/.,2/ &!"#$!!&""#$!$"
2
," $!"##"#""# 2 = (O E)
E
#,%$"!##3!#%$"+
%##&)"#!!#"#&#""%"#".""$<1+16/-
#"" %
& $, & $,
%!" 224 224 221 26: 5:6
!!"!" 22: 246 283 2:1 727
2#:!##"( 88 :2 97 76 42:
=21!##"( 292 263 235 84 641
5:1 5:2 5:3 598 2:71
,&!
&!
%" # '%# &
(row total)(column total) (495)(490)
E= = = 123.75
grand total 1960
!
" (" (
! -./'31 -.0 -.0'.2 -..'55
-10 -10'/- -10'2/ -1/',2
-5 $ 35'31 35'5- 4,',4 35'.2
7-, $ -/.'1 -/.'33 -//',0 -/-'25
(O E) 2 (113 123.75) 2 (113 124) 2 (110 124.26) 2
2 = = + + + etc... = 91.73
E 123.75 124 124.26
6*(-+*(-+6*0(-+*0(-+65
. ! %(! ,',,-%",
"! '
$ $
$ $
$!
P(A and B) = P(A) P(B) "!
$!!
P(A or B) = P(A) + P(B) P(A and B)
$! ! !
P(A and B) P(B | A) P(A)
P(A | B) = P(A | B) =
P(B) P(B)
&
$ $ $"" '"
' "($ "$ ($" !' $'
$ ! $)
$ " "
0.5 0.5 0.5 = 0.125 = P(A and B) = P(2 children = bb and 1st child bb)
$ " . % P(B) = P(1st child = bb) = 0.5
P(A and B) 0.125
$ P(2 children = bb | 1st child bb) = P(A | B) = = 0.25
P(B) 0.5
"$ 1-2 # /-2
!' $ #& "$
" #(
)%+ 3 #*3 ,
) + 3 # 3 , & $%
P(B | A) P(A)
P(A | B) =
P(B)
P(S = Z |W = Zz) = 0.5 +0-2 ,
P(W = Zz) = 0.7 +,
P(S = Z) = (0.7 0.5) + (0.3 1) = 0.65 + # ,
0.5 0.7
P(W = Zz | S = Z) = = 0.538
0.65
" $" "
m (nm )
n! p (1 p)
P(X = m) =
m!(n m)!
3 "
" "(
5!0.51 0.5 4
P(1 boy of 5 children) = = 0.15625
1!(4)!
$
"&
&
enp (n p) m
P(X = m) =
m!
% !
PIVOT TABLE
CHEAT SHEET
INSERTING A PIVOT TABLE
Alt N V T
Alt F5
DRILL DOWN TO AUDIT
NUMBER FORMATTING
FILTERING
CONDITIONAL FORMATTING
PIVOT TABLES