CRISP
CRISP
1) 4 Stages of analytic:
i. Descriptive= describing trends/what happen in the past
ii. Diagnostic= figuring the reason/cause of the trends or what happen in the past
iii. Predictive= the use of the trend, past or current data to make prediction/forecasting
for future event
iv. Prescriptive= decision making/recommendation based on all the first 3 analysis
process.
2) CRISP-ML (Q) stands for “Cross Industry Standard Process for Machine Learning with Quality
Assurance”. The 6 phrases are:
1) Continuous data is consisting of numerical value which can take any value that can be
measured and infinite. While discrete data is a solid which can be counted and exact value.
2) Data types in the data understanding is (1) continuous, (2) discrete, (3) Qualitative vs
Quantitative, (4) Structured vs semi structured vs unstructured, (5) big data and non-big
data, (6) Cross sectional vs time series vs longitude/panel data, (7) balanced vs imbalanced
data, (8) offline data vs live streaming data.
1) Composite primary key is a combination of more than one attributes to make a primary key
which then known as composite primary key.
2) Primary key identifies each record as distinct while unique key only allows distinct value to
be entered. While primary key can only have one in a table but unique key can have many
and accept null.
1) Time series data compost of single variable with pattern of acceding and descending across a
period of time. Cross sectional data is use multiple variables which usually to represent
relation between both.
2) Live streaming differ from offline processing as it happen in real time with shorter period by
using data which currently through system. While offline using pre existing data to draw
conclusion.
1) Constrain in SQL referring to command used in SQL to set rules. Constraint can be checked.
Check constraints used to specify specific range of data or limit type of data as in this
example to tell command to only permit aga more or equal than 18 to be stored:
a. CREATE TABLE Persons (
ID int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int,
CHECK (Age>=18)
);
2) In a simple word if the object does not exist already it will be created and if the object does
already exist it will be altered/updated
1) In-built fucntuon referring to several functions that are readily available for use.
2) Set in phyton referring an unordered collection of data types. These are mutable,
iterable, and do not consist of any duplicate elements. Set is mutable while tuple is
not. Set is unordered while tuples ordered.