Advance Data Analytics Unit 2
Advance Data Analytics Unit 2
Advance Data Analytics Unit 2
Unit: 2
11/24/2022 2
Evaluation Scheme
4 Elective-II 2 0 0 30 20 50 50 100 2
Elective-II Lab 0 0 2 50 50 1
Archana Verma
KCS 058 HCI
GRAND TOTAL Unit
-1 250 250 11/24/2022
450 200 1150 23
3
Evaluation Scheme
USDAVIS University of
IV Introduction to Google SEO California 14 hrs.
USDAVIS University of
V Google SEO Fundamentals California 29 hrs.
4
Autonomous Syllabus
•1
•To help students understand digital marketing practices, inclination of digital
consumers and role of content marketing.
•2
•To provide understanding of the concept of E-commerce and developing marketing
strategies in the virtual world
•3
•To impart learning on various digital channels and how to acquire and engage
consumers online.
•4
•To provide insights on building organizational competency by way of digital marketing
practices and cost considerations.
•5
•To develop understanding of the latest digital practices for marketing and promotion.
d.
•CO1
•It will develop proficiency in interpreting marketing strategies in the digital age and
provide fundamental knowledge for working in an online team.
•CO2
•It will enable them to develop various online marketing strategies for various
marketing-mix measures.
•CO3
•It will guide them to use various digital marketing channels for consumer acquisition
and engagement.
•CO4
•It will help in evaluating the productivity of digital marketing channels for business
success.
•CO5
•It will prepare candidates for global exposure of digital marketing practices to make
them employable in a high growth industry
Course PO1 PO PO PO PO PO PO PO PO PO PO PO
outcom 2 3 4 5 6 7 8 9 10 11 12
es
CO1 2 2 2 2 3 3 3 3 2 3 3
CO2
3 3 3 3 3 3 3 3 3 3 3
CO3
3 3 3 3 3 3 3 3 2 3 3
CO4
3 3 3 3 3 3 3 3 3 3 3
CO5
3 3 3 3 3 3 3 3 2 3 3
• Filters, on the other hand, are very useful in data cleaning when you want to
- find a particular piece of information.
• Another way to change the way you view data is
- by using pivot tables.
• pivot table is a data summarization tool that is used in
- data processing.
• Pivot tables
-sort,
-reorganize,
- group,
- count,
- total or
- average data stored in the database.
• VLOOKUP
- is a function that determines
- vertical data
• Supposed data is being stored in
- different places, in
- different formats, and
- each location might have
- millions of rows and
- hundreds of related tables.
• Suppose we want to find something specific in
- all this data,
- like how many patients
- with a certain diagnosis
- came in today.
Q1. Fill in the blank: To count the total number of spreadsheet values within a
specified range, a data analyst uses the _____ function.
1) SUM
2) WHOLE
3) COUNTA
4) TOTAL
Q2. data analyst is cleaning a dataset with inconsistent formats and repeated cases.
They use the TRIM function to remove extra spaces from string variables. What other
tools can they use for data cleaning? Select all that apply.
1) Protect sheet
2) Find and replace
3) Import data
4) Remove duplicates
• For example,
- INSERT INTO
- UPDATE <table name>
SET <field name>
WHERE <cond>
SELECT
Customer_id
FROM
customer_data.customer_address.
WHERE TRIM(state) = ‘OH’
•Suppose by Oder by clause we have sorted the purchase_price column in descending order
- It appears 88.79 and then799.89
- this is wrong
- because database is not recognizing these as numbers.
-The database thinks they are strings
- but actually they are float
- it compared the first letter of each digit
-it found 8 is greater than 7 so it put 8 first and then 7
SELECT
CAST(purchase_price AS FLOAT64)
FROM
customer_data.customer_purchase
ORDER BY
CAST(purchase_price AS FLOAT64) DESC
SELECT
CAST(purchase_price AS FLOAT64)
FROM
customer_data.customer_purchase
ORDER BY
CAST(purchase_price AS FLOAT64) DESC
• CONCAT function.
• CONCAT lets you add strings together to create new text strings
• Eg.
- product_code is the same,
- product color may be different
- We need to separate products by color,
- the first column we want is product_code,
- the other column we want, product_color.
- we want this for chair,
SELECT
CONCAT(product_code, product_color) AS new_prod_code
FROM
customer_data.customer_purchase
WHERE
product = ‘chair’
•
COALESCE. function.
Q1. In which of the following situations would a data analyst use spreadsheets instead
of SQL? Select all that apply.
1) When using a language to interact with multiple database programs
2) When working with a small dataset
3) When visually inspecting data
4) When working with a dataset with more than 1,000,000 rows
Q2. A data analyst creates many new tables in their company’s database. When the
project is complete, the analyst wants to remove the tables so they don’t clutter the
database. What SQL commands can they use to delete the tables?
• Reports are a super effective way to show your team that you're
- being 100 percent transparent about your data cleaning.
11/24/2022 Faculty Name Subject code and abbreviation Unit Number 34
Cleaning and your data expectations
• Reporting helps to
- show stakeholders that you're accountable,
- build trust with your team,
- and make sure you're all on the same page
- of important project details
• Pivot tables
- sort,
- reorganize,
- group,
- count, total or
- average data stored in a database.
SELECT
customer_id,
CASE
WHEN first_name =‘Tnoy’ THEN ‘Tony’
WHEN first_name = ‘Johb’ THEN ‘John’
ELSE first_name
END AS cleaned_name
FROM
customer_data.customer_name
• A changelog is a file
- containing a chronologically ordered list of
- modifications made to a project.
• Data analysts are counted on to present their findings after a data cleaning effort.
Changelogs helped store changes chronologically,
it provides a real-time account of every modification.
• For example,
• one of the biggest challenges of working with data
- is dealing with errors.
- Some of the most common errors
- involve human mistakes like
- mistyping or
- misspelling,
Q1. Why is it important for a data analyst to document the evolution of a dataset? Select all that
apply.
Q2. Fill in the blank: While cleaning data, documentation is used to track _____. Select all that apply.
1)Bias
2) Deletions
3) Errors
4)Change
• https://maung-sutikno.medium.com/process-data-from-dirty-to-clean-
eb6758190d92
• https://www.youtube.com/watch?v=sNkvWJmucQs
• https://www.youtube.com/watch?v=kCP-H8VRDCw
Q1.What is involved in seeing the big picture when verifying data cleaning?
Select all that apply.
1) Consider the data
2) Consider the business problem
3) consider the goal
4) Consider the reporting
We have learnt how SQL and spreadsheets are used in data cleaning. What
are the advanced functions used in cleaning. The string variables. What are
the expectations. The use of documentation, verification and feedback.
Thank You