0% found this document useful (0 votes)
6 views1 page

Test Your Knowledge Answers

The top three data science programming languages according to the 2020 Kaggle survey are Python, SQL, and R. A tradeoff exists between using a GUI, which is easier but less flexible, and programming languages like Python, which offer more capabilities at the cost of time and effort. The top cloud providers for data science are Amazon (AWS), Google (GCP), and Microsoft (Azure), and data scientists spend a significant portion of their time, often cited as up to 90%, on data cleaning and preparation.

Uploaded by

diogo.brmendes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views1 page

Test Your Knowledge Answers

The top three data science programming languages according to the 2020 Kaggle survey are Python, SQL, and R. A tradeoff exists between using a GUI, which is easier but less flexible, and programming languages like Python, which offer more capabilities at the cost of time and effort. The top cloud providers for data science are Amazon (AWS), Google (GCP), and Microsoft (Azure), and data scientists spend a significant portion of their time, often cited as up to 90%, on data cleaning and preparation.

Uploaded by

diogo.brmendes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 1

1.

What are the top three data science programming languages, in order, according to the 2020 Kaggle data
science and machine learning survey?

Python, SQL, and R

2. What is the tradeoff between using a GUI vs using a programming language for data science? What were
some of the GUIs for data science that we mentioned?

A GUI makes it easier and faster to carry out data science but lacks flexibility. With a general-purpose
programming language like Python, we can do almost anything, but it can require much more effort and
time.

Some of the GUIs include Excel, Tableau, Alteryx, and Rapidminer.

3. Who are the top three cloud providers for data science and machine learning according to the Kaggle 2020
survey?

Amazon (AWS), Google (GCP), and Microsoft (Azure)

4. What percent of time do data scientists spend cleaning and preparing data?

According to some surveys, it’s between 25% and 75% of their time. Anecdotal evidence suggests a
number more like 90% or more of data scientists’ time.

5. What specializations in and around data science did we discuss?

Machine learning engineer

6. What data science project management strategies did we discuss, and which one is the most recent? What
are their acronyms and what do the acronyms stand for?

CRISP-DM – Cross-industry standard process for data mining


TDSP – team data science process

7. What are the steps in the two data science project management strategies we discussed?

CRISP-DM: Business understanding, data understanding, data preparation, modeling, evaluation,


deployment
TDSP: Business understanding, data acquisition and understanding, modeling, deployment, customer
acceptance

You might also like