0% found this document useful (0 votes)
42 views

Kaggle State of Machine Learning and Data Science Report 2022

The document summarizes key findings from Kaggle's 2022 Data Science & ML Survey. It found that Python and SQL remain the most common programming skills, while scikit-learn is the most popular machine learning framework. VSCode has become the most used integrated development environment by data scientists. All major cloud computing providers saw growth, and specialized hardware like tensor processing units is gaining traction. The full survey results can be found on Kaggle's website.

Uploaded by

Ali Hasan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views

Kaggle State of Machine Learning and Data Science Report 2022

The document summarizes key findings from Kaggle's 2022 Data Science & ML Survey. It found that Python and SQL remain the most common programming skills, while scikit-learn is the most popular machine learning framework. VSCode has become the most used integrated development environment by data scientists. All major cloud computing providers saw growth, and specialized hardware like tensor processing units is gaining traction. The full survey results can be found on Kaggle's website.

Uploaded by

Ali Hasan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

2022 Kaggle Data

Science & ML Survey


Data Scientists’ backgrounds, preferred
technologies, and techniques

Oct/
11–13
In September 2022, Kaggle conducted its sixth
annual industry-wide survey in an attempt to
surface a truly comprehensive view of the state
of data science and machine learning.
23,997 173 ~43
Respondents Countries Questions
Meet Kaggle

Kaggle is the world’s largest data science community with


powerful tools and resources to help you achieve your data
science learning goals.

● > 10 Million Data Scientists


● 300+ Machine Learning Competitions
● 170k+ Public Datasets
● 750k+ Public Notebooks
Download the full survey
results at:
kaggle.com/kaggle-survey-2022
Today’s Presentation:

Working Data Scientists


01 Demographics

02 Programming

03 Machine Learning

04 Cloud Computing
Demographics
Kaggle DS & ML Survey 2022

The data science industry remains highly


gender imbalanced
Kaggle DS & ML Survey 2022

An increasing number of data scientists are


living and working in India and Japan
Panel Questions

1. Do you have any insights on the growth of


data science as a career in specific
geographic regions?

2. Any unique dynamics or initiatives that


accelerate growth?
Programming
Kaggle DS & ML Survey 2022

Python and SQL remain the two most common


programming skills for data scientists
Kaggle DS & ML Survey 2022

VSCode is now used by over 50% of


working data scientists
Kaggle DS & ML Survey 2022

Colab notebooks are the most popular cloud-based


Jupyter notebook environment
Panel Questions

1. Does the shift toward VSCode and Jupyter


Notebooks reflect a trend towards choosing
IDEs that have the option of being hosted
within a web browser? What do you think
drives people’s choices of IDE?

2. Why would users be shifting away from


desktop apps?
Machine Learning
Kaggle DS & ML Survey 2022

Scikit-learn is the most popular ML framework while


PyTorch has been growing steadily year-over-year
Kaggle DS & ML Survey 2022

Transformer architectures are becoming more popular for


deep learning models (both image and text data)
Questions

Panel 1. Do you suppose the popularity of scikit-learn


is attributable to its ability to cover so many
use cases?

2. Can you speak to the differences in which


frameworks are best used for which
applications?

3. How fundamental is tabular data in business?


Do you see a clear winner in the boosted
trees vs. tabular NNs space? Why are
boosted trees dominant on Kaggle?
Cloud Computing
Kaggle DS & ML Survey 2022

All major cloud computing providers saw


strong year over year growth in 2022
Kaggle DS & ML Survey 2022

Specialized hardware like Tensor Processing Units (TPUs) is


gaining initial traction with Kaggle data scientists
Panel Questions

1. Can you share how users make choices in


selecting between various cloud providers?

2. What do you think is driving the growth of


accelerators? Are there projects better suited
for these more specialized processors?
Download the full survey
results at:
kaggle.com/kaggle-survey-2022
Thank you

You might also like