0% found this document useful (0 votes)
119 views23 pages

10 Automated EDA Tools

The document discusses 10 open-source automated EDA tools that can generate an exploratory data analysis report in seconds to save time. Some of the tools discussed are SweetViz, Pandas-Profiling, DataPrep, AutoViz, D-Tale, dabl, QuickDA, Datatile, Lux, and ExploriPy. These tools typically provide information on missing values, data statistics, correlations and produce various data visualizations in their EDA reports.

Uploaded by

sushant jha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
119 views23 pages

10 Automated EDA Tools

The document discusses 10 open-source automated EDA tools that can generate an exploratory data analysis report in seconds to save time. Some of the tools discussed are SweetViz, Pandas-Profiling, DataPrep, AutoViz, D-Tale, dabl, QuickDA, Datatile, Lux, and ExploriPy. These tools typically provide information on missing values, data statistics, correlations and produce various data visualizations in their EDA reports.

Uploaded by

sushant jha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

10 Automated EDA

Tools That Will


Save You Hours
Of (Tedious) Work

Avi Chawla

avichawla.substack.com
EDA is a vital but
time-consuming
task in a data
project.

Here are 10 open-source


tools that generate
an EDA report
in seconds.
1.
SweetViz
avichawla.substack.com

In-depth EDA report


in two lines
of code.

Covers information about


missing values, data
statistics, etc.

Creates a variety of data


visualizations.

Integrates with Jupyter


Notebook.
2.
Pandas-Profiling
avichawla.substack.com

Generate a high-level
EDA report of your
data in no time.

Covers info about missing


values, data statistics,
correlation etc.

Produces data
alerts.

Plots data feature


interactions.
3.
DataPrep
avichawla.substack.com

Supports Pandas and


Dask DataFrames.

Interactive Visualizations.

10x Faster than Pandas


based tools.

Covers info about missing


values, data statistics,
correlation etc.
Plots data feature
interactions.
4.
AutoViz
avichawla.substack.com

Supports CSV, TXT, and


JSON.

Interactive Boken
charts.

Covers info about missing


values, data statistics,
correlation etc.
Presents data cleaning
suggestions.
5.
D-Tale
avichawla.substack.com

Runs common Pandas


operation with
no-code.
Exports code of
analysis.
Integrates with Jupyter
Notebook.

Covers info about missing


values, data statistics,
correlation etc.
Highlights duplicates,
outliers, etc.
6.
dabl
avichawla.substack.com

Primarily provides
visualizations.

Covers wide range


of plots.

Target distribution.

Scatter pair plots.

Histograms.
7.
QuickDA
avichawla.substack.com

Get overview report


of dataset.

Covers info about missing


values, data statistics,
correlation etc.

Produces data
alerts.

Plots data feature


interactions.
8.
Datatile
avichawla.substack.com

Extends Pandas'
describe().

Mostly statistical
information.

Provides column stats.

Counts, missing, etc.

Column datatype.

Column type count.


9.
Lux
avichawla.substack.com

Integrates with Jupyter


Notebook.

Provides visualization
recommendations.
Supports EDA on a subset
of columns.

Exports code of
analysis.
10.
ExploriPy
avichawla.substack.com

Covers info about missing


values, data statistics,
correlation etc.

Performs statistical
testing.

Column type-wise
distribution.

Continuous

Categorical
Hope that
helped.

Checkout my daily newsletter to learn


something new about Python and Data
Science everyday.

avichawla.substack.com

Connect with me on LinkedIn.

https://www.linkedin.com/in/avi-chawla

You might also like