Plano Codecademy En. v. 1.1

Introduction to Automated Data Processing via Python Programming
Suggested Study Plan
05/June/2024
Objective
Within the cadre of the project Health Innovation Initiatives for HIV and EPI, this study plan aims to
provide a structured path forward for Python beginners, empowering users to operate, adapt, and
maintain scripts that support data processing steps in an automated/ systematic way. The plan consists
of 3 steps:
● Introductory Python courses in Codeacademy (mandatory)

● Monthly webinars supported by ICAP/CESIN to transfer knowledge and/or troubleshooting on
specific/ predefined candidate tasks for automation.
● Online certifications or additional courses offered by Codeacademy (self-paced and optional)
During the first step, users are required to follow the recommended sequence of courses to build a solid
foundation in Python, with a focus on general programming, data science, and data visualization. Upon
completion of the indicated courses, users will gain the desired level of understanding to achieve a
successful knowledge transfer of data processing automation.
The second step consists of a series of webinars held virtually and scheduled on a regular basis (i.e.
monthly or similar) to transfer knowledge of scripts or automation steps that support predefined data
processing tasks.
Completion of the first step is a prerequisite to take advantage and fully grasp technical aspects of the
second step.
The third step is suggested as self-paced and optional continuation of training courses in Codeacademy.
Step 1 - Introductory Python coursework
We will take advantage of Codeacademy, an online platform designed as a hands-on training resource for
users from all levels. A list of users have been uploaded into Codeacademy, with a suggested
courseworks.
Study Tips
1. Time Organization
● Set up a weekly study schedule, reserving 3 to 5 hours weekly for the courses.
● Set yourself weekly targets for completing modules.
2. Additional Resources
● Take advantage of Codecademy's discussion forums to ask questions and interact with
other students.
● Consult the official documentation for Python, Pandas, NumPy, and Matplotlib to learn
more.
Course Sequence
1. Learn Python 3
● Content: Covers variables, data types, control structures (loops and conditionals), functions,
lists, and dictionaries. This course is ideal for building a solid foundation in Python
programming syntax and best practices.
● Estimated duration: 25 hours.
2. Getting Started with Python for Data Science
● Content: Introduction to Python, pandas, "Jupyter Notebook," and data manipulation. They
will learn to load, manipulate, and analyze data sets, laying a foundation for more advanced
work in data science.
3. How to Clean Data with Python
● Content: Handling missing data, formatting data, cleaning data. This course is crucial to
ensure the data is accurate and usable for subsequent analysis.

4. Introduction to Data Visualization with Python
● Content: This course introduces the basics of data visualization in Python, focusing on
libraries like Matplotlib and Seaborn. Students learn to create various types of charts and
graphs, understand their use cases, and customize them to convey data insights effectively.
● Estimated Duration: 3 hours.
5. Data Visualization with Python: Visual Arguments
● Content: This course explores how to build persuasive data visualizations that effectively
communicate arguments and insights. Students learn advanced techniques for designing
visuals that support data-driven storytelling and decision-making.
● Estimated Duration: 1 hour.
6. Data Visualization with Python: Seaborn
● Content: Focuses on the Seaborn library, an extension of Matplotlib that provides a high-level
interface for drawing attractive and informative statistical graphics. Students learn to create
complex visualizations and gain insights into data patterns and trends.
7. Python for Data Science: Working with Data
● Content: This course covers essential data manipulation skills using Python libraries such as
pandas. Students learn techniques for importing, cleaning, transforming, and analyzing data
to prepare it for visualization and further analysis.
8. Exploratory Data Analysis with Python
● Content: This course teaches exploratory data analysis (EDA) techniques using Python.
Students learn to summarize main characteristics of data sets, uncover patterns, spot
anomalies, test hypotheses, and check assumptions through graphical and quantitative
methods.

Step 2 - Monthly ICAP/ CIESIN webinars
Forthcoming
Step 3 - Continuing Education
After completing the listed courses, users are free to explore other Python courses available on
Codecademy. The accounts will be available for another year, allowing them to deepen and expand the
knowledge they have acquired. As users can see, the platform offers a variety of resources that build
upon completion of short courses. Two main paths are available to strengthen and solidify technical
knowledge in programming:
Skill paths: consists of groupings of 5-6 courses, usually ends with the submission of a project. After
completion, users strengthen technical skills, gain additional insights toward data processing, and benefit
from earning a certificate. A suggested list is outlined below:
1. Learn Python for Data Science
● Content: This path covers the fundamentals of Python, including variables, data types, control
structures, functions, and essential libraries like pandas and NumPy. Ideal for building a
strong programming foundation for data science. By the end of this path, you will be able to
write Python scripts to manipulate data and perform basic analyses.
● Duration: 15 hours
2. Analyze Data with Python
● Content: Focuses on data manipulation and analysis using Python, primarily with the pandas
library. It includes techniques for cleaning, transforming, and analyzing data sets, providing a
solid basis for data analysis.
3. Visualize Data with Python
● Content: Teaches how to create compelling data visualizations using Matplotlib and Seaborn.
Students learn to turn data into informative and engaging visual insights, essential for
communicating results effectively.

4. Master Statistics with Python
● Content: Covers fundamental statistical concepts and their application using Python. This
path includes probability, hypothesis testing, and statistical modeling, equipping students
with essential skills for data analysis..
Career paths: consists of a longer and advanced curriculum, usually targeting individuals who would like
to advance in their professional careers and obtain stronger technical knowledge. The completion of a
career path demands a stronger commitment to complete the required courseworks. A couple of
suggestions include:
1. Data Analyst
● Content: Provides comprehensive training in data analysis, covering SQL, Python, data
visualization, and statistical analysis. The path includes practical projects to apply skills
learned, preparing students for data analyst roles.
● Estimated Duration: 20 weeks.
2. Data Engineer
● Content: Focuses on data engineering skills such as SQL, Python, data warehousing, ETL
processes, and big data tools. Emphasis is on building and maintaining efficient data pipelines
and infrastructure.
● Estimated Duration: 6 months.

Plano Codecademy En. v. 1.1

Uploaded by

Copyright:

Available Formats

Plano Codecademy En. v. 1.1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Plano Codecademy En. v. 1.1

Uploaded by

Copyright:

Available Formats

Introduction to Automated Data Processing via Python Programming

Suggested Study Plan

● Introductory Python courses in Codeacademy (mandatory)

Step 1 - Introductory Python coursework

● Set yourself weekly targets for completing modules.

● Estimated duration: 25 hours.

2. Getting Started with Python for Data Science

● Estimated duration: 6 hours.

3. How to Clean Data with Python

● Estimated duration: 3 hours.

● Estimated Duration: 3 hours.

5. Data Visualization with Python: Visual Arguments

● Estimated Duration: 1 hour.

6. Data Visualization with Python: Seaborn

● Estimated Duration: 2 hours.

7. Python for Data Science: Working with Data

● Estimated Duration: 8 hours.

8. Exploratory Data Analysis with Python

● Estimated Duration: 6 hours.

Step 3 - Continuing Education

1. Learn Python for Data Science

2. Analyze Data with Python

● Estimated Duration: 12 hours.

3. Visualize Data with Python

● Estimated Duration: 7 hours.

● Estimated Duration: 26 hours.

● Estimated Duration: 20 weeks.

● Estimated Duration: 6 months.

You might also like