Plano Codecademy En. v. 1.1
Plano Codecademy En. v. 1.1
Plano Codecademy En. v. 1.1
05/June/2024
Objective
Within the cadre of the project Health Innovation Initiatives for HIV and EPI, this study plan aims to
provide a structured path forward for Python beginners, empowering users to operate, adapt, and
maintain scripts that support data processing steps in an automated/ systematic way. The plan consists
of 3 steps:
During the first step, users are required to follow the recommended sequence of courses to build a solid
foundation in Python, with a focus on general programming, data science, and data visualization. Upon
completion of the indicated courses, users will gain the desired level of understanding to achieve a
successful knowledge transfer of data processing automation.
The second step consists of a series of webinars held virtually and scheduled on a regular basis (i.e.
monthly or similar) to transfer knowledge of scripts or automation steps that support predefined data
processing tasks.
Completion of the first step is a prerequisite to take advantage and fully grasp technical aspects of the
second step.
The third step is suggested as self-paced and optional continuation of training courses in Codeacademy.
We will take advantage of Codeacademy, an online platform designed as a hands-on training resource for
users from all levels. A list of users have been uploaded into Codeacademy, with a suggested
courseworks.
Study Tips
1. Time Organization
● Set up a weekly study schedule, reserving 3 to 5 hours weekly for the courses.
2. Additional Resources
● Take advantage of Codecademy's discussion forums to ask questions and interact with
other students.
● Consult the official documentation for Python, Pandas, NumPy, and Matplotlib to learn
more.
Course Sequence
1. Learn Python 3
● Content: Covers variables, data types, control structures (loops and conditionals), functions,
lists, and dictionaries. This course is ideal for building a solid foundation in Python
programming syntax and best practices.
● Content: Introduction to Python, pandas, "Jupyter Notebook," and data manipulation. They
will learn to load, manipulate, and analyze data sets, laying a foundation for more advanced
work in data science.
● Content: Handling missing data, formatting data, cleaning data. This course is crucial to
ensure the data is accurate and usable for subsequent analysis.
● Content: This course introduces the basics of data visualization in Python, focusing on
libraries like Matplotlib and Seaborn. Students learn to create various types of charts and
graphs, understand their use cases, and customize them to convey data insights effectively.
● Content: This course explores how to build persuasive data visualizations that effectively
communicate arguments and insights. Students learn advanced techniques for designing
visuals that support data-driven storytelling and decision-making.
● Content: Focuses on the Seaborn library, an extension of Matplotlib that provides a high-level
interface for drawing attractive and informative statistical graphics. Students learn to create
complex visualizations and gain insights into data patterns and trends.
● Content: This course covers essential data manipulation skills using Python libraries such as
pandas. Students learn techniques for importing, cleaning, transforming, and analyzing data
to prepare it for visualization and further analysis.
● Content: This course teaches exploratory data analysis (EDA) techniques using Python.
Students learn to summarize main characteristics of data sets, uncover patterns, spot
anomalies, test hypotheses, and check assumptions through graphical and quantitative
methods.
Forthcoming
After completing the listed courses, users are free to explore other Python courses available on
Codecademy. The accounts will be available for another year, allowing them to deepen and expand the
knowledge they have acquired. As users can see, the platform offers a variety of resources that build
upon completion of short courses. Two main paths are available to strengthen and solidify technical
knowledge in programming:
Skill paths: consists of groupings of 5-6 courses, usually ends with the submission of a project. After
completion, users strengthen technical skills, gain additional insights toward data processing, and benefit
from earning a certificate. A suggested list is outlined below:
● Content: This path covers the fundamentals of Python, including variables, data types, control
structures, functions, and essential libraries like pandas and NumPy. Ideal for building a
strong programming foundation for data science. By the end of this path, you will be able to
write Python scripts to manipulate data and perform basic analyses.
● Duration: 15 hours
● Content: Focuses on data manipulation and analysis using Python, primarily with the pandas
library. It includes techniques for cleaning, transforming, and analyzing data sets, providing a
solid basis for data analysis.
● Content: Teaches how to create compelling data visualizations using Matplotlib and Seaborn.
Students learn to turn data into informative and engaging visual insights, essential for
communicating results effectively.
● Content: Covers fundamental statistical concepts and their application using Python. This
path includes probability, hypothesis testing, and statistical modeling, equipping students
with essential skills for data analysis..
Career paths: consists of a longer and advanced curriculum, usually targeting individuals who would like
to advance in their professional careers and obtain stronger technical knowledge. The completion of a
career path demands a stronger commitment to complete the required courseworks. A couple of
suggestions include:
1. Data Analyst
● Content: Provides comprehensive training in data analysis, covering SQL, Python, data
visualization, and statistical analysis. The path includes practical projects to apply skills
learned, preparing students for data analyst roles.
2. Data Engineer
● Content: Focuses on data engineering skills such as SQL, Python, data warehousing, ETL
processes, and big data tools. Emphasis is on building and maintaining efficient data pipelines
and infrastructure.