0% found this document useful (0 votes)
5 views1 page

Must Know Python Topics For Data Engineers

The document outlines key topics related to Python programming and data handling, including data types, file handling, and libraries such as Pandas and NumPy. It covers automation, testing, object-oriented programming, and packaging, along with cloud integration and database management. Additionally, it addresses code quality, data modeling, and CI/CD practices.

Uploaded by

Nayan M Gowda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views1 page

Must Know Python Topics For Data Engineers

The document outlines key topics related to Python programming and data handling, including data types, file handling, and libraries such as Pandas and NumPy. It covers automation, testing, object-oriented programming, and packaging, along with cloud integration and database management. Additionally, it addresses code quality, data modeling, and CI/CD practices.

Uploaded by

Nayan M Gowda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
You are on page 1/ 1

Topic Description

Python Fu Data types, loops, functions, error handling, comprehensions


File Handl Read/write text, CSV, JSON, Parquet, compressed formats
Working wit Pandas, NumPy, SQLAlchemy, Requests, PyYAML, PySpark
Data CleanNulls, regex, datetime handling, data type conversions
AutomatioETL pipelines, file watchers, job scheduling, Airflow
Testing & unittest, pytest, logging, pdb/ipdb for debugging
Object-Ori Classes, inheritance, dataclasses, abstraction
Packaging venv, pipenv, poetry, setup.py, requirements.txt
Parallelis threading, multiprocessing, asyncio, profiling tools
Cloud & ExBoto3 for AWS, REST APIs, cloud SDKs
Databases SQL (psycopg2, SQLAlchemy), NoSQL (MongoDB, Redis), transactions
Visualizati matplotlib, seaborn, plotly, PDF/HTML reports
Logging & Standard logging module, log levels, log rotation, structured logs
Config & P Using YAML/JSON configs, argparse, dotenv
Code Qualitflake8, black, isort, pre-commit hooks
Typing & Sttyping module, mypy, type annotations
Data ModelStar/Snowflake schemas, ER diagrams, data dictionaries
CI/CD & DeGitHub Actions, Docker, deployment automation

You might also like