LO2a) - Introduction To Data Engineering
LO2a) - Introduction To Data Engineering
Himanshu Patel
https://www.mathsisfun.com/data/data.html
• Data engineering is a critical field where data is concerned, but not many
people can accurately describe what data engineers do.
• Data drives the operations of businesses small and large. Businesses use data
to provide answers to relevant inquiries that range from consumer interest to
product viability.
• Without a doubt, data is an important part of scaling your business and gaining
valuable insights. And this makes data engineering just as important.
• In March 2019, about 6,500 LinkedIn users listed their title as “data
engineers”. They offered a wide variety of skill sets, including a knowledge
base of Python, SQL, and Java.
Finding the best practices for refining your software development life
cycle
Bringing data together into one place via data integration tools
1. Extract: sensors wait for upstream data sources to generate data (e.g. an
upstream source could be machine or user-generated logs, relational
database copy, external dataset, etc).
2. Transform: apply business logic and perform actions such as filtering,
grouping, and aggregation to translate raw data into analysis-ready
datasets.
3. Load: load the processed data and transport it to a final destination.
Often, this dataset can be either
a. consumed directly by end-users be
b. treated as yet another upstream dependency to another ETL job - forming the so-
called data lineage.
Artificial Intelligence and Data Analytics
Artificial Intelligence and Data Analytics
Artificial Intelligence and Data Analytics source: https://www.youtube.com/watch?v=qWru-b6m030&feature=emb_title
Summary
• Data and its various types
• Data Engineering and its importance
• Data Science and its importance
• Database, Data Warehouse, Datalakes
• Cloud vs On-premise data storage
• ETL