???? ?????????? ????
???? ?????????? ????
3. Once you get this data from the source, please perform the below steps
a. First, get data from the source
b. Using Spark flatten all the columns from the source.
i. Flatten column names, IF you are having nested columns make them
unnest it.
Example:
“test”:”ha”,
“Feature”:[
{“Type”:”abc”
“Name”:”abc”},
{“Type”:”pqr”
“Name”:”pqr”}
]
After flattening the above JSON File, I should get the below columns in
my target table.
test, feature_type, feature_name
c. and store them in the target location. Please refer target location
earthquakeanalysis/raw/<date in YYYYMMDD>/<target file>.parquet
4. Once this is done. To generate Analysis Layer Questions will be shared with you.
"mag": 0.89,
"place": "6 km NW of The Geysers, CA",
"time": 1729308248850,
"updated": 1729308343908,
"tz": null,
"url": "https://earthquake.usgs.gov/earthquakes/eventpage/nc75076006",
"detail": "https://earthquake.usgs.gov/earthquakes/feed/v1.0/detail/nc75076006.geojson",
"felt": null,
"cdi": null,
"mmi": null,
"alert": null,
"status": "automatic",
"tsunami": 0,
"sig": 12,
"net": "nc",
"code": "75076006",
"ids": ",nc75076006,",
"sources": ",nc,",
"types": ",nearby-cities,origin,phase-data,",
"nst": 9,
"dmin": 0.01303,
"rms": 0.02,
"gap": 77,
"magType": "md",
"type": "earthquake",
"title": "M 0.9 - 6 km NW of The Geysers, CA",
"geometry": {
"longtitude":-122.813163757324,
"latitude":38.8125,
"depth": 3.25999999046326
}
BQ Table: earthquake_db.earthquake_data
Cloud Composer
Historical load - Manual and its going to be one time activity
Daily Load -
- Ingestion - transformation - Bq load