Python Capstone Implementation
Python Capstone Implementation
import pandas as pd
data = pd.read_excel("water_leak_detection_1000_rows.xlsx")
data.head()
Timestamp Sensor_ID Pressure (bar) Flow Rate (L/s) Temperature (°C) Leak Status Burst Status
Next steps: Generate code with data toggle_off View recommended plots New interactive sheet
data.shape
(1000, 7)
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 7 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Timestamp 1000 non-null datetime64[ns]
1 Sensor_ID 1000 non-null object
2 Pressure (bar) 1000 non-null float64
3 Flow Rate (L/s) 1000 non-null float64
4 Temperature (°C) 1000 non-null float64
5 Leak Status 1000 non-null int64
6 Burst Status 1000 non-null int64
dtypes: datetime64[ns](1), float64(3), int64(2), object(1)
memory usage: 54.8+ KB
data.isnull().sum()
Timestamp 0
Sensor_ID 0
Pressure (bar) 0
Temperature (°C) 0
Leak Status 0
Burst Status 0
dtype: int64
data.describe()
https://colab.research.google.com/drive/1sY2gcm-d9PRkpgIuJGIhjFr75I0Q6pxx#scrollTo=2V9uqbNf7fqu&printMode=true 1/4
7/6/25, 3:03 PM Python Capstone Implementation - Colab
Timestamp Pressure (bar) Flow Rate (L/s) Temperature (°C) Leak Status Burst Status
data_numeric = data.select_dtypes(include=['number'])
corr_matrix = data_numeric.corr()
plt.figure(figsize=(10, 6))
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm')
plt.title('Correlation Matrix')
plt.show()
plt.figure(figsize=(10, 6))
sns.histplot(data['Flow Rate (L/s)'], kde=True, bins=30, color='skyblue')
plt.title('Pressure (bar)')
plt.xlabel('Flow Rate (L/s)')
plt.ylabel('Temperature (°C)')
plt.grid(True)
plt.show()
https://colab.research.google.com/drive/1sY2gcm-d9PRkpgIuJGIhjFr75I0Q6pxx#scrollTo=2V9uqbNf7fqu&printMode=true 2/4
7/6/25, 3:03 PM Python Capstone Implementation - Colab
import pandas as pd
data = pd.read_excel("water_leak_detection_1000_rows.xlsx")
https://colab.research.google.com/drive/1sY2gcm-d9PRkpgIuJGIhjFr75I0Q6pxx#scrollTo=2V9uqbNf7fqu&printMode=true 3/4
7/6/25, 3:03 PM Python Capstone Implementation - Colab
First 5 rows:
Timestamp Sensor_ID Pressure (bar) Flow Rate (L/s) \
0 2024-01-01 00:00:00 S007 3.694814 77.515218
1 2024-01-01 00:05:00 S007 2.587125 179.926422
2 2024-01-01 00:10:00 S002 2.448965 210.130823
3 2024-01-01 00:15:00 S009 2.936844 141.777934
4 2024-01-01 00:20:00 S003 3.073693 197.484633
Column names:
['Timestamp', 'Sensor_ID', 'Pressure (bar)', 'Flow Rate (L/s)', 'Temperature (°C)', 'Leak Status', 'Burst Status']
Data types:
Timestamp datetime64[ns]
Sensor_ID object
Pressure (bar) float64
Flow Rate (L/s) float64
Temperature (°C) float64
Leak Status int64
Burst Status int64
dtype: object
Null values:
Timestamp 0
Sensor_ID 0
Pressure (bar) 0
Flow Rate (L/s) 0
Temperature (°C) 0
Leak Status 0
Burst Status 0
dtype: int64
Summary statistics:
Timestamp Pressure (bar) Flow Rate (L/s) \
count 1000 1000.000000 1000.000000
mean 2024-01-02 17:37:30 3.220696 125.038082
min 2024-01-01 00:00:00 0.910977 50.654490
25% 2024-01-01 20:48:45 2.859332 87.946866
50% 2024-01-02 17:37:30 3.265711 124.106896
75% 2024-01-03 14:26:15 3.607196 162.086708
max 2024-01-04 11:15:00 3.995364 331.754081
std NaN 0.488997 44.121419
https://colab.research.google.com/drive/1sY2gcm-d9PRkpgIuJGIhjFr75I0Q6pxx#scrollTo=2V9uqbNf7fqu&printMode=true 4/4