0% found this document useful (0 votes)
22 views6 pages

Analysis of 2nd Question Using Python

Uploaded by

sinhaakash662
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views6 pages

Analysis of 2nd Question Using Python

Uploaded by

sinhaakash662
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Impoting the required libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

QUESTION 2

QUESTION 2(1)
Importing data from the source , I got the data with multiple sheet thats why i have used sheet_name to get the
right dataset

df = pd.read_excel(r"C:\Users\sinha\Downloads\data careerera\may_generator2023.xlsx", sheet_name ="Planned")

df

Net Net
Balancing Energ
Entity Plant Plant Google Bing Plant Summer Winter
Entity ID County Authority Sector ... Technology Sourc
Name ID Name Map Map State Capacity Capacity
Code
(MW) (MW)

Azimuth
IPP
180 Solar Grinnell Solar
0 64516 65164.0 Map Map IA Poweshiek MISO Non- ... 3.9 3.9
Electric, College Photovoltaic
CHP
LLC

Number
Invenergy IPP Onshore
Three
1 49893 Services 65522.0 Map Map NY Lewis NYIS Non- ... 106.0 106 Wind
Wind
LLC CHP Turbine
Project

HCE
Leyline IPP
Amelia Solar
2 65277 Renewable 66115.0 Map Map VA Amelia PJM Non- ... 5.0 5
Solar I, Photovoltaic
Capital CHP
LLC

HCE
Leyline IPP
Amelia Solar
3 65277 Renewable 66116.0 Map Map VA Amelia PJM Non- ... 5.0 5
Solar II, Photovoltaic
Capital CHP
LLC

HCE
Leyline Millboro IPP
Solar
4 65277 Renewable 66117.0 Springs Map Map VA Bath PJM Non- ... 5.0 5
Photovoltaic
Capital Solar, CHP
LLC

... ... ... ... ... ... ... ... ... ... ... ... ... ... ...

UAMPS
Utah
Carbon
Associated Electric
1523 40575 61075.0 Free Map Map ID Bonneville PACE ... 72.7 72.7 Nuclear
Mun Utility
Power
Power Sys
Plant

UAMPS
Utah
Carbon
Associated Electric
1524 40575 61075.0 Free Map Map ID Bonneville PACE ... 72.7 72.7 Nuclear
Mun Utility
Power
Power Sys
Plant

UAMPS
Utah
Carbon
Associated Electric
1525 40575 61075.0 Free Map Map ID Bonneville PACE ... 72.7 72.7 Nuclear
Mun Utility
Power
Power Sys
Plant

1526 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN

NOTES:\nCapacity
1527 from facilities with NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN
a total ...

1528 rows × 23 columns

df.info() #informatuion about the data


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1528 entries, 0 to 1527
Data columns (total 23 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Entity ID 1527 non-null object
1 Entity Name 1526 non-null object
2 Plant ID 1526 non-null float64
3 Plant Name 1526 non-null object
4 Google Map 1526 non-null object
5 Bing Map 1526 non-null object
6 Plant State 1526 non-null object
7 County 1521 non-null object
8 Balancing Authority Code 1521 non-null object
9 Sector 1526 non-null object
10 Generator ID 1524 non-null object
11 Unit Code 3 non-null object
12 Nameplate Capacity (MW) 1526 non-null float64
13 Net Summer Capacity (MW) 1526 non-null float64
14 Net Winter Capacity (MW) 1526 non-null object
15 Technology 1526 non-null object
16 Energy Source Code 1526 non-null object
17 Prime Mover Code 1526 non-null object
18 Planned Operation Month 1526 non-null float64
19 Planned Operation Year 1526 non-null float64
20 Status 1526 non-null object
21 Latitude 1526 non-null float64
22 Longitude 1526 non-null float64
dtypes: float64(7), object(16)
memory usage: 274.7+ KB

df.duplicated().sum() # here I am checking duplicate value

df[(df["Planned Operation Year"]<=2025)].count() # here i am checking that there is how many plant will be planned by

Entity ID 1314
Entity Name 1314
Plant ID 1314
Plant Name 1314
Google Map 1314
Bing Map 1314
Plant State 1314
County 1313
Balancing Authority Code 1313
Sector 1314
Generator ID 1312
Unit Code 2
Nameplate Capacity (MW) 1314
Net Summer Capacity (MW) 1314
Net Winter Capacity (MW) 1314
Technology 1314
Energy Source Code 1314
Prime Mover Code 1314
Planned Operation Month 1314
Planned Operation Year 1314
Status 1314
Latitude 1314
Longitude 1314
dtype: int64

(df['Status'].str.contains('Under construction Planned', case=False)).count()

1526

number_of_plant_to_establish = df[(df["Planned Operation Year"]<=2025) & df['Status'].str.contains('Under constructio


# Here I am checking that how many plants are there under construction and established by 2025

number_of_plant_to_establish.count()
Entity ID 1314
Entity Name 1314
Plant ID 1314
Plant Name 1314
Google Map 1314
Bing Map 1314
Plant State 1314
County 1313
Balancing Authority Code 1313
Sector 1314
Generator ID 1312
Unit Code 2
Nameplate Capacity (MW) 1314
Net Summer Capacity (MW) 1314
Net Winter Capacity (MW) 1314
Technology 1314
Energy Source Code 1314
Prime Mover Code 1314
Planned Operation Month 1314
Planned Operation Year 1314
Status 1314
Latitude 1314
Longitude 1314
dtype: int64

Here above we can see there 1314 plant will be planeed by 2025

QUESTION 2(2)
Again Importing data from the source , I got the data with multiple sheet thats why i have used
sheet_name to get the right dataset , here the sheet name is Operating
df2 = pd.read_excel(r"C:\Users\sinha\Downloads\data careerera\may_generator2023.xlsx", sheet_name ="Operating")

df2

Nameplate
Balancing DC Net
Entity Plant Google Bing Plant Energy
Entity ID Plant Name County Authority Sector ... Capacity
Name ID Map Map State Capacity
Code (MW)
(MWh)

TDX Sand
Point Aleutians Electric
0 63560 1.0 Sand Point Map Map AK NaN ...
Generating, East Utility
LLC

TDX Sand
Point Aleutians Electric
1 63560 1.0 Sand Point Map Map AK NaN ...
Generating, East Utility
LLC

TDX Sand
Point Aleutians Electric
2 63560 1.0 Sand Point Map Map AK NaN ...
Generating, East Utility
LLC

TDX Sand
Point Aleutians Electric
3 63560 1.0 Sand Point Map Map AK NaN ...
Generating, East Utility
LLC

TDX Sand
Point Aleutians Electric
4 63560 1.0 Sand Point Map Map AK NaN ...
Generating, East Utility
LLC

... ... ... ... ... ... ... ... ... ... ... ... ... ...

Texas IPP
25449 64449 Microgrid, 66614.0 OTHERCitizensHospital Map Map TX Victoria ERCO Non- ...
LLC CHP

Texas IPP
25450 64449 Microgrid, 66614.0 OTHERCitizensHospital Map Map TX Victoria ERCO Non- ...
LLC CHP

IPP
Valta 9150 Hermosa Solar San
25451 61980 66622.0 Map Map CA CISO Non- ... 1.8
Energy GM Bernardino
CHP

25452 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN

NOTES:\nCapacity
25453 from facilities with NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN
a total ...

25454 rows × 33 columns


df2.columns # how many columns are there

Index(['Entity ID', 'Entity Name', 'Plant ID', 'Plant Name', 'Google Map',
'Bing Map', 'Plant State', 'County', 'Balancing Authority Code',
'Sector', 'Generator ID', 'Unit Code', 'Nameplate Capacity (MW)',
'Net Summer Capacity (MW)', 'Net Winter Capacity (MW)', 'Technology',
'Energy Source Code', 'Prime Mover Code', 'Operating Month',
'Operating Year', 'Planned Retirement Month', 'Planned Retirement Year',
'Status', 'Nameplate Energy Capacity (MWh)', 'DC Net Capacity (MW)',
'Planned Derate Year', 'Planned Derate Month',
'Planned Derate of Summer Capacity (MW)', 'Planned Uprate Year',
'Planned Uprate Month', 'Planned Uprate of Summer Capacity (MW)',
'Latitude', 'Longitude'],
dtype='object')

(df2['Entity Name'].str.contains('connecticut', case=False)).sum()

53

(df2['Plant Name'].str.contains('hydro power', case=False)).sum()

df2['Status'].str.contains('operating', case=False).sum()

23184

hydro_power = df2[(df2['Entity Name'].str.contains('connecticut', case=True) &


df2['Plant Name'].str.contains('hydro power plant', case=True) &
df2['Status'].str.contains('operating', case=True))]
# Above i am checking that there is conecticut in entity name and hydro power plant in Plant name and operating in St
# we can have the idea that how my many hydro power plant is active in connecticut

hydro_power

Planned
Nameplate Derate
Balancing DC Net Planned Planned Planned Plan
Entity Entity Plant Plant Google Bing Plant Energy of
County Authority Sector ... Capacity Derate Derate Uprate
ID Name ID Name Map Map State Capacity Summer
Code (MW) Year Month Year
(MWh) Capacity
(MW)

0 rows × 33 columns

Again Importing data from the source , I got the data with multiple sheet thats why i have used
sheet_name to get the right dataset , here the sheet name is Operating_PR
df3= pd.read_excel(r"C:\Users\sinha\Downloads\data careerera\may_generator2023.xlsx", sheet_name ="Operating_PR"

df3
Nameplate
Balancing DC Net Planned
Entity Plant Google Bing Plant Energy
Entity ID Plant Name County Authority Sector ... Capacity Derate
Name ID Map Map State Capacity
Code (MW) Year
(MWh)

Pattern Pattern
Santa IPP Non-
0 56545 Operators 61014.0 Santa Map Map PR NaN ...
Isabel CHP
LP Isabel LLC

EcoElectrica Industrial
1 60671 61034.0 EcoElectrica Map Map PR Penuelas NaN ...
LP CHP

EcoElectrica Industrial
2 60671 61034.0 EcoElectrica Map Map PR Penuelas NaN ...
LP CHP

EcoElectrica Industrial
3 60671 61034.0 EcoElectrica Map Map PR Penuelas NaN ...
LP CHP

AES
AES IPP Non-
4 60673 ILUMINA, 61036.0 Map Map PR Guayama NaN ... 23.7
ILUMINA CHP
LLC

... ... ... ... ... ... ... ... ... ... ... ... ... ... ...

Lilly del Industrial


Lilly del
209 65407 66319.0 Caribe Map Map PR NaN NaN Non- ...
Caribe
PR01 CHP

Lilly del Industrial


Lilly del
210 65407 66319.0 Caribe Map Map PR NaN NaN Non- ...
Caribe
PR01 CHP

Lilly del Industrial


Lilly del
211 65407 66319.0 Caribe Map Map PR NaN NaN Non- ...
Caribe
PR01 CHP

212 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN

NOTES:\nCapacity
213 from facilities with NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN
a total ...

214 rows × 33 columns

(df3['Entity Name'].str.contains('connecticut', case=False)).sum()

hydro_power1 = df3[(df3['Entity Name'].str.contains('connecticut', case=True) &


df3['Plant Name'].str.contains('hydro power plant', case=True) &
df3['Status'].str.contains('operating', case=True))
# Above i am checking that there is conecticut in entity name and hydro power plant in Plant name and operating in St
# we can have the idea that how my many hydro power plant is active in connecticut

hydro_power1.count()

Entity ID 0
Entity Name 0
Plant ID 0
Plant Name 0
Google Map 0
Bing Map 0
Plant State 0
County 0
Balancing Authority Code 0
Sector 0
Generator ID 0
Unit Code 0
Nameplate Capacity (MW) 0
Net Summer Capacity (MW) 0
Net Winter Capacity (MW) 0
Technology 0
Energy Source Code 0
Prime Mover Code 0
Operating Month 0
Operating Year 0
Planned Retirement Month 0
Planned Retirement Year 0
Status 0
Nameplate Energy Capacity (MWh) 0
DC Net Capacity (MW) 0
Planned Derate Year 0
Planned Derate Month 0
Planned Derate of Summer Capacity (MW) 0
Planned Uprate Year 0
Planned Uprate Month 0
Planned Uprate of Summer Capacity (MW) 0
Latitude 0
Longitude 0
dtype: int64
Here we can see above there is zero active hydro power plant in connecticut

QUESTION2(3)
alabama_solar_df = df[(df['Plant State'] == 'Alabama') & (df['Technology'] == 'Solar Photovoltaic')]

# Group and aggregate data by Planned Operation Year


grouped_data = alabama_solar_df.groupby('Planned Operation Year')['Nameplate Capacity (MW)'].sum().reset_index()

# Plot the data


plt.figure(figsize=(10, 6))
plt.hist(grouped_data['Planned Operation Year'], grouped_data['Nameplate Capacity (MW)'], marker='o')
plt.title('Evolution of Solar Capacity in Alabama')
plt.xlabel('Year')
plt.ylabel('Solar Capacity (MW)')
plt.grid(True)
plt.show()

Loading [MathJax]/jax/output/CommonHTML/fonts/TeX/fontdata.js

You might also like