Fundamental - Python
Fundamental - Python
NUMPY
PANDAS
pd.Series(data, index)
pd.DataFrames(data, index, columns)
df[column_name] - displays in the form of a series
Data Frae is a bunch of a series that shows same indexes
df[[col1, col2]] - list of columns will be displayed
df.drop(column name, axis = 1 ) - will drop the columns
This will not completly delete the column. To remove it completely use inplace=True
argument
axis = 0 - rows
axis = 1 - columns
df.loc['row name'] - to access the rows in the data frame
df.iloc[index of the row] - to access the rows in the data frame
df>0 - returns a df with boolean notation
df[df>0] - returns the df with the values greater than 0, and for those which are
less than 0 returns NaN
& - to compare we don't use 'and' operator instead we use '&' operator as it can
only compare one boolean value at a time
df[(df['W']> 0) & (df['Y]>1) - will compare the values and give the dataframe
which has only true values
| - pipe operator to get the 'or' operation
df.reset_index() - columns names get reseted into a seperate column and the indexes
will be 0 to last
df.set_index(newly added index) - resets the index but not permanent until we use
inplace argument
df.loc[coloumn name].loc[row number] - used in the multi level dataframes
df.xs() - cross-sectional (numbered index, level = 'column name')
df.dropna() - axis = 1: drop Null values
- thresh = 2: will print rows that have atleast 2 non NaN values
df.fillna() - fill values( can fill with mean values of the dataframe)
groupby - perform aggregrate functios by using the functions
groupby().describe() - gives the entire values oof the values(count, mean, min,
25%, 50%, 75%, std, max)
transpose() - changes the values from row to columns and vice versa
joining:combining the columns which are differently indexed into single result
Dataframe
df1.join(df2, how=__)
CSV
pd.read_csv() - read csv file
pd.read_html() - read html file
df.to_csv(__, index= False) - will clear out the previous index and will give the
new index
EXCEL
pandas can only data in the excel sheets, it cannot import tables or pictures
in excel each sheet is a dataframe
pd.read_excel(__, sheet_name=__)
df.to_excel(__, sheet_name=__)
HTML
pandas try to get each table element in the html and convert it to a dataframe
data = pd.read_html(__)
SQL
from sqlalchemy import create_engine
engine = create_engine('sqlite:///:memory:')
df.to_sql('__', engine)
MATPLOTLIB - (https://matplotlib.org/)
Types of Plots:
1. Functional:
plt.plot(x, y)
[plt.xlabel, plt.ylabel, plt.title]
plt.subplot(rows, columns, number of plot reffereing to) - to create multiple plots
in the same canvas
plot appearances
1. color = 'purple' <-- color of the plot
2. linewidth = 0.3 / lw = 0.3 <-- width of the line
3. alpha = 0.5 <-- transparancy of the line
4. linestyle = '--' / ls = '--' <-- style of the line
5. marker = 'o' <-- marks out each point in the axis (markersize to
specify the size)
6. markerfacecolor = 'yello' <-- colour of the marker
7. markeredgewidth = 3 <-- changes the width of the marker outline
8. markeredgecolor = 'green' <-- changes the colour of the marker border