0% found this document useful (0 votes)

18 views6 pages

IML Project

The document outlines a Jupyter Notebook for predicting Bitcoin prices using various machine learning models. It includes data preprocessing steps, exploratory data analysis, and model training with Logistic Regression, SVC, and XGBoost, along with their respective training and validation accuracies. Visualizations such as price trends, distribution plots, and correlation heatmaps are also presented to support the analysis.

Uploaded by

Punani Nikul

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views6 pages

IML Project

Uploaded by

Punani Nikul

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

10/19/24, 10:30 PM bit_coin_prediction.

ipynb - Colab

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sb

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from xgboost import XGBClassifier
from sklearn import metrics

import warnings
warnings.filterwarnings('ignore')

df = pd.read_csv('bitcoin.csv')
df.head()

Date Open High Low Close Adj Close Volume

0 2014-09-17 465.864014 468.174011 452.421997 457.334015 457.334015 21056800

1 2014-09-18 456.859985 456.859985 413.104004 424.440002 424.440002 34483200

2 2014-09-19 424.102997 427.834991 384.532013 394.795990 394.795990 37919700

3 2014-09-20 394.673004 423.295990 389.882996 408.903992 408.903992 36863600

4 2014-09-21 408.084991 412.425995 393.181000 398.821014 398.821014 26580100

Next steps: Generate code with df View recommended plots New interactive sheet

df.shape

(2713, 7)

df.describe()

Open High Low Close Adj Close Volume

count 2713.000000 2713.000000 2713.000000 2713.000000 2713.000000 2.713000e+03

mean 11311.041069 11614.292482 10975.555057 11323.914637 11323.914637 1.470462e+10

std 16106.428891 16537.390649 15608.572560 16110.365010 16110.365010 2.001627e+10

min 176.897003 211.731003 171.509995 178.102997 178.102997 5.914570e+06

25% 606.396973 609.260986 604.109985 606.718994 606.718994 7.991080e+07

50% 6301.569824 6434.617676 6214.220215 6317.609863 6317.609863 5.098183e+09

75% 10452.399414 10762.644531 10202.387695 10462.259766 10462.259766 2.456992e+10

max 67549.734375 68789.625000 66382.062500 67566.828125 67566.828125 3.509679e+11

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2713 entries, 0 to 2712
Data columns (total 7 columns):
# Column Non-Null Count Dtype

0 Date 2713 non-null object

1 Open 2713 non-null float64
2 High 2713 non-null float64
3 Low 2713 non-null float64
4 Close 2713 non-null float64
5 Adj Close 2713 non-null float64
6 Volume 2713 non-null int64
dtypes: float64(5), int64(1), object(1)
memory usage: 148.5+ KB

plt.figure(figsize=(15, 5))
plt.plot(df['Close'])
plt.title('Bitcoin Close price.', fontsize=15)
plt.ylabel('Price in dollars.')
plt.show()

https://colab.research.google.com/drive/10aiAnjw847cPy6koQiQw6ayhIhipEdyw?authuser=0#scrollTo=RmkFxGnnuCM_&printMode=true 1/6
10/19/24, 10:30 PM bit_coin_prediction.ipynb - Colab

df[df['Close'] == df['Adj Close']].shape, df.shape

((2713, 7), (2713, 7))

df = df.drop(['Adj Close'], axis=1)

df.isnull().sum()

Date 0

Open 0

High 0

Low 0

Close 0

Volume 0

features = ['Open', 'High', 'Low', 'Close']

plt.subplots(figsize=(20,10))
for i, col in enumerate(features):
plt.subplot(2,2,i+1)
sb.distplot(df[col])
plt.show()

https://colab.research.google.com/drive/10aiAnjw847cPy6koQiQw6ayhIhipEdyw?authuser=0#scrollTo=RmkFxGnnuCM_&printMode=true 2/6
10/19/24, 10:30 PM bit_coin_prediction.ipynb - Colab

plt.subplots(figsize=(20,10))
for i, col in enumerate(features):
plt.subplot(2,2,i+1)
sb.boxplot(df[col])
plt.show()

https://colab.research.google.com/drive/10aiAnjw847cPy6koQiQw6ayhIhipEdyw?authuser=0#scrollTo=RmkFxGnnuCM_&printMode=true 3/6
10/19/24, 10:30 PM bit_coin_prediction.ipynb - Colab

splitted = df['Date'].str.split('-', expand=True)

df['year'] = splitted[0].astype('int')
df['month'] = splitted[1].astype('int')
df['day'] = splitted[2].astype('int')

# Convert the 'Date' column to datetime objects

df['Date'] = pd.to_datetime(df['Date'])

df.head()

Date Open High Low Close Volume year month day

0 2014-09-17 465.864014 468.174011 452.421997 457.334015 21056800 2014 9 17

1 2014-09-18 456.859985 456.859985 413.104004 424.440002 34483200 2014 9 18

2 2014-09-19 424.102997 427.834991 384.532013 394.795990 37919700 2014 9 19

3 2014-09-20 394.673004 423.295990 389.882996 408.903992 36863600 2014 9 20

4 2014-09-21 408.084991 412.425995 393.181000 398.821014 26580100 2014 9 21

Next steps: Generate code with df View recommended plots New interactive sheet

data_grouped = df.groupby('year').mean()
plt.subplots(figsize=(20,10))
for i, col in enumerate(['Open', 'High', 'Low', 'Close']):
plt.subplot(2,2,i+1)
data_grouped[col].plot.bar()
plt.show()

df['is_quarter_end'] = np.where(df['month']%3==0,1,0)
df.head()

https://colab.research.google.com/drive/10aiAnjw847cPy6koQiQw6ayhIhipEdyw?authuser=0#scrollTo=RmkFxGnnuCM_&printMode=true 4/6
10/19/24, 10:30 PM bit_coin_prediction.ipynb - Colab

Date Open High Low Close Volume year month day is_quarter_end

0 2014-09-17 465.864014 468.174011 452.421997 457.334015 21056800 2014 9 17 1

1 2014-09-18 456.859985 456.859985 413.104004 424.440002 34483200 2014 9 18 1

2 2014-09-19 424.102997 427.834991 384.532013 394.795990 37919700 2014 9 19 1

3 2014-09-20 394.673004 423.295990 389.882996 408.903992 36863600 2014 9 20 1

4 2014-09-21 408.084991 412.425995 393.181000 398.821014 26580100 2014 9 21 1

Next steps: Generate code with df View recommended plots New interactive sheet

df['open-close'] = df['Open'] - df['Close']

df['low-high'] = df['Low'] - df['High']
df['target'] = np.where(df['Close'].shift(-1) > df['Close'], 1, 0)

plt.pie(df['target'].value_counts().values,
labels=[0, 1], autopct='%1.1f%%')
plt.show()

plt.figure(figsize=(10, 10))

# As our concern is with the highly

# correlated features only so, we will visualize
# our heatmap as per that criteria only.
sb.heatmap(df.corr() > 0.9, annot=True, cbar=False)
plt.show()

https://colab.research.google.com/drive/10aiAnjw847cPy6koQiQw6ayhIhipEdyw?authuser=0#scrollTo=RmkFxGnnuCM_&printMode=true 5/6
10/19/24, 10:30 PM bit_coin_prediction.ipynb - Colab

features = df[['open-close', 'low-high', 'is_quarter_end']]

target = df['target']

scaler = StandardScaler()
features = scaler.fit_transform(features)

X_train, X_valid, Y_train, Y_valid = train_test_split(

features, target, test_size=0.1, random_state=2022)
print(X_train.shape, X_valid.shape)

(2441, 3) (272, 3)

models = [LogisticRegression(), SVC(kernel='poly', probability=True), XGBClassifier()]

for i in range(3):
models[i].fit(X_train, Y_train)

print(f'{models[i]} : ')
print('Training Accuracy : ', metrics.roc_auc_score(Y_train, models[i].predict_proba(X_train)[:,1]))
print('Validation Accuracy : ', metrics.roc_auc_score(Y_valid, models[i].predict_proba(X_valid)[:,1]))
print()

LogisticRegression() :
Training Accuracy : 0.5272712493564907
Validation Accuracy : 0.5187429004165088

SVC(kernel='poly', probability=True) :
Training Accuracy : 0.4828745224483161
Validation Accuracy : 0.5278844593498134

XGBClassifier(base_score=None, booster=None, callbacks=None,

colsample_bylevel=None, colsample_bynode=None,
colsample_bytree=None, device=None, early_stopping_rounds=None,
enable_categorical=False, eval_metric=None, feature_types=None,
gamma=None, grow_policy=None, importance_type=None,
interaction_constraints=None, learning_rate=None, max_bin=None,
max_cat_threshold=None, max_cat_to_onehot=None,
max_delta_step=None, max_depth=None, max_leaves=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
multi_strategy=None, n_estimators=None, n_jobs=None,
num_parallel_tree=None, random_state=None, ...) :
Training Accuracy : 0.9229563497439509
Validation Accuracy : 0.46156758803483533

from sklearn.metrics import ConfusionMatrixDisplay

ConfusionMatrixDisplay.from_estimator(models[0], X_valid, Y_valid)

plt.show()

https://colab.research.google.com/drive/10aiAnjw847cPy6koQiQw6ayhIhipEdyw?authuser=0#scrollTo=RmkFxGnnuCM_&printMode=true 6/6

024 Price and Everything PDF
100% (1)
024 Price and Everything PDF
12 pages
Cisco UCS Troubleshooting
No ratings yet
Cisco UCS Troubleshooting
134 pages
Learning Pandas PDF
No ratings yet
Learning Pandas PDF
171 pages
House Price Prediction: Project Description
No ratings yet
House Price Prediction: Project Description
11 pages
Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques: A Guide to Data Science for Fraud Detection
From Everand
Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques: A Guide to Data Science for Fraud Detection
Bart Baesens
No ratings yet
Machine Learning in CryptoCurrency Markets
No ratings yet
Machine Learning in CryptoCurrency Markets
25 pages
Bitcoin Prise Using LSTM - Ipynb - Colab
No ratings yet
Bitcoin Prise Using LSTM - Ipynb - Colab
49 pages
ML Report Miniproject
No ratings yet
ML Report Miniproject
11 pages
Week-5 - Jupyter Notebook
No ratings yet
Week-5 - Jupyter Notebook
9 pages
Netflix Stock Price Prediction
No ratings yet
Netflix Stock Price Prediction
20 pages
History of Code
No ratings yet
History of Code
37 pages
New Text Document
No ratings yet
New Text Document
2 pages
Project Intern - Jupyter Notebook
No ratings yet
Project Intern - Jupyter Notebook
16 pages
Pawar (2022) Seasonal and Non Seasonal GARCH TimeSeries Analysis
No ratings yet
Pawar (2022) Seasonal and Non Seasonal GARCH TimeSeries Analysis
33 pages
NN
No ratings yet
NN
7 pages
BitcoinAnalysis - Ipynb - Colaboratory
No ratings yet
BitcoinAnalysis - Ipynb - Colaboratory
12 pages
BitcoinAnalysis - Ipynb - Colaboratory
No ratings yet
BitcoinAnalysis - Ipynb - Colaboratory
12 pages
Pandas
No ratings yet
Pandas
24 pages
Data Wrangling With Python and Pandas
No ratings yet
Data Wrangling With Python and Pandas
7 pages
Advanced BTC Price Prediction
No ratings yet
Advanced BTC Price Prediction
8 pages
Final Report
No ratings yet
Final Report
28 pages
Cia Code
No ratings yet
Cia Code
38 pages
Python For DS Cheat Sheet
100% (2)
Python For DS Cheat Sheet
6 pages
Markets
No ratings yet
Markets
5 pages
Bajaj Finance 10 Years
No ratings yet
Bajaj Finance 10 Years
38 pages
Pandas PDF
No ratings yet
Pandas PDF
171 pages
10 Minutes To Pandas - Pandas 1.2.4 Documentation
No ratings yet
10 Minutes To Pandas - Pandas 1.2.4 Documentation
18 pages
Import As Import As From Import: # Load Dataset
No ratings yet
Import As Import As From Import: # Load Dataset
7 pages
Cia 1.1
No ratings yet
Cia 1.1
7 pages
Chapter 9 BTC PRICE PRED
No ratings yet
Chapter 9 BTC PRICE PRED
12 pages
Chap 1: Preparing Data and A Linear Model: Explore The Data With Some EDA
No ratings yet
Chap 1: Preparing Data and A Linear Model: Explore The Data With Some EDA
27 pages
Practical (Data Science)
No ratings yet
Practical (Data Science)
13 pages
Bitcoine Data Analysis
No ratings yet
Bitcoine Data Analysis
7 pages
ML Book Notes
No ratings yet
ML Book Notes
9 pages
10 Minutes To Pandas - Pandas 2.1.1 Documentation
No ratings yet
10 Minutes To Pandas - Pandas 2.1.1 Documentation
24 pages
Lunc Prediction
No ratings yet
Lunc Prediction
6 pages
BitcoinDataAnalysisCaseStudy - GoogleCollab
No ratings yet
BitcoinDataAnalysisCaseStudy - GoogleCollab
10 pages
Gold Price Forecasting Using Time Series
100% (2)
Gold Price Forecasting Using Time Series
15 pages
0.1 Stock Data
100% (1)
0.1 Stock Data
4 pages
Forage 1
No ratings yet
Forage 1
9 pages
USA Real Estate Price Prediction Using Decision Tree Regressor, and AdaBoost Regressor
No ratings yet
USA Real Estate Price Prediction Using Decision Tree Regressor, and AdaBoost Regressor
14 pages
10 - Jayesh - Prakash - Rane
No ratings yet
10 - Jayesh - Prakash - Rane
26 pages
Project Report
No ratings yet
Project Report
37 pages
10 Minutes To Pandas
No ratings yet
10 Minutes To Pandas
26 pages
Homework 1
No ratings yet
Homework 1
7 pages
Group 5 AI END SEM REPORT
No ratings yet
Group 5 AI END SEM REPORT
11 pages
Online Machine Learning Algorithms For Currency Exchange Prediction
No ratings yet
Online Machine Learning Algorithms For Currency Exchange Prediction
84 pages
How I Built A Stock Prediction Tool in Python
No ratings yet
How I Built A Stock Prediction Tool in Python
10 pages
Importing Libraries: Import As Import As Import As From Import As From Import From Import Import
100% (1)
Importing Libraries: Import As Import As Import As From Import As From Import From Import Import
11 pages
Pandas PDF
No ratings yet
Pandas PDF
25 pages
10 Minutes To Pandas - Pandas 0.21
No ratings yet
10 Minutes To Pandas - Pandas 0.21
23 pages
Regression Analysis - Lasso and Ridge Regularization
No ratings yet
Regression Analysis - Lasso and Ridge Regularization
17 pages
Analysis and Prediction of House Prices by Linear Regression Model
No ratings yet
Analysis and Prediction of House Prices by Linear Regression Model
91 pages
Explain Me Every Code Written in It With Deep Know
No ratings yet
Explain Me Every Code Written in It With Deep Know
7 pages
How To Predict Doge Coin Price Using Machine Learning and Python
No ratings yet
How To Predict Doge Coin Price Using Machine Learning and Python
14 pages
Linha de Comando Over
No ratings yet
Linha de Comando Over
22 pages
Gold Price Analysis (Neural Network)
No ratings yet
Gold Price Analysis (Neural Network)
44 pages
EDA - Exploratory Data Analysis
No ratings yet
EDA - Exploratory Data Analysis
16 pages
Data Science Programming In Python
From Everand
Data Science Programming In Python
Anita Raichand
No ratings yet
TensorFlow深度学习项目实战: Chinese Edition
From Everand
TensorFlow深度学习项目实战: Chinese Edition
Posts & Telecom Press
No ratings yet
CryENGINE 3 Cookbook
From Everand
CryENGINE 3 Cookbook
Sean Tracy
No ratings yet
Problem - B - Codeforces
No ratings yet
Problem - B - Codeforces
2 pages
COMP8780 Assignment Two - 2021-Final
No ratings yet
COMP8780 Assignment Two - 2021-Final
10 pages
Xbox Paper
No ratings yet
Xbox Paper
5 pages
Skills Matrix - RQ00356
No ratings yet
Skills Matrix - RQ00356
22 pages
Trellix Data Loss Prevention Endpoint Complete
No ratings yet
Trellix Data Loss Prevention Endpoint Complete
2 pages
Project Report Format (AspirationIWish)
No ratings yet
Project Report Format (AspirationIWish)
33 pages
CP R80.10 EndpointSecurity AdminGuide
No ratings yet
CP R80.10 EndpointSecurity AdminGuide
190 pages
T01-1 MasterFrame Tutorial - The Basics
No ratings yet
T01-1 MasterFrame Tutorial - The Basics
68 pages
jcp11 01 Rms 20240118
No ratings yet
jcp11 01 Rms 20240118
14 pages
Multiple Matching Exercise 18
No ratings yet
Multiple Matching Exercise 18
3 pages
The Lay of The Land: Chapter Objectives
No ratings yet
The Lay of The Land: Chapter Objectives
35 pages
Introduction To Tableau Postgraduate Notes
No ratings yet
Introduction To Tableau Postgraduate Notes
3 pages
Mini Project
No ratings yet
Mini Project
2 pages
Ma 0702 05 en 00 - Setup Manual
No ratings yet
Ma 0702 05 en 00 - Setup Manual
214 pages
SAP Cheet Sheet
No ratings yet
SAP Cheet Sheet
3 pages
ED1072
No ratings yet
ED1072
7 pages
Program 1: Develop & Demonstrate A XHTML Document That Illustrates The Use of External Style Sheet, Ordered List, Table, Borders, Padding, Color & The Tag. m1.html
No ratings yet
Program 1: Develop & Demonstrate A XHTML Document That Illustrates The Use of External Style Sheet, Ordered List, Table, Borders, Padding, Color & The Tag. m1.html
26 pages
Patran Contact Pairs Settings Web of MSC Nastran 3d Contact Video Series Mica
No ratings yet
Patran Contact Pairs Settings Web of MSC Nastran 3d Contact Video Series Mica
20 pages
Mod2 CS
No ratings yet
Mod2 CS
161 pages
Data Analysis Using Python (1) NAVTTC
No ratings yet
Data Analysis Using Python (1) NAVTTC
17 pages
105-DMA MCQ Unit-1
No ratings yet
105-DMA MCQ Unit-1
20 pages
Laboratory Record Note Book: Rajalakshmi Institute of Technology
100% (1)
Laboratory Record Note Book: Rajalakshmi Institute of Technology
110 pages
Magic Lantern Raw: Written and Photographed by Stephen Mick
No ratings yet
Magic Lantern Raw: Written and Photographed by Stephen Mick
44 pages
Grade 5 Holiday Study Pack T1
No ratings yet
Grade 5 Holiday Study Pack T1
20 pages
Practical Projects For Operate Personal Computer
No ratings yet
Practical Projects For Operate Personal Computer
3 pages
SQL Queries
No ratings yet
SQL Queries
3 pages
Explain Following CSS Properties
No ratings yet
Explain Following CSS Properties
8 pages
Final - Proposal 1 - Student - Management - System
No ratings yet
Final - Proposal 1 - Student - Management - System
56 pages
Sims STKDMP
No ratings yet
Sims STKDMP
5 pages

IML Project

Uploaded by

IML Project

Uploaded by

10/19/24, 10:30 PM bit_coin_prediction.

from sklearn.model_selection import train_test_split

Date Open High Low Close Adj Close Volume

0 2014-09-17 465.864014 468.174011 452.421997 457.334015 457.334015 21056800

1 2014-09-18 456.859985 456.859985 413.104004 424.440002 424.440002 34483200

2 2014-09-19 424.102997 427.834991 384.532013 394.795990 394.795990 37919700

3 2014-09-20 394.673004 423.295990 389.882996 408.903992 408.903992 36863600

4 2014-09-21 408.084991 412.425995 393.181000 398.821014 398.821014 26580100

Open High Low Close Adj Close Volume

count 2713.000000 2713.000000 2713.000000 2713.000000 2713.000000 2.713000e+03

mean 11311.041069 11614.292482 10975.555057 11323.914637 11323.914637 1.470462e+10

std 16106.428891 16537.390649 15608.572560 16110.365010 16110.365010 2.001627e+10

min 176.897003 211.731003 171.509995 178.102997 178.102997 5.914570e+06

25% 606.396973 609.260986 604.109985 606.718994 606.718994 7.991080e+07

50% 6301.569824 6434.617676 6214.220215 6317.609863 6317.609863 5.098183e+09

75% 10452.399414 10762.644531 10202.387695 10462.259766 10462.259766 2.456992e+10

max 67549.734375 68789.625000 66382.062500 67566.828125 67566.828125 3.509679e+11

0 Date 2713 non-null object

df[df['Close'] == df['Adj Close']].shape, df.shape

((2713, 7), (2713, 7))

df = df.drop(['Adj Close'], axis=1)

features = ['Open', 'High', 'Low', 'Close']

splitted = df['Date'].str.split('-', expand=True)

# Convert the 'Date' column to datetime objects

Date Open High Low Close Volume year month day

0 2014-09-17 465.864014 468.174011 452.421997 457.334015 21056800 2014 9 17

1 2014-09-18 456.859985 456.859985 413.104004 424.440002 34483200 2014 9 18

2 2014-09-19 424.102997 427.834991 384.532013 394.795990 37919700 2014 9 19

3 2014-09-20 394.673004 423.295990 389.882996 408.903992 36863600 2014 9 20

4 2014-09-21 408.084991 412.425995 393.181000 398.821014 26580100 2014 9 21

0 2014-09-17 465.864014 468.174011 452.421997 457.334015 21056800 2014 9 17 1

1 2014-09-18 456.859985 456.859985 413.104004 424.440002 34483200 2014 9 18 1

2 2014-09-19 424.102997 427.834991 384.532013 394.795990 37919700 2014 9 19 1

3 2014-09-20 394.673004 423.295990 389.882996 408.903992 36863600 2014 9 20 1

4 2014-09-21 408.084991 412.425995 393.181000 398.821014 26580100 2014 9 21 1

df['open-close'] = df['Open'] - df['Close']

# As our concern is with the highly

features = df[['open-close', 'low-high', 'is_quarter_end']]

X_train, X_valid, Y_train, Y_valid = train_test_split(

models = [LogisticRegression(), SVC(kernel='poly', probability=True), XGBClassifier()]

XGBClassifier(base_score=None, booster=None, callbacks=None,

from sklearn.metrics import ConfusionMatrixDisplay

ConfusionMatrixDisplay.from_estimator(models[0], X_valid, Y_valid)

You might also like