0% found this document useful (0 votes)

19 views10 pages

BitcoinDataAnalysisCaseStudy - GoogleCollab

The document discusses merging Bitcoin trading data from 2017-2019 into a single dataset and transforming the minute-level data into daily aggregates. Key steps include: 1) Importing and merging the minute-level Bitcoin price and volume data from 2017, 2018, and 2019. 2) Converting the data types and checking for duplicates and null values. 3) Sorting the data chronologically by date and setting date as the index. 4) Resampling the minute-level data to daily averages and sums to refine the analysis and make trends more clear.

Uploaded by

ramihameed2000

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views10 pages

BitcoinDataAnalysisCaseStudy - GoogleCollab

Uploaded by

ramihameed2000

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

1/13/24, 9:58 PM Untitled0.

ipynb - Colaboratory

"An Analysis of Bitcoin Trading Data from 2017-2019: A Brief Case Study Demonstrating Expertise
in Predictive Modeling of Closing Prices" This case study aims to present a streamlined approach
for modeling Bitcoin trading data on a minute-by-minute basis, spanning the years 2017 to 2019.
The primary objective is to develop a straightforward yet effective model to forecast Bitcoin's
closing prices, showcasing both a deep understanding of the cryptocurrency market and proficiency
in data analysis techniques.

from google.colab import drive

drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.moun

"Initially, we commence by integrating the datasets from 2017, 2018, and 2019 into a singular
comprehensive dataset."

import pandas as pd

# File paths .
file_2017 = '/content/drive/MyDrive/BTC-2017min.csv'
file_2018 = '/content/drive/MyDrive/BTC-2018min.csv'
file_2019 = '/content/drive/MyDrive/BTC-2019min.csv'

# Load the datasets .

data_2017 = pd.read_csv(file_2017)
data_2018 = pd.read_csv(file_2018)
data_2019 = pd.read_csv(file_2019)

# Merging the datasets

merged_data = pd.concat([data_2017, data_2018, data_2019])

# Saving the merged dataset

merged_data.to_csv('/content/drive/My Drive/BTC_2017-2019_merged.csv', index=False)

data exploration and data cleaning

Column Data Types:

merged_data.dtypes

unix int64
date object
https://colab.research.google.com/drive/1YzxhLK6bzcIR-j5btF8wVutr8BZsToAR#scrollTo=bC29Dc8VaDPR&printMode=true 1/10
1/13/24, 9:58 PM Untitled0.ipynb - Colaboratory
symbol object
open float64
high float64
low float64
close float64
Volume BTC float64
Volume USD float64
dtype: object

Changing data types for usability:

# Convert 'unix' to datetime

merged_data['unix'] = pd.to_datetime(merged_data['unix'], unit='s') # Assuming 'unix' is in
# Convert 'date' to datetime
merged_data['date'] = pd.to_datetime(merged_data['date'])

# Convert 'symbol' to string

merged_data['symbol'] = merged_data['symbol'].astype('string')

# Check the data types again

merged_data.dtypes

unix datetime64[ns]
date datetime64[ns]
symbol string
open float64
high float64
low float64
close float64
Volume BTC float64
Volume USD float64
dtype: object

check unix and date if the same?

# Compare 'unix' and 'date'

# Creating a new column 'is_same' to check if 'unix' and 'date' are the same (up to seconds)
merged_data['is_same'] = merged_data['unix'].dt.floor('S') == merged_data['date'].dt.floor('

# Check the comparison results

print(merged_data[['unix', 'date', 'is_same']].head())
# unique values
merged_data['is_same'].unique()

unix date is_same

0 2017-12-31 23:59:00 2017-12-31 23:59:00 True
1 2017-12-31 23:58:00 2017-12-31 23:58:00 True
2 2017-12-31 23:57:00 2017-12-31 23:57:00 True
3 2017-12-31 23:56:00 2017-12-31 23:56:00 True
4 2017-12-31 23:55:00 2017-12-31 23:55:00 True

https://colab.research.google.com/drive/1YzxhLK6bzcIR-j5btF8wVutr8BZsToAR#scrollTo=bC29Dc8VaDPR&printMode=true 2/10
1/13/24, 9:58 PM Untitled0.ipynb - Colaboratory
array([ True])

since unix and date are exact match , drop unix

# Drop 'unix' and 'is_same' columns

merged_data = merged_data.drop(columns=['unix', 'is_same'])
merged_data.head()

Volume
date symbol open high low close Volume USD
BTC

2017-
0 12-31 BTC/USD 13913.28 13913.28 13867.18 13880.00 0.591748 8213.456549
23:59:00

2017-
1 12-31 BTC/USD 13913.26 13953.83 13884.69 13953.77 1.398784 19518.309658
23:58:00

Value Counts for symbol:

merged_data['symbol'].value_counts()

BTC/USD 1576797
Name: symbol, dtype: Int64

Unique Values in a Column:

merged_data['symbol'].nunique()

Correlation Matrix: To check the correlation between different numerical columns:

merged_data.corr()

https://colab.research.google.com/drive/1YzxhLK6bzcIR-j5btF8wVutr8BZsToAR#scrollTo=bC29Dc8VaDPR&printMode=true 3/10
1/13/24, 9:58 PM Untitled0.ipynb - Colaboratory

<ipython-input-54-cc54846d37e8>:1: FutureWarning: The default value of numeric_only in D

merged_data.corr()
open high low close Volume BTC Volume USD

open 1.000000 0.999997 0.999996 0.999995 0.027315 0.252212

high 0.999997 1.000000 0.999994 0.999997 0.028008 0.253042

low 0.999996 0.999994 1.000000 0.999996 0.026497 0.251236

close 0.999995 0.999997 0.999996 1.000000 0.027233 0.252119

Volume BTC 0.027315 0.028008 0.026497 0.027233 1.000000 0.831629

Volume USD 0.252212 0.253042 0.251236 0.252119 0.831629 1.000000

Sample of Data:

merged_data.sample(5)

Volume
date symbol open high low close Volume USD
BTC

2017-
213716 08-05 BTC/USD 3156.07 3156.07 3156.07 3156.07 0.000000 0.000000
14:03:00

2018-
174849 09-01 BTC/USD 7041.85 7051.72 7039.02 7051.72 1.109430 7823.389085
13:50:00

Check for duplicates & Nulls entries in the data

merged_data.duplicated().sum()

merged_data.isnull().sum()

date 0
symbol 0
open 0
high 0
low 0
close 0
Volume BTC 0
Volume USD 0
dtype: int64

https://colab.research.google.com/drive/1YzxhLK6bzcIR-j5btF8wVutr8BZsToAR#scrollTo=bC29Dc8VaDPR&printMode=true 4/10
1/13/24, 9:58 PM Untitled0.ipynb - Colaboratory

merged_data['date'].head(5)

0 2017-12-31 23:59:00
1 2017-12-31 23:58:00
2 2017-12-31 23:57:00
3 2017-12-31 23:56:00
4 2017-12-31 23:55:00
Name: date, dtype: datetime64[ns]

merged_data['date'].nunique()

1576797

"Next, we proceed to meticulously organize the minutely data in chronological order. This sorting
by date is crucial for maintaining the integrity of the time series."

merged_data_sorted = merged_data.sort_values(by='date')

"we then transform the minutely data into a daily format. This conversion is aimed at refining the
analysis process. Aggregating the data on a daily basis allows for a clearer, more manageable
overview of trends and patterns, which is particularly beneficial for more effective and insightful
analysis."

import pandas as pd

# Convert 'date' to datetime if not already done

merged_data_sorted['date'] = pd.to_datetime(merged_data_sorted['date'])

# Set the 'date' column as the index

merged_data_sorted.set_index('date', inplace=True)

# Resample to daily data and aggregate

daily_data = merged_data_sorted.resample('D').agg({
'open': 'mean', # mean of open prices
'high': 'mean', # mean of high prices
'low': 'mean', # mean of low prices
'close': 'mean', # mean of close prices
'Volume BTC': 'sum', # sum of BTC volumes
'Volume USD': 'sum' # sum of USD volumes
})

reset index and see data

https://colab.research.google.com/drive/1YzxhLK6bzcIR-j5btF8wVutr8BZsToAR#scrollTo=bC29Dc8VaDPR&printMode=true 5/10
1/13/24, 9:58 PM Untitled0.ipynb - Colaboratory

daily_data.reset_index(inplace=True)
daily_data.head(5)

date open high low close Volume BTC Volume USD

2017-
0 977.256602 977.385233 977.132620 977.276060 6850.593309 6.765936e+06
01-01

2017-
1 1012.267604 1012.517181 1011.988826 1012.273903 8167.381030 8.276031e+06
01-02

2017-
2 1020.001535 1020.226840 1019.794437 1020.040472 9089.658025 9.276735e+06
01-03

the next step is to securely store this refined dataset. We accomplish this by exporting the
'daily_data' into a CSV file.

daily_data.to_csv('/content/drive/MyDrive/daily_data.csv', index = False)

Adding a moving average (or moving mean) to your dataset is a common technique in time series
analysis, especially in financial data analysis. It helps in smoothing out short-term fluctuations and
highlighting longer-term trends or cycles.

import pandas as pd

# Load your dataset

daily_data = pd.read_csv('/content/drive/MyDrive/daily_data.csv')
#set_index
daily_data.set_index('date', inplace=True)

# Choose a window size for the moving average, 20 days

window_size = 20

# Calculate the moving average for the 'close' price

daily_data['moving_average_close'] = daily_data['close'].rolling(window=window_size).mean()

# Now your daily_data DataFrame has an additional column with the 20-day moving average of t
daily_data.reset_index('date', inplace=True)
#replace first 19 rows of null because of the 20 window.
#used this approach , using close price as defauly price.
daily_data['moving_average_close'].fillna(daily_data['close'], inplace=True)
print(daily_data.head(25)) # Displaying the first 25 rows to see some of the moving average

date open high low close \

0 2017-01-01 977.256602 977.385233 977.132620 977.276060
1 2017-01-02 1012.267604 1012.517181 1011.988826 1012.273903

https://colab.research.google.com/drive/1YzxhLK6bzcIR-j5btF8wVutr8BZsToAR#scrollTo=bC29Dc8VaDPR&printMode=true 6/10
1/13/24, 9:58 PM Untitled0.ipynb - Colaboratory
2 2017-01-03 1020.001535 1020.226840 1019.794437 1020.040472
3 2017-01-04 1076.558840 1077.271167 1075.572542 1076.553639
4 2017-01-05 1043.608646 1044.905549 1042.094125 1043.547951
5 2017-01-06 934.455278 935.419188 933.269312 934.416729
6 2017-01-07 869.618951 870.700465 868.904215 869.738333
7 2017-01-08 914.224917 914.637931 913.597944 913.966083
8 2017-01-09 893.495403 893.856319 893.047132 893.471535
9 2017-01-10 902.637313 902.858104 902.343042 902.638375
10 2017-01-11 846.285701 847.306472 845.153167 846.173313
11 2017-01-12 782.970292 783.542347 782.386604 782.961688
12 2017-01-13 807.222361 807.674451 806.778986 807.177507
13 2017-01-14 827.433958 827.573681 827.271819 827.412431
14 2017-01-15 817.127076 817.298757 816.911528 817.081007
15 2017-01-16 827.983313 828.105757 827.839722 827.977958
16 2017-01-17 876.174264 876.640868 875.760896 876.181472
17 2017-01-18 885.380229 885.695486 885.050083 885.345653
18 2017-01-19 893.326090 893.591750 893.053882 893.294389
19 2017-01-20 895.566618 895.695965 895.415535 895.552688
20 2017-01-21 917.654201 917.797965 917.532965 917.679382
21 2017-01-22 922.678097 922.897590 922.452764 922.689111
22 2017-01-23 920.178285 920.327340 919.997403 920.188507
23 2017-01-24 907.212306 907.418618 906.928014 907.186326
24 2017-01-25 894.505562 894.616382 894.376438 894.509347

Volume BTC Volume USD moving_average_close

0 6850.593309 6.765936e+06 977.276060
1 8167.381030 8.276031e+06 1012.273903
2 9089.658025 9.276735e+06 1020.040472
3 21562.456972 2.347651e+07 1076.553639
4 36018.861120 3.619081e+07 1043.547951
5 27916.703099 2.553144e+07 934.416729
6 20401.113591 1.761907e+07 869.738333
7 8937.492708 8.164011e+06 913.966083
8 8716.182941 7.782149e+06 893.471535
9 8535.521688 7.706384e+06 902.638375
10 35893.768368 2.945219e+07 846.173313
11 17400.141555 1.363246e+07 782.961688
12 11409.520330 9.224971e+06 807.177507
13 6614.718992 5.469742e+06 827.412431
14 4231.463903 3.454909e+06 817.081007
15 6166.043977 5.107435e+06 827.977958
16 12264.169385 1.077497e+07 876.181472
17 11181.898878 9.830026e+06 885.345653
18 11094.603298 9.928565e+06 893.294389
19 6618.627764 5.915721e+06 905.154059
20 5865.632031 5.373761e+06 902.174225
21 7166.665479 6.566289e+06 897.694986
22 3514.741429 3.234650e+06 892.702387
23 9405.046565 8.497003e+06 884.234022
24 5291.554742 4.725942e+06 876.782092

daily_data['date'].head()

0 2017-01-01
1 2017-01-02

https://colab.research.google.com/drive/1YzxhLK6bzcIR-j5btF8wVutr8BZsToAR#scrollTo=bC29Dc8VaDPR&printMode=true 7/10
1/13/24, 9:58 PM Untitled0.ipynb - Colaboratory
2 2017-01-03
3 2017-01-04
4 2017-01-05
Name: date, dtype: object

daily_data.head()

date open high low close Volume BTC Volume USD m

2017-
0 977.256602 977.385233 977.132620 977.276060 6850.593309 6.765936e+06
01-01

2017-
1 1012.267604 1012.517181 1011.988826 1012.273903 8167.381030 8.276031e+06
01-02

2017-
2 1020.001535 1020.226840 1019.794437 1020.040472 9089.658025 9.276735e+06
01-03

"After preparing and saving the daily data, we shift our focus to developing a predictive model. For
this purpose, a Linear Regression model is chosen due to its effectiveness in capturing linear
relationships between variables. In this context, the model will be employed to understand and
predict the closing prices of Bitcoin based on the daily data trends observed over the previous
years."

https://colab.research.google.com/drive/1YzxhLK6bzcIR-j5btF8wVutr8BZsToAR#scrollTo=bC29Dc8VaDPR&printMode=true 8/10
1/13/24, 9:58 PM Untitled0.ipynb - Colaboratory

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, mean_absolute_error
import numpy as np

# Separate features and target

X = daily_data[['open', 'high', 'low', 'Volume BTC', 'Volume USD', 'moving_average_close']]
y = daily_data['close']

# Split the data

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize and train the model

model = LinearRegression()
model.fit(X_train, y_train)

# Predict and evaluate

predictions = model.predict(X_test)
mae = mean_absolute_error(y_test, predictions)
rmse = np.sqrt(mean_squared_error(y_test, predictions))

print(f'Mean Absolute Error: {mae}')

print(f'Root Mean Squared Error: {rmse}')

Mean Absolute Error: 0.33095838248461557

Root Mean Squared Error: 0.540670528986689

Scaling the entire dataset is an important preprocessing step, especially in regression analysis
where features might have different scales and units. This can significantly impact the performance
of many machine learning algorithms, including linear regression.

https://colab.research.google.com/drive/1YzxhLK6bzcIR-j5btF8wVutr8BZsToAR#scrollTo=bC29Dc8VaDPR&printMode=true 9/10
1/13/24, 9:58 PM Untitled0.ipynb - Colaboratory

from sklearn.preprocessing import StandardScaler

from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, mean_squared_error
import numpy as np

# Initialize the StandardScaler

scaler = StandardScaler()

# Scale the training data and also apply the same transformation to the test data
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Initialize the Linear Regression model

linear_model = LinearRegression()

https://colab.research.google.com/drive/1YzxhLK6bzcIR-j5btF8wVutr8BZsToAR#scrollTo=bC29Dc8VaDPR&printMode=true 10/10

EDA Cheat Sheet - Exploratory Data Analysis
No ratings yet
EDA Cheat Sheet - Exploratory Data Analysis
2 pages
Data Aggregation
No ratings yet
Data Aggregation
68 pages
Al-Husayn Ibn Hamdân Al-Khasîbî: A Historical Biography of The Founder of The Nusayrî - 'Alawite Sect"
100% (1)
Al-Husayn Ibn Hamdân Al-Khasîbî: A Historical Biography of The Founder of The Nusayrî - 'Alawite Sect"
23 pages
BitcoinAnalysis - Ipynb - Colaboratory
No ratings yet
BitcoinAnalysis - Ipynb - Colaboratory
12 pages
BitcoinAnalysis - Ipynb - Colaboratory
No ratings yet
BitcoinAnalysis - Ipynb - Colaboratory
12 pages
Python Data Wrangling Tutorial With Pandas
No ratings yet
Python Data Wrangling Tutorial With Pandas
15 pages
Bitcoine Data Analysis
No ratings yet
Bitcoine Data Analysis
7 pages
10 - Jayesh - Prakash - Rane
No ratings yet
10 - Jayesh - Prakash - Rane
26 pages
Python Notes by Prof T
No ratings yet
Python Notes by Prof T
10 pages
Pawar (2022) Seasonal and Non Seasonal GARCH TimeSeries Analysis
No ratings yet
Pawar (2022) Seasonal and Non Seasonal GARCH TimeSeries Analysis
33 pages
ML Report Miniproject
No ratings yet
ML Report Miniproject
11 pages
Homework 1
No ratings yet
Homework 1
7 pages
Analyzing Data Using Python - Cleaning and Analyzing Data in Pandas
No ratings yet
Analyzing Data Using Python - Cleaning and Analyzing Data in Pandas
81 pages
Lunc Prediction
No ratings yet
Lunc Prediction
6 pages
Unit3 - 3) Pandas - Ipynb - Colab
No ratings yet
Unit3 - 3) Pandas - Ipynb - Colab
11 pages
Time Series
No ratings yet
Time Series
23 pages
Semi-Automated Exploratory Data Analysis (EDA) in Python - by Destin Gong - Mar, 2021 - Towards Data
No ratings yet
Semi-Automated Exploratory Data Analysis (EDA) in Python - by Destin Gong - Mar, 2021 - Towards Data
3 pages
Dev Record Aids
No ratings yet
Dev Record Aids
24 pages
Lecture 15 (DS) - Pandas - DataFrame Merging, String Operations
No ratings yet
Lecture 15 (DS) - Pandas - DataFrame Merging, String Operations
25 pages
Co Digit Ooo
No ratings yet
Co Digit Ooo
15 pages
Chapter2 - Data Wrangling
No ratings yet
Chapter2 - Data Wrangling
48 pages
Cleaning Data in Python
No ratings yet
Cleaning Data in Python
8 pages
Data Exploration Preparation
No ratings yet
Data Exploration Preparation
12 pages
Pandas Cheat Sheet Final
No ratings yet
Pandas Cheat Sheet Final
1 page
7 Days Analytics Course 3feiz7 4
No ratings yet
7 Days Analytics Course 3feiz7 4
8 pages
Pandas 6 1716219621
No ratings yet
Pandas 6 1716219621
17 pages
Numpanda
No ratings yet
Numpanda
24 pages
DAP Module4 Notes
No ratings yet
DAP Module4 Notes
17 pages
Lesson - 3 - 1 Data Wrangling
No ratings yet
Lesson - 3 - 1 Data Wrangling
29 pages
Exp 3
No ratings yet
Exp 3
10 pages
Unit Iv
No ratings yet
Unit Iv
63 pages
Python Programming Mock Exam
No ratings yet
Python Programming Mock Exam
20 pages
Python For DS Unit4
No ratings yet
Python For DS Unit4
11 pages
Introduction To Pandas in Data Analytics
No ratings yet
Introduction To Pandas in Data Analytics
12 pages
Exp 8 - LM
No ratings yet
Exp 8 - LM
10 pages
Unit 5 - Real Time Data Analysis
No ratings yet
Unit 5 - Real Time Data Analysis
16 pages
Pandas
No ratings yet
Pandas
94 pages
10 Minutes To Pandas - Pandas 2.1.1 Documentation
No ratings yet
10 Minutes To Pandas - Pandas 2.1.1 Documentation
24 pages
Lecture - 2 Pandas
No ratings yet
Lecture - 2 Pandas
24 pages
基于Engle Granger的低频、高频统计套利研究
No ratings yet
基于Engle Granger的低频、高频统计套利研究
22 pages
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
No ratings yet
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
28 pages
Statistical Transform Data Cleaning
No ratings yet
Statistical Transform Data Cleaning
30 pages
Pandas Merged
No ratings yet
Pandas Merged
2 pages
Lab Record Dev
No ratings yet
Lab Record Dev
20 pages
Task2 Eda Cleaning
No ratings yet
Task2 Eda Cleaning
33 pages
MEE 6070 Data Science and Analytics: Importing Data Using Plotting The Data Checking For Linearity
No ratings yet
MEE 6070 Data Science and Analytics: Importing Data Using Plotting The Data Checking For Linearity
13 pages
Pandas Syntax Revision For ML
No ratings yet
Pandas Syntax Revision For ML
10 pages
Python (Unit - 2)
No ratings yet
Python (Unit - 2)
22 pages
Data Wrangling With Python and Pandas
No ratings yet
Data Wrangling With Python and Pandas
7 pages
Pandas AI ML Python Software Engineering
No ratings yet
Pandas AI ML Python Software Engineering
63 pages
Lab 1 ML Lab
No ratings yet
Lab 1 ML Lab
15 pages
04-Data Manipulation With Pandas
No ratings yet
04-Data Manipulation With Pandas
28 pages
Dev Lab Record
No ratings yet
Dev Lab Record
21 pages
Data Science Unit 2 Second Half Notes
No ratings yet
Data Science Unit 2 Second Half Notes
18 pages
Python Cheat Sheets
97% (33)
Python Cheat Sheets
11 pages
Vba Exam
No ratings yet
Vba Exam
2 pages
Data Preprocessing
No ratings yet
Data Preprocessing
84 pages
10 Minutes To Pandas
No ratings yet
10 Minutes To Pandas
26 pages
Trading Results Analysis
No ratings yet
Trading Results Analysis
27 pages
Pandas Data Wrangling Cheatsheet Datacamp PDF
No ratings yet
Pandas Data Wrangling Cheatsheet Datacamp PDF
1 page
Bitcoin Tutorials - Herong's Tutorial Examples
From Everand
Bitcoin Tutorials - Herong's Tutorial Examples
Herong Yang
No ratings yet
Learning To Teach Large Language Models Logical Reasoning
No ratings yet
Learning To Teach Large Language Models Logical Reasoning
23 pages
Soal Latihan Bahasa Inggris Kelas 1
No ratings yet
Soal Latihan Bahasa Inggris Kelas 1
2 pages
Importance of Language Laboratory in Developing La
No ratings yet
Importance of Language Laboratory in Developing La
6 pages
Sample Questions 1 - Marks Class 11 Maths
No ratings yet
Sample Questions 1 - Marks Class 11 Maths
6 pages
CANCELLATION OF REAL ESTATE MORTGAGE Annex A
No ratings yet
CANCELLATION OF REAL ESTATE MORTGAGE Annex A
3 pages
Calming The Storm in Matthew
No ratings yet
Calming The Storm in Matthew
2 pages
Start An Essay
100% (2)
Start An Essay
7 pages
Voice Based Email System For Visually Impaired
No ratings yet
Voice Based Email System For Visually Impaired
8 pages
The Metaphysics of Quantum Mechanics (T en
100% (1)
The Metaphysics of Quantum Mechanics (T en
353 pages
Makalah Urgensi Pendidikan
No ratings yet
Makalah Urgensi Pendidikan
13 pages
SHS DLL Q2 - w8
No ratings yet
SHS DLL Q2 - w8
6 pages
B-63833en 02 (Hardware)
100% (1)
B-63833en 02 (Hardware)
338 pages
Test Taker Brochure
No ratings yet
Test Taker Brochure
11 pages
Argent Savant 银白学者
No ratings yet
Argent Savant 银白学者
4 pages
BCD To Excess 3
No ratings yet
BCD To Excess 3
3 pages
Prog Numerically
No ratings yet
Prog Numerically
57 pages
2016 Imotc
No ratings yet
2016 Imotc
3 pages
1 - Introduction To Python Programming
No ratings yet
1 - Introduction To Python Programming
19 pages
Software Assignment No1 Zohaib Ijaz 23811
No ratings yet
Software Assignment No1 Zohaib Ijaz 23811
10 pages
(2022 11 10) 張訓碩-生藥學 (terpenoids)
No ratings yet
(2022 11 10) 張訓碩-生藥學 (terpenoids)
49 pages
Druid Vs Necromancer Spell List PDF
No ratings yet
Druid Vs Necromancer Spell List PDF
1 page
‎⁨قيمة الزمن⁩
No ratings yet
‎⁨قيمة الزمن⁩
28 pages
MIL - Module 5 Reviewer
No ratings yet
MIL - Module 5 Reviewer
4 pages
Philippine-Literature
No ratings yet
Philippine-Literature
73 pages
Form As Standards Block 1957 Phronesis Vol 2
No ratings yet
Form As Standards Block 1957 Phronesis Vol 2
14 pages
Defect Bug Life Cycle in Software Testing
No ratings yet
Defect Bug Life Cycle in Software Testing
7 pages
Email Grid
No ratings yet
Email Grid
1 page
Curriculum Vitae
No ratings yet
Curriculum Vitae
3 pages
BR200 V1.2
100% (1)
BR200 V1.2
76 pages

BitcoinDataAnalysisCaseStudy - GoogleCollab

Uploaded by

BitcoinDataAnalysisCaseStudy - GoogleCollab

Uploaded by

1/13/24, 9:58 PM Untitled0.

from google.colab import drive

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.moun

# Load the datasets .

# Merging the datasets

# Saving the merged dataset

data exploration and data cleaning

Column Data Types:

Changing data types for usability:

# Convert 'unix' to datetime

# Convert 'symbol' to string

# Check the data types again

check unix and date if the same?

# Compare 'unix' and 'date'

# Check the comparison results

unix date is_same

since unix and date are exact match , drop unix

# Drop 'unix' and 'is_same' columns

Value Counts for symbol:

Unique Values in a Column:

Correlation Matrix: To check the correlation between different numerical columns:

<ipython-input-54-cc54846d37e8>:1: FutureWarning: The default value of numeric_only in D

open 1.000000 0.999997 0.999996 0.999995 0.027315 0.252212

high 0.999997 1.000000 0.999994 0.999997 0.028008 0.253042

low 0.999996 0.999994 1.000000 0.999996 0.026497 0.251236

close 0.999995 0.999997 0.999996 1.000000 0.027233 0.252119

Volume BTC 0.027315 0.028008 0.026497 0.027233 1.000000 0.831629

Volume USD 0.252212 0.253042 0.251236 0.252119 0.831629 1.000000

Check for duplicates & Nulls entries in the data

# Convert 'date' to datetime if not already done

# Set the 'date' column as the index

# Resample to daily data and aggregate

reset index and see data

date open high low close Volume BTC Volume USD

daily_data.to_csv('/content/drive/MyDrive/daily_data.csv', index = False)

# Load your dataset

# Choose a window size for the moving average, 20 days

# Calculate the moving average for the 'close' price

date open high low close \

Volume BTC Volume USD moving_average_close

date open high low close Volume BTC Volume USD m

# Separate features and target

# Split the data

# Initialize and train the model

# Predict and evaluate

print(f'Mean Absolute Error: {mae}')

Mean Absolute Error: 0.33095838248461557

from sklearn.preprocessing import StandardScaler

# Initialize the StandardScaler

# Initialize the Linear Regression model

You might also like