0% found this document useful (0 votes)

16 views10 pages

Project Documentation

The document discusses preparing a machine learning model to predict gold prices. It imports relevant libraries for data manipulation, visualization and modeling. It then describes collecting gold price data, exploring the data structure and missing values. A heatmap is constructed to understand correlations between variables like GLD, SPX and USO. Finally, the distribution of GLD prices is visualized.

Uploaded by

Uzair Ahmad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views10 pages

Project Documentation

Uploaded by

Uzair Ahmad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

PROJECT NAME: GOLD PRICE

PREDICTION

Group Members

Abdul Tawwab
M Rizwan
Uzair Ahmad

Importing the Libraries

 import numpy as np:
Imports the NumPy library, which is commonly used for numerical
operations and array manipulations.

 import pandas as pd:

Imports the Pandas library, which is widely used for data manipulation and
analysis. It provides data structures like DataFrames for efficient handling of
structured data.

 import matplotlib.pyplot as plt:

Imports the Matplotlib library, which is used for creating various types of
visualizations, such as plots and charts.
 import seaborn as sns:

Imports the Seaborn library, which is built on top of Matplotlib and provides
a high-level interface for drawing attractive and informative statistical
graphics.

 from sklearn.model_selection import train_test_split:

Imports the train_test_split function from scikit-learn, which is used for

splitting datasets into training and testing sets.

 from sklearn.ensemble import RandomForestRegressor:

Imports the RandomForestRegressor class from scikit-learn, which is a

machine learning model for regression tasks based on the random forest
algorithm.

 from sklearn import metrics:

Imports the metrics module from scikit-learn, which includes various
functions for evaluating the performance of machine learning models.

Data Collection and Processing

gold_data.head():

Calls the head() method on the gold_data DataFrame to display the first
five rows of the DataFrame. This is a quick way to inspect the structure and
content of the loaded data.
The
describe() method in Pandas generates descriptive statistics of the numerical
columns in a DataFrame. When applied to a DataFrame like gold_data, it provides
statistical information such as count, mean, standard deviation, minimum, 25th
percentile (Q1), median (50th percentile or Q2), 75th percentile (Q3), and
maximum for each numeric column.

By using gold_data.describe(), you get an overview of the central tendency,

dispersion, and shape of the distribution of each numeric column in the
DataFrame. This can be helpful for understanding the basic statistics of the
dataset and identifying potential outliers or patterns in the numerical data.
# number of rows and columns

gold_data.shape
The shape attribute in Pandas is used to determine the dimensions of a
DataFrame. It returns a tuple where the first element is the number of rows, and
the second element is the number of columns.

So, when you execute gold_data.shape, it will output a tuple representing the
dimensions of the DataFrame gold_data. For example, if the DataFrame has 100
rows and 5 columns, the output would be (100, 5). This
information is useful for understanding the size and

structure of the dataset.

checking the number of missing values
gold_data.isnull().sum()
gold_data.isnull(): This part of the code creates a boolean DataFrame of the same
shape as gold_data, where each element is True if the corresponding element in
gold_data is NaN (null), and False otherwise.
.sum(): This part sums up the True values along each column. Since True is treated
as 1 and False as 0 in numerical operations, the result is a Series that shows the
total number of missing values for each column.

The output will be a Series where the index represents column names, and the
values represent the count of missing values in each column. This information is
valuable for understanding the completeness of the dataset and deciding how to
handle missing values during data preprocessing.

constructing a heatmap to understand the correlatiom

plt.figure(figsize = (8,8))

sns.heatmap(correlation, cbar=True, square=True, fmt='.1f',annot=True,

annot_kws={'size':8}, cmap='plasma')
plt.figure(figsize=(8, 8)): This line creates a Matplotlib figure with a specified size
of 8x8 inches for the heatmap.

sns.heatmap(): This function from Seaborn is used to create a heatmap. It

visualizes the correlation matrix of a dataset.

correlation: It seems like the variable correlation is assumed to be a correlation

matrix (a 2D array or DataFrame) containing correlation coefficients between
different variables.

cbar=True: Displays a colorbar beside the heatmap to indicate the mapping of

colors to correlation values.

square=True: Ensures that the heatmap is square.

fmt='.1f': Formats the values in the heatmap with one decimal place.

annot=True: Displays the correlation values on the heatmap.

annot_kws={'size': 8}: Sets the size of the annotations to 8.

cmap='plasma': Specifies the color map to be used. In this case, 'plasma' is
chosen.

correlation values of GLD

print(correlation['GLD'])

correlation['GLD']: Assuming that correlation is a

DataFrame or a Series containing correlation
coefficients, this line selects the column labeled
'GLD' from the DataFrame or retrieves the correlation
values of the 'GLD' variable with all other variables.

print(): This function is used to display the selected

correlation values.

The output will be a Series or a single column

DataFrame (depending on the structure of correlation),
where the index represents the variable names, and the
values
represent
the correlation coefficients with the 'GLD' variable.

This information is valuable for understanding how

strongly each variable is correlated with the 'GLD'
variable. Positive values indicate positive
correlation, negative values indicate negative
correlation, and values closer to 0 indicate weaker
correlation.
SPX 0.049345
GLD 1.000000
USO -0.186360
SLV 0.866632
EUR/USD -0.024375
Name: GLD, dtype: float64

checking the distribution of the GLD Price

sns.distplot(gold_data['GLD'],color='green')
gold_data['GLD']: Extracts the 'GLD' column from the
gold_data DataFrame, representing the Gold prices.

sns.distplot(): This Seaborn function is used to create

a distribution plot, which combines a histogram with a
kernel density estimate (KDE) curve. It provides a
visual representation of the distribution of a
univariate dataset.

gold_data['GLD']: The variable for which the

distribution is being plotted.

color='green': Specifies the color of the plot. In this

case, the color is set to green.

The resulting plot will show the distribution of Gold

prices, helping to visualize the frequency and pattern
of different price levels. The histogram provides
information about the density of prices in various
ranges, and the KDE curve offers a smooth estimate of
the probability density function. This can be useful
for understanding the central tendency and spread of
the Gold prices in the dataset.

(Roundscape Adorevia) Unnoficial Game Guide
100% (1)
(Roundscape Adorevia) Unnoficial Game Guide
156 pages
Data Analysis and Visualisation With Python
No ratings yet
Data Analysis and Visualisation With Python
42 pages
250 MW O&M Manual
100% (2)
250 MW O&M Manual
375 pages
Poa Sba
100% (2)
Poa Sba
14 pages
Wendland, Aristeae Ad Philocratem Epistula
No ratings yet
Wendland, Aristeae Ad Philocratem Epistula
275 pages
AI PDF Presentation
No ratings yet
AI PDF Presentation
9 pages
Advanced Plot Types With Seaborn
No ratings yet
Advanced Plot Types With Seaborn
8 pages
Data Visualization
No ratings yet
Data Visualization
48 pages
Gold Price Prediction Using RandomForestRegressor and ML
No ratings yet
Gold Price Prediction Using RandomForestRegressor and ML
7 pages
Gold Price Forecasting Using Time Series
100% (2)
Gold Price Forecasting Using Time Series
15 pages
Ai&Ml Bail606 ML Lab Manual
No ratings yet
Ai&Ml Bail606 ML Lab Manual
50 pages
Seaborn
No ratings yet
Seaborn
7 pages
Pandas 3-2
No ratings yet
Pandas 3-2
27 pages
BDA File
No ratings yet
BDA File
26 pages
Matplotlib Notes
No ratings yet
Matplotlib Notes
5 pages
Data Visualization With Python
No ratings yet
Data Visualization With Python
34 pages
DSA Lab Manual Pgms - fINAL
No ratings yet
DSA Lab Manual Pgms - fINAL
34 pages
Pandas For Machine Learning: Acadview
No ratings yet
Pandas For Machine Learning: Acadview
18 pages
Seaborn
No ratings yet
Seaborn
17 pages
DSBDAL - Assignment No 9
No ratings yet
DSBDAL - Assignment No 9
12 pages
Sl-3 Assignment No.8
No ratings yet
Sl-3 Assignment No.8
21 pages
DVA Practical
No ratings yet
DVA Practical
19 pages
Ex1 - Plotting and Visualization Using Numpy and Pandas
No ratings yet
Ex1 - Plotting and Visualization Using Numpy and Pandas
14 pages
Exp 12 and 15
No ratings yet
Exp 12 and 15
4 pages
Series and Pandas Methods
No ratings yet
Series and Pandas Methods
5 pages
Seaborn 2
No ratings yet
Seaborn 2
49 pages
Pandas
No ratings yet
Pandas
25 pages
Pandas Cheat Sheet 2
No ratings yet
Pandas Cheat Sheet 2
12 pages
Experiment No 9
No ratings yet
Experiment No 9
13 pages
Article Review 6 Eng
No ratings yet
Article Review 6 Eng
31 pages
Regression and Eda
No ratings yet
Regression and Eda
47 pages
NumPy, Pandas, MatplotLib, Seaborn, ScikitLearn (SkLearn)
No ratings yet
NumPy, Pandas, MatplotLib, Seaborn, ScikitLearn (SkLearn)
14 pages
Aphical Representation
No ratings yet
Aphical Representation
8 pages
Visualization Library Documentation
No ratings yet
Visualization Library Documentation
16 pages
Topic 2. Visual Data Analysis in Python: Mlcourse - Ai (Https://mlcourse - Ai)
No ratings yet
Topic 2. Visual Data Analysis in Python: Mlcourse - Ai (Https://mlcourse - Ai)
15 pages
Lecture 4
No ratings yet
Lecture 4
60 pages
ML Lab1 Python Panda
No ratings yet
ML Lab1 Python Panda
9 pages
2.1 Exploratory Data Analysis Using Python
No ratings yet
2.1 Exploratory Data Analysis Using Python
12 pages
Seaborn
No ratings yet
Seaborn
71 pages
Advanced Plot Types With Seaborn
No ratings yet
Advanced Plot Types With Seaborn
4 pages
Data Visualization Lab3
No ratings yet
Data Visualization Lab3
23 pages
Lec 19
No ratings yet
Lec 19
14 pages
EDA Lab Manual
100% (2)
EDA Lab Manual
93 pages
EDA Lab Manual
No ratings yet
EDA Lab Manual
93 pages
2 Mark Key DS
No ratings yet
2 Mark Key DS
3 pages
Plot Per Columns Features Kde or Normal Distribution Seaborn in Details
No ratings yet
Plot Per Columns Features Kde or Normal Distribution Seaborn in Details
272 pages
Unit 3 DS
No ratings yet
Unit 3 DS
30 pages
AD3411 - 1 To 5
No ratings yet
AD3411 - 1 To 5
11 pages
CSA105-LinearRegression-HousePrice-Prediction - Ipynb - Colaboratory
No ratings yet
CSA105-LinearRegression-HousePrice-Prediction - Ipynb - Colaboratory
17 pages
Time Series Analysis Group 9
No ratings yet
Time Series Analysis Group 9
16 pages
Pandas
No ratings yet
Pandas
7 pages
Predicting Gold Prices: Working With The Time Series Data
No ratings yet
Predicting Gold Prices: Working With The Time Series Data
15 pages
Lec 20
No ratings yet
Lec 20
24 pages
Pandas Notes
No ratings yet
Pandas Notes
27 pages
Seaborn 1655435139
No ratings yet
Seaborn 1655435139
13 pages
EDA Document
No ratings yet
EDA Document
13 pages
Pandas
No ratings yet
Pandas
9 pages
Data Science and Analtics Laboratory
No ratings yet
Data Science and Analtics Laboratory
21 pages
Unit 5
No ratings yet
Unit 5
25 pages
Datascienece
No ratings yet
Datascienece
18 pages
DV LAb Staff
No ratings yet
DV LAb Staff
73 pages
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Axial Piston Variable Pump A4VG Series 32: Europe
No ratings yet
Axial Piston Variable Pump A4VG Series 32: Europe
94 pages
Classification of CNC Machine
81% (16)
Classification of CNC Machine
11 pages
Generating Evidence For Artificial Intelligence-Based Medical Devices
No ratings yet
Generating Evidence For Artificial Intelligence-Based Medical Devices
104 pages
Unilumin AIO SMD 135 Inch
No ratings yet
Unilumin AIO SMD 135 Inch
4 pages
Typical Slab and Beams and Columns Bbs 1st 9th Floor
No ratings yet
Typical Slab and Beams and Columns Bbs 1st 9th Floor
19 pages
Word Embedding Generation For Telugu Corpus
No ratings yet
Word Embedding Generation For Telugu Corpus
28 pages
Unit 9-2
No ratings yet
Unit 9-2
14 pages
All-Electric Bus HVAC Solutions: Choose From A Range of Clean, Efficient Solutions
No ratings yet
All-Electric Bus HVAC Solutions: Choose From A Range of Clean, Efficient Solutions
4 pages
Swami Tech
No ratings yet
Swami Tech
32 pages
National Cybersecurity Policy 2023 - 2028 Is Published - Carey Abogados
No ratings yet
National Cybersecurity Policy 2023 - 2028 Is Published - Carey Abogados
4 pages
Clips Report-CAM - 6-2023-10-13-1407
No ratings yet
Clips Report-CAM - 6-2023-10-13-1407
2 pages
Some Introductory Concepts On Fiberr Optic System
No ratings yet
Some Introductory Concepts On Fiberr Optic System
36 pages
EE3402 LIC Notes QUESTION BANK - by WWW - Notesfree.in
No ratings yet
EE3402 LIC Notes QUESTION BANK - by WWW - Notesfree.in
9 pages
IT Reviewer
No ratings yet
IT Reviewer
13 pages
Excel Associate
No ratings yet
Excel Associate
7 pages
NIBDocument NIB16
No ratings yet
NIBDocument NIB16
92 pages
Servers
No ratings yet
Servers
4 pages
Assignment: Citibank: Performance Evaluation
No ratings yet
Assignment: Citibank: Performance Evaluation
17 pages
Ashwani Kumar Yadav Chief Mechanic
No ratings yet
Ashwani Kumar Yadav Chief Mechanic
5 pages
Boost Performance of Informatica Lookups
No ratings yet
Boost Performance of Informatica Lookups
5 pages
HSV5 TB
No ratings yet
HSV5 TB
15 pages
Chapter 5 DC Machines
No ratings yet
Chapter 5 DC Machines
49 pages
Parameter List EPA Commander SK (English)
No ratings yet
Parameter List EPA Commander SK (English)
2 pages
Direcpeciallfbi Po Prelims
No ratings yet
Direcpeciallfbi Po Prelims
20 pages
Oyasumi Punpun Vol.1 Chapter 10 Online For Free
No ratings yet
Oyasumi Punpun Vol.1 Chapter 10 Online For Free
1 page
ME990-IH-Section 2a - LongBoltFlangeDesignProblems
No ratings yet
ME990-IH-Section 2a - LongBoltFlangeDesignProblems
15 pages

Project Documentation

Uploaded by

Project Documentation

Uploaded by

PROJECT NAME: GOLD PRICE

Importing the Libraries

 import pandas as pd:

 import matplotlib.pyplot as plt:

 from sklearn.model_selection import train_test_split:

Imports the train_test_split function from scikit-learn, which is used for

 from sklearn.ensemble import RandomForestRegressor:

Imports the RandomForestRegressor class from scikit-learn, which is a

 from sklearn import metrics:

Data Collection and Processing

By using gold_data.describe(), you get an overview of the central tendency,

structure of the dataset.

constructing a heatmap to understand the correlatiom

sns.heatmap(correlation, cbar=True, square=True, fmt='.1f',annot=True,

sns.heatmap(): This function from Seaborn is used to create a heatmap. It

correlation: It seems like the variable correlation is assumed to be a correlation

cbar=True: Displays a colorbar beside the heatmap to indicate the mapping of

square=True: Ensures that the heatmap is square.

annot=True: Displays the correlation values on the heatmap.

annot_kws={'size': 8}: Sets the size of the annotations to 8.

correlation values of GLD

correlation['GLD']: Assuming that correlation is a

print(): This function is used to display the selected

The output will be a Series or a single column

This information is valuable for understanding how

checking the distribution of the GLD Price

sns.distplot(): This Seaborn function is used to create

gold_data['GLD']: The variable for which the

color='green': Specifies the color of the plot. In this

The resulting plot will show the distribution of Gold

You might also like