0% found this document useful (0 votes)
52 views8 pages

Task 5 Sales Prediction Using Machine Learning

Sales Prediction of compamy Using ML Algorithm

Uploaded by

jadhavvikram863
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
52 views8 pages

Task 5 Sales Prediction Using Machine Learning

Sales Prediction of compamy Using ML Algorithm

Uploaded by

jadhavvikram863
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 8
711323, 016M In [36]: In [37]: out [37]: In [38]: out [38]: ‘Sales Prediction Project Task 2 - Jupyter Notebook Project Report: Sales Prediction using Python Submitted by, Mr. Omkar Balwant Jadhav Introduction The objective of this project is to develop a machine learning model to predict sales based on different advertising channels: TV, Radio, and Newspaper. The dataset used for this project consists of historical data that includes advertising expenditures on each channel and corresponding sales figures. Collecting and Filtering Data + The dataset contains 200 records with four columns: TV, Radio, Newspaper, and Sales. + Before proceeding with the analysis, it's important to perform exploratory data analysis to gain insights into the data distribution, identify any missing values, and check for correlations between variables. + The data should be preprocessed by handling missing values (if any), handling outliers, and scaling the features (if required) 1 import pandas as pd 2 data = pd.read_csv( :\\Users\\Onkar\\Downloads\\Sales Prediction Data.cs\ 1 data.shape (200, 5) 1 data.head() Unnamed: 0 TV Radi Newspaper Sales ° 1 2301 378 692 22.1 1 2 445 393 454 10.4 2 3 172 459 693 93, 3 41515 413 585 185 4 5 1808 10.8 584 129 Exploratory Data Analysis Iocahost 8888inotebooks/Desklop/Final ProjectSales Presition Projecct Task 2ipynb ve 711323, 216 AM ‘Sales Prediction Project Task 2 - Jupyter Notebook Exploratory Data Analysis refers to the critical process of performing initial investigations on data so as to discover patterns,to spot anomalies, to test hypothesis and to check assumptions with the help of summary statistics and graphical representations. Itis a good practice to understand the data first and try to gather as many insights from it EDA\is all about making sense of data in hand. In [39]: 1 data.info() RangeIndex: 200 entries, @ to 199 Data colunns (total 5 columns): # Column Non-Null Count Dtype Unnamed: @ 200 non-null —inte4 e 1 ow 28 non-null —float64 2 Radio 208 non-null —floate4 3 Newspaper 208 non-null —float64 4 Sales 208 non-null —floate4 dtypes: floatea(4), int6a(1) memory usage: 7.9 KB In [40]: 1 data.describe() out (49): Unnamed: 0 TV___ Radio Newspaper __ Sales count 200.000000 700.000000 200.000000 200.000000 200.000000 mean 100.5000 147.042500 23,264000 30.554000 4.022500 std 7.879185 85.854236 14.848809 21.78621 5.217457 min 1,000000 0.700000 0.000000 0.300000 1.600000 25% — 50.750000 74.3750 9.875000 12.750000 10.375000 50% 100,500000 149.750000 22.900000 2.750000 12.900000 75% 150.250000 218,825000 96.525000 45.100000 17.4000 max 200,000000 298.400000 49.600000 114.0000 27.0000 In [41]: 1 data.isnul1().sum() out[41]: Unname wv Radio Newspaper sales dtype: intes Data Visualization Incas 8888/notebooks/Desklop/Final ProjectSales Prasition Projecct Task 2ipynb 28 711323, 216 AM In [42]: sales Sales Prediction Project Task 2 -Jupyter Notebook import matplotlib.pyplot as plt # Scatter plot of TV vs Sales pit.scatter(data['TV'], data['sales’}) plt.xlabel('Tv") plt.ylabel('Sales*) plt.title(‘TV vs Sales") pt. show() # Scatter plot of Radio vs Sales plt.scatter(data[ 'Radio'], data[‘Sales']) plt.xlabel(‘Radio") plt.ylabel(‘Sales') plt.title( ‘Radio vs Sales‘) plt.show() #t Scatter plot of Newspaper vs Sales plt.scatter(datal ‘Newspaper'], data[ ‘Sales']) plt.xlabel(‘Newspaper') plt.ylabel( Sales") plt.title( ‘Newspaper vs Sales") plt.show() TV vs Sales 25 20 15 10 ei 5 e e ° 50 100 150 Incas 8888inotebooks/Desklop/Final ProjectSalos Preston Projecct Task 2ipynb se 711323, 216 AM 25 20 10 25 20 15 sales 10 Sales Prediction Project Task 2 -Jupyter Notebook Radio vs Sales . o 10 20 30 40 50 Radio Newspaper vs Sales ° ° ° e ° 0 20 40 60 20 100 Newspaper Incas 8888inotebooks/Desklop/Final ProjectSalos Preston Projecct Task 2ipynb 48 711323, 216 AM In [43]: Sales Prediction Project Task 2 -Jupyter Notebook 1 import matplotlib.pyplot as plt 2 3 # Set the figure size 4 pit. figure(figsize=(10, 6)) 5 6 # Plot the bar plots for 'TV', ‘Radio’, and ‘Newspaper’ columns 7 plt.bar(data['TV'], data['Sales’], color="red", alpha=0.5, label="TV") 8 plt.bar(data['Radio'], data['sales'], color="green', alpha-0.5, label='Ra\ 9 plt.bar(data[ "Newspaper" ], data[‘Sales'], color="blue’, alpha=0.5, label= 10 11 # Add Labels and title to the plot 12. plt.xlabel( ‘Advertisement Medium’) 13. plt.ylabel(‘Sales') 14 plt.title( ‘Sales vs Advertisement Medium’) 15 16 # Add a Legend 17. plt-legend() 18 19 # Show the plot 28 plt.show() 21 Sales vs Advertisement Medium = mm Rao mm Newspaper 2 20 sates 100 350 ‘Advertisement Medium Machine Learning Process Step 1: Import the necessary libraries Incas 8888inotebooks/Desklop/Final ProjectSalos Preston Projecct Task 2ipynb 58 711323, 216 AM In [44]: In [45]: out [45]: ‘Sales Prediction Project Task 2 - Jupyter Notebook import pandas as pd from sklearn.model_selection import train_test_split from sklearn.linear_odel import LinearRegression from sklearn.metrics import mean_squared_error, r2_score whens Step 2: Load and preprocess the data 1 data = pd.read_csv("C:\\Users\\Omkar\\Downloads\\Sales Prediction Data.csy 5 ‘# Split the data into features (X) and target variable (y) 4 X = data.drop('Sales', axis=1) 5S y = data['Sales'] 200 rows x 5 columns Step 3: Train the model + Several regression models can be considered for sales prediction, such as linear regression, decision tree regression, random forest regression, and support vector regression. + The dataset can be split into training and testing sets using tech validation or a simple train-test split. + The selected regression model can then be trained on the training set. ues like k-fold cross- Incas 8888inotebooks/Desklop/Final ProjectSalos Preston Projecct Task 2ipynb ae 711323, 216 AM ‘Sales Prediction Project Task 2 - Jupyter Notebook In [46]: # Create a Linear regression model model = LinearRegression() 1 2 3 4 # Train the model on the training data 5 model.fit(Xtrain, y train) 6 out[46]: LinearRegression() Step 4: Evaluate the model + The trained model needs to be evaluated to assess its performance and generalization capabilities. + Evaluation metrics like mean squared error (MSE), root mean squared error (RMSE), and R-squared (R2) can be used to quantify the performance of the model + Comparing the mode''s performance on the training and testing sets can help identify overfiting or underfitting issues. In [47]: # Make predictions on the test data y_pred = model.predict(Xx_test) 1 2 3 4 # Calculate evaluation metrics 5 mse = mean_squared_error(y test, y_pred) 6 r2= r2_score(y_test, y_pred) 7 8 9 print( print( "ean Squared Error:", mse) ", 72) square’ Mean Squared Error: 3.1990044685889063 Resquared: @.898648915141708 Step 5: Predict sales for new data + Once the best model is selected and trained, it can be deployed for real-world predictions, + New data points can be provided as input to the trained model to predict sales based on the advertising expenditures on TV, Radio, and Newspaper. Incas 8888inotebooks/Desklop/Final ProjectSalos Preston Projecct Task 2ipynb 78 711323, 216 AM In [48]: In [ In]: ‘Sales Prediction Project Task 2 - Jupyter Notebook # Create a new DataFrame for new data pd.DataFrame({'SepalLengthcm': [5.2, 6.1, 4.9], "SepalWidthcm': (3.1, 2.8, 3.5], "PetalLengthcm': [1.7, 4.7, 1.5], "PetalWidthcm': [@.5, 1.6, @.4]}) # Predict sales for the new data predictions = nodel.predict (new_data) print("Predictions:", predictions) Predictions: [3.37175054 3.93001816 3.35128999] €:\Users\Onkar\anaconda3\1ib\site-packages\sklearn\base.py:493: FutureWarnin g: The feature names should match those that were passed during fit. Starting version 1.2, an error will be raised. Feature names unseen at fit time: - PetalLengthcm ~ PetalWidthcm - SepalLengthcm ~ SepalWidthcm Feature names seen at fit time, yet now missing: - Newspaper ~ Radio wv - Unnamed: @ warnings.warn(message, FutureWarning) Conclusion advertising strategies and maximize sales. Incas 8888inotebooks/Desklop/Final ProjectSalos Preston Projecct Task 2ipynb ‘Summarize the key findings of the project, including the best-performing model, important features affecting sales, and the model's predictive capabilities. Highlight any insights or recommendations based on the analysis that could help optimize ory

You might also like