0% found this document useful (0 votes)
50 views4 pages

CE802 Pilot

This document outlines a machine learning project to predict hotel profitability. A hotel chain manager provided historical data on hotel location characteristics and financial performance to train a model. Linear regression is proposed as the most suitable algorithm to analyze the relationship between independent variables like population, income, and marketing and the dependent variable of profit. The model will be tested on accuracy and speed before being deployed. Key features in the data like location demographics and brand presence must be included to ensure quality predictions.

Uploaded by

fidel munywoki
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views4 pages

CE802 Pilot

This document outlines a machine learning project to predict hotel profitability. A hotel chain manager provided historical data on hotel location characteristics and financial performance to train a model. Linear regression is proposed as the most suitable algorithm to analyze the relationship between independent variables like population, income, and marketing and the dependent variable of profit. The model will be tested on accuracy and speed before being deployed. Key features in the data like location demographics and brand presence must be included to ensure quality predictions.

Uploaded by

fidel munywoki
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Assignment: Design and Application of a Machine Learning System for a

Practical Problem

University of Essex
School of Computer Science and Electronic Engineering
CE802 Machine Learning

Prof. Luca Citi ([email protected])

[Student Name ]
[Student Number ]

PROBLEM STATEMENT

A manger of a large chain of hotels wants to investigate the feasibility of


using machine learning to predict whether a new hotel opened in a given
location would bring profits or a loss. The manager has historical data of
the successful and unsuccessful hotels opened under the chain’s brand
and she provides this data about the geographical and socio-economical
data about the locations and the neighbourhoods.

This information is used to find answer and solution of the problem using
machine learning and data science methods.
INTRODUCTION

From the algorithms available, the most considerable alogrithms are the
one that are able to give two way outcome; that is profit or loss scenerios.
Classification, regression, clustering, and rules mining are some of the
prediction models that are used in machine learnin. For best results
regression analysis tops in the list ( Bonaccorso, G. 2017).

Regression analysis focuses on the set of independent variables and


dependent variables to give a indepth description of the variables
relationship hence deducing a conclusion easily and correctly. The
analysis produces a regression equation where the coefficients represent
the independent and the dependent variable’s relationship ( Freund, R. J.,
Wilson, W. J., & Sa, P. ,2006).

Given a dataset, predicitive modelling technique can be used to analyze


the relation between the dependent and independent variable. This
modelling techque is the regression analysis. However regarding the trend
of the given data, different types of regression analysis can be used if the
target and independent variable show a linear or non-linear relationship
between each other, and the target variable contains continuous values.
The analysis can be used to determine the predictor strength, forecast
trend, time series, and cause and effect relation ( Montgomery, D. C.,
Peck, E. A., & Vining, G. G. 2021).

This exact characteristics are key concepts in determining if another new


hotel will be profitable or not. This will be done by determing a line of
best fit to make a conclusion and also observe the trend. Linear regression
through connection of data points gives correct information and
visualization on the hotels trend. However performance and correctness
will highly determine the model selected, if it has poor performance then
different model will be selected. During analysis, different adjustments
can be done to arrive at the best prediction model.

From the dataset provided some of the key informative features that must
be provided are for examples; the population of the selected locations,
income level of the residents, brand marketing of the hotels in that area.
These are just but some of the important features that determine the
corectness of the results from the data set. Quality data means quality
prediction results.
The algorithms needs to be trainned using the most efficient procedure or
procedures. The most convinient method for the model training in this
scenerio is the linear regression. Linear regression is a machine learning
algorithm which bases the results on supervised learning. On this case,
we have expected result of profit and input data to the model. This trains
the model to achieve the result. Linear regression is the most preferred
when finding out the relationship between the variables and forecasting.
The number of independent variables can vary according to the data set
provided.

K- Nearest Neighbour (k-NN) can be a procedure used to train the model


as it helps in forecasting the hotel’s profit by matching its recent trend to
historical earning or profit margin of “neighbor” hotels. If the neighbor
hotel is performing well what are key items that are key contributors to
the success?. Can they be deployed to the new hotel for the same
performance or how can we improve the occurance. The k-NN algorithm
is supperior to other alternative models if applied in trainning (Soucy, P.,
& Mineau, G. W. 2001, November).

After development the model should be tested to ensure that it works


correctly. This is a great reason that the system performance is evaluated
before deployment for assuarance. The system can be measured by one
using the classification Accuracy. This means the ratio of correct
predictions to the number of the input samples. Two, the execution speed
of the system can be measured. More adjustments can be done according
to the results ( Saxena, A., Celaya, J., Saha, B., Saha, S., & Goebel, K.
2009, March)..
References

1. Bonaccorso, G. (2017). Machine learning algorithms. Packt


Publishing Ltd.
2. Freund, R. J., Wilson, W. J., & Sa, P. (2006). Regression
analysis. Elsevier.
3. Montgomery, D. C., Peck, E. A., & Vining, G. G.
(2021). Introduction to linear regression analysis. John Wiley
& Sons.
4. Soucy, P., & Mineau, G. W. (2001, November). A simple KNN
algorithm for text categorization. In Proceedings 2001 IEEE
International Conference on Data Mining (pp. 647-648). IEEE.
5. Saxena, A., Celaya, J., Saha, B., Saha, S., & Goebel, K. (2009,
March). Evaluating algorithm performance metrics tailored for
prognostics. In 2009 IEEE Aerospace conference (pp. 1-13). IEEE.

You might also like