Machine Learning Project PDF
Machine Learning Project PDF
# music_data = pd.read_csv('music.csv')
# X = music_data.drop(columns=['genre'])
# y = music_data['genre']
# model = DecisionTreeClassifier()
# model.fit(X, y)
predictions
---------------------------------------------------------------------------
~\AppData\Local\Temp\ipykernel_65448\591308401.py in <module>
10 # model.fit(X, y)
11
14 predictions
music_data = pd.read_csv('music.csv')
X = music_data.drop(columns=['genre'])
y = music_data['genre']
model = DecisionTreeClassifier()
model.fit(X, y)
dump(model, 'music-recommender.joblib')
model = load('music-recommender.joblib')
print(predictions)
['HipHop']
C:\Users\kenny.ralph\Anaconda3\lib\site-packages\sklearn\base.py:450: UserWarning: X does not have valid feature names, but Deci
sionTreeClassifier was fitted with feature names
warnings.warn(
music_data = pd.read_csv('music.csv')
X = music_data.drop(columns=['genre'])
y = music_data['genre']
model = DecisionTreeClassifier()
model.fit(X, y)
tree.export_graphviz(model, out_file='music-recommender.dot',
feature_names=['age', 'gender'],
class_names=sorted(y.unique()),
label='all',
rounded=True,
filled=True)
music-recommender.dot:
digraph Tree {
node [shape=box, style="filled, rounded", color="black", fontname="helvetica"] ;
edge [fontname="helvetica"] ;
0 [label="age <= 30.5\ngini =
0.778\nsamples = 18\nvalue = [3, 6, 3, 3, 3]\nclass = Classical", fillcolor="#e5fad7"] ;
1 [label="gender <= 0.5\ngini = 0.75\nsamples = 12\nvalue = [3, 0, 3, 3,
3]\nclass = Acoustic", fillcolor="#ffffff"] ;
0 -> 1 [labeldistance=2.5, labelangle=45, headlabel="True"] ;
2 [label="age <= 25.5\ngini = 0.5\nsamples = 6\nvalue = [3,
0, 3, 0, 0]\nclass = Acoustic", fillcolor="#ffffff"] ;
1 -> 2 ;
3 [label="gini = 0.0\nsamples = 3\nvalue = [0, 0, 3, 0, 0]\nclass = Dance", fillcolor="#39e5c5"] ;
2 -> 3 ;
4
[label="gini = 0.0\nsamples = 3\nvalue = [3, 0, 0, 0, 0]\nclass = Acoustic", fillcolor="#e58139"] ;
2 -> 4 ;
5 [label="age <= 25.5\ngini = 0.5\nsamples = 6\nvalue =
[0, 0, 0, 3, 3]\nclass = HipHop", fillcolor="#ffffff"] ;
1 -> 5 ;
6 [label="gini = 0.0\nsamples = 3\nvalue = [0, 0, 0, 3, 0]\nclass = HipHop", fillcolor="#3c39e5"] ;
5 -> 6 ;
7
[label="gini = 0.0\nsamples = 3\nvalue = [0, 0, 0, 0, 3]\nclass = Jazz", fillcolor="#e539c0"] ;
5 -> 7 ;
8 [label="gini = 0.0\nsamples = 6\nvalue = [0, 6, 0, 0, 0]\nclass =
Classical", fillcolor="#7be539"] ;
0 -> 8 [labeldistance=2.5, labelangle=-45, headlabel="False"] ;
The script uses a decision tree classifier algorithm to train the music recommendation model, using the 'age' and 'gender' columns as input
features to predict the music genre.
Import the required libraries: 'pandas' and 'sklearn'.
Read a music data set from a .csv file into a pandas
DataFrame.
Define the input features (X) and target variable (y) for the model. X is the data in the DataFrame except the 'genre' column, and y is
the 'genre' column.
Create an instance of the DecisionTreeClassifier class, fitting the model to the data.
Export the model to a .dot file, which can
be visualized with graphviz to show the structure of the decision tree.