0% found this document useful (0 votes)
27 views9 pages

1710993830340

Uploaded by

أماني
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
27 views9 pages

1710993830340

Uploaded by

أماني
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 9
MCheaL Codeine Regenerocnaleradet-descert pr aman apes? ML Caste Stochastic Gradient Descent: Itis an other type ofthe Gradient Descent algorithm that is used for Optimizing Machine Leaming Models iteratively. addresses the computational inefficiency of Batch Gradient Descent methods when dealing with large datasets In SGD, instead of using the entre dataset foreach iteration, only a single random example is selected to calculate the ‘gradient and update the model parameters. Ths random selection introduces randomness into the optimization process, hence the term “stochastic in Stochastic Gradient Descent Path taken by Stochastic Gradient Descent looks as follows = orasont Note: SG0 is noisier than Batch Gradient Descent, and often requiring more iterations to reach the minima due to its randomness. Despite this, itis computationally more efficient, making ita preferred choice in most scenarios over Batch Gradient Descent for optimizing learing algorithms Advantages of Stochastic gradient descent: In Stochastic gradient descent (S60), lesming happens on every example, and it consists of a few advantages over other gradient descent ‘+ Memory Efficiency: It updates the parameters for each training example one ata time its memory-efficient and can easier to allocate large datasets in desired memory. + Speed: tis relatively fst to compute than Batch Gradient Descent and Mini-Batch Gradient Descent, because it uses only psp crnterasad2\M-Che-Codeatchnait near Aagrssintechstcpen-dscrtpn an sag sem M.CheaLCedotinr Regendersocnaleradet-dtcert pr aman apes? IML Cease fone example to update the parameters. ‘+ Computationally Efficient: By using a single example, the computational cost per iteration i significantly reduced compared to Batch Gradient Descent methods that require processing the entre dataset, + Avoidance of Local Minima: Due to the noisy updates in SGD, it has the ability to escape from local minima and converges toa global minimum Let's proceed to build an Approximation Class that will asistus in determining the beta values (coefficients and intercept using Stochastic Gradient Descent for our Multiple Linear Regression Model. will use the Diabetes dataset to create our own GDRegressor and validate it against Sklearn’s SGDRegressor. Importing Dataset In this implementation am using Diabetes Data of stlearn: https //scikit- learorg/stable/medules/generated/sklearn datasets load ciabetes html ps gun crnteasad2\M-Cha- Cadet near Aagresntechstcpe-scrtpr an sag sem MCheaL Codeine Reyenersocnieradet-tcert pr aman apes? ML Castes ‘fron sklearn.datasets Anport oad_diabetes (Anputs, target) = Load diabetes(return_Xy = True) print('inputs.shape:' , inputs.shape) print(‘sarget shape: jtarget.shape) inputs. shape: (442,10) tanget.shape: (442,) Splitting data into train and test datasets ‘fron sklearn.nodel_selection inport train test split ‘erain_inputs, test_inputs, train_target, test target = train_test_split(inputs, target, test_size = 6.2, rando print(*erain_inputs:' train inputs) rant (*\n") print(' on sgn crnteasad2\M-Cha- Codsall near Ragrssntechstcpen-dscrtpr Micheal Codeine Regenersocnaieradent-dtcert pr aman kapae? ML Cah tee Note: Data Preprocessing Ensure thatthe data preprocessing steps, such as normalization or standardization, must be perform. Discrepancies in data processing can impact model convergence, Here, m not applying Datastandardization because the datasets already in Similar Range of all the axis Since, Stochastic Gradient Descent requires the value of Learning Rate and Epochs. Iam fst applying the sklearn’'s ‘StochasticGradientDescent for better implementation of our model ‘fron skLearn-Linear_nodel inport scoRegressor ‘from aklearncnetrics inport r2. score reg = scpRegressor(nax_tter=509, learning rate-' constant eta@=2.03) reg. f18(train inputs, train target) scbRegressor(learning_rate='constant', max_iter=500) In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook, (On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org, Ypred = reg.pnedict(test_inputs) print(*sconegressor Coefficiente:", rag.coef_) print(*\n") print (*scoRegresson Intercept", reg.intercept_) Scotegressor Coefficients: [ 53.00997234 -126.73396481 412.23622602 277.2729001-23.54477637 *66, 58596436 -195.90104992 147,23460275 315.68136954 143,74333886] ScoRegressor Intercept: [148,50841551" ra score(test_target,y_pred) 9,4508313570919322 Building our own Stochastic Gradient Descent Class sag sem MCheaL Codeine Regenerocnaleradet-descert pr aman apes? ML Caste Anport ounpy as np | am using the fst derivative of Mean Squared Error (MSE) for finding convergence in Stochastic Gradient Descent (SGD) ‘algorithm. The first derivative, often referred to as the gradient, indicates the direction and magnitude ofthe steepest ascent ofthe cost function. By updating the model parameters (coefficients and intercept in the opposite direction of the gradient aim to minimize the cost function, Stochastic Gradient Descent Algorithm 1. Initialize Parameters Randomly inialze the parameters of the model Determine the numberof ieraions(epochs) and the leaming rate for updating the parameters 2. Stochastic Gradient Descent Loop: Repeat the following steps until the model converges or reaches the maximur- numberof iterations: a erate over each randomly selected training example from training dataset to introduce randomness. b. Conpute the gradient of the cost function with respect to the model parameters using the current training example. . Update the model paraneters by taking 2 step in the direction of the negative gradient, as per the learning rate, 4. Evaluate the convergence criteria, such as the difference in the cost function between iterations of the gradient. 3. Return Optimized Parameters: Once the convergence criteria are met or the maximum number of iterations is reached, retum the optimized model parameters. class stochasticobRegressor() def _init_(se1f, learning_rate = €.01, epochs = 200) Half .coet# = hone self intept = None self leamning.rate = learning rate Self epochs = epoens ps pun crnteasad2\M-Cha- Cadet nar agrssntechstcpet-dscrtipr on sag sem Mi. Chea Cedotinr Regendersocnieradet-dtcert pr aman apes? ML Cease # creating “fit" Function ef #2t(self, train_anguts, train target): 1 In Multiple Linear Regression, ‘t {s advisable to choose the starting point of intercept = @ and coef 4 searing with initializing intercept = @ self. intept staring with initializing coeffictents = 2 self.coeff = np.ones(train_inputs.shape[1]) # Using train inputs. shape[] for the number of features f starting tteratton Loop for 1 in range(sel¢-epochs) for jin range(Zrain_inputs.shapel)) 1 Fetching the index rondonty Sd = nporandon.ranaint(@, train. snputs.shape(0)) # catculating the derivative of intercept values y.hat = np.cot(train_inputs[idx), self.coeff) + self-intept intercept derivative = -2 * np.nean(train_target [idk] - y_het) # Updating att the intercept values Self-intept = self.intept ~ (Self-learning_rate * intercept derivative) # calculating the derivative of intercept values coef#-derivative = -2 * np.dot((train-target dt] ~ y.hat), teain_inputs(iéx]) # Updating alt the intercept values self coeff = self.coet - (self. learning pate * coeff_derivative) oroperty def coeficients( sel): Af self.coetf $= not None: ‘return seLf.coeff else! Dprint( Model now fatted yet. ‘return None @eroperty def intercept (self): Af self.intept is not None: return self.intept ease psp crnteasad21MCha-CatetlcranlnearAgresnontechaste-pe-dscrtipr ‘return Hone creating ‘predict’ Functton def predict(selt, test inputs) return np.dot(test inputs, self.coer) + self.intept 9 2 ~ sconing for Metric Evaluation def scone(self, est_inputs, test target): predictions = seif-predict(test_inputs) Fe r2_score(test target, predictions) spd = Stochast cabegressor(Iearning_atesd.01,¢p0chse500) sed. f1¢(traln inputs, train_target) print (*stochasticbtegresser Cocfticients:" print(*\n") print (*Stochasticaptegressor Intescept:", sgd.intercept) |, sad. coot#ictents) StochasticGoRegressor Coefficients: [ 43.00297577 -237.60014313 62.78410347 333.84767943 -125. 88091605 106,29575328 -205.01352308 153,03909436 423,57085728 55.00051245] StochasticGokegressor Intercept 247.41704853952255 Additionally, using the R2 score asa regression metric isa good choice for evaluating the performance of model, R2 measures the proportion ofthe variance inthe dependent variable thats predictable from the independent variables. higher R2 score indicates better predictive performance, 2 = sgé.score(test_inputs, test_target) print(R2 score on test data: (2)") 2 score on test data: 0,44582679448963036 The slight difference in performance between sklearn's SGDRegressor madel and my custom Stochastic Gradient Descent ps pun crnteasad2\M-Cha- Cadet near Aagssntechstcpen-dscrtipr en MCteaLCedotinar Regenersocnaleradet-tcrt pr aman apes? ML Cah tee (S60) Class implementation can be attributed to several factors: 1. Hyperparameter Tuning: The performance of SGORegresso in sklearn may be influenced by default hyperparameter settings, optimized for specific large datasets. 2. Random Initialization: My custom SGD implementation employs a random index for gradient descent updates introducing randomness that can lea to different convergence paths and variations in final model parameters. 3. Convergence Criteria: Differences in the number of epochs and convergence criteria could contribute to performance variations between the two implementations 4, Learning Rate Schedule: sklearn's SGDRegressor utilizes a default learning rate schedule, while | have not experimentee with various leaming rate schedules in my custom class, 5. Regularization: skearn's SGDRegressor may include regularization terms by default, whereas my custom class does not currently implement any form of regularization, By systematically evaluating these factors, you can identify the specific reasons behind the performance differences and refine your custom SGD implementation accordingly, Disadvantages of Stochastic Gradient Descent (SGD) 1. Noisy Updates: The updates in SGD are noisy with high variance, which can make the optimization process less stable anc Potentially leads to oscillations around the minimum. 2. low Convergence: Convergence in SGO may be slower as it updates parameters foreach traning example individually requifng more iterations to reach the minima, 3. Sensitivity to Learning Rate: The choice of leaming rate is crucial in SGD. A high rate may cause overshooting, while alow rate can result in slow convergence, impacting the algorithm's performance, Teolow Just ight Too high 10) 10 1 SS 406) ps pun crnteasad2}M-chan- Cadena near Aagrssntechstcpen-dscrtipr on Ty SS = oem 4 Asmalllearing rate The optimal learning Two lage ofa learning rate requires many updates ‘ate swiftly reaches the causes drastic updates ore ect ‘minimum point hich lead te divergent “behaviors 4. Less Accuracy: The noisy updates may prevent SGD from converging to the exact global minimum, yielding suboptima solutions. Techniques like leaming rate scheduling and momentum-based updates can help mitigate this issue, Difference between Batch Gradient Descent and Stochastic Gradient Descent ‘elaine cape changed ag i "reais yay. nepeal nga ts gant rin cose reo Ey cterpin al aba ae Note: have built a custom class facilitate a bette understanding f Stochastic Gradient Descent. Consequently, | would recommend utilizing the scikit-lear library forthe development of your model ‘Say tuned for Polynomial Regression and Dont forget te Star this Github Repository for more such contents and consider sharing with others ps pun crnterasad2\M-Cha- Cadet near Ragesintechstcpedscrtipr son

You might also like