Open navigation menu
Close suggestions
Search
Search
en
Change Language
Upload
Sign in
Sign in
Download free for days
0 ratings
0% found this document useful (0 votes)
16 views
10 pages
ML 4
Uploaded by
read4free
AI-enhanced title
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Download
Save
Save ml4 For Later
Share
0%
0% found this document useful, undefined
0%
, undefined
Print
Embed
Report
0 ratings
0% found this document useful (0 votes)
16 views
10 pages
ML 4
Uploaded by
read4free
AI-enhanced title
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Carousel Previous
Carousel Next
Download
Save
Save ml4 For Later
Share
0%
0% found this document useful, undefined
0%
, undefined
Print
Embed
Report
Download
Save ml4 For Later
You are on page 1
/ 10
Search
Fullscreen
es ae 9.1. Explain in detail about recurrent neural network, Ans. A recurrent neural network (RNN) is an extension of a conventional feedforward neural network, which is able to handle a variable-length sequence input. The RNN handles the variable-length sequence by having a recurrent hidden State whose activation at each time is dependent on that of the previous time. More formally, given a sequence x = (x,, Xp, ...... X7), the RNN updates its recurrent hidden state h, by 0, t=0 ea Gentes x )ercthervise ao where @ is @ non-linear function such as composition of a logistic Sigmoid with an affine transformation. Optionally, the RNN may have an output Y= (i Yar --» Ya) Which may again be of variable length. Traditionally, the update of the recurrent hidden state in implemented as — equation (i) is hy = g(Wx, + Uh,_,), ss(ii) where g is a smooth, bounded function such as a logistic sigmoid function or 4 hyperbolic tangent function, = Weg Benerative RNN outputs a probability distribution ove: % fant siVen its current state hy, and this generative n ution over sequences of variable length by using ‘the end of the sequenc: into r the ng telement model can capture aspecial output symbol e. The sequence probability can be represent “ecomposed PX, 9) = PO%1) PO | x4) p(xs | x), x9) ses PORT |X], csesey Xp) «-(iii) oo126 Machine Leaming (Vi-Sem.) Q.2. Briefly explain the term long short-term memory unit. Ans. Te long shorter memory {LSTM) unital propos Hochreiter and Schmidhuber, Since ten a number of minor modieton the original LSTM unit have been made. : Unlike tothe recurrent unit which simply computes a weighted sum ofthe ‘input signal and applies a non-linear function, each j* LSTM unit maintains. memory ¢} at timet. The output fn, othe activation, ofthe LSTM units then hj = of tanh (ch), where 0} — An output gate that modulates the amount of memory ens exposure “The output gate is computed by of = (Worx + Uoh Veer)! = Alogistic Sigmoid function V =A diagonal matrix. “The memory celle} isupdated by pasally fore! a- ‘and adding a new memory content a =ficls +itel where 4xingthe essing mem a yntent is where the new memory CO Ud)? gi = tanh (Wer + Ud aula? g memory is forgoten gu” 1:10 which the existin content ‘The extent gree to which the EW j det forget gute ere 4 by an input ate He the memory ce! b= iene * } i = oi matrices. ‘Note that Vrand Vi are diagonal raditional £°°% ike to the waditiont Qty vse fromesuason each time vet? Unit lV 127 existing mem” ‘importa wy sae eas ii the existence of the feature) ry via the introduced gates. Intuitively ifthe LSTM 4 feature from an iteasily carries okeep te ve foto & ‘a tong distance, hence, Le stance dependencies. pote fone tere Fand are the input, forget and ples, respectively. ¢ and © denote the memony ell and the new memory cell ie 4) Long Short-term content is shown in fig. 4.1 Memory 03. What do you mean by gated recurrent unit ? Explain. “ns. A gated recurrent unit (GRU) was proposed by Cho et al. to make ‘ach ecurent unit to adaptively capture dependencies of different time scales. Sinilary tothe LSTM unit, the GRU has gating units that modulate the flow of information inside the unit, however, without having a separate memory cells. The activation hj of the GRU at time t isa linear interpolation between the previous activation hj , and the candidate activation fh) fe out b= G—z)yni +2) bi, i) where i oe ‘update gate z} decides how much the unit updates its activation, or The update gate is computed by A= o(W.x, + Uh)! ‘This procedure of taking a lis aa a Making a linear sum between the existing state and the {similar to the LSTM unit. The GRU, however, docs ty makes the unit act as ifit is reading allowing it to forget the previously Say MPUE sequence, Rene128 Machine Learning (VI-Sem) ‘The reset gate xj to the update gate — j q is computed similarly OCW, x, + Uyhy y)! Here, and 2 ae th 2 re the rest and pa gates, and h and h are the activation. me candidate activation. The ated recent is shown in fig. 4.2. : Fig. 4.2 Gated Recurrent Uni @.4. How can we use SMT to find synonyms ? Ans. The word “ship” ina 2 ina particular context cam be translated to ano . err word “ship” is synonymous with the word “transport”. Sp, our example abowe fof a query such as “how to ship a box” might have the same translation “how to transport box” 3 “The search might be expanded to include both queries ~"how to ships box” as well as “how to transport a box” ‘A machine translation system may also collect information about work in the same language, o lear about how those words might be related. Q.5. Write short note on beam search. “Ams. Neural sequence models are widely used to mode time-series Equally ubiquitous is the usage of beam search (BS) as an approxinwt tartpence algorithm decode ourput sequences from thes models. BSexplrs greedy left-right fashion retaining oly te OPP candids ane aetting in sequences that differ only stiBbly from each other. “The most prevalent method for approxima decoding is ih od the top-B highly scoring: ates at each time step; where B ons bel by BS atthe sat oF ‘Let us denote the set of B solutic ee rey. ane oe, fet At ‘each time step, BS on e ; be single token Tabastons ‘of these beams (ae Oe rs oh y i sons. More formally at 8°? selects the B most likely extens! Oe atid acgraxa es oy sien Fes an be sivial See ee advantages of beam search. lof beam search are as follows — r-identical beams make BS ac .e computation being repeat reduction of ea computationally ction of with essentially the same ted for hm, stl aT in in performance: oa mismatch i.e. improvements in posterior rosary corresponding 10 improvements in task-specific netesractce to deliberately throtile BS to Become & poorer om algorithm by using reduced beam widths. This treatment of an ce aiorim as 0 byper-parameter is nol only tellectually opting bt also has a significant practical side-effect — it leads to the tf arely bland, generic, and “safe” outputs, © always saying “I ort know” in conversion models Most importantly, ack of diversity in the decoded solutions is findamenily crippling in AI problems with significant ambiguity ~e.g. there tromulipie way of describing an image or responding in a conversation that Se as sine! to capture this ambiguity by finding several Q7. Explain the term BLEU score. Ans, BLEU (BiLit i eee ais Understudy) is an algorithm that was Pomel aluaichow accuntemachine translatedtextwas, Here, same approach to evaluate the quality of the text response that we is com130 Machine Leaming (Vi-Sem,) This are the BLEU seores from n-gram Se that it gives a higher score for lower n, case zero for 4-grams as no sequence of sentences. Tis isthe general methodology As mentioned earlier, BLEU score helps us to determine the next step for our model, As depicted in fig. 4.3, the methodology behind using BLEU score is to improve our model. A low score indicates that may be the performance is not as good as expected and so, we need to improve our fc
SAa REEN WE a gor oY ca viroumelt is i © Disadvantages of reinforcement learning are as follows waxewe Iconly provides enough to meet up the minimum behaviour Q.17. Explain the term Markov decision process (MDP). Q.15. What are the differences between reinforcement learning ani snl tare andts Markov detbion processes model time-diserete stochastic state- ? eéleamia,| “atsition automata, An MDP = (S, A, P, R) consist ofa set of state S, a set of Ans Difereness between enforcement lamingandstrevse8l] scion A expected Gmmedia sonads Re, eccned hence ee ima: {fom state so states by executing action a, and transit ting action a, and transition probabilities P. The - ata areaes se ee rang Probability that in states action a takes the agent to state sis given by P", At a No. Z| [*EEPoot inte the MDP isin some site 5. An agent chooscs an | )@ [Reinforcement learning is all about | In supervised learing the desi) PIES sore scssor sn ih " making decisions sequentially. In_| is made on the initial inpt ‘The agent receives a scalar reward (or matte : simple words we can say that the | input given atthe star. cae) eR ‘output depends on the state of the e ving na states, and Se v1 and receiving a reward eee Steseelteann lacmenen nd depends on the output of the pre- |] MENA Ges ladepndeet essen tr vag tem atons ions input, ne oc) cen eC AD, (i) |tn Reinforcement learning decision | Supervised learnine| | sgent that interacts with eof is dependent 0 we give labels to | ae independent Sr at the MDP is modeted in sequences of dependent decisions. | other so labels © 4 stochastic pote ) in tate 5. Thi S erple + Object OVETA foreach tae % Which specifies a proba isis a Example — Chess game, Exampl cnt H singgg, TPS MRE. a)aetes he peepee ny Satbution ‘O16 Write the various practical applications of rif ss? on PrObAIltyto'choose action Ans. Vatious applications of reinforcement | Y Esplain the rencdt® Yaious practical applications of ww Tl te term Bellman eq uatio tics OF {© ‘Reinforcement laming can be vsed in 0004unit 139 428 Machine Learning (VI-Sem.) tions with scause we care only about the a agent hypothetically knows the model of the eny cute ronment, mn counter. Be when th mnowledge of the reward and tran s the iteration pate be Ye ce 0? Metios.p nat ace Be is poniblethat he piy converzesto the opting Ore fees Ber crcis onlya small number k <{S) ofnext possible jscounted sum of collected reward V(s,0*) = max = before the Feaenty there is only a smal mmberk<[]ofnextpossPle ie or = MELT Tor] wrig | OCIA er 10 OSTA iteration, we store and update the policy swe call the optimal value of the states. This function is the unique solut lution to fhe values, The policy iteration algorithm the Bellman equation — Poli rather than doing this fs shown in fig. 4.7. Vos.0") nalts yD Tis.as)Vs.0*) Lvs es is aeA\ 6 “The intuition behind this equation is that the optimal value ofthe state 4 expected discounted value of the next step. Note that ifthe agent was an otving the Tear equations ; the optimal value function, it would retrieve the optimal policy 14(6,0") by Eels xe vac) Improve the potey at cach sate ‘VA(e)= Ele R01+7 nals." Ss fas However, the model of the environment is completely unknown toh Fig, 4.7 agent in most problems considered in reinforcement learning, There seve ‘The idea is to start with a policy and improve it repeatedly until there is general directions in finding the optimal policy ‘without the model : leamings | 0 change. The value function can be calculated by solving for the linear rae ad unngitto derive an optimal policy (model-based); and leanings | SUMO, einen check eter ws can improve the policy by taking these re preyacasa earning aresbue ciodst tee) into account, This step is auaranteed to improve the policy, and when no ‘optimal policy ie « . Jmprovement is possible, the policy is guaranteed to be optimal. Each iteration 219. Briefly describe the terms value iteration and policy iter. ‘of this algorithm takes © ((Al SP + SP) time that is more than that of value vdns, Value Iteration ~To find the optimal policy, iteration, but policy iteration needs fewer iterations than value iteration, value function, and there is an iterative algorithm call tas been shown to converge tothe correct V* values. Its pseudocod in fig. 46. ‘we can use the opty led value iteration eis show Initialize Vis) to arbitrary values Repeat Tor allseS For alla ‘ QedLire a] +E, P ESO Yoy= mara Q(6,4)* Unit Vs) Converge = Fig. 4.6 ‘Wesay tha the values converged if the maximum value i ns sles than a certain threshold 3. fe Malv(s) —vi(g}
A, which make decisions without the need for optimization procedures on a value function, mapping representation of the states to action- selection probabilities. The value function is known as the critic QS —SxA—R, which estimates the expected return to reduce variance and accelerate learning, mapping states to expected cumulative future reward. Fig. 4.8 shows an architecture design, the actor and critic are two Sa separated networks share a_ Extraction Ao common observation. At each ™Y*S step, the action selected by actor network is also an input factor to the critic network. In the process of policy improvement, the critic eee) network estimates the state-action ‘*¥ value of the current policy by Fig. 4.8 Actor-critic Network DQN, then actor network updates its policy in a direction improves the e value. Compared with the previous pure policy-gradient methods, which do not have a value function, using a critic network to evaluate the current policy is more conducive to convergence and stability. The bet ena te value evaluation is, the lower the learning Relonpate pa oe important and helpful to have a better policy Sry pre Policy-gradient-based actor-critic algorithms OM Ting jow-varianee lications because they can search for OPH PY pg; algorithm, radient estimates. Lillicrap et al. present i sim 3 gradient art ees {g from DQN, to solve Sal combines the actorcrtic approach with insite ont te istic
You might also like
Machine Learning Unit 3
PDF
No ratings yet
Machine Learning Unit 3
40 pages
ML (Cs-601) Unit 4 Complete
PDF
No ratings yet
ML (Cs-601) Unit 4 Complete
45 pages
ML (Cs-601) Unit 4 Complete
PDF
No ratings yet
ML (Cs-601) Unit 4 Complete
45 pages
ML 1
PDF
No ratings yet
ML 1
35 pages
Unit 1
PDF
No ratings yet
Unit 1
21 pages
ML 5
PDF
No ratings yet
ML 5
20 pages
ML Super Imp
PDF
No ratings yet
ML Super Imp
19 pages
Unit 1
PDF
No ratings yet
Unit 1
17 pages
Super Important ML
PDF
No ratings yet
Super Important ML
17 pages
Super Important ML
PDF
No ratings yet
Super Important ML
16 pages
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (643)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1175)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2885)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)