Dlvu Lecture12
Dlvu Lecture12
Dlvu Lecture12
Peter Bloem
Deep Learning 2020
dlvu.github.io
THE PLAN
outputs
s2s layer
inputs
time
4
RECAP: SEQUENCE-TO-SEQUENCE LAYERS
h0 h1 h2 h3 h4
them slow. Convolution don’t have this drawback
—we can compute each output vector in parallel if
we want to—but the downside is that they are
limited in how far back they can look into the
sequence.
sequential processing finite “memory”
6
Self-attention is another sequence-to-sequence
layer, and one which provides us with the best of
both worlds: parallel processing and a potentially
infinite memory.
SELF-ATTENTION
<latexit sha1_base64="tTo7PvB30kPtldb77IGPHijn3Rc=">AAAN8HicfZdNb9s2HMbV7q3L6i3djrsQCzoMQxBIieOXQ4Baso0e1jYN8tItCgyKpmXVkkhQlGNX0HlfYbdh132BfZYddt2+xijbkSWKsk5/8Hn46Ke/SJtyqO9FXNf/fvT4o48/+fSzJ5/vffG08eVX+8++vo5IzBC+QsQn7J0DI+x7Ib7iHvfxO8owDBwf3zgzK9Nv5phFHgkv+ZLiuwC6oTfxEORiaLT/1kZuYi/TkQfOwPd2FAej98BGJLlPR4n3PhW1k9iLNBu192yOFzy59/gUpEr3GTBG+wf6kb66QLUwNsWBtrnOR8+e/rpnjwmKAxxy5MMoujV0yu8SyLiHfJzu2XGEKUQz6OLbmE86d4kX0pjjEKXgudAmsQ84AdnzgbHHMOL+UhQQMU8kADSFDCIuurBXjopwCAMcHY7nHo3WZTR31wWHooV3yWLV4rQ0MXEZpFMPLUpkCQyiAPJpZTBaBk55EMc+ZvOgPJhRCkbJucAMeVHWg3PRmDc0e2vRJTnf6NMlneIwSpOY+WlxohAwY3giJq7KCPOYJquHEUtlFp1xFuPDrFyNnfUhm13g8aHIKQ2UcSY+gbw85EiPsZiIXmf9CvE9IkEAw3Fi0zRZLx378CgV4nPg+MLvEMjGZedFmiR21kbHARfCWhJfF8TXsjgoiIPNTYg/BhPCwFwsCcIiIIxAWJiHcFSefZXPnoArOfq6IF7L4k1BvJFFJy6ocUWdF9R5Rb0vqPeyuiiIC1lcFsRlJZcXVC6rHwriB1l8t+umP++66S9SrHg7YhctxcbHE/HTtVpzyQylycvLVz+lSXd1bVZKjIFRNiLnwXgybLW77VSW/Qe9OewYZr+q54a22TMslSF39NqW3h9sWY4lbw6t661BryVHIX+rd7rWsKpvYfVeu28qDFvaodUctDftwziUrG7us7qtZqUtbp7TNU3ztFvVc4PZtKzOscKQO6x+v9+zVig0ZtTHkpc+GFutU70aRfOgjt5q9hT69gXoHdOswNICizk0jb6xYuEY+pKT54vF6vS6AzmIb/tv9iyr8gJ5of0dy+g3FYYt62n/dHCyIiEMhq7cFZK3r9XunHTkKJIHDdviDVZYyPZOw66ptysspMAyNC0za2x5J4pNdmvcJesf5O2+AwcGSFPZLDaabM723oNZ8voKs1/vVtp3+dUT/Fr46pMiVBePFOGoFgapWFA9PFLCo13wbtXv1sW7inC3FsZVsbj18K4S3t0FT6t+WhdPFeG0FoaqWGg9PFXC013wvOrndfFcEc5rYbiKhdfDcyU83wVPqn5SF08U4aQWhqhYSD08UcKTMrz488iO+dAH2TGZ+MALNweD0m8WzY4PMyROkmv3+sH7WHwuMPxKHCveiDMuFGe8HxMbMjfwwlR8Prj2YVbtMsLFg1FU4tPFkD9UqsX18ZFxeqS/bR68MDcfMU+0b7XvtB80Q2trL7SX2rl2pSHtL+0f7V/tvwZr/Nb4vfHH2vr40WbON1rpavz5P1CnIqw=</latexit>
X
yi = wij xj
j
X
with wij = 1
j
0
<latexit sha1_base64="47bRmxjTbNtwHq9NXos4ppI6b/o=">AAAOCnicfZdNc+M0HMbd5W0pG+jCkYuHzgLDdDp2m+bl0JmNnWT2wO6WTt+gLh1ZUVxvZEsjy2myHp+58Gm4MVz5Ahz5JFyR8+LYshyfNHoePf7pbymRXIr9iBvGvztPPvjwo48/efrp7mfPGp9/sff8y6uIxAyiS0gwYTcuiBD2Q3TJfY7RDWUIBC5G1+7EzvTrKWKRT8ILPqfoLgBe6I99CLjout+bOJAkj+l394n/LtW/PdUTB7qJM0vv/fTXi3X7ne44u0tnbnTGDMDEQTOqFzPSxIniIBtRUe739o1DY/Ho1Ya5auxrq+fs/vmz33adEYFxgEIOMYiiW9Og/C4BjPsQo3TXiSNEAZwAD93GfNy5S/yQxhyFMNVfCG0cY50TPZu5PvIZghzPRQNA5osEHT4AMQku6rNbjopQCAIUHYymPo2WzWjqLRsciOLeJbNF8dPSwMRjgD74cFYiS0AQBYA/VDqjeeCWO1GMEZsG5c6MUjBKzhli0I+yGpyJwryl2feMLsjZSn+Y0wcURmkSM5wWBwoBMYbGYuCiGSEe02QxGbGIJtEpZzE6yJqLvtM+YJNzNDoQOaWOMs4YE8DLXa40jVm2YLJ6hegRkiAA4ShxqFgvHM144hwcpkJ8obtY+F0C2KjsPE+TxMnK6Lr6ubCWxDcF8Y0sDgriYPUSgkf6mDB9KpYEYZEujLqwMB+iqDz6Mh891i/l6KuCeCWL1wXxWhbduKDGFXVaUKcV9bGgPsrqrCDOZHFeEOeVXF5Quay+L4jvZfFm20t/3vbSX6RY8XXELpqLjY/G4kdtseaSCUyTVxevf0yT7uJZrZQY6WbZCN218XjYanfbqSzjtd4cdkyrX9VzQ9vqmbbKkDt6bdvoDzYsR5I3hzaM1qDXkqMg3uidrj2s6htYo9fuWwrDhnZoNwftVfkQCiWrl/vsbqtZKYuX53QtyzrpVvXcYDVtu3OkMOQOu9/v9+wFCo0ZxUjy0rWx1ToxqlE0D+oYrWZPoW8+gNGxrAosLbBYQ8vsmwsWjgCWnDxfLHan1x3IQXxTf6tn25UPyAvl79hmv6kwbFhP+ieD4wUJYSD05KqQvHytdue4I0eRPGjYFl+wwkI2bxp2LaNdYSEFlqFlW1lhyztRbLJb8y5Z/iBv9p2+b+ppKpvFRpPN2d5bmyUvVphxvVtp3+ZXD8C18NWZQlgXDxXhsBYGqlhgPTxUwsNt8F7V79XFe4pwrxbGU7F49fCeEt7bBk+rfloXTxXhtBaGqlhoPTxVwtNt8Lzq53XxXBHOa2G4ioXXw3MlPN8GT6p+UhdPFOGkFoaoWEg9PFHCkzK8+PPIjvkA69kxmWDdD1cHg9JvFs2ODxNxEVm5lxPvI3FdYOi1OFa8FWdcIM54PyQOYF7gh6m4PnjOQdbaZgSztVG0xNXFlC8q1cbV0aF5cmj81Nx/aa0uMU+1r7VvtO81U2trL7VX2pl2qUHtH+2/HW1np/F744/Gn42/ltYnO6sxX2mlp/H3/+F9LNM=</latexit>
wij = xi T xj
0
0 exp wij
<latexit sha1_base64="47bRmxjTbNtwHq9NXos4ppI6b/o=">AAAOCnicfZdNc+M0HMbd5W0pG+jCkYuHzgLDdDp2m+bl0JmNnWT2wO6WTt+gLh1ZUVxvZEsjy2myHp+58Gm4MVz5Ahz5JFyR8+LYshyfNHoePf7pbymRXIr9iBvGvztPPvjwo48/efrp7mfPGp9/sff8y6uIxAyiS0gwYTcuiBD2Q3TJfY7RDWUIBC5G1+7EzvTrKWKRT8ILPqfoLgBe6I99CLjout+bOJAkj+l394n/LtW/PdUTB7qJM0vv/fTXi3X7ne44u0tnbnTGDMDEQTOqFzPSxIniIBtRUe739o1DY/Ho1Ya5auxrq+fs/vmz33adEYFxgEIOMYiiW9Og/C4BjPsQo3TXiSNEAZwAD93GfNy5S/yQxhyFMNVfCG0cY50TPZu5PvIZghzPRQNA5osEHT4AMQku6rNbjopQCAIUHYymPo2WzWjqLRsciOLeJbNF8dPSwMRjgD74cFYiS0AQBYA/VDqjeeCWO1GMEZsG5c6MUjBKzhli0I+yGpyJwryl2feMLsjZSn+Y0wcURmkSM5wWBwoBMYbGYuCiGSEe02QxGbGIJtEpZzE6yJqLvtM+YJNzNDoQOaWOMs4YE8DLXa40jVm2YLJ6hegRkiAA4ShxqFgvHM144hwcpkJ8obtY+F0C2KjsPE+TxMnK6Lr6ubCWxDcF8Y0sDgriYPUSgkf6mDB9KpYEYZEujLqwMB+iqDz6Mh891i/l6KuCeCWL1wXxWhbduKDGFXVaUKcV9bGgPsrqrCDOZHFeEOeVXF5Quay+L4jvZfFm20t/3vbSX6RY8XXELpqLjY/G4kdtseaSCUyTVxevf0yT7uJZrZQY6WbZCN218XjYanfbqSzjtd4cdkyrX9VzQ9vqmbbKkDt6bdvoDzYsR5I3hzaM1qDXkqMg3uidrj2s6htYo9fuWwrDhnZoNwftVfkQCiWrl/vsbqtZKYuX53QtyzrpVvXcYDVtu3OkMOQOu9/v9+wFCo0ZxUjy0rWx1ToxqlE0D+oYrWZPoW8+gNGxrAosLbBYQ8vsmwsWjgCWnDxfLHan1x3IQXxTf6tn25UPyAvl79hmv6kwbFhP+ieD4wUJYSD05KqQvHytdue4I0eRPGjYFl+wwkI2bxp2LaNdYSEFlqFlW1lhyztRbLJb8y5Z/iBv9p2+b+ppKpvFRpPN2d5bmyUvVphxvVtp3+ZXD8C18NWZQlgXDxXhsBYGqlhgPTxUwsNt8F7V79XFe4pwrxbGU7F49fCeEt7bBk+rfloXTxXhtBaGqlhoPTxVwtNt8Lzq53XxXBHOa2G4ioXXw3MlPN8GT6p+UhdPFOGkFoaoWEg9PFHCkzK8+PPIjvkA69kxmWDdD1cHg9JvFs2ODxNxEVm5lxPvI3FdYOi1OFa8FWdcIM54PyQOYF7gh6m4PnjOQdbaZgSztVG0xNXFlC8q1cbV0aF5cmj81Nx/aa0uMU+1r7VvtO81U2trL7VX2pl2qUHtH+2/HW1np/F744/Gn42/ltYnO6sxX2mlp/H3/+F9LNM=</latexit>
T
w
wij =
= xP
i xj
ij 0
j exp
exp ww0 ij
ij
wij = P 0
j exp wij
y1 y2 y3 y4 y5 y6
<latexit sha1_base64="xeGfV1YPMsYDYKSe/yoPKYFBu90=">AAAN4XicfZdbb9s2GIbV7tRl9Zaul70RFhQYhiCQEh9RFKgl2+jF2mZBTlsUBBRNK4JpkaAox66g62F3w273B/Zrdrvt34zyQQeKsq4+8H356uEn0aZciv2QG8Z/jx5/8ulnn3/x5Mu9r542vv5m/9m3lyGJGEQXkGDCrl0QIuwH6IL7HKNryhCYuRhduVM71a/miIU+Cc75kqLbGfACf+JDwMXQ3b7lQC92lsmd6bzalsd5eZKXzbxs5WXbeXW3f2AcGatLrxbmpjjQNtfp3bOnv+45YwKjGQo4xCAMb0yD8tsYMO5DjJI9JwoRBXAKPHQT8Un3NvYDGnEUwER/KbRJhHVO9HRB+thnCHK8FAWAzBcJOrwHDEAulr1XjgpRAGYoPBzPfRquy3DurQsORM9u48Wqp0lpYuwxQO99uCiRxWAWzgC/rwyGy5lbHkQRRmw+Kw+mlIJRci4Qg36Y9uBUNOYDTR9TeE5ON/r9kt6jIEziiOGkOFEIiDE0ERNXZYh4ROPVYsS7MQ1fcxahw7Rcjb0eADY9Q+NDkVMaKONMMAG8PORKy1hMRK/TfgXoAZLZDATj2KFJ7HC04LFzeJQI8aXuYuF3CWDjsvMsiWMnbaPr6mfCWhLfF8T3sjgsiMPNTQge6xPC9Ll4JQgLdWHUhYX5EIXl2RfZ7Il+IUdfFsRLWbwqiFey6EYFNaqo84I6r6gPBfVBVhcFcSGLy4K4rOTygspl9WNB/CiL17tu+vOum/4ixYqnI3bRUmx8NBG/Vat3Lp7CJH57/u7HJO6trs2bEiHdLBuhuzWejNqdXieRZbzVm6OuaQ2qemboWH3TVhkyR79jG4NhznIseTNow2gP+205CuJc7/bsUVXPYY1+Z2ApDDntyG4OO5v2IRRIVi/z2b12s9IWL8vpWZbV6lX1zGA1bbt7rDBkDnswGPTtFQqNGMVI8tKtsd1uGdUomgV1jXazr9DzB2B0LasCSwss1sgyB+aKhSOAJSfPXha72+8N5SCe99/q23blAfJC+7u2OWgqDDlra9AanqxICAOBJ3eFZO1rd7onXTmKZEGjjniCFRaS32nUs4xOhYUUWEaWbaWNLe9EscluzNt4/YOc7zv9wNSTRDaLjSab0723NUterDDjerfSvsuvnoBr4asrhbAuHirCYS0MVLHAeniohIe74L2q36uL9xThXi2Mp2Lx6uE9Jby3C55W/bQunirCaS0MVbHQeniqhKe74HnVz+viuSKc18JwFQuvh+dKeL4LnlT9pC6eKMJJLQxRsZB6eKKEJ2V48eeRHvMB1tNjMsG6H2wOBqXfLJoeH6ZQnCTX7vXCB0h8LjD0ThwrPogzLhBnvB9iBzBv5geJ+HzwnMO02mUEi61RVOLTxZQ/VKrF5fGR2ToyfmoevLE2HzFPtBfad9r3mql1tDfaW+1Uu9Cg9pf2t/aP9m8DNn5r/N74Y219/Ggz57lWuhp//g+aRR0A</latexit>
softmax
⇥ ⇥ ⇥ ⇥ ⇥ ⇥
<latexit sha1_base64="0qRstZIKf4JOUFLBsGr5ufSVrM0=">AAAN7XicfZdNb9s2HMbV7q3L6jXdjrsICwoMQxBIiV8xFKgl2ehhTbMgb1tsBBRNK4IpkaAox66g8z7BbsOu+wL7Mtt1+yCjbEeWKMo6kXwePvrpL9EmXYr9iBvG30+efvTxJ59+9uzzvS+eN758sf/yq6uIxAyiS0gwYTcuiBD2Q3TJfY7RDWUIBC5G1+7MzvTrOWKRT8ILvqRoHAAv9Kc+BFwM3e2fjiBJHtK75MRMRz/kneNi56TYaRY7rWKnLTp3+wfGkbG69GrD3DQOtM11dvfy+a97owmBcYBCDjGIolvToHycAMZ9iFG6N4ojRAGcAQ/dxnzaHSd+SGOOQpjqr4Q2jbHOiZ49nD7xGYIcL0UDQOaLBB3eAwYgFyXYK0dFKAQBig4nc59G62Y099YNDkT9xsliVd+0NDHxGKD3PlyUyBIQRAHg95XBaBm45UEUY8TmQXkwoxSMknOBGPSjrAZnojDvafbKogtyttHvl/QehVGaxAynxYlCQIyhqZi4akaIxzRZPYz4TmbRa85idJg1V2OvHcBm52hyKHJKA2WcKSaAl4dc6TEWU1HrrF4heoAkCEA4SUY0TUYcLXgyOjxKhfhKd7HwuwSwSdl5nibJKCuj6+rnwloSTwviqSwOCuJgcxOCJ/qUMH0uPgnCIl0YdWFhPkRRefZlPnuqX8rRVwXxShavC+K1LLpxQY0r6rygzivqQ0F9kNVFQVzI4rIgLiu5vKByWf1QED/I4s2um/6866a/SLHi7YhVtBQLH03F79bqm0tmME3eXrz7MU16q2vzpcRIN8tG6D4aT4btTq+TyjJ+1JvDrmk5VT03dKy+aasMuaPfsQ1nsGU5lrw5tGG0B/22HAXxVu/27GFV38Ia/Y5jKQxb2qHdHHQ25UMolKxe7rN77WalLF6e07Msq9Wr6rnBatp291hhyB224zh9e4VCY0Yxkrz00dhut4xqFM2Duka72Vfo2xdgdC2rAksLLNbQMh1zxcIRwJKT5x+L3e33BnIQ39bf6tt25QXyQvm7tuk0FYYta8tpDU5WJISB0JOrQvLytTvdk64cRfKgYUe8wQoL2d5p2LOMToWFFFiGlm1lhS2vRLHIbs1xsv5B3q47/cDU01Q2i4Umm7O192iWvFhhxvVupX2XXz0B18JXnxTCunioCIe1MFDFAuvhoRIe7oL3qn6vLt5ThHu1MJ6KxauH95Tw3i54WvXTuniqCKe1MFTFQuvhqRKe7oLnVT+vi+eKcF4Lw1UsvB6eK+H5LnhS9ZO6eKIIJ7UwRMVC6uGJEp6U4cWfR7bNB1jPtskE63642RiUfrNotn2YQbGTXLvXD+4gcVxg6J3YVrwXe1wg9njfJyPAvMAPU3F88EaHWWuXESwejaIlji6mfFCpNq6Oj8zWkfFT8+CNtTnEPNO+0b7VvtNMraO90d5qZ9qlBrW/tH+0f7X/GqTxW+P3xh9r69Mnmzlfa6Wr8ef/TP8idg==</latexit>
w31xw
1 x
322w
xw331xw
33 4wx32
5x
34ww
1w
x 2w
x33
631 35 w
xw
32xw
334
36w
4wx
335xw
35 w
xwx36w
634 ww
x 35xw
wwx36w
xw
xwx w
ww xwwx xww
x x x wx xw x x <latexit sha1_base64="UwzBCGnFqsCXotGJ+NPMfcVd3xo=">AAAN4XicfZdbb9s2GIbV7tRl9Zaul70RFhQYhiCQEh9RFKgl2+jF2mZBTlsUBBRNK4JpkaAox66g62F3w273B/Zrdrvt34zyQQeKsq4+8H356uEn0aZciv2QG8Z/jx5/8ulnn3/x5Mu9r542vv5m/9m3lyGJGEQXkGDCrl0QIuwH6IL7HKNryhCYuRhduVM71a/miIU+Cc75kqLbGfACf+JDwMXQ3b7lQDd2Fsmd6bzalsd5eZKXzbxs5WXbeXW3f2AcGatLrxbmpjjQNtfp3bOnv+45YwKjGQo4xCAMb0yD8tsYMO5DjJI9JwoRBXAKPHQT8Un3NvYDGnEUwER/KbRJhHVO9HRB+thnCHK8FAWAzBcJOrwHDEAulr1XjgpRAGYoPBzPfRquy3DurQsORM9u48Wqp0lpYuwxQO99uCiRxWAWzgC/rwyGy5lbHkQRRmw+Kw+mlIJRci4Qg36Y9uBUNOYDTR9TeE5ON/r9kt6jIEziiOGkOFEIiDE0ERNXZYh4ROPVYsS7MQ1fcxahw7Rcjb0eADY9Q+NDkVMaKONMMAG8PORKy1hMRK/TfgXoAZLZDATj2KFJ7HC04LFzeJQI8aXuYuF3CWDjsvMsiWMnbaPr6mfCWhLfF8T3sjgsiMPNTQge6xPC9Ll4JQgLdWHUhYX5EIXl2RfZ7Il+IUdfFsRLWbwqiFey6EYFNaqo84I6r6gPBfVBVhcFcSGLy4K4rOTygspl9WNB/CiL17tu+vOum/4ixYqnI3bRUmx8NBG/Vat3Lp7CJH57/u7HJO6trs2bEiHdLBuhuzWejNqdXieRZbzVm6OuaQ2qemboWH3TVhkyR79jG4NhznIseTNow2gP+205CuJc7/bsUVXPYY1+Z2ApDDntyG4OO5v2IRRIVi/z2b12s9IWL8vpWZbV6lX1zGA1bbt7rDBkDnswGPTtFQqNGMVI8tKtsd1uGdUomgV1jXazr9DzB2B0LasCSwss1sgyB+aKhSOAJSfPXha72+8N5SCe99/q23blAfJC+7u2OWgqDDlra9AanqxICAOBJ3eFZO1rd7onXTmKZEGjjniCFRaS32nUs4xOhYUUWEaWbaWNLe9EscluzNt4/YOc7zv9wNSTRDaLjSab0723NUterDDjerfSvsuvnoBr4asrhbAuHirCYS0MVLHAeniohIe74L2q36uL9xThXi2Mp2Lx6uE9Jby3C55W/bQunirCaS0MVbHQeniqhKe74HnVz+viuSKc18JwFQuvh+dKeL4LnlT9pC6eKMJJLQxRsZB6eKKEJ2V48eeRHvMB1tNjMsG6H2wOBqXfLJoeH6ZQnCTX7vXCB0h8LjD0ThwrPogzLhBnvB9iBzBv5geJ+HzwnMO02mUEi61RVOLTxZQ/VKrF5fGR2ToyfmoevLE2HzFPtBfad9r3mql1tDfaW+1Uu9Cg9pf2t/aP9m8DNn5r/N74Y219/Ggz57lWuhp//g+5Exzc</latexit>
<latexit sha1_base64="0qRstZIKf4JOUFLBsGr5ufSVrM0=">AAAN7XicfZdNb9s2HMbV7q3L6jXdjrsICwoMQxBIiV8xFKgl2ehhTbMgb1tsBBRNK4IpkaAox66g8z7BbsOu+wL7Mtt1+yCjbEeWKMo6kXwePvrpL9EmXYr9iBvG30+efvTxJ59+9uzzvS+eN758sf/yq6uIxAyiS0gwYTcuiBD2Q3TJfY7RDWUIBC5G1+7MzvTrOWKRT8ILvqRoHAAv9Kc+BFwM3e2fjiBJHtK75MRMRz/kneNi56TYaRY7rWKnLTp3+wfGkbG69GrD3DQOtM11dvfy+a97owmBcYBCDjGIolvToHycAMZ9iFG6N4ojRAGcAQ/dxnzaHSd+SGOOQpjqr4Q2jbHOiZ49nD7xGYIcL0UDQOaLBB3eAwYgFyXYK0dFKAQBig4nc59G62Y099YNDkT9xsliVd+0NDHxGKD3PlyUyBIQRAHg95XBaBm45UEUY8TmQXkwoxSMknOBGPSjrAZnojDvafbKogtyttHvl/QehVGaxAynxYlCQIyhqZi4akaIxzRZPYz4TmbRa85idJg1V2OvHcBm52hyKHJKA2WcKSaAl4dc6TEWU1HrrF4heoAkCEA4SUY0TUYcLXgyOjxKhfhKd7HwuwSwSdl5nibJKCuj6+rnwloSTwviqSwOCuJgcxOCJ/qUMH0uPgnCIl0YdWFhPkRRefZlPnuqX8rRVwXxShavC+K1LLpxQY0r6rygzivqQ0F9kNVFQVzI4rIgLiu5vKByWf1QED/I4s2um/6866a/SLHi7YhVtBQLH03F79bqm0tmME3eXrz7MU16q2vzpcRIN8tG6D4aT4btTq+TyjJ+1JvDrmk5VT03dKy+aasMuaPfsQ1nsGU5lrw5tGG0B/22HAXxVu/27GFV38Ia/Y5jKQxb2qHdHHQ25UMolKxe7rN77WalLF6e07Msq9Wr6rnBatp291hhyB224zh9e4VCY0Yxkrz00dhut4xqFM2Duka72Vfo2xdgdC2rAksLLNbQMh1zxcIRwJKT5x+L3e33BnIQ39bf6tt25QXyQvm7tuk0FYYta8tpDU5WJISB0JOrQvLytTvdk64cRfKgYUe8wQoL2d5p2LOMToWFFFiGlm1lhS2vRLHIbs1xsv5B3q47/cDU01Q2i4Umm7O192iWvFhhxvVupX2XXz0B18JXnxTCunioCIe1MFDFAuvhoRIe7oL3qn6vLt5ThHu1MJ6KxauH95Tw3i54WvXTuniqCKe1MFTFQuvhqRKe7oLnVT+vi+eKcF4Lw1UsvB6eK+H5LnhS9ZO6eKIIJ7UwRMVC6uGJEp6U4cWfR7bNB1jPtskE63642RiUfrNotn2YQbGTXLvXD+4gcVxg6J3YVrwXe1wg9njfJyPAvMAPU3F88EaHWWuXESwejaIlji6mfFCpNq6Oj8zWkfFT8+CNtTnEPNO+0b7VvtNMraO90d5qZ9qlBrW/tH+0f7X/GqTxW+P3xh9r69Mnmzlfa6Wr8ef/TP8idg==</latexit>
<latexit sha1_base64="UwzBCGnFqsCXotGJ+NPMfcVd3xo=">AAAN4XicfZdbb9s2GIbV7tRl9Zaul70RFhQYhiCQEh9RFKgl2+jF2mZBTlsUBBRNK4JpkaAox66g62F3w273B/Zrdrvt34zyQQeKsq4+8H356uEn0aZciv2QG8Z/jx5/8ulnn3/x5Mu9r542vv5m/9m3lyGJGEQXkGDCrl0QIuwH6IL7HKNryhCYuRhduVM71a/miIU+Cc75kqLbGfACf+JDwMXQ3b7lQDd2Fsmd6bzalsd5eZKXzbxs5WXbeXW3f2AcGatLrxbmpjjQNtfp3bOnv+45YwKjGQo4xCAMb0yD8tsYMO5DjJI9JwoRBXAKPHQT8Un3NvYDGnEUwER/KbRJhHVO9HRB+thnCHK8FAWAzBcJOrwHDEAulr1XjgpRAGYoPBzPfRquy3DurQsORM9u48Wqp0lpYuwxQO99uCiRxWAWzgC/rwyGy5lbHkQRRmw+Kw+mlIJRci4Qg36Y9uBUNOYDTR9TeE5ON/r9kt6jIEziiOGkOFEIiDE0ERNXZYh4ROPVYsS7MQ1fcxahw7Rcjb0eADY9Q+NDkVMaKONMMAG8PORKy1hMRK/TfgXoAZLZDATj2KFJ7HC04LFzeJQI8aXuYuF3CWDjsvMsiWMnbaPr6mfCWhLfF8T3sjgsiMPNTQge6xPC9Ll4JQgLdWHUhYX5EIXl2RfZ7Il+IUdfFsRLWbwqiFey6EYFNaqo84I6r6gPBfVBVhcFcSGLy4K4rOTygspl9WNB/CiL17tu+vOum/4ixYqnI3bRUmx8NBG/Vat3Lp7CJH57/u7HJO6trs2bEiHdLBuhuzWejNqdXieRZbzVm6OuaQ2qemboWH3TVhkyR79jG4NhznIseTNow2gP+205CuJc7/bsUVXPYY1+Z2ApDDntyG4OO5v2IRRIVi/z2b12s9IWL8vpWZbV6lX1zGA1bbt7rDBkDnswGPTtFQqNGMVI8tKtsd1uGdUomgV1jXazr9DzB2B0LasCSwss1sgyB+aKhSOAJSfPXha72+8N5SCe99/q23blAfJC+7u2OWgqDDlra9AanqxICAOBJ3eFZO1rd7onXTmKZEGjjniCFRaS32nUs4xOhYUUWEaWbaWNLe9EscluzNt4/YOc7zv9wNSTRDaLjSab0723NUterDDjerfSvsuvnoBr4asrhbAuHirCYS0MVLHAeniohIe74L2q36uL9xThXi2Mp2Lx6uE9Jby3C55W/bQunirCaS0MVbHQeniqhKe74HnVz+viuSKc18JwFQuvh+dKeL4LnlT9pC6eKMJJLQxRsZB6eKKEJ2V48eeRHvMB1tNjMsG6H2wOBqXfLJoeH6ZQnCTX7vXCB0h8LjD0ThwrPogzLhBnvB9iBzBv5geJ+HzwnMO02mUEi61RVOLTxZQ/VKrF5fGR2ToyfmoevLE2HzFPtBfad9r3mql1tDfaW+1Uu9Cg9pf2t/aP9m8DNn5r/N74Y219/Ggz57lWuhp//g+5Exzc</latexit>
<latexit sha1_base64="0qRstZIKf4JOUFLBsGr5ufSVrM0=">AAAN7XicfZdNb9s2HMbV7q3L6jXdjrsICwoMQxBIiV8xFKgl2ehhTbMgb1tsBBRNK4IpkaAox66g8z7BbsOu+wL7Mtt1+yCjbEeWKMo6kXwePvrpL9EmXYr9iBvG30+efvTxJ59+9uzzvS+eN758sf/yq6uIxAyiS0gwYTcuiBD2Q3TJfY7RDWUIBC5G1+7MzvTrOWKRT8ILvqRoHAAv9Kc+BFwM3e2fjiBJHtK75MRMRz/kneNi56TYaRY7rWKnLTp3+wfGkbG69GrD3DQOtM11dvfy+a97owmBcYBCDjGIolvToHycAMZ9iFG6N4ojRAGcAQ/dxnzaHSd+SGOOQpjqr4Q2jbHOiZ49nD7xGYIcL0UDQOaLBB3eAwYgFyXYK0dFKAQBig4nc59G62Y099YNDkT9xsliVd+0NDHxGKD3PlyUyBIQRAHg95XBaBm45UEUY8TmQXkwoxSMknOBGPSjrAZnojDvafbKogtyttHvl/QehVGaxAynxYlCQIyhqZi4akaIxzRZPYz4TmbRa85idJg1V2OvHcBm52hyKHJKA2WcKSaAl4dc6TEWU1HrrF4heoAkCEA4SUY0TUYcLXgyOjxKhfhKd7HwuwSwSdl5nibJKCuj6+rnwloSTwviqSwOCuJgcxOCJ/qUMH0uPgnCIl0YdWFhPkRRefZlPnuqX8rRVwXxShavC+K1LLpxQY0r6rygzivqQ0F9kNVFQVzI4rIgLiu5vKByWf1QED/I4s2um/6866a/SLHi7YhVtBQLH03F79bqm0tmME3eXrz7MU16q2vzpcRIN8tG6D4aT4btTq+TyjJ+1JvDrmk5VT03dKy+aasMuaPfsQ1nsGU5lrw5tGG0B/22HAXxVu/27GFV38Ia/Y5jKQxb2qHdHHQ25UMolKxe7rN77WalLF6e07Msq9Wr6rnBatp291hhyB224zh9e4VCY0Yxkrz00dhut4xqFM2Duka72Vfo2xdgdC2rAksLLNbQMh1zxcIRwJKT5x+L3e33BnIQ39bf6tt25QXyQvm7tuk0FYYta8tpDU5WJISB0JOrQvLytTvdk64cRfKgYUe8wQoL2d5p2LOMToWFFFiGlm1lhS2vRLHIbs1xsv5B3q47/cDU01Q2i4Umm7O192iWvFhhxvVupX2XXz0B18JXnxTCunioCIe1MFDFAuvhoRIe7oL3qn6vLt5ThHu1MJ6KxauH95Tw3i54WvXTuniqCKe1MFTFQuvhqRKe7oLnVT+vi+eKcF4Lw1UsvB6eK+H5LnhS9ZO6eKIIJ7UwRMVC6uGJEp6U4cWfR7bNB1jPtskE63642RiUfrNotn2YQbGTXLvXD+4gcVxg6J3YVrwXe1wg9njfJyPAvMAPU3F88EaHWWuXESwejaIlji6mfFCpNq6Oj8zWkfFT8+CNtTnEPNO+0b7VvtNMraO90d5qZ9qlBrW/tH+0f7X/GqTxW+P3xh9r69Mnmzlfa6Wr8ef/TP8idg==</latexit> <latexit sha1_base64="0qRstZIKf4JOUFLBsGr5ufSVrM0=">AAAN7XicfZdNb9s2HMbV7q3L6jXdjrsICwoMQxBIiV8xFKgl2ehhTbMgb1tsBBRNK4IpkaAox66g8z7BbsOu+wL7Mtt1+yCjbEeWKMo6kXwePvrpL9EmXYr9iBvG30+efvTxJ59+9uzzvS+eN758sf/yq6uIxAyiS0gwYTcuiBD2Q3TJfY7RDWUIBC5G1+7MzvTrOWKRT8ILvqRoHAAv9Kc+BFwM3e2fjiBJHtK75MRMRz/kneNi56TYaRY7rWKnLTp3+wfGkbG69GrD3DQOtM11dvfy+a97owmBcYBCDjGIolvToHycAMZ9iFG6N4ojRAGcAQ/dxnzaHSd+SGOOQpjqr4Q2jbHOiZ49nD7xGYIcL0UDQOaLBB3eAwYgFyXYK0dFKAQBig4nc59G62Y099YNDkT9xsliVd+0NDHxGKD3PlyUyBIQRAHg95XBaBm45UEUY8TmQXkwoxSMknOBGPSjrAZnojDvafbKogtyttHvl/QehVGaxAynxYlCQIyhqZi4akaIxzRZPYz4TmbRa85idJg1V2OvHcBm52hyKHJKA2WcKSaAl4dc6TEWU1HrrF4heoAkCEA4SUY0TUYcLXgyOjxKhfhKd7HwuwSwSdl5nibJKCuj6+rnwloSTwviqSwOCuJgcxOCJ/qUMH0uPgnCIl0YdWFhPkRRefZlPnuqX8rRVwXxShavC+K1LLpxQY0r6rygzivqQ0F9kNVFQVzI4rIgLiu5vKByWf1QED/I4s2um/6866a/SLHi7YhVtBQLH03F79bqm0tmME3eXrz7MU16q2vzpcRIN8tG6D4aT4btTq+TyjJ+1JvDrmk5VT03dKy+aasMuaPfsQ1nsGU5lrw5tGG0B/22HAXxVu/27GFV38Ia/Y5jKQxb2qHdHHQ25UMolKxe7rN77WalLF6e07Msq9Wr6rnBatp291hhyB224zh9e4VCY0Yxkrz00dhut4xqFM2Duka72Vfo2xdgdC2rAksLLNbQMh1zxcIRwJKT5x+L3e33BnIQ39bf6tt25QXyQvm7tuk0FYYta8tpDU5WJISB0JOrQvLytTvdk64cRfKgYUe8wQoL2d5p2LOMToWFFFiGlm1lhS2vRLHIbs1xsv5B3q47/cDU01Q2i4Umm7O192iWvFhhxvVupX2XXz0B18JXnxTCunioCIe1MFDFAuvhoRIe7oL3qn6vLt5ThHu1MJ6KxauH95Tw3i54WvXTuniqCKe1MFTFQuvhqRKe7oLnVT+vi+eKcF4Lw1UsvB6eK+H5LnhS9ZO6eKIIJ7UwRMVC6uGJEp6U4cWfR7bNB1jPtskE63642RiUfrNotn2YQbGTXLvXD+4gcVxg6J3YVrwXe1wg9njfJyPAvMAPU3F88EaHWWuXESwejaIlji6mfFCpNq6Oj8zWkfFT8+CNtTnEPNO+0b7VvtNMraO90d5qZ9qlBrW/tH+0f7X/GqTxW+P3xh9
W =X X W =X X W= m W W= m Y W= WX
W =X X W= m W Y = WX
TAKE NOTE
In simple self-attention wii (xi to yi) usually has the most weight
not a big problem, but we’ll allow this to change later.
Y = WXT
<latexit sha1_base64="bX1CVr1BmYlZVtrvdVSb3jYZFu8=">AAANw3icfZdNb9s2HMbV7q3L6i3djrsIywoMQxBIieOXQ4Bako0e1jYL8ro4DSiaVgRTIkFRjl1B5wH7NLtu32TfZpTsyBJFWad/+Dx8/NOfokK5FPsRN4z/nj3/7PMvvvzqxdc737xsffvd7qvvLyMSM4guIMGEXbsgQtgP0QX3OUbXlCEQuBhduTM706/miEU+Cc/5kqK7AHihP/Uh4GLofvfnMfSSsTvVb1L9RB9Dkv9xlYrSzcvr9OP5/e6ecWDkl14vzHWxp62v0/tXL//cGU8IjAMUcohBFN2aBuV3CWDchxilO+M4QhTAGfDQbcynvbvED2nMUQhT/bXQpjHWOdEzZH3iMwQ5XooCQOaLBB0+AAYgFze2U42KUAgCFO1P5j6NVmU091YFB6Ird8ki71pamZh4DNAHHy4qZAkIogDwh9pgtAzc6iCKMWLzoDqYUQpGyblADPpR1oNT0ZgPNFuI6JycrvWHJX1AYZQmMcNpeaIQEGNoKibmZYR4TJP8ZsTqz6ITzmK0n5X52IkD2OwMTfZFTmWgijPFBPDqkCvdxmIqep31K0SPkAQBCCfJmKbJmKMFT8b7B6kQX+suFn6XADapOs/SJBlnbXRd/UxYK+L7kvheFoclcbj+EYIn+pQwfS4eCcIiXRh1YWE+RFF19kUxe6pfyNGXJfFSFq9K4pUsunFJjWvqvKTOa+pjSX2U1UVJXMjisiQua7m8pHJZ/VQSP8nidUm8lsWbkngji39IsWJ1xC5aio2PpuJtlD9zyQymydvzd7+lST+/1k9KjHSzaoTuk/Fo1On2u6ks4ye9PeqZllPXC0PXGpi2ylA4Bl3bcIYblkPJW0AbRmc46MhREG/0Xt8e1fUNrDHoOpbCsKEd2e1hd90+hELJ6hU+u99p19riFTl9y7KO+3W9MFht2+4dKgyFw3YcZ2DnKDRmFCPJS5+Mnc6xUY+iRVDP6LQHCn2zAEbPsmqwtMRijSzTMXMWjgCWnLx4WOzeoD+Ug/im/9bAtmsLyEvt79mm01YYNqzHzvHwKCchDISe3BVStK/T7R315ChSBI26YgVrLGTzS6O+ZXRrLKTEMrJsK2tsdSeKTXZr3iWrF/Jm3+l7pp6msllsNNmc7b0ns+TFCjNudivt2/zqCbgRvn6nEDbFQ0U4bISBKhbYDA+V8HAbvFf3e03xniLca4TxVCxeM7ynhPe2wdO6nzbFU0U4bYShKhbaDE+V8HQbPK/7eVM8V4TzRhiuYuHN8FwJz7fBk7qfNMUTRThphCEqFtIMT5TwpAov/nlkx3yA9eyYTLDuh+uDQeWdRbPjwwyKk+TKvbpxB4nPBYbeiWPFB3HGBeKM92syBswL/DAVnw/eeD+rthnB4skoKvHpYsofKvXi8vDAPD4wfm/vvbHWHzEvtB+1n7RfNFPram+0t9qpdqFB7S/tb+0f7d/WsDVrsRZfWZ8/W8/5QatcrfR/PkUPaQ==</latexit>
Y = WXT
<latexit sha1_base64="bX1CVr1BmYlZVtrvdVSb3jYZFu8=">AAANw3icfZdNb9s2HMbV7q3L6i3djrsIywoMQxBIieOXQ4Bako0e1jYL8ro4DSiaVgRTIkFRjl1B5wH7NLtu32TfZpTsyBJFWad/+Dx8/NOfokK5FPsRN4z/nj3/7PMvvvzqxdc737xsffvd7qvvLyMSM4guIMGEXbsgQtgP0QX3OUbXlCEQuBhduTM706/miEU+Cc/5kqK7AHihP/Uh4GLofvfnMfSSsTvVb1L9RB9Dkv9xlYrSzcvr9OP5/e6ecWDkl14vzHWxp62v0/tXL//cGU8IjAMUcohBFN2aBuV3CWDchxilO+M4QhTAGfDQbcynvbvED2nMUQhT/bXQpjHWOdEzZH3iMwQ5XooCQOaLBB0+AAYgFze2U42KUAgCFO1P5j6NVmU091YFB6Ird8ki71pamZh4DNAHHy4qZAkIogDwh9pgtAzc6iCKMWLzoDqYUQpGyblADPpR1oNT0ZgPNFuI6JycrvWHJX1AYZQmMcNpeaIQEGNoKibmZYR4TJP8ZsTqz6ITzmK0n5X52IkD2OwMTfZFTmWgijPFBPDqkCvdxmIqep31K0SPkAQBCCfJmKbJmKMFT8b7B6kQX+suFn6XADapOs/SJBlnbXRd/UxYK+L7kvheFoclcbj+EYIn+pQwfS4eCcIiXRh1YWE+RFF19kUxe6pfyNGXJfFSFq9K4pUsunFJjWvqvKTOa+pjSX2U1UVJXMjisiQua7m8pHJZ/VQSP8nidUm8lsWbkngji39IsWJ1xC5aio2PpuJtlD9zyQymydvzd7+lST+/1k9KjHSzaoTuk/Fo1On2u6ks4ye9PeqZllPXC0PXGpi2ylA4Bl3bcIYblkPJW0AbRmc46MhREG/0Xt8e1fUNrDHoOpbCsKEd2e1hd90+hELJ6hU+u99p19riFTl9y7KO+3W9MFht2+4dKgyFw3YcZ2DnKDRmFCPJS5+Mnc6xUY+iRVDP6LQHCn2zAEbPsmqwtMRijSzTMXMWjgCWnLx4WOzeoD+Ug/im/9bAtmsLyEvt79mm01YYNqzHzvHwKCchDISe3BVStK/T7R315ChSBI26YgVrLGTzS6O+ZXRrLKTEMrJsK2tsdSeKTXZr3iWrF/Jm3+l7pp6msllsNNmc7b0ns+TFCjNudivt2/zqCbgRvn6nEDbFQ0U4bISBKhbYDA+V8HAbvFf3e03xniLca4TxVCxeM7ynhPe2wdO6nzbFU0U4bYShKhbaDE+V8HQbPK/7eVM8V4TzRhiuYuHN8FwJz7fBk7qfNMUTRThphCEqFtIMT5TwpAov/nlkx3yA9eyYTLDuh+uDQeWdRbPjwwyKk+TKvbpxB4nPBYbeiWPFB3HGBeKM92syBswL/DAVnw/eeD+rthnB4skoKvHpYsofKvXi8vDAPD4wfm/vvbHWHzEvtB+1n7RfNFPram+0t9qpdqFB7S/tb+0f7d/WsDVrsRZfWZ8/W8/5QatcrfR/PkUPaQ==</latexit>
Y = WXT
<latexit sha1_base64="bX1CVr1BmYlZVtrvdVSb3jYZFu8=">AAANw3icfZdNb9s2HMbV7q3L6i3djrsIywoMQxBIieOXQ4Bako0e1jYL8ro4DSiaVgRTIkFRjl1B5wH7NLtu32TfZpTsyBJFWad/+Dx8/NOfokK5FPsRN4z/nj3/7PMvvvzqxdc737xsffvd7qvvLyMSM4guIMGEXbsgQtgP0QX3OUbXlCEQuBhduTM706/miEU+Cc/5kqK7AHihP/Uh4GLofvfnMfSSsTvVb1L9RB9Dkv9xlYrSzcvr9OP5/e6ecWDkl14vzHWxp62v0/tXL//cGU8IjAMUcohBFN2aBuV3CWDchxilO+M4QhTAGfDQbcynvbvED2nMUQhT/bXQpjHWOdEzZH3iMwQ5XooCQOaLBB0+AAYgFze2U42KUAgCFO1P5j6NVmU091YFB6Ird8ki71pamZh4DNAHHy4qZAkIogDwh9pgtAzc6iCKMWLzoDqYUQpGyblADPpR1oNT0ZgPNFuI6JycrvWHJX1AYZQmMcNpeaIQEGNoKibmZYR4TJP8ZsTqz6ITzmK0n5X52IkD2OwMTfZFTmWgijPFBPDqkCvdxmIqep31K0SPkAQBCCfJmKbJmKMFT8b7B6kQX+suFn6XADapOs/SJBlnbXRd/UxYK+L7kvheFoclcbj+EYIn+pQwfS4eCcIiXRh1YWE+RFF19kUxe6pfyNGXJfFSFq9K4pUsunFJjWvqvKTOa+pjSX2U1UVJXMjisiQua7m8pHJZ/VQSP8nidUm8lsWbkngji39IsWJ1xC5aio2PpuJtlD9zyQymydvzd7+lST+/1k9KjHSzaoTuk/Fo1On2u6ks4ye9PeqZllPXC0PXGpi2ylA4Bl3bcIYblkPJW0AbRmc46MhREG/0Xt8e1fUNrDHoOpbCsKEd2e1hd90+hELJ6hU+u99p19riFTl9y7KO+3W9MFht2+4dKgyFw3YcZ2DnKDRmFCPJS5+Mnc6xUY+iRVDP6LQHCn2zAEbPsmqwtMRijSzTMXMWjgCWnLx4WOzeoD+Ug/im/9bAtmsLyEvt79mm01YYNqzHzvHwKCchDISe3BVStK/T7R315ChSBI26YgVrLGTzS6O+ZXRrLKTEMrJsK2tsdSeKTXZr3iWrF/Jm3+l7pp6msllsNNmc7b0ns+TFCjNudivt2/zqCbgRvn6nEDbFQ0U4bISBKhbYDA+V8HAbvFf3e03xniLca4TxVCxeM7ynhPe2wdO6nzbFU0U4bYShKhbaDE+V8HQbPK/7eVM8V4TzRhiuYuHN8FwJz7fBk7qfNMUTRThphCEqFtIMT5TwpAov/nlkx3yA9eyYTLDuh+uDQeWdRbPjwwyKk+TKvbpxB4nPBYbeiWPFB3HGBeKM92syBswL/DAVnw/eeD+rthnB4skoKvHpYsofKvXi8vDAPD4wfm/vvbHWHzEvtB+1n7RfNFPram+0t9qpdqFB7S/tb+0f7d/WsDVrsRZfWZ8/W8/5QatcrfR/PkUPaQ==</latexit>
12
TAKE NOTE
Permutation equivariant.
for any permutation p of the input: p(sa(X)) = sa(p(X))
13
A LITTLE MORE INTUITION: DOT PRODUCTS. To build some intiuition for why the self attention
works, we need to look into how dot products
users movies function. To do so, we’ll leave the realm of
likes sequence learning for a while and dip our toes
briefly into the pool of recommendation.
score = u1 m1 + u2 m2 + u3 m3 Note that we’re not just taking into account the
likes romance
likes acbon
likes comedy
nt
as
le
inputs
no
of words model. In this case, the word terrible
to
th
rib
ra
w
au
r
te
st
re
<latexit sha1_base64="wdldC1bqSTRlBfD7X34fN+OypEA=">AAANuXicfZfbbts2HMbV7tRl9ZZud9uNsKDAMASBlDg+YChQS7LRi7XNgpy6OAgompY10yJBUo5dQcDu9ia73V5nbzPKdmSJoqwrgt/Hzz/+KcqkT3HIhWX99+TpJ59+9vkXz77c++p54+tv9l98e8VJzCC6hAQTduMDjnAYoUsRCoxuKENg5mN07U/dTL+eI8ZDEl2IJUV3MxBE4TiEQMiu+/3vhzBIhv7YXKb3Q4EWIhGIsVAOT+/3D6wja/WY1Ya9aRwYm+fs/sXzv/aGIwLjGYoExIDzW9ui4i4BTIRQJu4NY44ogFMQoNtYjDt3SRjRWKAIpuZLqY1jbApiZqDmKGQICryUDQBZKBNMOAEMQAnI98pRHEVghvjhaB5Svm7yebBuCCAnc5csVrVKSwOTgAE6CeGiRJaAGZ8BMal08uXML3eiGCM2n5U7M0rJqDgXiMGQZzU4k4V5T7Py8wtyttEnSzpBEU+TmOG0OFAKckHQWA5cNTkSMU1Wk5FrPuWvBIvRYdZc9b3yAJueo9GhzCl1lHHGmABR7vKVaSzGstZZvSL0AMlsBqJRMqRpsn5LhodHqRRfmj6Wfp8ANio7z9MkGWZl9H3zXFpL4ruC+E4V+wWxv/kRgkfmmDBzLl8Jwrgpjaa0sBAiXh59mY8em5dq9FVBvFLF64J4rYp+XFDjijovqPOK+lBQH1R1URAXqrgsiMtKriioQlU/FsSPqnhTEG9U8UNB/KCKvyuxcnXkLlrKjY/G8hu0eueSKUyTNxdvf02T7urZvCkxMu2yEfqPxpNBq91tp6qMH/XmoGM7XlXPDW2nZ7s6Q+7otV3L629ZjhVvDm1ZrX6vpUZBvNU7XXdQ1bewVq/tORrDlnbgNvvtTfkQihRrkPvcbqtZKUuQ53QdxzntVvXc4DRdt3OsMeQO1/O8nrtCoTGjGCle+mhstU6tahTNgzpWq9nT6NsFsDqOU4GlBRZn4NievWIRCGDFKfKXxe30un01SGzr7/Rct7KAolD+jmt7TY1hy3rqnfZPViSEgShQq0Ly8rXanZOOGkXyoEFbrmCFhWx/adB1rHaFhRRYBo7rZIUt70S5yW7tu2T9Qd7uO/PANtNUNcuNppqzvfdoVrxYY8b1bq19l18/ANfCV2cKYV081ITDWhioY4H18FALD3fBB1V/UBcfaMKDWphAxxLUwwda+GAXPK36aV081YTTWhiqY6H18FQLT3fBi6pf1MULTbiohRE6FlEPL7TwYhc8qfpJXTzRhJNaGKJjIfXwRAtPyvDyzyM75gNsZsdkgs0w2hwMSt8smh0fplCeJNfu9cQ9JK8LDL2Vx4r38owL5Bnv52QIWDALo1ReH4LhYdbaZQSLR6NsyauLrV5Uqo2r4yP79Mj6rXnw2tlcYp4ZPxg/Gj8ZttE2XhtvjDPj0oDGn8bfxj/Gv41fGqAxafyxtj59shnznVF6Gvx/zFMNMQ==</latexit>
vterrible vnot
<latexit sha1_base64="A9NwmAiYJPzpQe9MqsgsZWOdhMQ=">AAAN0nicfZdNb9s2HMbV7q3L6jXdjrsICwoMQxBIieMXDAVqSTZ6WNssyFsXBQFF04pgSiQoyrEj6DDssMu+wD7NrttH2LcZZTuyRFHWieDz8NGPf5I25VEcxNww/nvy9JNPP/v8i2df7nz1vPX1i92X31zEJGEQnUOCCbvyQIxwEKFzHnCMrihDIPQwuvSmdq5fzhCLAxKd8QVFNyHwo2ASQMBF1+3ukQu91PUm+iy7dTma85QjxgIxPNPdn/SaGhGe3e7uGQfG8tHrDXPd2NPWz8nty+d/7LhjApMQRRxiEMfXpkH5TQoYD6B41Y6bxIgCOAU+uk74pHeTBhFNOIpgpr8S2iTBOid6PgN9HDAEOV6IBoAsEAk6vAMMQEEe71SjYhSBEMX741lA41UznvmrBgdiljfpfFnErDIw9RmgdwGcV8hSEMYh4He1zngRetVOlGDEZmG1M6cUjJJzjhgM4rwGJ6IwH2i+LvEZOVnrdwt6h6I4SxOGs/JAIYiVQhMxcNmMEU9oupyM2AzT+DVnCdrPm8u+1w5g01M03hc5lY4qzgQTwKtdnjSN+UTUOq9XhO4hCUMQjVOXZulqg7j7B5kQX+keFn6PADauOk+zNHXzMnqefiqsFfF9SXwvi8OSOFy/hOCxPiFMn4ktQVisC6MuLCyAKK6OPi9GT/RzOfqiJF7I4mVJvJRFLympSU2dldRZTb0vqfeyOi+Jc1lclMRFLZeXVC6rDyXxQRavSuKVLH4siR9l8VcpVqyOOEULcfDRRPw4LfdcOoVZ+vbs3c9Z2l8+652SIN2sGqH3aDwadbr9bibL+FFvj3qm5dT1wtC1BqatMhSOQdc2nOGG5VDyFtCG0RkOOnIUxBu917dHdX0Dawy6jqUwbGhHdnvYXZcPoUiy+oXP7nfatbL4RU7fsqzjfl0vDFbbtnuHCkPhsB3HGdhLFJowipHkpY/GTufYqEfRIqhndNoDhb5ZAKNnWTVYWmKxRpbpmEsWjgCWnLzYLHZv0B/KQXxTf2tg27UF5KXy92zTaSsMG9Zj53h4tCQhDES+XBVSlK/T7R315ChSBI26YgVrLGTzplHfMro1FlJiGVm2lRe2ehLFIbs2b9LVD/Lm3Ol7pp5lslkcNNmcn71Hs+TFCjNudivt2/zqAbgRvj5TCJvioSIcNsJAFQtshodKeLgN3q/7/aZ4XxHuN8L4Kha/Gd5Xwvvb4GndT5viqSKcNsJQFQtthqdKeLoNntf9vCmeK8J5IwxXsfBmeK6E59vgSd1PmuKJIpw0whAVC2mGJ0p4UoUXfx75NR9gPb8mE6wH0fpiUPnNovn1YQrFTXLlXk3cQeJzgaF34lrxQdxxgbjj/Zi6gPlhEGXi88F39/PWNiOYPxpFS3y6mPKHSr1xcXhgHh8Yv7T33ljrj5hn2nfa99oPmql1tTfaW+1EO9eg9pf2t/aP9m/rrPXQ+q31+8r69Ml6zLda5Wn9+T+c/xeW</latexit>
⇥ ⇥ ⇥ ⇥ ⇥ ⇥
<latexit sha1_base64="VVOj9rnMzMPj3kMAppAgd9+7MrY=">AAANo3icfZdbb9s2HMXV7tZl9dpuj3sRFgQYhiCQEl8fCtSSbBTY2mZBbltsFBRNy5opkaAox66g573udftm+zajfJElirKeCJ7Dox//FG3SpdiPuGH89+TpZ59/8eVXz74++uZ549sXL199dxuRmEF0Awkm7N4FEcJ+iG64zzG6pwyBwMXozp3bmX63QCzySXjNVxSNA+CF/tSHgIuu2xH3AxR9fHlsnBnrR682zG3jWNs+lx9fPf/raDQhMA5QyCEGUfRgGpSPE8C4DzFKj0ZxhCiAc+Chh5hPu+PED2nMUQhT/URo0xjrnOgZkj7xGYIcr0QDQOaLBB3OAAOQC/CjclSEQiCITycLn0abZrTwNg0OxKzHyXJdlbQ0MPEYoDMfLktkCQiiAPBZpTNaBW65E8UYsUVQ7swoBaPkXCIG/SirwaUozAeaFTq6JpdbfbaiMxRGaRIznBYHCgExhqZi4LoZIR7TZD0Zsbrz6DVnMTrNmuu+1w5g8ys0ORU5pY4yzhQTwMtdrjSN5VTUOqtXiB4hCQIQTpIRTZMRR0uejE7PUiGe6C4WfpcANik7r9IkGWVldF39SlhL4vuC+F4WBwVxsH0JwRN9Spi+EJ8EYZEujLqwMB+iqDz6Jh891W/k6NuCeCuLdwXxThbduKDGFXVRUBcV9bGgPsrqsiAuZXFVEFeVXF5Quax+KoifZPH+0Et/P/TSP6RYsTpiF63ExkdT8Wuz/uaSOUyTt9fvfk2T3vrZfikx0s2yEbo748Ww3el1UlnGO7057JqWU9VzQ8fqm7bKkDv6HdtwBnuWc8mbQxtGe9Bvy1EQ7/Vuzx5W9T2s0e84lsKwpx3azUFnWz6EQsnq5T67125WyuLlOT3Lslq9qp4brKZtd88VhtxhO47Tt9coNGYUI8lLd8Z2u2VUo2ge1DXazb5C3y+A0bWsCiwtsFhDy3TMNQtHAEtOnn8sdrffG8hBfF9/q2/blQXkhfJ3bdNpKgx71pbTGlysSQgDoSdXheTla3e6F105iuRBw45YwQoL2b9p2LOMToWFFFiGlm1lhS3vRLHJHsxxsvlB3u87/djU01Q2i40mm7O9tzNLXqww43q30n7Irx6Aa+GrM4WwLh4qwmEtDFSxwHp4qISHh+C9qt+ri/cU4V4tjKdi8erhPSW8dwieVv20Lp4qwmktDFWx0Hp4qoSnh+B51c/r4rkinNfCcBULr4fnSnh+CJ5U/aQunijCSS0MUbGQeniihCdlePHnkR3zAdazYzLBuh9uDwal3yyaHR/mUJwkN+7NxB0krgsMvRPHig/ijAvEGe/nZASYF/hhKq4P3ug0ax0yguXOKFri6mLKF5Vq4/b8zGydGb81j99Y20vMM+0H7UftJ83UOtob7a12qd1oUPtT+1v7R/u3cdL4pXHVuN5Ynz7ZjvleKz2N8f+4KwRK</latexit> <latexit sha1_base64="VVOj9rnMzMPj3kMAppAgd9+7MrY=">AAANo3icfZdbb9s2HMXV7tZl9dpuj3sRFgQYhiCQEl8fCtSSbBTY2mZBbltsFBRNy5opkaAox66g573udftm+zajfJElirKeCJ7Dox//FG3SpdiPuGH89+TpZ59/8eVXz74++uZ549sXL199dxuRmEF0Awkm7N4FEcJ+iG64zzG6pwyBwMXozp3bmX63QCzySXjNVxSNA+CF/tSHgIuu2xH3AxR9fHlsnBnrR682zG3jWNs+lx9fPf/raDQhMA5QyCEGUfRgGpSPE8C4DzFKj0ZxhCiAc+Chh5hPu+PED2nMUQhT/URo0xjrnOgZkj7xGYIcr0QDQOaLBB3OAAOQC/CjclSEQiCITycLn0abZrTwNg0OxKzHyXJdlbQ0MPEYoDMfLktkCQiiAPBZpTNaBW65E8UYsUVQ7swoBaPkXCIG/SirwaUozAeaFTq6JpdbfbaiMxRGaRIznBYHCgExhqZi4LoZIR7TZD0Zsbrz6DVnMTrNmuu+1w5g8ys0ORU5pY4yzhQTwMtdrjSN5VTUOqtXiB4hCQIQTpIRTZMRR0uejE7PUiGe6C4WfpcANik7r9IkGWVldF39SlhL4vuC+F4WBwVxsH0JwRN9Spi+EJ8EYZEujLqwMB+iqDz6Jh891W/k6NuCeCuLdwXxThbduKDGFXVRUBcV9bGgPsrqsiAuZXFVEFeVXF5Quax+KoifZPH+0Et/P/TSP6RYsTpiF63ExkdT8Wuz/uaSOUyTt9fvfk2T3vrZfikx0s2yEbo748Ww3el1UlnGO7057JqWU9VzQ8fqm7bKkDv6HdtwBnuWc8mbQxtGe9Bvy1EQ7/Vuzx5W9T2s0e84lsKwpx3azUFnWz6EQsnq5T67125WyuLlOT3Lslq9qp4brKZtd88VhtxhO47Tt9coNGYUI8lLd8Z2u2VUo2ge1DXazb5C3y+A0bWsCiwtsFhDy3TMNQtHAEtOnn8sdrffG8hBfF9/q2/blQXkhfJ3bdNpKgx71pbTGlysSQgDoSdXheTla3e6F105iuRBw45YwQoL2b9p2LOMToWFFFiGlm1lhS3vRLHJHsxxsvlB3u87/djU01Q2i40mm7O9tzNLXqww43q30n7Irx6Aa+GrM4WwLh4qwmEtDFSxwHp4qISHh+C9qt+ri/cU4V4tjKdi8erhPSW8dwieVv20Lp4qwmktDFWx0Hp4qoSnh+B51c/r4rkinNfCcBULr4fnSnh+CJ5U/aQunijCSS0MUbGQeniihCdlePHnkR3zAdazYzLBuh9uDwal3yyaHR/mUJwkN+7NxB0krgsMvRPHig/ijAvEGe/nZASYF/hhKq4P3ug0ax0yguXOKFri6mLKF5Vq4/b8zGydGb81j99Y20vMM+0H7UftJ83UOtob7a12qd1oUPtT+1v7R/u3cdL4pXHVuN5Ynz7ZjvleKz2N8f+4KwRK</latexit> <latexit sha1_base64="VVOj9rnMzMPj3kMAppAgd9+7MrY=">AAANo3icfZdbb9s2HMXV7tZl9dpuj3sRFgQYhiCQEl8fCtSSbBTY2mZBbltsFBRNy5opkaAox66g573udftm+zajfJElirKeCJ7Dox//FG3SpdiPuGH89+TpZ59/8eVXz74++uZ549sXL199dxuRmEF0Awkm7N4FEcJ+iG64zzG6pwyBwMXozp3bmX63QCzySXjNVxSNA+CF/tSHgIuu2xH3AxR9fHlsnBnrR682zG3jWNs+lx9fPf/raDQhMA5QyCEGUfRgGpSPE8C4DzFKj0ZxhCiAc+Chh5hPu+PED2nMUQhT/URo0xjrnOgZkj7xGYIcr0QDQOaLBB3OAAOQC/CjclSEQiCITycLn0abZrTwNg0OxKzHyXJdlbQ0MPEYoDMfLktkCQiiAPBZpTNaBW65E8UYsUVQ7swoBaPkXCIG/SirwaUozAeaFTq6JpdbfbaiMxRGaRIznBYHCgExhqZi4LoZIR7TZD0Zsbrz6DVnMTrNmuu+1w5g8ys0ORU5pY4yzhQTwMtdrjSN5VTUOqtXiB4hCQIQTpIRTZMRR0uejE7PUiGe6C4WfpcANik7r9IkGWVldF39SlhL4vuC+F4WBwVxsH0JwRN9Spi+EJ8EYZEujLqwMB+iqDz6Jh891W/k6NuCeCuLdwXxThbduKDGFXVRUBcV9bGgPsrqsiAuZXFVEFeVXF5Quax+KoifZPH+0Et/P/TSP6RYsTpiF63ExkdT8Wuz/uaSOUyTt9fvfk2T3vrZfikx0s2yEbo748Ww3el1UlnGO7057JqWU9VzQ8fqm7bKkDv6HdtwBnuWc8mbQxtGe9Bvy1EQ7/Vuzx5W9T2s0e84lsKwpx3azUFnWz6EQsnq5T67125WyuLlOT3Lslq9qp4brKZtd88VhtxhO47Tt9coNGYUI8lLd8Z2u2VUo2ge1DXazb5C3y+A0bWsCiwtsFhDy3TMNQtHAEtOnn8sdrffG8hBfF9/q2/blQXkhfJ3bdNpKgx71pbTGlysSQgDoSdXheTla3e6F105iuRBw45YwQoL2b9p2LOMToWFFFiGlm1lhS3vRLHJHsxxsvlB3u87/djU01Q2i40mm7O9tzNLXqww43q30n7Irx6Aa+GrM4WwLh4qwmEtDFSxwHp4qISHh+C9qt+ri/cU4V4tjKdi8erhPSW8dwieVv20Lp4qwmktDFWx0Hp4qoSnh+B51c/r4rkinNfCcBULr4fnSnh+CJ5U/aQunijCSS0MUbGQeniihCdlePHnkR3zAdazYzLBuh9uDwal3yyaHR/mUJwkN+7NxB0krgsMvRPHig/ijAvEGe/nZASYF/hhKq4P3ug0ax0yguXOKFri6mLKF5Vq4/b8zGydGb81j99Y20vMM+0H7UftJ83UOtob7a12qd1oUPtT+1v7R/u3cdL4pXHVuN5Ynz7ZjvleKz2N8f+4KwRK</latexit> <latexit sha1_base64="VVOj9rnMzMPj3kMAppAgd9+7MrY=">AAANo3icfZdbb9s2HMXV7tZl9dpuj3sRFgQYhiCQEl8fCtSSbBTY2mZBbltsFBRNy5opkaAox66g573udftm+zajfJElirKeCJ7Dox//FG3SpdiPuGH89+TpZ59/8eVXz74++uZ549sXL199dxuRmEF0Awkm7N4FEcJ+iG64zzG6pwyBwMXozp3bmX63QCzySXjNVxSNA+CF/tSHgIuu2xH3AxR9fHlsnBnrR682zG3jWNs+lx9fPf/raDQhMA5QyCEGUfRgGpSPE8C4DzFKj0ZxhCiAc+Chh5hPu+PED2nMUQhT/URo0xjrnOgZkj7xGYIcr0QDQOaLBB3OAAOQC/CjclSEQiCITycLn0abZrTwNg0OxKzHyXJdlbQ0MPEYoDMfLktkCQiiAPBZpTNaBW65E8UYsUVQ7swoBaPkXCIG/SirwaUozAeaFTq6JpdbfbaiMxRGaRIznBYHCgExhqZi4LoZIR7TZD0Zsbrz6DVnMTrNmuu+1w5g8ys0ORU5pY4yzhQTwMtdrjSN5VTUOqtXiB4hCQIQTpIRTZMRR0uejE7PUiGe6C4WfpcANik7r9IkGWVldF39SlhL4vuC+F4WBwVxsH0JwRN9Spi+EJ8EYZEujLqwMB+iqDz6Jh891W/k6NuCeCuLdwXxThbduKDGFXVRUBcV9bGgPsrqsiAuZXFVEFeVXF5Quax+KoifZPH+0Et/P/TSP6RYsTpiF63ExkdT8Wuz/uaSOUyTt9fvfk2T3vrZfikx0s2yEbo748Ww3el1UlnGO7057JqWU9VzQ8fqm7bKkDv6HdtwBnuWc8mbQxtGe9Bvy1EQ7/Vuzx5W9T2s0e84lsKwpx3azUFnWz6EQsnq5T67125WyuLlOT3Lslq9qp4brKZtd88VhtxhO47Tt9coNGYUI8lLd8Z2u2VUo2ge1DXazb5C3y+A0bWsCiwtsFhDy3TMNQtHAEtOnn8sdrffG8hBfF9/q2/blQXkhfJ3bdNpKgx71pbTGlysSQgDoSdXheTla3e6F105iuRBw45YwQoL2b9p2LOMToWFFFiGlm1lhS3vRLHJHsxxsvlB3u87/djU01Q2i40mm7O9tzNLXqww43q30n7Irx6Aa+GrM4WwLh4qwmEtDFSxwHp4qISHh+C9qt+ri/cU4V4tjKdi8erhPSW8dwieVv20Lp4qwmktDFWx0Hp4qoSnh+B51c/r4rkinNfCcBULr4fnSnh+CJ5U/aQunijCSS0MUbGQeniihCdlePHnkR3zAdazYzLBuh9uDwal3yyaHR/mUJwkN+7NxB0krgsMvRPHig/ijAvEGe/nZASYF/hhKq4P3ug0ax0yguXOKFri6mLKF5Vq4/b8zGydGb81j99Y20vMM+0H7UftJ83UOtob7a12qd1oUPtT+1v7R/u3cdL4pXHVuN5Ynz7ZjvleKz2N8f+4KwRK</latexit> <latexit sha1_base64="VVOj9rnMzMPj3kMAppAgd9+7MrY=">AAANo3icfZdbb9s2HMXV7tZl9dpuj3sRFgQYhiCQEl8fCtSSbBTY2mZBbltsFBRNy5opkaAox66g573udftm+zajfJElirKeCJ7Dox//FG3SpdiPuGH89+TpZ59/8eVXz74++uZ549sXL199dxuRmEF0Awkm7N4FEcJ+iG64zzG6pwyBwMXozp3bmX63QCzySXjNVxSNA+CF/tSHgIuu2xH3AxR9fHlsnBnrR682zG3jWNs+lx9fPf/raDQhMA5QyCEGUfRgGpSPE8C4DzFKj0ZxhCiAc+Chh5hPu+PED2nMUQhT/URo0xjrnOgZkj7xGYIcr0QDQOaLBB3OAAOQC/CjclSEQiCITycLn0abZrTwNg0OxKzHyXJdlbQ0MPEYoDMfLktkCQiiAPBZpTNaBW65E8UYsUVQ7swoBaPkXCIG/SirwaUozAeaFTq6JpdbfbaiMxRGaRIznBYHCgExhqZi4LoZIR7TZD0Zsbrz6DVnMTrNmuu+1w5g8ys0ORU5pY4yzhQTwMtdrjSN5VTUOqtXiB4hCQIQTpIRTZMRR0uejE7PUiGe6C4WfpcANik7r9IkGWVldF39SlhL4vuC+F4WBwVxsH0JwRN9Spi+EJ8EYZEujLqwMB+iqDz6Jh891W/k6NuCeCuLdwXxThbduKDGFXVRUBcV9bGgPsrqsiAuZXFVEFeVXF5Quax+KoifZPH+0Et/P/TSP6RYsTpiF63ExkdT8Wuz/uaSOUyTt9fvfk2T3vrZfikx0s2yEbo748Ww3el1UlnGO7057JqWU9VzQ8fqm7bKkDv6HdtwBnuWc8mbQxtGe9Bvy1EQ7/Vuzx5W9T2s0e84lsKwpx3azUFnWz6EQsnq5T67125WyuLlOT3Lslq9qp4brKZtd88VhtxhO47Tt9coNGYUI8lLd8Z2u2VUo2ge1DXazb5C3y+A0bWsCiwtsFhDy3TMNQtHAEtOnn8sdrffG8hBfF9/q2/blQXkhfJ3bdNpKgx71pbTGlysSQgDoSdXheTla3e6F105iuRBw45YwQoL2b9p2LOMToWFFFiGlm1lhS3vRLHJHsxxsvlB3u87/djU01Q2i40mm7O9tzNLXqww43q30n7Irx6Aa+GrM4WwLh4qwmEtDFSxwHp4qISHh+C9qt+ri/cU4V4tjKdi8erhPSW8dwieVv20Lp4qwmktDFWx0Hp4qoSnh+B51c/r4rkinNfCcBULr4fnSnh+CJ5U/aQunijCSS0MUbGQeniihCdlePHnkR3zAdazYzLBuh9uDwal3yyaHR/mUJwkN+7NxB0krgsMvRPHig/ijAvEGe/nZASYF/hhKq4P3ug0ax0yguXOKFri6mLKF5Vq4/b8zGydGb81j99Y20vMM+0H7UftJ83UOtob7a12qd1oUPtT+1v7R/u3cdL4pXHVuN5Ynz7ZjvleKz2N8f+4KwRK</latexit> <latexit sha1_base64="VVOj9rnMzMPj3kMAppAgd9+7MrY=">AAANo3icfZdbb9s2HMXV7tZl9dpuj3sRFgQYhiCQEl8fCtSSbBTY2mZBbltsFBRNy5opkaAox66g573udftm+zajfJElirKeCJ7Dox//FG3SpdiPuGH89+TpZ59/8eVXz74++uZ549sXL199dxuRmEF0Awkm7N4FEcJ+iG64zzG6pwyBwMXozp3bmX63QCzySXjNVxSNA+CF/tSHgIuu2xH3AxR9fHlsnBnrR682zG3jWNs+lx9fPf/raDQhMA5QyCEGUfRgGpSPE8C4DzFKj0ZxhCiAc+Chh5hPu+PED2nMUQhT/URo0xjrnOgZkj7xGYIcr0QDQOaLBB3OAAOQC/CjclSEQiCITycLn0abZrTwNg0OxKzHyXJdlbQ0MPEYoDMfLktkCQiiAPBZpTNaBW65E8UYsUVQ7swoBaPkXCIG/SirwaUozAeaFTq6JpdbfbaiMxRGaRIznBYHCgExhqZi4LoZIR7TZD0Zsbrz6DVnMTrNmuu+1w5g8ys0ORU5pY4yzhQTwMtdrjSN5VTUOqtXiB4hCQIQTpIRTZMRR0uejE7PUiGe6C4WfpcANik7r9IkGWVldF39SlhL4vuC+F4WBwVxsH0JwRN9Spi+EJ8EYZEujLqwMB+iqDz6Jh891W/k6NuCeCuLdwXxThbduKDGFXVRUBcV9bGgPsrqsiAuZXFVEFeVXF5Quax+KoifZPH+0Et/P/TSP6RYsTpiF63ExkdT8Wuz/uaSOUyTt9fvfk2T3vrZfikx0s2yEbo748Ww3el1UlnGO7057JqWU9VzQ8fqm7bKkDv6HdtwBnuWc8mbQxtGe9Bvy1EQ7/Vuzx5W9T2s0e84lsKwpx3azUFnWz6EQsnq5T67125WyuLlOT3Lslq9qp4brKZtd88VhtxhO47Tt9coNGYUI8lLd8Z2u2VUo2ge1DXazb5C3y+A0bWsCiwtsFhDy3TMNQtHAEtOnn8sdrffG8hBfF9/q2/blQXkhfJ3bdNpKgx71pbTGlysSQgDoSdXheTla3e6F105iuRBw45YwQoL2b9p2LOMToWFFFiGlm1lhS3vRLHJHsxxsvlB3u87/djU01Q2i40mm7O9tzNLXqww43q30n7Irx6Aa+GrM4WwLh4qwmEtDFSxwHp4qISHh+C9qt+ri/cU4V4tjKdi8erhPSW8dwieVv20Lp4qwmktDFWx0Hp4qoSnh+B51c/r4rkinNfCcBULr4fnSnh+CJ5U/aQunijCSS0MUbGQeniihCdlePHnkR3zAdazYzLBuh9uDwal3yyaHR/mUJwkN+7NxB0krgsMvRPHig/ijAvEGe/nZASYF/hhKq4P3ug0ax0yguXOKFri6mLKF5Vq4/b8zGydGb81j99Y20vMM+0H7UftJ83UOtob7a12qd1oUPtT+1v7R/u3cdL4pXHVuN5Ynz7ZjvleKz2N8f+4KwRK</latexit>
vterrible vnot
<latexit sha1_base64="A9NwmAiYJPzpQe9MqsgsZWOdhMQ=">AAAN0nicfZdNb9s2HMbV7q3L6jXdjrsICwoMQxBIieMXDAVqSTZ6WNssyFsXBQFF04pgSiQoyrEj6DDssMu+wD7NrttH2LcZZTuyRFHWieDz8NGPf5I25VEcxNww/nvy9JNPP/v8i2df7nz1vPX1i92X31zEJGEQnUOCCbvyQIxwEKFzHnCMrihDIPQwuvSmdq5fzhCLAxKd8QVFNyHwo2ASQMBF1+3ukQu91PUm+iy7dTma85QjxgIxPNPdn/SaGhGe3e7uGQfG8tHrDXPd2NPWz8nty+d/7LhjApMQRRxiEMfXpkH5TQoYD6B41Y6bxIgCOAU+uk74pHeTBhFNOIpgpr8S2iTBOid6PgN9HDAEOV6IBoAsEAk6vAMMQEEe71SjYhSBEMX741lA41UznvmrBgdiljfpfFnErDIw9RmgdwGcV8hSEMYh4He1zngRetVOlGDEZmG1M6cUjJJzjhgM4rwGJ6IwH2i+LvEZOVnrdwt6h6I4SxOGs/JAIYiVQhMxcNmMEU9oupyM2AzT+DVnCdrPm8u+1w5g01M03hc5lY4qzgQTwKtdnjSN+UTUOq9XhO4hCUMQjVOXZulqg7j7B5kQX+keFn6PADauOk+zNHXzMnqefiqsFfF9SXwvi8OSOFy/hOCxPiFMn4ktQVisC6MuLCyAKK6OPi9GT/RzOfqiJF7I4mVJvJRFLympSU2dldRZTb0vqfeyOi+Jc1lclMRFLZeXVC6rDyXxQRavSuKVLH4siR9l8VcpVqyOOEULcfDRRPw4LfdcOoVZ+vbs3c9Z2l8+652SIN2sGqH3aDwadbr9bibL+FFvj3qm5dT1wtC1BqatMhSOQdc2nOGG5VDyFtCG0RkOOnIUxBu917dHdX0Dawy6jqUwbGhHdnvYXZcPoUiy+oXP7nfatbL4RU7fsqzjfl0vDFbbtnuHCkPhsB3HGdhLFJowipHkpY/GTufYqEfRIqhndNoDhb5ZAKNnWTVYWmKxRpbpmEsWjgCWnLzYLHZv0B/KQXxTf2tg27UF5KXy92zTaSsMG9Zj53h4tCQhDES+XBVSlK/T7R315ChSBI26YgVrLGTzplHfMro1FlJiGVm2lRe2ehLFIbs2b9LVD/Lm3Ol7pp5lslkcNNmcn71Hs+TFCjNudivt2/zqAbgRvj5TCJvioSIcNsJAFQtshodKeLgN3q/7/aZ4XxHuN8L4Kha/Gd5Xwvvb4GndT5viqSKcNsJQFQtthqdKeLoNntf9vCmeK8J5IwxXsfBmeK6E59vgSd1PmuKJIpw0whAVC2mGJ0p4UoUXfx75NR9gPb8mE6wH0fpiUPnNovn1YQrFTXLlXk3cQeJzgaF34lrxQdxxgbjj/Zi6gPlhEGXi88F39/PWNiOYPxpFS3y6mPKHSr1xcXhgHh8Yv7T33ljrj5hn2nfa99oPmql1tTfaW+1EO9eg9pf2t/aP9m/rrPXQ+q31+8r69Ml6zLda5Wn9+T+c/xeW</latexit>
BELLS AND WHISTLES: STANDARD SELF-ATTENTION The standard self attention add some bells and
whistles to this basic framework. We’ll discuss the
• scaled dot product three most important additions.
• key, value and query transformations
• multi-head attention
18
SCALED SELF-ATTENTION Scaled self attention is very simple: instead of
using the dot product, we use the dot product
scaled by the square root of the input dimension.
This ensures that the input and output of the self
attention operation have similar variance.
xi T xj
<latexit sha1_base64="vlnS2v28DxeK7y6kIZsKP0NONDY=">AAAN33icfZdNb9s2HMbV7q3L6i3djrsIC7oNQxBIieOXQ4BYko0e1jYL8tItygKKphXVtMhRlGNX0Lm3Ydd9gX2aXTdg32aUX2SJoqwTwefhox//Im3SoziIuGH89+jxBx9+9PEnTz7d+exp4/Mvdp99eRWRmEF0CQkm7I0HIoSDEF3ygGP0hjIEJh5G197YzvTrKWJRQMILPqfodgL8MBgFEHDRdbd76kKSPKTf3SXB21T/9kR3RwzAJHGhl7iz9C5If71Yt9+miRv9xrgQeTJO01R33Z273T3jwFg8erVhrhp72uo5u3v29P2OOyQwnqCQQwyi6MY0KL9NAOMBxCjdceMIUQDHwEc3MR91bpMgpDFHIUz150IbxVjnRM+mow8DhiDHc9EAkAUiQYf3QEyBi0nvlKMiFIIJivaH04BGy2Y09ZcNDkTFbpPZoqJpaWDiM0DvAzgrkSVgEk0Av690RvOJV+5EMUZsOil3ZpSCUXLOEINBlNXgTBTmNc0+UnRBzlb6/ZzeozBKk5jhtDhQCIgxNBIDF80I8Zgmi8mIlTGOTjiL0X7WXPSdOICNz9FwX+SUOso4I0wAL3d50jRm2XLJ6hWiB0gmExAOE5eKhcLRTCyU/YNUiM91Dwu/RwAblp3nqVhqWRk9Tz8X1pL4qiC+ksV+QeyvXkLwUB8Rpk/FkiAs0oVRFxYWQBSVR1/mo0f6pRx9VRCvZPG6IF7LohcX1LiiTgvqtKI+FNQHWZ0VxJkszgvivJLLCyqX1XcF8Z0svtn20p+3vfQXKVZ8HbGL5mLjo5H4pVqsuWQM0+TFxcsf06S7eFYrJUa6WTZCb208GrTa3XYqy3itNwcd03Kqem5oWz3TVhlyR69tG05/w3IoeXNow2j1ey05CuKN3unag6q+gTV6bcdSGDa0A7vZb6/Kh1AoWf3cZ3dbzUpZ/Dyna1nWcbeq5waradudQ4Uhd9iO4/TsBQqNGcVI8tK1sdU6NqpRNA/qGK1mT6FvPoDRsawKLC2wWAPLdMwFC0cAS06eLxa70+v25SC+qb/Vs+3KB+SF8nds02kqDBvWY+e4f7QgIQyEvlwVkpev1e4cdeQokgcN2uILVljI5k2DrmW0KyykwDKwbCsrbHknik12Y94myx/kzb7T90w9TWWz2GiyOdt7a7PkxQozrncr7dv86gG4Fr46Uwjr4qEiHNbCQBULrIeHSni4Dd6v+v26eF8R7tfC+CoWvx7eV8L72+Bp1U/r4qkinNbCUBULrYenSni6DZ5X/bwunivCeS0MV7HweniuhOfb4EnVT+riiSKc1MIQFQuphydKeFKGF38e2TEfYD07JhOsB+HqYFD6zaLZ8WEMxUly6V5O3EHiusDQS3GseC3OuECc8X5IXMD8SRCm4vrgu/tZa5sRzNZG0RJXF1O+qFQbV4cH5vGB8VNz79RaXWKeaF9r32jfa6bW1k61F9qZdqlB7S/tb+0f7d8GaLxv/N74Y2l9/Gg15iut9DT+/B9ehR1l</latexit>
0
wij = p Why √k? Imagine a vector in ℝk with values all c.
k <- inp
ut d im Its Euclidean length is √kc. Therefore, we are
ension dividing out the amount by which the increase in
dimension increases the length of the average
vectors. Transformer usually models apply
19
normalization at every layer, so we can usually
assume that the input is standard-normally
distributed.
the query
input vector.
the value • the key: the input vector that the query is
matched against to determine the weight.
this
20 restaurant was not too terrible
d['b'] = 3
a 1
<-
qu
er
b 2
y
c 3
21
ATTENTION AS A SOFT DICTIONARY If the dot product of only one query/key pair is
non-zero, we recover the operation of a normal
Attention is a soft dictionary dictionary.
• key, query and value are vectors
• every key matches the query to some extent
as determined by their dot-product
Self-attention
Attention with keys, queries and values from the same set.
22
KEY, QUERY AND VALUE TRANSFORMATIONS To give the self attention some more flexibility in
determining its behavior, we multiply each input
introduce matrices K, Q, V for linear transforms vector by three different k-by-k parameter
and associated biases
matrices, which gives us a different vector to act
+ as key query and value.
<latexit sha1_base64="hT+FQzXOLeuz6OSFWbXGYnQt0eU=">AAANnnicfZdNb9s2HMbVdi9dVq/tetxFWFBg2IJAShy/HArUkmz0sDRpECfZYqOgaFoRTIkERTl2BZ132HX7cPs2o/wiSxRlnQg+Dx/9+Kdoky7FfsQN478nT5999fU33z7/7uD7F40fXr56/eNNRGIG0RASTNidCyKE/RANuc8xuqMMgcDF6Nad2Zl+O0cs8kl4zZcUjQPghf7Uh4CLrk+/fX51aBwbq0evNsxN41DbPJefX7/462A0ITAOUMghBlF0bxqUjxPAuA8xSg9GcYQogDPgofuYTzvjxA9pzFEIU/2t0KYx1jnRMxp94jMEOV6KBoDMFwk6fAAMQC6YD8pREQpBgKKjydyn0boZzb11gwMx4XGyWBUkLQ1MPAbogw8XJbIEBFEA+EOlM1oGbrkTxRixeVDuzCgFo+RcIAb9KKvBpSjMBc1qHF2Ty43+sKQPKIzSJGY4LQ4UAmIMTcXAVTNCPKbJajJiYWfRO85idJQ1V33vHMBmV2hyJHJKHWWcKSaAl7tcaRqLqah1Vq8QPUISBCCcJCOaJiOOFjwZHR2nQnyru1j4XQLYpOy8SpNklJXRdfUrYS2JHwviR1nsF8T+5iUET/QpYfpcfBKERbow6sLCfIii8uhhPnqqD+Xom4J4I4u3BfFWFt24oMYVdV5Q5xX1saA+yuqiIC5kcVkQl5VcXlC5rH4piF9k8W7fS//Y99I/pVixOmIXLcXGR1PxQ7P65pIZTJMP1+e/p0l39Wy+lBjpZtkI3a3xdNBqd9upLOOt3hx0TMup6rmhbfVMW2XIHb22bTj9HcuJ5M2hDaPV77XkKIh3eqdrD6r6DtbotR1LYdjRDuxmv70pH0KhZPVyn91tNStl8fKcrmVZZ92qnhuspm13ThSG3GE7jtOzVyg0ZhQjyUu3xlbrzKhG0TyoY7SaPYW+WwCjY1kVWFpgsQaW6ZgrFo4Alpw8/1jsTq/bl4P4rv5Wz7YrC8gL5e/YptNUGHasZ85Z/3RFQhgIPbkqJC9fq9057chRJA8atMUKVljI7k2DrmW0KyykwDKwbCsrbHknik12b46T9Q/ybt/ph6aeprJZbDTZnO29rVnyYoUZ17uV9n1+9QBcC1+dKYR18VARDmthoIoF1sNDJTzcB+9V/V5dvKcI92phPBWLVw/vKeG9ffC06qd18VQRTmthqIqF1sNTJTzdB8+rfl4XzxXhvBaGq1h4PTxXwvN98KTqJ3XxRBFOamGIioXUwxMlPCnDiz+P7JgPsJ4dkwnW/XBzMCj9ZtHs+DCD4iS5dq8n7iBxXWDoXBwrLsQZF4gz3q/JCDAv8MNUXB+80VHW2mcEi61RtMTVxZQvKtXGzcmxeXZsfGoevrc2l5jn2k/az9ovmqm1tffaB+1SG2pQQ9rf2j/avw29MWicNy7W1qdPNmPeaKWncfc//1EBxQ==</latexit>
ki = Kxi + bk
<latexit sha1_base64="0oeRmgrsqlmLSmfsuGk8y9UcBQo=">AAAORnicfZfdbts2GIbl7q/Lmi7dDnciLCgwbEEgJY5/gAWoJdsosLVNg/xtURBQNK0IpkWWohy7go53KbuV3cJuYmfDTkfJtixRlHVE8n2/T48+kjbpUuyH3DD+bjz55NPPPv/i6Zc7Xz3bff713otvrkISMYguIcGE3bggRNgP0CX3OUY3lCEwdTG6did2ql/PEAt9ElzwBUV3U+AF/tiHgIuh+70/HQhjxx3rk+Te10/1dfeXxIFu1pqnwk+54Cb3E8fZcSDPuh/WYcvu+2oYX4d9cLI4mvVn67hl96oaR9dxszTufm/fODSyR682zFVjX1s9Z/cvnv2x44wIjKYo4BCDMLw1DcrvYsC4DzFKdpwoRBTACfDQbcTHnbvYD2jEUQAT/aXQxhHWOdHTqukjnyHI8UI0AGS+yKDDB8AA5KK2O+VUIQrAFIUHo5lPw2UznHnLBgdiYu7ieTZxSSkw9higDz6cl8hiMA2ngD9UBsPF1C0PoggjNpuWB1NKwSg554hBP0xrcCYK846mayG8IGcr/WFBH1AQJnHEcFIMFAJiDI1FYNYMEY9onH2MWICT8JSzCB2kzWzstA/Y5ByNDkSe0kAZZ4wJ4OUhV/qM+VjUOq1XgB4hmU5BMIodmsQOR3OxwA4OEyG+1F0s/C4BbFR2nidx7KRldF39XFhL4tuC+FYWBwVxsHoJwSN9TJg+E0uCsFAXRl1YmA9RWI6+zKPH+qWc+qogXsnidUG8lkU3KqhRRZ0V1FlFfSyoj7I6L4hzWVwUxEUlLy+oXFY/FsSPsniz7aW/bXvp71JaMTtiFy3Exkdj8YOYrbl4ApP49cWbX5O4mz2rlRIh3Swbobs2Hg9b7W47kWW81pvDjmn1q3puaFs901YZckevbRv9wYblSPLm0IbRGvRaciqIN3qnaw+r+gbW6LX7lsKwoR3azUF7VT6EAsnq5T6722pWyuLlebqWZZ10q3pusJq23TlSGHKH3e/3e3aGQiNGMZK8dG1stU6MaiqaJ+oYrWZPoW8mwOhYVgWWFlisoWX2zYyFI4AlJ88Xi93pdQdyIr6pv9Wz7coE8kL5O7bZbyoMG9aT/sngOCMhDASeXBWSl6/V7hx35FQkTzRsixmssJDNm4Zdy2hXWEiBZWjZVlrY8k4Um+zWvIuXP8ibfafvm3qSyGax0WRzuvfWZsmLFWZc71bat/nVAbgWvvqlENalh4rksBYGqlhgPTxUwsNt8F7V79Wl9xTJvVoYT8Xi1cN7SnhvGzyt+mldeqpITmthqIqF1sNTJTzdBs+rfl6XniuS81oYrmLh9fBcCc+3wZOqn9SlJ4rkpBaGqFhIPTxRwpMyvPjzSI/5AOvpMZlg3Q9WB4PSbxZNjw8Tca9ZuZcf3kfiusDQG3GseCfOuECc8X6MHcC8qR8k4vrgOQdpa5sRzNdG0RJXF1O+qFQbV0eH5smh8b65/8paXWKeat9p32s/aKbW1l5pr7Uz7VKDjd3GcePnxunuX7v/7P67+9/S+qSxivlWKz3Ptf8Bjrw9sg==</latexit>
kj
<latexit sha1_base64="cvZ1ZtTJtpEMMd0GFeF5nR2Utxo=">AAANqnicfZdNb9s2HMbV7q3L6q3djrsICwoMgxFIiV8PBWpJNnpY2yzIWxcbAUXTsmZK5CjKsSvovI+w6/ax9m1GObYsUZR1Ivg8fPTjn6JNuhT7ETeM/548/ezzL7786tnXR988b3z73YuX319HJGYQXUGCCbt1QYSwH6Ir7nOMbilDIHAxunEXdqbfLBGLfBJe8jVFkwB4oT/zIeCiazKGMBm7M32R3v9xdP/i2DgxNo9ebZjbxrG2fc7vXz7/62g8JTAOUMghBlF0ZxqUTxLAuA8xSo/GcYQogAvgobuYz3qTxA9pzFEIU/2V0GYx1jnRMzJ96jMEOV6LBoDMFwk6nAMGIBf8R+WoCIUgQFFzuvRp9NiMlt5jgwMx+Umy2hQnLQ1MPAbo3IerElkCgigAfF7pjNaBW+5EMUZsGZQ7M0rBKDlXiEE/ympwLgrzgWb1ji7J+Vafr+kchVGaxAynxYFCQIyhmRi4aUaIxzTZTEYs8iJ6zVmMmllz0/faAWxxgaZNkVPqKOPMMAG83OVK01jNRK2zeoXoAZIgAOE0GdM0GXO04sm4eZIK8ZXuYuF3CWDTsvMiTZJxVkbX1S+EtSS+L4jvZXFYEIfblxA81WeE6UvxSRAW6cKoCwvzIYrKo6/y0TP9So6+LojXsnhTEG9k0Y0LalxRlwV1WVEfCuqDrK4K4koW1wVxXcnlBZXL6qeC+EkWbw+99OOhl/4uxYrVEbtoLTY+mokfnc03lyxgmry9fPdrmvQ3z/ZLiZFulo3Q3RnPRp1uv5vKMt7prVHPtJyqnhu61sC0VYbcMejahjPcs5xK3hzaMDrDQUeOgniv9/r2qKrvYY1B17EUhj3tyG4Nu9vyIRRKVi/32f1Oq1IWL8/pW5bV7lf13GC1bLt3qjDkDttxnIG9QaExoxhJXrozdjptoxpF86Ce0WkNFPp+AYyeZVVgaYHFGlmmY25YOAJYcvL8Y7F7g/5QDuL7+lsD264sIC+Uv2ebTkth2LO2nfbwbENCGAg9uSokL1+n2zvryVEkDxp1xQpWWMj+TaO+ZXQrLKTAMrJsKytseSeKTXZnTpLHH+T9vtOPTT1NZbPYaLI523s7s+TFCjOudyvth/zqAbgWvjpTCOvioSIc1sJAFQush4dKeHgI3qv6vbp4TxHu1cJ4KhavHt5TwnuH4GnVT+viqSKc1sJQFQuth6dKeHoInlf9vC6eK8J5LQxXsfB6eK6E54fgSdVP6uKJIpzUwhAVC6mHJ0p4UoYXfx7ZMR9gPTsmE6z74fZgUPrNotnxYSFuGVv348QdJK4LDL0Tx4oP4owLxBnvl2QMmBf4YSquD964mbUOGcFqZxQtcXUx5YtKtXF9emK2T4zfWsdvrO0l5pn2o/aT9rNmal3tjfZWO9euNKj9qf2t/aP922g2LhofG3eP1qdPtmN+0EpPY/o/JugGrg==</latexit>
⇥ i i
<latexit sha1_base64="VVOj9rnMzMPj3kMAppAgd9+7MrY=">AAANo3icfZdbb9s2HMXV7tZl9dpuj3sRFgQYhiCQEl8fCtSSbBTY2mZBbltsFBRNy5opkaAox66g573udftm+zajfJElirKeCJ7Dox//FG3SpdiPuGH89+TpZ59/8eVXz74++uZ549sXL199dxuRmEF0Awkm7N4FEcJ+iG64zzG6pwyBwMXozp3bmX63QCzySXjNVxSNA+CF/tSHgIuu2xH3AxR9fHlsnBnrR682zG3jWNs+lx9fPf/raDQhMA5QyCEGUfRgGpSPE8C4DzFKj0ZxhCiAc+Chh5hPu+PED2nMUQhT/URo0xjrnOgZkj7xGYIcr0QDQOaLBB3OAAOQC/CjclSEQiCITycLn0abZrTwNg0OxKzHyXJdlbQ0MPEYoDMfLktkCQiiAPBZpTNaBW65E8UYsUVQ7swoBaPkXCIG/SirwaUozAeaFTq6JpdbfbaiMxRGaRIznBYHCgExhqZi4LoZIR7TZD0Zsbrz6DVnMTrNmuu+1w5g8ys0ORU5pY4yzhQTwMtdrjSN5VTUOqtXiB4hCQIQTpIRTZMRR0uejE7PUiGe6C4WfpcANik7r9IkGWVldF39SlhL4vuC+F4WBwVxsH0JwRN9Spi+EJ8EYZEujLqwMB+iqDz6Jh891W/k6NuCeCuLdwXxThbduKDGFXVRUBcV9bGgPsrqsiAuZXFVEFeVXF5Quax+KoifZPH+0Et/P/TSP6RYsTpiF63ExkdT8Wuz/uaSOUyTt9fvfk2T3vrZfikx0s2yEbo748Ww3el1UlnGO7057JqWU9VzQ8fqm7bKkDv6HdtwBnuWc8mbQxtGe9Bvy1EQ7/Vuzx5W9T2s0e84lsKwpx3azUFnWz6EQsnq5T67125WyuLlOT3Lslq9qp4brKZtd88VhtxhO47Tt9coNGYUI8lLd8Z2u2VUo2ge1DXazb5C3y+A0bWsCiwtsFhDy3TMNQtHAEtOnn8sdrffG8hBfF9/q2/blQXkhfJ3bdNpKgx71pbTGlysSQgDoSdXheTla3e6F105iuRBw45YwQoL2b9p2LOMToWFFFiGlm1lhS3vRLHJHsxxsvlB3u87/djU01Q2i40mm7O9tzNLXqww43q30n7Irx6Aa+GrM4WwLh4qwmEtDFSxwHp4qISHh+C9qt+ri/cU4V4tjKdi8erhPSW8dwieVv20Lp4qwmktDFWx0Hp4qoSnh+B51c/r4rkinNfCcBULr4fnSnh+CJ5U/aQunijCSS0MUbGQeniihCdlePHnkR3zAdazYzLBuh9uDwal3yyaHR/mUJwkN+7NxB0krgsMvRPHig/ijAvEGe/nZASYF/hhKq4P3ug0ax0yguXOKFri6mLKF5Vq4/b8zGydGb81j99Y20vMM+0H7UftJ83UOtob7a12qd1oUPtT+1v7R/u3cdL4pXHVuN5Ynz7ZjvleKz2N8f+4KwRK</latexit>
vi = Vxiqi = Qxi
vi = Vxi
23
24
split
W1
W2
0
wij = p
k head 1 key input vector, and then apply some smaller
head 1 input head 1 query transformations to turn this into a key, query and
head 2 input
head 1 value
value. However, since these are all linear
transformations, we can compose them into three
larger transformations, and compute the keys,
<latexit sha1_base64="vlnS2v28DxeK7y6kIZsKP0NONDY=">AAAN33icfZdNb9s2HMbV7q3L6i3djrsIC7oNQxBIieOXQ4BYko0e1jYL8tItygKKphXVtMhRlGNX0Lm3Ydd9gX2aXTdg32aUX2SJoqwTwefhox//Im3SoziIuGH89+jxBx9+9PEnTz7d+exp4/Mvdp99eRWRmEF0CQkm7I0HIoSDEF3ygGP0hjIEJh5G197YzvTrKWJRQMILPqfodgL8MBgFEHDRdbd76kKSPKTf3SXB21T/9kR3RwzAJHGhl7iz9C5If71Yt9+miRv9xrgQeTJO01R33Z273T3jwFg8erVhrhp72uo5u3v29P2OOyQwnqCQQwyi6MY0KL9NAOMBxCjdceMIUQDHwEc3MR91bpMgpDFHIUz150IbxVjnRM+mow8DhiDHc9EAkAUiQYf3QEyBi0nvlKMiFIIJivaH04BGy2Y09ZcNDkTFbpPZoqJpaWDiM0DvAzgrkSVgEk0Av690RvOJV+5EMUZsOil3ZpSCUXLOEINBlNXgTBTmNc0+UnRBzlb6/ZzeozBKk5jhtDhQCIgxNBIDF80I8Zgmi8mIlTGOTjiL0X7WXPSdOICNz9FwX+SUOso4I0wAL3d50jRm2XLJ6hWiB0gmExAOE5eKhcLRTCyU/YNUiM91Dwu/RwAblp3nqVhqWRk9Tz8X1pL4qiC+ksV+QeyvXkLwUB8Rpk/FkiAs0oVRFxYWQBSVR1/mo0f6pRx9VRCvZPG6IF7LohcX1LiiTgvqtKI+FNQHWZ0VxJkszgvivJLLCyqX1XcF8Z0svtn20p+3vfQXKVZ8HbGL5mLjo5H4pVqsuWQM0+TFxcsf06S7eFYrJUa6WTZCb208GrTa3XYqy3itNwcd03Kqem5oWz3TVhlyR69tG05/w3IoeXNow2j1ey05CuKN3unag6q+gTV6bcdSGDa0A7vZb6/Kh1AoWf3cZ3dbzUpZ/Dyna1nWcbeq5waradudQ4Uhd9iO4/TsBQqNGcVI8tK1sdU6NqpRNA/qGK1mT6FvPoDRsawKLC2wWAPLdMwFC0cAS06eLxa70+v25SC+qb/Vs+3KB+SF8nds02kqDBvWY+e4f7QgIQyEvlwVkpev1e4cdeQokgcN2uILVljI5k2DrmW0KyykwDKwbCsrbHknik12Y94myx/kzb7T90w9TWWz2GiyOdt7a7PkxQozrncr7dv86gG4Fr46Uwjr4qEiHNbCQBULrIeHSni4Dd6v+v26eF8R7tfC+CoWvx7eV8L72+Bp1U/r4qkinNbCUBULrYenSni6DZ5X/bwunivCeS0MV7HweniuhOfb4EnVT+riiSKc1MIQFQuphydKeFKGF38e2TEfYD07JhOsB+HqYFD6zaLZ8WEMxUly6V5O3EHiusDQS3GseC3OuECc8X5IXMD8SRCm4vrgu/tZa5sRzNZG0RJXF1O+qFQbV4cH5vGB8VNz79RaXWKeaF9r32jfa6bW1k61F9qZdqlB7S/tb+0f7d8GaLxv/N74Y2l9/Gg15iut9DT+/B9ehR1l</latexit>
xi T xj
<latexit sha1_base64="vlnS2v28DxeK7y6kIZsKP0NONDY=">AAAN33icfZdNb9s2HMbV7q3L6i3djrsIC7oNQxBIieOXQ4BYko0e1jYL8tItygKKphXVtMhRlGNX0Lm3Ydd9gX2aXTdg32aUX2SJoqwTwefhox//Im3SoziIuGH89+jxBx9+9PEnTz7d+exp4/Mvdp99eRWRmEF0CQkm7I0HIoSDEF3ygGP0hjIEJh5G197YzvTrKWJRQMILPqfodgL8MBgFEHDRdbd76kKSPKTf3SXB21T/9kR3RwzAJHGhl7iz9C5If71Yt9+miRv9xrgQeTJO01R33Z273T3jwFg8erVhrhp72uo5u3v29P2OOyQwnqCQQwyi6MY0KL9NAOMBxCjdceMIUQDHwEc3MR91bpMgpDFHIUz150IbxVjnRM+mow8DhiDHc9EAkAUiQYf3QEyBi0nvlKMiFIIJivaH04BGy2Y09ZcNDkTFbpPZoqJpaWDiM0DvAzgrkSVgEk0Av690RvOJV+5EMUZsOil3ZpSCUXLOEINBlNXgTBTmNc0+UnRBzlb6/ZzeozBKk5jhtDhQCIgxNBIDF80I8Zgmi8mIlTGOTjiL0X7WXPSdOICNz9FwX+SUOso4I0wAL3d50jRm2XLJ6hWiB0gmExAOE5eKhcLRTCyU/YNUiM91Dwu/RwAblp3nqVhqWRk9Tz8X1pL4qiC+ksV+QeyvXkLwUB8Rpk/FkiAs0oVRFxYWQBSVR1/mo0f6pRx9VRCvZPG6IF7LohcX1LiiTgvqtKI+FNQHWZ0VxJkszgvivJLLCyqX1XcF8Z0svtn20p+3vfQXKVZ8HbGL5mLjo5H4pVqsuWQM0+TFxcsf06S7eFYrJUa6WTZCb208GrTa3XYqy3itNwcd03Kqem5oWz3TVhlyR69tG05/w3IoeXNow2j1ey05CuKN3unag6q+gTV6bcdSGDa0A7vZb6/Kh1AoWf3cZ3dbzUpZ/Dyna1nWcbeq5waradudQ4Uhd9iO4/TsBQqNGcVI8tK1sdU6NqpRNA/qGK1mT6FvPoDRsawKLC2wWAPLdMwFC0cAS06eLxa70+v25SC+qb/Vs+3KB+SF8nds02kqDBvWY+e4f7QgIQyEvlwVkpev1e4cdeQokgcN2uILVljI5k2DrmW0KyykwDKwbCsrbHknik12Y94myx/kzb7T90w9TWWz2GiyOdt7a7PkxQozrncr7dv86gG4Fr46Uwjr4qEiHNbCQBULrIeHSni4Dd6v+v26eF8R7tfC+CoWvx7eV8L72+Bp1U/r4qkinNbCUBULrYenSni6DZ5X/bwunivCeS0MV7HweniuhOfb4EnVT+riiSKc1MIQFQuphydKeFKGF38e2TEfYD07JhOsB+HqYFD6zaLZ8WEMxUly6V5O3EHiusDQS3GseC3OuECc8X5IXMD8SRCm4vrgu/tZa5sRzNZG0RJXF1O+qFQbV4cH5vGB8VNz79RaXWKeaF9r32jfa6bW1k61F9qZdqlB7S/tb+0f7d8GaLxv/N74Y2l9/Gg15iut9DT+/B9ehR1l</latexit>
xi T xj
<latexit sha1_base64="vlnS2v28DxeK7y6kIZsKP0NONDY=">AAAN33icfZdNb9s2HMbV7q3L6i3djrsIC7oNQxBIieOXQ4BYko0e1jYL8tItygKKphXVtMhRlGNX0Lm3Ydd9gX2aXTdg32aUX2SJoqwTwefhox//Im3SoziIuGH89+jxBx9+9PEnTz7d+exp4/Mvdp99eRWRmEF0CQkm7I0HIoSDEF3ygGP0hjIEJh5G197YzvTrKWJRQMILPqfodgL8MBgFEHDRdbd76kKSPKTf3SXB21T/9kR3RwzAJHGhl7iz9C5If71Yt9+miRv9xrgQeTJO01R33Z273T3jwFg8erVhrhp72uo5u3v29P2OOyQwnqCQQwyi6MY0KL9NAOMBxCjdceMIUQDHwEc3MR91bpMgpDFHIUz150IbxVjnRM+mow8DhiDHc9EAkAUiQYf3QEyBi0nvlKMiFIIJivaH04BGy2Y09ZcNDkTFbpPZoqJpaWDiM0DvAzgrkSVgEk0Av690RvOJV+5EMUZsOil3ZpSCUXLOEINBlNXgTBTmNc0+UnRBzlb6/ZzeozBKk5jhtDhQCIgxNBIDF80I8Zgmi8mIlTGOTjiL0X7WXPSdOICNz9FwX+SUOso4I0wAL3d50jRm2XLJ6hWiB0gmExAOE5eKhcLRTCyU/YNUiM91Dwu/RwAblp3nqVhqWRk9Tz8X1pL4qiC+ksV+QeyvXkLwUB8Rpk/FkiAs0oVRFxYWQBSVR1/mo0f6pRx9VRCvZPG6IF7LohcX1LiiTgvqtKI+FNQHWZ0VxJkszgvivJLLCyqX1XcF8Z0svtn20p+3vfQXKVZ8HbGL5mLjo5H4pVqsuWQM0+TFxcsf06S7eFYrJUa6WTZCb208GrTa3XYqy3itNwcd03Kqem5oWz3TVhlyR69tG05/w3IoeXNow2j1ey05CuKN3unag6q+gTV6bcdSGDa0A7vZb6/Kh1AoWf3cZ3dbzUpZ/Dyna1nWcbeq5waradudQ4Uhd9iO4/TsBQqNGcVI8tK1sdU6NqpRNA/qGK1mT6FvPoDRsawKLC2wWAPLdMwFC0cAS06eLxa70+v25SC+qb/Vs+3KB+SF8nds02kqDBvWY+e4f7QgIQyEvlwVkpev1e4cdeQokgcN2uILVljI5k2DrmW0KyykwDKwbCsrbHknik12Y94myx/kzb7T90w9TWWz2GiyOdt7a7PkxQozrncr7dv86gG4Fr46Uwjr4qEiHNbCQBULrIeHSni4Dd6v+v26eF8R7tfC+CoWvx7eV8L72+Bp1U/r4qkinNbCUBULrYenSni6DZ5X/bwunivCeS0MV7HweniuhOfb4EnVT+riiSKc1MIQFQuphydKeFKGF38e2TEfYD07JhOsB+HqYFD6zaLZ8WEMxUly6V5O3EHiusDQS3GseC3OuECc8X5IXMD8SRCm4vrgu/tZa5sRzNZG0RJXF1O+qFQbV4cH5vGB8VNz79RaXWKeaF9r32jfa6bW1k61F9qZdqlB7S/tb+0f7d8GaLxv/N74Y2l9/Gg15iut9DT+/B9ehR1l</latexit>
27
Peter Bloem
Deep Learning 2020
dlvu.github.io
transformer:
Any sequence-based model that primarily uses self-attention to propagate
information along the time dimension.
more broadly:
Any model that primarily uses self-attention to propagate information
between the basic units of our instances.
pixels -> image transformer
graph nodes -> graph transformer
30
The basic building block of transformer models is
TRANSFORMER BLOCK
usually a simple transformer block.
class Block(nn.Module):
res
y = self.attention(x) layer normalization feed-forward layer applied individually to each
x = x + y +
<latexit sha1_base64="hT+FQzXOLeuz6OSFWbXGYnQt0eU=">AAANnnicfZdNb9s2HMbVdi9dVq/tetxFWFBg2IJAShy/HArUkmz0sDRpECfZYqOgaFoRTIkERTl2BZ132HX7cPs2o/wiSxRlnQg+Dx/9+Kdoky7FfsQN478nT5999fU33z7/7uD7F40fXr56/eNNRGIG0RASTNidCyKE/RANuc8xuqMMgcDF6Nad2Zl+O0cs8kl4zZcUjQPghf7Uh4CLrk+/fX51aBwbq0evNsxN41DbPJefX7/462A0ITAOUMghBlF0bxqUjxPAuA8xSg9GcYQogDPgofuYTzvjxA9pzFEIU/2t0KYx1jnRMxp94jMEOV6KBoDMFwk6fAAMQC6YD8pREQpBgKKjydyn0boZzb11gwMx4XGyWBUkLQ1MPAbogw8XJbIEBFEA+EOlM1oGbrkTxRixeVDuzCgFo+RcIAb9KKvBpSjMBc1qHF2Ty43+sKQPKIzSJGY4LQ4UAmIMTcXAVTNCPKbJajJiYWfRO85idJQ1V33vHMBmV2hyJHJKHWWcKSaAl7tcaRqLqah1Vq8QPUISBCCcJCOaJiOOFjwZHR2nQnyru1j4XQLYpOy8SpNklJXRdfUrYS2JHwviR1nsF8T+5iUET/QpYfpcfBKERbow6sLCfIii8uhhPnqqD+Xom4J4I4u3BfFWFt24oMYVdV5Q5xX1saA+yuqiIC5kcVkQl5VcXlC5rH4piF9k8W7fS//Y99I/pVixOmIXLcXGR1PxQ7P65pIZTJMP1+e/p0l39Wy+lBjpZtkI3a3xdNBqd9upLOOt3hx0TMup6rmhbfVMW2XIHb22bTj9HcuJ5M2hDaPV77XkKIh3eqdrD6r6DtbotR1LYdjRDuxmv70pH0KhZPVyn91tNStl8fKcrmVZZ92qnhuspm13ThSG3GE7jtOzVyg0ZhQjyUu3xlbrzKhG0TyoY7SaPYW+WwCjY1kVWFpgsQaW6ZgrFo4Alpw8/1jsTq/bl4P4rv5Wz7YrC8gL5e/YptNUGHasZ85Z/3RFQhgIPbkqJC9fq9057chRJA8atMUKVljI7k2DrmW0KyykwDKwbCsrbHknik12b46T9Q/ybt/ph6aeprJZbDTZnO29rVnyYoUZ17uV9n1+9QBcC1+dKYR18VARDmthoIoF1sNDJTzcB+9V/V5dvKcI92phPBWLVw/vKeG9ffC06qd18VQRTmthqIqF1sNTJTzdB8+rfl4XzxXhvBaGq1h4PTxXwvN98KTqJ3XxRBFOamGIioXUwxMlPCnDiz+P7JgPsJ4dkwnW/XBzMCj9ZtHs+DCD4iS5dq8n7iBxXWDoXBwrLsQZF4gz3q/JCDAv8MNUXB+80VHW2mcEi61RtMTVxZQvKtXGzcmxeXZsfGoevrc2l5jn2k/az9ovmqm1tffaB+1SG2pQQ9rf2j/avw29MWicNy7W1qdPNmPeaKWncfc//1EBxQ==</latexit>
token in the sequence and a layer normalization
self-attention
and residual connection for each.
res
y = self.layernorm(x)
layer normalization Note that the self-attention is the only operation
y = self.linear(x) in the block that propagates information across
return x + y the time dimension. The other layers operation
only on each token independently.s
{ybt }b,t : output vectors (one per timestep t and d batch instance b) in R
d
{ybt }b,t : output vectors (one per timestep t and
{ybt }batch: output
instance
vectors
b) in (one
R per timestep t and batch instance b) in Rd
b,t
, : learnable parameter vectors
, : learnable parameter vectors , : learnable parameter vectors vector representing
time
1 X bt
batch
µbt =
1 X bt
xi
µbt =
d
xi
µbt =
1 X bt
xi mean over token
mean over tokenNote that
mean over this does not propagate information
token
d i d
input features
i
1 X bt bt
=
1 X bt 2
(xi - µ)bt
i
1 X bt across the time dimension. That is still reserved for
variance over token
bt
= (xi - µ)2 d = (xi - µ)2 variance over token variance over token
d
xbt - µbt x̂bt =p
xbt - µbt
d
xbt - µbt
the self attention only.
standardize
x̂bt =p bt + ✏
bt
x̂ = p standardize standardize
bt + ✏ bt + ✏
ybt = T x̂bt + rescale
ybt = T x̂bt + ybt = T x̂bt + rescale While layer
rescale normalization tends to work a little
transformer block
transformer block
input embeddings
transformer block
34
inputs h e l l o ! !
The solution is simple: when we compute the
MASKING: MAKING SELF-ATTENTION CAUSAL
attention weights, we mask out any attention from