Attorney Docket No. TESLPO0S / P0920-2NUS.
APPLICATION FOR UNITED STATES PATENT
A COMPUTATIONAL ARRAY MICROPROCESSOR SYSTEM
WITH VARIABLE LATENCY MEMORY ACCES:
By Inventor(s):
Emil Talpes
San Mateo, CA
Peter Bannon
Woodside, CA
Kevin Hurd
Redwood City, CA
Assignee: Tesla, Ine.
VAN PELT, YI & JAMES LLP
10050 N, Foothill Blvd., Suite 200
Cupertino, CA 95014
Telephone (408) 973-2585A COMPUTATIONAL ARRAY MICROPROCESSOR SYSTEM
WITH VARIABLE LATENCY MEMORY ACCESS
CROSS REFERENCE TO OTHER APPLICATIONS
[0001 This application claims priority to U.S. Provisional Patent Application No.
62/635,399 entitled A COMPUTATIONAL ARRAY MICROPROCESSOR SYSTEM WITH
VARIABLE LATENCY MEMORY ACCESS filed February 26, 2018, and this application
claims priority to U.S. Provisional Patent Application No. 62/625,251 entitled VECTOR
COMPUTATIONAL UNIT filed February 1, 2018, and this application claims priority to U.S.
Provisional Patent Application No, 62/536,399 entitled ACCELERATED MATHEMATICAL,
ENGINE filed July 24, 2017, and this application is a continuation-in-part of co-pending U.S.
Patent Application No. 15/710,433 entitled ACCELERATED MATHEMATICAL ENGINE.
filed September 20, 2017, which claims priori
to US. Provisional Patent Application No.
62/536,399 entitled ACCELERATED MATHEMATICAL ENGINE filed July 24, 2017, all of
which are incorporated herein by reference for all purposes.
BA
GRO!
IND OF THE INVENTIO!
0002] Performing inference on a machine learning model typically requires retrieving
data from memory and applying one or more computational array operations on the data.
Applications of machine learning, such as those targeting self-driving and driver-assisted
automobiles, often utilize computational array operations to calculate matrix and vector results.
These operations require loading data, such captured sensor data, and performing image
processing to identify key features, such as lane markers and other objects in a scene.
Traditionally, these operations may be implemented using a generic microprocessor system that
loads the computation data from memory before performing a computational array instruction
While the data is loading, the microprocessor system often
its idle, The software platform
running these applications will initiate the computational array instruction once the data has,
completed loading. The length of stalls and the time required to synchronize the computational
tomy Dacket No, TESLPUOS / PO920-2NUIS 1 Pariseoperation with the retrieved data can be particularly long for when accessing variable latency
memory. Stalls and synchronization efforts by the software platform reduce the efficiency of the
microprocessor system and result in higher power consumption and lower throughput.
Therefore, there exists a need for a microprocessor system with increased throughput that
performs array computational operations using variable latency memory access
tomy Dacket No, TESLPUOS / PO920-2NUIS 2 Parise
Nimish Shah, Wannes Meert, Marian Verhelst - Efficient Execution of Irregular Dataflow Graphs. Hardware_Software Co-optimization for Probabilistic AI and Sparse Linear Algebra-Springer (155)