This library is part of my lecture "Software Basics for High Performance Computing" (MATH9367) at Ulm University:
http://www.mathematik.uni-ulm.de/~lehn/ulmBLAS
http://www.mathematik.uni-ulm.de/~lehn/sghpc
And yes, I am particularly proud of the section demonstrating how to achieve peak performance for the matrix-matrix product:
http://www.mathematik.uni-ulm.de/~lehn/sghpc/gemm/index.html
Note: Further development will take place in ulmBLAS-core