User profiles for Ian Karlin
![]() | Ian KarlinLawrence Livermore National Laboratory Verified email at colorado.edu Cited by 2327 |
[PDF][PDF] Lulesh 2.0 updates and changes
The Livermore Unstructured Lagrange Explicit Shock Hydrodynamics (LULESH) proxy
application [1] is being developed as part of the NNSA Advanced Scientific Computing (ASC) …
application [1] is being developed as part of the NNSA Advanced Scientific Computing (ASC) …
Exploring traditional and emerging parallel programming models using a proxy application
Parallel machines are becoming more complex with increasing core counts and more
heterogeneous architectures. However, the commonly used parallel programming models, C/C++ …
heterogeneous architectures. However, the commonly used parallel programming models, C/C++ …
The design, deployment, and evaluation of the CORAL pre-exascale systems
…, B Hanson, B Hartner, I Karlin… - … Conference for High …, 2018 - ieeexplore.ieee.org
CORAL, the Collaboration of Oak Ridge, Argonne and Livermore, is fielding two similar IBM
systems, Summit and Sierra, with NVIDIA GPUs that will replace the existing Titan and …
systems, Summit and Sierra, with NVIDIA GPUs that will replace the existing Titan and …
Lulesh programming model and performance ports overview
… was done by Ian Karlin and Jim McGraw. Jeff Keasler and Ian Karlin created the pure C
version of the code. Ian Karlin is responsible for the transactional memory and critical section …
version of the code. Ian Karlin is responsible for the transactional memory and critical section …
DataRaceBench: a benchmark suite for systematic evaluation of data race detection tools
Data races in multi-threaded parallel applications are notoriously damaging while extremely
difficult to detect. Many tools have been developed to help programmers find data races. …
difficult to detect. Many tools have been developed to help programmers find data races. …
High-performance tensor contractions for GPUs
We present a computational framework for high-performance tensor contractions on GPUs.
High-performance is difficult to obtain using existing libraries, especially for many …
High-performance is difficult to obtain using existing libraries, especially for many …
Efficient exascale discretizations: High-order finite element methods
Efficient exploitation of exascale architectures requires rethinking of the numerical algorithms
used in many large-scale applications. These architectures favor algorithms that expose …
used in many large-scale applications. These architectures favor algorithms that expose …
Fast multi-parameter performance modeling
Tuning large applications requires a clever exploration of the design and configuration
space. Especially on supercomputers, this space is so large that its exhaustive traversal via …
space. Especially on supercomputers, this space is so large that its exhaustive traversal via …
Predicting the performance impact of different fat-tree configurations
The fat-tree topology is one of the most commonly used network topologies in HPC systems.
Vendors support several options that can be configured when deploying fat-tree networks …
Vendors support several options that can be configured when deploying fat-tree networks …
Enabling rapid COVID-19 small molecule drug design through scalable deep learning of generative models
We improved the quality and reduced the time to produce machine learned models for use
in small molecule antiviral design. Our globally asynchronous multi-level parallel training …
in small molecule antiviral design. Our globally asynchronous multi-level parallel training …