-
LVLM-Interpret: An Interpretability Tool for Large Vision-Language Models
Authors:
Gabriela Ben Melech Stan,
Estelle Aflalo,
Raanan Yehezkel Rohekar,
Anahita Bhiwandiwalla,
Shao-Yen Tseng,
Matthew Lyle Olson,
Yaniv Gurwicz,
Chenfei Wu,
Nan Duan,
Vasudev Lal
Abstract:
In the rapidly evolving landscape of artificial intelligence, multi-modal large language models are emerging as a significant area of interest. These models, which combine various forms of data input, are becoming increasingly popular. However, understanding their internal mechanisms remains a complex task. Numerous advancements have been made in the field of explainability tools and mechanisms, y…
▽ More
In the rapidly evolving landscape of artificial intelligence, multi-modal large language models are emerging as a significant area of interest. These models, which combine various forms of data input, are becoming increasingly popular. However, understanding their internal mechanisms remains a complex task. Numerous advancements have been made in the field of explainability tools and mechanisms, yet there is still much to explore. In this work, we present a novel interactive application aimed towards understanding the internal mechanisms of large vision-language models. Our interface is designed to enhance the interpretability of the image patches, which are instrumental in generating an answer, and assess the efficacy of the language model in grounding its output in the image. With our application, a user can systematically investigate the model and uncover system limitations, paving the way for enhancements in system capabilities. Finally, we present a case study of how our application can aid in understanding failure mechanisms in a popular large multi-modal model: LLaVA.
△ Less
Submitted 24 June, 2024; v1 submitted 3 April, 2024;
originally announced April 2024.
-
Getting it Right: Improving Spatial Consistency in Text-to-Image Models
Authors:
Agneet Chatterjee,
Gabriela Ben Melech Stan,
Estelle Aflalo,
Sayak Paul,
Dhruba Ghosh,
Tejas Gokhale,
Ludwig Schmidt,
Hannaneh Hajishirzi,
Vasudev Lal,
Chitta Baral,
Yezhou Yang
Abstract:
One of the key shortcomings in current text-to-image (T2I) models is their inability to consistently generate images which faithfully follow the spatial relationships specified in the text prompt. In this paper, we offer a comprehensive investigation of this limitation, while also developing datasets and methods that achieve state-of-the-art performance. First, we find that current vision-language…
▽ More
One of the key shortcomings in current text-to-image (T2I) models is their inability to consistently generate images which faithfully follow the spatial relationships specified in the text prompt. In this paper, we offer a comprehensive investigation of this limitation, while also developing datasets and methods that achieve state-of-the-art performance. First, we find that current vision-language datasets do not represent spatial relationships well enough; to alleviate this bottleneck, we create SPRIGHT, the first spatially-focused, large scale dataset, by re-captioning 6 million images from 4 widely used vision datasets. Through a 3-fold evaluation and analysis pipeline, we find that SPRIGHT largely improves upon existing datasets in capturing spatial relationships. To demonstrate its efficacy, we leverage only ~0.25% of SPRIGHT and achieve a 22% improvement in generating spatially accurate images while also improving the FID and CMMD scores. Secondly, we find that training on images containing a large number of objects results in substantial improvements in spatial consistency. Notably, we attain state-of-the-art on T2I-CompBench with a spatial score of 0.2133, by fine-tuning on <500 images. Finally, through a set of controlled experiments and ablations, we document multiple findings that we believe will enhance the understanding of factors that affect spatial consistency in text-to-image models. We publicly release our dataset and model to foster further research in this area.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
Hot-LEGO: Architect Microfluidic Cooling Equipped 3DICs with Pre-RTL Thermal Simulation
Authors:
Runxi Wang,
Jun-Han Han,
Mircea Stan,
Xinfei Guo
Abstract:
Microfluidic cooling has been recognized as one of the most promising solutions to achieve efficient thermal management for three-dimensional integrated circuits (3DICs). It enables more opportunities to architect 3DICs with different die configurations. It becomes increasingly important to perform thermal analysis in the early design phases to validate the architectural design decisions. This is…
▽ More
Microfluidic cooling has been recognized as one of the most promising solutions to achieve efficient thermal management for three-dimensional integrated circuits (3DICs). It enables more opportunities to architect 3DICs with different die configurations. It becomes increasingly important to perform thermal analysis in the early design phases to validate the architectural design decisions. This is even more critical for microfluidic cooling equipped 3DICs as the embedded cooling structures greatly influence the performance, power, and reliability of the stacked system. We exploited the existing architectural simulators and developed a Pre-register-transfer-level (Pre-RTL) thermal simulation methodology named Hot-LEGO that integrates these tools with their latest features such as support for microfluidic cooling and 3DIC stacking configurations. This methodology differs from existing ones by looking into the design granularity at a much finer level which enables the exploration of unique architecture combinations across the vertical stack. Though architectural-level simulators are not designed for signoff-calibre, it offers speed and agility which are imperative for early design space exploration. We claim that this ongoing work will speed up the co-design cycle of microfluidic cooling and offer a portable methodology for architects to perform exhaustive search for the optimal microarchitecture solutions in 3DICs.
△ Less
Submitted 29 March, 2024;
originally announced March 2024.
-
Learning Long Sequences in Spiking Neural Networks
Authors:
Matei Ioan Stan,
Oliver Rhodes
Abstract:
Spiking neural networks (SNNs) take inspiration from the brain to enable energy-efficient computations. Since the advent of Transformers, SNNs have struggled to compete with artificial networks on modern sequential tasks, as they inherit limitations from recurrent neural networks (RNNs), with the added challenge of training with non-differentiable binary spiking activations. However, a recent rene…
▽ More
Spiking neural networks (SNNs) take inspiration from the brain to enable energy-efficient computations. Since the advent of Transformers, SNNs have struggled to compete with artificial networks on modern sequential tasks, as they inherit limitations from recurrent neural networks (RNNs), with the added challenge of training with non-differentiable binary spiking activations. However, a recent renewed interest in efficient alternatives to Transformers has given rise to state-of-the-art recurrent architectures named state space models (SSMs). This work systematically investigates, for the first time, the intersection of state-of-the-art SSMs with SNNs for long-range sequence modelling. Results suggest that SSM-based SNNs can outperform the Transformer on all tasks of a well-established long-range sequence modelling benchmark. It is also shown that SSM-based SNNs can outperform current state-of-the-art SNNs with fewer parameters on sequential image classification. Finally, a novel feature mixing layer is introduced, improving SNN accuracy while challenging assumptions about the role of binary activations in SNNs. This work paves the way for deploying powerful SSM-based architectures, such as large language models, to neuromorphic hardware for energy-efficient long-range sequence modelling.
△ Less
Submitted 14 December, 2023;
originally announced January 2024.
-
LDM3D-VR: Latent Diffusion Model for 3D VR
Authors:
Gabriela Ben Melech Stan,
Diana Wofk,
Estelle Aflalo,
Shao-Yen Tseng,
Zhipeng Cai,
Michael Paulitsch,
Vasudev Lal
Abstract:
Latent diffusion models have proven to be state-of-the-art in the creation and manipulation of visual outputs. However, as far as we know, the generation of depth maps jointly with RGB is still limited. We introduce LDM3D-VR, a suite of diffusion models targeting virtual reality development that includes LDM3D-pano and LDM3D-SR. These models enable the generation of panoramic RGBD based on textual…
▽ More
Latent diffusion models have proven to be state-of-the-art in the creation and manipulation of visual outputs. However, as far as we know, the generation of depth maps jointly with RGB is still limited. We introduce LDM3D-VR, a suite of diffusion models targeting virtual reality development that includes LDM3D-pano and LDM3D-SR. These models enable the generation of panoramic RGBD based on textual prompts and the upscaling of low-resolution inputs to high-resolution RGBD, respectively. Our models are fine-tuned from existing pretrained models on datasets containing panoramic/high-resolution RGB images, depth maps and captions. Both models are evaluated in comparison to existing related methods.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
LDM3D: Latent Diffusion Model for 3D
Authors:
Gabriela Ben Melech Stan,
Diana Wofk,
Scottie Fox,
Alex Redden,
Will Saxton,
Jean Yu,
Estelle Aflalo,
Shao-Yen Tseng,
Fabio Nonato,
Matthias Muller,
Vasudev Lal
Abstract:
This research paper proposes a Latent Diffusion Model for 3D (LDM3D) that generates both image and depth map data from a given text prompt, allowing users to generate RGBD images from text prompts. The LDM3D model is fine-tuned on a dataset of tuples containing an RGB image, depth map and caption, and validated through extensive experiments. We also develop an application called DepthFusion, which…
▽ More
This research paper proposes a Latent Diffusion Model for 3D (LDM3D) that generates both image and depth map data from a given text prompt, allowing users to generate RGBD images from text prompts. The LDM3D model is fine-tuned on a dataset of tuples containing an RGB image, depth map and caption, and validated through extensive experiments. We also develop an application called DepthFusion, which uses the generated RGB images and depth maps to create immersive and interactive 360-degree-view experiences using TouchDesigner. This technology has the potential to transform a wide range of industries, from entertainment and gaming to architecture and design. Overall, this paper presents a significant contribution to the field of generative AI and computer vision, and showcases the potential of LDM3D and DepthFusion to revolutionize content creation and digital experiences. A short video summarizing the approach can be found at https://t.ly/tdi2.
△ Less
Submitted 21 May, 2023; v1 submitted 18 May, 2023;
originally announced May 2023.
-
MuMUR : Multilingual Multimodal Universal Retrieval
Authors:
Avinash Madasu,
Estelle Aflalo,
Gabriela Ben Melech Stan,
Shachar Rosenman,
Shao-Yen Tseng,
Gedas Bertasius,
Vasudev Lal
Abstract:
Multi-modal retrieval has seen tremendous progress with the development of vision-language models. However, further improving these models require additional labelled data which is a huge manual effort. In this paper, we propose a framework MuMUR, that utilizes knowledge transfer from a multilingual model to boost the performance of multi-modal (image and video) retrieval. We first use state-of-th…
▽ More
Multi-modal retrieval has seen tremendous progress with the development of vision-language models. However, further improving these models require additional labelled data which is a huge manual effort. In this paper, we propose a framework MuMUR, that utilizes knowledge transfer from a multilingual model to boost the performance of multi-modal (image and video) retrieval. We first use state-of-the-art machine translation models to construct pseudo ground-truth multilingual visual-text pairs. We then use this data to learn a joint vision-text representation where English and non-English text queries are represented in a common embedding space based on pretrained multilingual models. We evaluate our proposed approach on a diverse set of retrieval datasets: five video retrieval datasets such as MSRVTT, MSVD, DiDeMo, Charades and MSRVTT multilingual, two image retrieval datasets such as Flickr30k and Multi30k . Experimental results demonstrate that our approach achieves state-of-the-art results on all video retrieval datasets outperforming previous models. Additionally, our framework MuMUR significantly beats other multilingual video retrieval dataset. We also observe that MuMUR exhibits strong performance on image retrieval. This demonstrates the universal ability of MuMUR to perform retrieval across all visual inputs (image and video) and text inputs (monolingual and multilingual).
△ Less
Submitted 19 September, 2023; v1 submitted 24 August, 2022;
originally announced August 2022.
-
Flame Stability Analysis of Flame Spray Pyrolysis by Artificial Intelligence
Authors:
Jessica Pan,
Joseph A. Libera,
Noah H. Paulson,
Marius Stan
Abstract:
Flame spray pyrolysis (FSP) is a process used to synthesize nanoparticles through the combustion of an atomized precursor solution; this process has applications in catalysts, battery materials, and pigments. Current limitations revolve around understanding how to consistently achieve a stable flame and the reliable production of nanoparticles. Machine learning and artificial intelligence algorith…
▽ More
Flame spray pyrolysis (FSP) is a process used to synthesize nanoparticles through the combustion of an atomized precursor solution; this process has applications in catalysts, battery materials, and pigments. Current limitations revolve around understanding how to consistently achieve a stable flame and the reliable production of nanoparticles. Machine learning and artificial intelligence algorithms that detect unstable flame conditions in real time may be a means of streamlining the synthesis process and improving FSP efficiency. In this study, the FSP flame stability is first quantified by analyzing the brightness of the flame's anchor point. This analysis is then used to label data for both unsupervised and supervised machine learning approaches. The unsupervised learning approach allows for autonomous labelling and classification of new data by representing data in a reduced dimensional space and identifying combinations of features that most effectively cluster it. The supervised learning approach, on the other hand, requires human labeling of training and test data, but is able to classify multiple objects of interest (such as the burner and pilot flames) within the video feed. The accuracy of each of these techniques is compared against the evaluations of human experts. Both the unsupervised and supervised approaches can track and classify FSP flame conditions in real time to alert users of unstable flame conditions. This research has the potential to autonomously track and manage flame spray pyrolysis as well as other flame technologies by monitoring and classifying the flame stability.
△ Less
Submitted 22 October, 2020;
originally announced November 2020.
-
Towards Online Steering of Flame Spray Pyrolysis Nanoparticle Synthesis
Authors:
Maksim Levental,
Ryan Chard,
Joseph A. Libera,
Kyle Chard,
Aarthi Koripelly,
Jakob R. Elias,
Marcus Schwarting,
Ben Blaiszik,
Marius Stan,
Santanu Chaudhuri,
Ian Foster
Abstract:
Flame Spray Pyrolysis (FSP) is a manufacturing technique to mass produce engineered nanoparticles for applications in catalysis, energy materials, composites, and more. FSP instruments are highly dependent on a number of adjustable parameters, including fuel injection rate, fuel-oxygen mixtures, and temperature, which can greatly affect the quality, quantity, and properties of the yielded nanopart…
▽ More
Flame Spray Pyrolysis (FSP) is a manufacturing technique to mass produce engineered nanoparticles for applications in catalysis, energy materials, composites, and more. FSP instruments are highly dependent on a number of adjustable parameters, including fuel injection rate, fuel-oxygen mixtures, and temperature, which can greatly affect the quality, quantity, and properties of the yielded nanoparticles. Optimizing FSP synthesis requires monitoring, analyzing, characterizing, and modifying experimental conditions.Here, we propose a hybrid CPU-GPU Difference of Gaussians (DoG)method for characterizing the volume distribution of unburnt solution, so as to enable near-real-time optimization and steering of FSP experiments. Comparisons against standard implementations show our method to be an order of magnitude more efficient. This surrogate signal can be deployed as a component of an online end-to-end pipeline that maximizes the synthesis yield.
△ Less
Submitted 16 October, 2020;
originally announced October 2020.
-
An Experimentally Driven Automated Machine Learned lnter-Atomic Potential for a Refractory Oxide
Authors:
Ganesh Sivaraman,
Leighanne Gallington,
Anand Narayanan Krishnamoorthy,
Marius Stan,
Gabor Csanyi,
Alvaro Vazquez-Mayagoitia,
Chris J. Benmore
Abstract:
Understanding the structure and properties of refractory oxides are critical for high temperature applications. In this work, a combined experimental and simulation approach uses an automated closed loop via an active-learner, which is initialized by X-ray and neutron diffraction measurements, and sequentially improves a machine-learning model until the experimentally predetermined phase space is…
▽ More
Understanding the structure and properties of refractory oxides are critical for high temperature applications. In this work, a combined experimental and simulation approach uses an automated closed loop via an active-learner, which is initialized by X-ray and neutron diffraction measurements, and sequentially improves a machine-learning model until the experimentally predetermined phase space is covered. A multi-phase potential is generated for a canonical example of the archetypal refractory oxide, HfO2, by drawing a minimum number of training configurations from room temperature to the liquid state at ~2900oC. The method significantly reduces model development time and human effort.
△ Less
Submitted 8 September, 2020;
originally announced September 2020.
-
Temporal Memory with Magnetic Racetracks
Authors:
Hamed Vakili,
Mohammad Nazmus Sakib,
Samiran Ganguly,
Mircea Stan,
Matthew W. Daniels,
Advait Madhavan,
Mark D. Stiles,
Avik W. Ghosh
Abstract:
Race logic is a relative timing code that represents information in a wavefront of digital edges on a set of wires in order to accelerate dynamic programming and machine learning algorithms. Skyrmions, bubbles, and domain walls are mobile magnetic configurations (solitons) with applications for Boolean data storage. We propose to use current-induced displacement of these solitons on magnetic racet…
▽ More
Race logic is a relative timing code that represents information in a wavefront of digital edges on a set of wires in order to accelerate dynamic programming and machine learning algorithms. Skyrmions, bubbles, and domain walls are mobile magnetic configurations (solitons) with applications for Boolean data storage. We propose to use current-induced displacement of these solitons on magnetic racetracks as a native temporal memory for race logic computing. Locally synchronized racetracks can spatially store relative timings of digital edges and provide non-destructive read-out. The linear kinematics of skyrmion motion, the tunability and low-voltage asynchronous operation of the proposed device, and the elimination of any need for constant skyrmion nucleation make these magnetic racetracks a natural memory for low-power, high-throughput race logic applications.
△ Less
Submitted 21 May, 2020;
originally announced May 2020.
-
Memristive Learning Cellular Automata: Theory and Applications
Authors:
Rafailia-Eleni Karamani,
Iosif-Angelos Fyrigos,
Vasileios Ntinas,
Orestis Liolis,
Giorgos Dimitrakopoulos,
Mustafa Altun,
Andrew Adamatzky,
Mircea R. Stan,
Georgios Ch. Sirakoulis
Abstract:
Memristors are novel non volatile devices that manage to combine storing and processing capabilities in the same physical place.Their nanoscale dimensions and low power consumption enable the further design of various nanoelectronic processing circuits and corresponding computing architectures, like neuromorhpic, in memory, unconventional, etc.One of the possible ways to exploit the memristor's ad…
▽ More
Memristors are novel non volatile devices that manage to combine storing and processing capabilities in the same physical place.Their nanoscale dimensions and low power consumption enable the further design of various nanoelectronic processing circuits and corresponding computing architectures, like neuromorhpic, in memory, unconventional, etc.One of the possible ways to exploit the memristor's advantages is by combining them with Cellular Automata (CA).CA constitute a well known non von Neumann computing architecture that is based on the local interconnection of simple identical cells forming N-dimensional grids.These local interconnections allow the emergence of global and complex phenomena.In this paper, we propose a hybridization of the CA original definition coupled with memristor based implementation, and, more specifically, we focus on Memristive Learning Cellular Automata (MLCA), which have the ability of learning using also simple identical interconnected cells and taking advantage of the memristor devices inherent variability.The proposed MLCA circuit level implementation is applied on optimal detection of edges in image processing through a series of SPICE simulations, proving its robustness and efficacy.
△ Less
Submitted 15 March, 2020;
originally announced March 2020.
-
Machine Learning Inter-Atomic Potentials Generation Driven by Active Learning: A Case Study for Amorphous and Liquid Hafnium dioxide
Authors:
Ganesh Sivaraman,
Anand Narayanan Krishnamoorthy,
Matthias Baur,
Christian Holm,
Marius Stan,
Gabor Csányi,
Chris Benmore,
Álvaro Vázquez-Mayagoitia
Abstract:
We propose a novel active learning scheme for automatically sampling a minimum number of uncorrelated configurations for fitting the Gaussian Approximation Potential (GAP). Our active learning scheme consists of an unsupervised machine learning (ML) scheme coupled to Bayesian optimization technique that evaluates the GAP model. We apply this scheme to a Hafnium dioxide (HfO2) dataset generated fro…
▽ More
We propose a novel active learning scheme for automatically sampling a minimum number of uncorrelated configurations for fitting the Gaussian Approximation Potential (GAP). Our active learning scheme consists of an unsupervised machine learning (ML) scheme coupled to Bayesian optimization technique that evaluates the GAP model. We apply this scheme to a Hafnium dioxide (HfO2) dataset generated from a melt-quench ab initio molecular dynamics (AIMD) protocol. Our results show that the active learning scheme, with no prior knowledge of the dataset is able to extract a configuration that reaches the required energy fit tolerance. Further, molecular dynamics (MD) simulations performed using this active learned GAP model on 6144-atom systems of amorphous and liquid state elucidate the structural properties of HfO2 with near ab initio precision and quench rates (i.e. 1.0 K/ps) not accessible via AIMD. The melt and amorphous x-ray structural factors generated from our simulation are in good agreement with experiment. Additionally, the calculated diffusion constants are in good agreement with previous ab initio studies.
△ Less
Submitted 22 October, 2019;
originally announced October 2019.
-
Reservoir Computing based Neural Image Filters
Authors:
Samiran Ganguly,
Yunfei Gu,
Yunkun Xie,
Mircea R. Stan,
Avik W. Ghosh,
Nibir K. Dhar
Abstract:
Clean images are an important requirement for machine vision systems to recognize visual features correctly. However, the environment, optics, electronics of the physical imaging systems can introduce extreme distortions and noise in the acquired images. In this work, we explore the use of reservoir computing, a dynamical neural network model inspired from biological systems, in creating dynamic i…
▽ More
Clean images are an important requirement for machine vision systems to recognize visual features correctly. However, the environment, optics, electronics of the physical imaging systems can introduce extreme distortions and noise in the acquired images. In this work, we explore the use of reservoir computing, a dynamical neural network model inspired from biological systems, in creating dynamic image filtering systems that extracts signal from noise using inverse modeling. We discuss the possibility of implementing these networks in hardware close to the sensors.
△ Less
Submitted 7 September, 2018;
originally announced September 2018.
-
Hardware based Spatio-Temporal Neural Processing Backend for Imaging Sensors: Towards a Smart Camera
Authors:
Samiran Ganguly,
Yunfei Gu,
Mircea R. Stan,
Avik W. Ghosh
Abstract:
In this work we show how we can build a technology platform for cognitive imaging sensors using recent advances in recurrent neural network architectures and training methods inspired from biology. We demonstrate learning and processing tasks specific to imaging sensors, including enhancement of sensitivity and signal-to-noise ratio (SNR) purely through neural filtering beyond the fundamental limi…
▽ More
In this work we show how we can build a technology platform for cognitive imaging sensors using recent advances in recurrent neural network architectures and training methods inspired from biology. We demonstrate learning and processing tasks specific to imaging sensors, including enhancement of sensitivity and signal-to-noise ratio (SNR) purely through neural filtering beyond the fundamental limits sensor materials, and inferencing and spatio-temporal pattern recognition capabilities of these networks with applications in object detection, motion tracking and prediction. We then show designs of unit hardware cells built using complementary metal-oxide semiconductor (CMOS) and emerging materials technologies for ultra-compact and energy-efficient embedded neural processors for smart cameras.
△ Less
Submitted 22 March, 2018;
originally announced March 2018.
-
Tolerating Soft Errors in Processor Cores Using CLEAR (Cross-Layer Exploration for Architecting Resilience)
Authors:
Eric Cheng,
Shahrzad Mirkhani,
Lukasz G. Szafaryn,
Chen-Yong Cher,
Hyungmin Cho,
Kevin Skadron,
Mircea R. Stan,
Klas Lilja,
Jacob A. Abraham,
Pradip Bose,
Subhasish Mitra
Abstract:
We present CLEAR (Cross-Layer Exploration for Architecting Resilience), a first of its kind framework which overcomes a major challenge in the design of digital systems that are resilient to reliability failures: achieve desired resilience targets at minimal costs (energy, power, execution time, area) by combining resilience techniques across various layers of the system stack (circuit, logic, arc…
▽ More
We present CLEAR (Cross-Layer Exploration for Architecting Resilience), a first of its kind framework which overcomes a major challenge in the design of digital systems that are resilient to reliability failures: achieve desired resilience targets at minimal costs (energy, power, execution time, area) by combining resilience techniques across various layers of the system stack (circuit, logic, architecture, software, algorithm). This is also referred to as cross-layer resilience. In this paper, we focus on radiation-induced soft errors in processor cores. We address both single-event upsets (SEUs) and single-event multiple upsets (SEMUs) in terrestrial environments. Our framework automatically and systematically explores the large space of comprehensive resilience techniques and their combinations across various layers of the system stack (586 cross-layer combinations in this paper), derives cost-effective solutions that achieve resilience targets at minimal costs, and provides guidelines for the design of new resilience techniques. Our results demonstrate that a carefully optimized combination of circuit-level hardening, logic-level parity checking, and micro-architectural recovery provides a highly cost-effective soft error resilience solution for general-purpose processor cores. For example, a 50x improvement in silent data corruption rate is achieved at only 2.1% energy cost for an out-of-order core (6.1% for an in-order core) with no speed impact. However, (application-aware) selective circuit-level hardening alone, guided by a thorough analysis of the effects of soft errors on application benchmarks, provides a cost-effective soft error resilience solution as well (with ~1% additional energy cost for a 50x improvement in silent data corruption rate).
△ Less
Submitted 28 September, 2017;
originally announced September 2017.
-
CLEAR: Cross-Layer Exploration for Architecting Resilience - Combining Hardware and Software Techniques to Tolerate Soft Errors in Processor Cores
Authors:
Eric Cheng,
Shahrzad Mirkhani,
Lukasz G. Szafaryn,
Chen-Yong Cher,
Hyungmin Cho,
Kevin Skadron,
Mircea R. Stan,
Klas Lilja,
Jacob A. Abraham,
Pradip Bose,
Subhasish Mitra
Abstract:
We present a first of its kind framework which overcomes a major challenge in the design of digital systems that are resilient to reliability failures: achieve desired resilience targets at minimal costs (energy, power, execution time, area) by combining resilience techniques across various layers of the system stack (circuit, logic, architecture, software, algorithm). This is also referred to as…
▽ More
We present a first of its kind framework which overcomes a major challenge in the design of digital systems that are resilient to reliability failures: achieve desired resilience targets at minimal costs (energy, power, execution time, area) by combining resilience techniques across various layers of the system stack (circuit, logic, architecture, software, algorithm). This is also referred to as cross-layer resilience. In this paper, we focus on radiation-induced soft errors in processor cores. We address both single-event upsets (SEUs) and single-event multiple upsets (SEMUs) in terrestrial environments. Our framework automatically and systematically explores the large space of comprehensive resilience techniques and their combinations across various layers of the system stack (586 cross-layer combinations in this paper), derives cost-effective solutions that achieve resilience targets at minimal costs, and provides guidelines for the design of new resilience techniques. We demonstrate the practicality and effectiveness of our framework using two diverse designs: a simple, in-order processor core and a complex, out-of-order processor core. Our results demonstrate that a carefully optimized combination of circuit-level hardening, logic-level parity checking, and micro-architectural recovery provides a highly cost-effective soft error resilience solution for general-purpose processor cores. For example, a 50x improvement in silent data corruption rate is achieved at only 2.1% energy cost for an out-of-order core (6.1% for an in-order core) with no speed impact. However, selective circuit-level hardening alone, guided by a thorough analysis of the effects of soft errors on application benchmarks, provides a cost-effective soft error resilience solution as well (with ~1% additional energy cost for a 50x improvement in silent data corruption rate).
△ Less
Submitted 23 June, 2016; v1 submitted 11 April, 2016;
originally announced April 2016.