Skip to main content

Showing 1–23 of 23 results for author: Papka, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.16871  [pdf, other

    cs.HC cs.LG

    Trust Your Gut: Comparing Human and Machine Inference from Noisy Visualizations

    Authors: Ratanond Koonchanok, Michael E. Papka, Khairi Reda

    Abstract: People commonly utilize visualizations not only to examine a given dataset, but also to draw generalizable conclusions about the underlying models or phenomena. Prior research has compared human visual inference to that of an optimal Bayesian agent, with deviations from rational analysis viewed as problematic. However, human reliance on non-normative heuristics may prove advantageous in certain ci… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: To appear in IEEE Transactions on Visualization and Computer Graphics (Proceedings of IEEE VIS'24)

  2. arXiv:2406.14452  [pdf, other

    cs.HC

    Science in a Blink: Supporting Ensemble Perception in Scalar Fields

    Authors: Victor A. Mateevitsi, Michael E. Papka, Khairi Reda

    Abstract: Visualizations support rapid analysis of scientific datasets, allowing viewers to glean aggregate information (e.g., the mean) within split-seconds. While prior research has explored this ability in conventional charts, it is unclear if spatial visualizations used by computational scientists afford a similar ensemble perception capacity. We investigate people's ability to estimate two summary stat… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: To appear in Proceedings of the 2024 IEEE Visualization Conference (VIS'24)

  3. arXiv:2404.17619  [pdf, other

    cs.HC cs.GR

    VisAnywhere: Developing Multi-platform Scientific Visualization Applications

    Authors: Thomas Marrinan, Madeleine Moeller, Alina Kanayinkal, Victor A. Mateevitsi, Michael E. Papka

    Abstract: Scientists often explore and analyze large-scale scientific simulation data by leveraging two- and three-dimensional visualizations. The data and tasks can be complex and therefore best supported using myriad display technologies, from mobile devices to large high-resolution display walls to virtual reality headsets. Using a simulation of neuron connections in the human brain, we present our work… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  4. MalleTrain: Deep Neural Network Training on Unfillable Supercomputer Nodes

    Authors: Xiaolong Ma, Feng Yan, Lei Yang, Ian Foster, Michael E. Papka, Zhengchun Liu, Rajkumar Kettimuthu

    Abstract: First-come first-serve scheduling can result in substantial (up to 10%) of transiently idle nodes on supercomputers. Recognizing that such unfilled nodes are well-suited for deep neural network (DNN) training, due to the flexible nature of DNN training tasks, Liu et al. proposed that the re-scaling DNN training tasks to fit gaps in schedules be formulated as a mixed-integer linear programming (MIL… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  5. MRSch: Multi-Resource Scheduling for HPC

    Authors: Boyang Li, Yuping Fan, Matthew Dearing, Zhiling Lan, Paul Richy, William Allcocky, Michael Papka

    Abstract: Emerging workloads in high-performance computing (HPC) are embracing significant changes, such as having diverse resource requirements instead of being CPU-centric. This advancement forces cluster schedulers to consider multiple schedulable resources during decision-making. Existing scheduling studies rely on heuristic or optimization methods, which are limited by an inability to adapt to new scen… ▽ More

    Submitted 3 April, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

  6. Interpretable Modeling of Deep Reinforcement Learning Driven Scheduling

    Authors: Boyang Li, Zhiling Lan, Michael E. Papka

    Abstract: In the field of high-performance computing (HPC), there has been recent exploration into the use of deep reinforcement learning for cluster scheduling (DRL scheduling), which has demonstrated promising outcomes. However, a significant challenge arises from the lack of interpretability in deep neural networks (DNN), rendering them as black-box models to system managers. This lack of model interpret… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  7. Color Maker: a Mixed-Initiative Approach to Creating Accessible Color Maps

    Authors: Amey Salvi, Kecheng Lu, Michael E. Papka, Yunhai Wang, Khairi Reda

    Abstract: Quantitative data is frequently represented using color, yet designing effective color mappings is a challenging task, requiring one to balance perceptual standards with personal color preference. Current design tools either overwhelm novices with complexity or offer limited customization options. We present ColorMaker, a mixed-initiative approach for creating colormaps. ColorMaker combines fluid… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    Comments: To appear at the ACM CHI '24 Conference on Human Factors in Computing Systems

  8. Scaling Computational Fluid Dynamics: In Situ Visualization of NekRS using SENSEI

    Authors: Victor A. Mateevitsi, Mathis Bode, Nicola Ferrier, Paul Fischer, Jens Henrik Göbbert, Joseph A. Insley, Yu-Hsiang Lan, Misun Min, Michael E. Papka, Saumil Patel, Silvio Rizzi, Jonathan Windgassen

    Abstract: In the realm of Computational Fluid Dynamics (CFD), the demand for memory and computation resources is extreme, necessitating the use of leadership-scale computing platforms for practical domain sizes. This intensive requirement renders traditional checkpointing methods ineffective due to the significant slowdown in simulations while saving state data to disk. As we progress towards exascale and G… ▽ More

    Submitted 18 December, 2023; v1 submitted 15 December, 2023; originally announced December 2023.

  9. arXiv:2310.04610  [pdf, other

    cs.AI cs.LG

    DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

    Authors: Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Masahiro Tanaka, Xiaoxia Wu, Jeff Rasley, Ammar Ahmad Awan, Connor Holmes, Martin Cai, Adam Ghanem, Zhongzhu Zhou, Yuxiong He, Pete Luferenko, Divya Kumar, Jonathan Weyn, Ruixiong Zhang, Sylwester Klocek, Volodymyr Vragov, Mohammed AlQuraishi, Gustaf Ahdritz, Christina Floristean, Cristina Negri , et al. (67 additional authors not shown)

    Abstract: In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing significant advancements across sectors from drug development to renewable energy. To answer this call, we present DeepSpeed4Science initiative (deepspeed4science.ai) which aims to build unique… ▽ More

    Submitted 11 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

  10. arXiv:2310.04607  [pdf, other

    cs.PF cs.AI cs.AR cs.LG

    A Comprehensive Performance Study of Large Language Models on Novel AI Accelerators

    Authors: Murali Emani, Sam Foreman, Varuni Sastry, Zhen Xie, Siddhisanket Raskar, William Arnold, Rajeev Thakur, Venkatram Vishwanath, Michael E. Papka

    Abstract: Artificial intelligence (AI) methods have become critical in scientific applications to help accelerate scientific discovery. Large language models (LLMs) are being considered as a promising approach to address some of the challenging problems because of their superior generalization capabilities across domains. The effectiveness of the models and the accuracy of the applications is contingent upo… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

  11. arXiv:2306.09457  [pdf, other

    cs.HC cs.CV

    A Multi-Level, Multi-Scale Visual Analytics Approach to Assessment of Multifidelity HPC Systems

    Authors: Shilpika, Bethany Lusch, Murali Emani, Filippo Simini, Venkatram Vishwanath, Michael E. Papka, Kwan-Liu Ma

    Abstract: The ability to monitor and interpret of hardware system events and behaviors are crucial to improving the robustness and reliability of these systems, especially in a supercomputing facility. The growing complexity and scale of these systems demand an increase in monitoring data collected at multiple fidelity levels and varying temporal resolutions. In this work, we aim to build a holistic analyti… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

  12. arXiv:2304.10516  [pdf, other

    cs.DC cs.AI

    Distributed Neural Representation for Reactive in situ Visualization

    Authors: Qi Wu, Joseph A. Insley, Victor A. Mateevitsi, Silvio Rizzi, Michael E. Papka, Kwan-Liu Ma

    Abstract: Implicit neural representations (INRs) have emerged as a powerful tool for compressing large-scale volume data. This opens up new possibilities for in situ visualization. However, the efficient application of INRs to distributed data remains an underexplored area. In this work, we develop a distributed volumetric neural representation and optimize it for in situ visualization. Our technique elimin… ▽ More

    Submitted 20 July, 2024; v1 submitted 27 March, 2023; originally announced April 2023.

  13. arXiv:2204.05128  [pdf, other

    cs.DC

    Linking Scientific Instruments and HPC: Patterns, Technologies, Experiences

    Authors: Rafael Vescovi, Ryan Chard, Nickolaus Saint, Ben Blaiszik, Jim Pruyne, Tekin Bicer, Alex Lavens, Zhengchun Liu, Michael E. Papka, Suresh Narayanan, Nicholas Schwarz, Kyle Chard, Ian Foster

    Abstract: Powerful detectors at modern experimental facilities routinely collect data at multiple GB/s. Online analysis methods are needed to enable the collection of only interesting subsets of such massive data streams, such as by explicitly discarding some data elements or by directing instruments to relevant areas of experimental space. Such online analyses require methods for configuring and running hi… ▽ More

    Submitted 22 August, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

  14. arXiv:2201.06098  [pdf, other

    cs.CV

    An Edge Map based Ensemble Solution to Detect Water Level in Stream

    Authors: Pratool Bharti, Priyanjani Chandra, Michael. E. Papka, David Koop

    Abstract: Flooding is one of the most dangerous weather events today. Between $2015-2019$, on average, flooding has caused more than $130$ deaths every year in the USA alone. The devastating nature of flood necessitates the continuous monitoring of water level in the rivers and streams to detect the incoming flood. In this work, we have designed and implemented an efficient vision-based ensemble solution to… ▽ More

    Submitted 16 January, 2022; originally announced January 2022.

  15. arXiv:2109.05412  [pdf, other

    cs.DC

    Hybrid Workload Scheduling on HPC Systems

    Authors: Yuping Fan, Paul Rich, William Allcock, Michael Papka, Zhiling Lan

    Abstract: Traditionally, on-demand, rigid, and malleable applications have been scheduled and executed on separate systems. The ever-growing workload demands and rapidly developing HPC infrastructure trigger the interest of converging these applications on a single HPC system. Although allocating the hybrid workloads within one system could potentially improve system efficiency, it is difficult to balance t… ▽ More

    Submitted 11 September, 2021; originally announced September 2021.

  16. arXiv:2106.12091  [pdf, other

    cs.DC cs.LG

    BFTrainer: Low-Cost Training of Neural Networks on Unfillable Supercomputer Nodes

    Authors: Zhengchun Liu, Rajkumar Kettimuthu, Michael E. Papka, Ian Foster

    Abstract: Supercomputer FCFS-based scheduling policies result in many transient idle nodes, a phenomenon that is only partially alleviated by backfill scheduling methods that promote small jobs to run before large jobs. Here we describe how to realize a novel use for these otherwise wasted resources, namely, deep neural network (DNN) training. This important workload is easily organized as many small fragme… ▽ More

    Submitted 22 June, 2021; originally announced June 2021.

  17. arXiv:2105.06571  [pdf, other

    cs.DC

    Toward Real-time Analysis of Experimental Science Workloads on Geographically Distributed Supercomputers

    Authors: Michael Salim, Thomas Uram, J. Taylor Childers, Venkat Vishwanath, Michael E. Papka

    Abstract: Massive upgrades to science infrastructure are driving data velocities upwards while stimulating adoption of increasingly data-intensive analytics. While next-generation exascale supercomputers promise strong support for I/O-intensive workflows, HPC remains largely untapped by live experiments, because data transfers and disparate batch-queueing policies are prohibitive when faced with scarce inst… ▽ More

    Submitted 2 July, 2021; v1 submitted 13 May, 2021; originally announced May 2021.

  18. arXiv:2102.06243  [pdf, other

    cs.DC cs.AI cs.LG

    Deep Reinforcement Agent for Scheduling in HPC

    Authors: Yuping Fan, Zhiling Lan, Taylor Childers, Paul Rich, William Allcock, Michael E. Papka

    Abstract: Cluster scheduler is crucial in high-performance computing (HPC). It determines when and which user jobs should be allocated to available system resources. Existing cluster scheduling heuristics are developed by human experts based on their experience with specific HPC systems and workloads. However, the increasing complexity of computing systems and the highly dynamic nature of application worklo… ▽ More

    Submitted 19 April, 2021; v1 submitted 11 February, 2021; originally announced February 2021.

    Comments: Accepted by IPDPS 2021

    Journal ref: 35th IEEE International Parallel & Distributed Processing Symposium (2021)

  19. Scheduling Beyond CPUs for HPC

    Authors: Yuping Fan, Zhiling Lan, Paul Rich, William E. Allcock, Michael E. Papka, Brian Austin, David Paul

    Abstract: High performance computing (HPC) is undergoing significant changes. The emerging HPC applications comprise both compute- and data-intensive applications. To meet the intense I/O demand from emerging data-intensive applications, burst buffers are deployed in production systems. Existing HPC schedulers are mainly CPU-centric. The extreme heterogeneity of hardware devices, combined with workload chan… ▽ More

    Submitted 9 December, 2020; originally announced December 2020.

    Comments: Accepted by HPDC 2019

    Journal ref: Proceedings of the 28th ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC'19), 2019

  20. arXiv:1909.08704  [pdf, other

    cs.DC

    Balsam: Automated Scheduling and Execution of Dynamic, Data-Intensive HPC Workflows

    Authors: Michael A. Salim, Thomas D. Uram, J. Taylor Childers, Prasanna Balaprakash, Venkatram Vishwanath, Michael E. Papka

    Abstract: We introduce the Balsam service to manage high-throughput task scheduling and execution on supercomputing systems. Balsam allows users to populate a task database with a variety of tasks ranging from simple independent tasks to dynamic multi-task workflows. With abstractions for the local resource scheduler and MPI environment, Balsam dynamically packages tasks into ensemble jobs and manages their… ▽ More

    Submitted 18 September, 2019; originally announced September 2019.

    Comments: SC '18: 8th Workshop on Python for High-Performance and Scientific Computing (PyHPC 2018)

  21. arXiv:1708.01658  [pdf, ps, other

    cs.DL

    Exploring Features for Predicting Policy Citations

    Authors: Christian Bailey, Bharat Kale, Jamieson Walker, Harish Varma Siravuri, Hamed Alhoori, Micheal E. Papka

    Abstract: In this study we performed an initial investigation and evaluation of altmetrics and their relationship with public policy citation of research papers. We examined methods for using altmetrics and other data to predict whether a research paper is cited in public policy and applied receiver operating characteristic curve on various feature groups in order to evaluate their potential usefulness. Fro… ▽ More

    Submitted 15 June, 2017; originally announced August 2017.

    Comments: 2 pages, accepted to JCDL '17

  22. Predicting Research that will be Cited in Policy Documents

    Authors: Bharat Kale, Harish Varma Siravuri, Hamed Alhoori, Michael E. Papka

    Abstract: Scientific publications and other genres of research output are increasingly being cited in policy documents. Citations in documents of this nature could be considered a critical indicator of the significance and societal impact of the research output. In this study, we built classification models that predict whether a particular research work is likely to be cited in a public policy document bas… ▽ More

    Submitted 13 June, 2017; originally announced June 2017.

    Comments: 2 page extended abstract submitted for ACM WebSci'17 conference

  23. arXiv:1511.07312  [pdf, other

    hep-ph cs.DC physics.comp-ph

    Adapting the serial Alpgen event generator to simulate LHC collisions on millions of parallel threads

    Authors: J. T. Childers, T. D. Uram, T. J. LeCompte, M. E. Papka, D. P. Benjamin

    Abstract: As the LHC moves to higher energies and luminosity, the demand for computing resources increases accordingly and will soon outpace the growth of the Worldwide LHC Computing Grid. To meet this greater demand, event generation Monte Carlo was targeted for adaptation to run on Mira, the supercomputer at the Argonne Leadership Computing Facility. Alpgen is a Monte Carlo event generation application th… ▽ More

    Submitted 23 November, 2015; originally announced November 2015.

    Comments: 13 pages, 7 figures, publication