Search | arXiv e-print repository

Optical Networks

Authors: Varsha Lohani, Anjali Sharma, Yatindra Nath Singh, Kumari Akansha, Baljinder Singh Heera, Pallavi Athe

Abstract: Optical networks play a crucial role in todays digital topography, enabling the high-speed and reliable transmission of vast amounts of data over optical fibre for long distances. This paper provides an overview of optical networks, especially emphasising on their evolution with time. Optical networks play a crucial role in todays digital topography, enabling the high-speed and reliable transmission of vast amounts of data over optical fibre for long distances. This paper provides an overview of optical networks, especially emphasising on their evolution with time. △ Less

Submitted 14 August, 2024; originally announced August 2024.

arXiv:2408.01085 [pdf, other]

Effect of Fog Particle Size Distribution on 3D Object Detection Under Adverse Weather Conditions

Authors: Ajinkya Shinde, Gaurav Sharma, Manisha Pattanaik, Sri Niwas Singh

Abstract: LiDAR-based sensors employing optical spectrum signals play a vital role in providing significant information about the target objects in autonomous driving vehicle systems. However, the presence of fog in the atmosphere severely degrades the overall system's performance. This manuscript analyzes the role of fog particle size distributions in 3D object detection under adverse weather conditions. W… ▽ More LiDAR-based sensors employing optical spectrum signals play a vital role in providing significant information about the target objects in autonomous driving vehicle systems. However, the presence of fog in the atmosphere severely degrades the overall system's performance. This manuscript analyzes the role of fog particle size distributions in 3D object detection under adverse weather conditions. We utilise Mie theory and meteorological optical range (MOR) to calculate the attenuation and backscattering coefficient values for point cloud generation and analyze the overall system's accuracy in Car, Cyclist, and Pedestrian case scenarios under easy, medium and hard detection difficulties. Gamma and Junge (Power-Law) distributions are employed to mathematically model the fog particle size distribution under strong and moderate advection fog environments. Subsequently, we modified the KITTI dataset based on the backscattering coefficient values and trained it on the PV-RCNN++ deep neural network model for Car, Cyclist, and Pedestrian cases under different detection difficulties. The result analysis shows a significant variation in the system's accuracy concerning the changes in target object dimensionality, the nature of the fog environment and increasing detection difficulties, with the Car exhibiting the highest accuracy of around 99% and the Pedestrian showing the lowest accuracy of around 73%. △ Less

Submitted 2 August, 2024; originally announced August 2024.

arXiv:2407.14933 [pdf, other]

Consent in Crisis: The Rapid Decline of the AI Data Commons

Authors: Shayne Longpre, Robert Mahari, Ariel Lee, Campbell Lund, Hamidah Oderinwale, William Brannon, Nayan Saxena, Naana Obeng-Marnu, Tobin South, Cole Hunter, Kevin Klyman, Christopher Klamm, Hailey Schoelkopf, Nikhil Singh, Manuel Cherep, Ahmad Anis, An Dinh, Caroline Chitongo, Da Yin, Damien Sileo, Deividas Mataciunas, Diganta Misra, Emad Alghamdi, Enrico Shippole, Jianguo Zhang , et al. (24 additional authors not shown)

Abstract: General-purpose artificial intelligence (AI) systems are built on massive swathes of public web data, assembled into corpora such as C4, RefinedWeb, and Dolma. To our knowledge, we conduct the first, large-scale, longitudinal audit of the consent protocols for the web domains underlying AI training corpora. Our audit of 14,000 web domains provides an expansive view of crawlable web data and how co… ▽ More General-purpose artificial intelligence (AI) systems are built on massive swathes of public web data, assembled into corpora such as C4, RefinedWeb, and Dolma. To our knowledge, we conduct the first, large-scale, longitudinal audit of the consent protocols for the web domains underlying AI training corpora. Our audit of 14,000 web domains provides an expansive view of crawlable web data and how codified data use preferences are changing over time. We observe a proliferation of AI-specific clauses to limit use, acute differences in restrictions on AI developers, as well as general inconsistencies between websites' expressed intentions in their Terms of Service and their robots.txt. We diagnose these as symptoms of ineffective web protocols, not designed to cope with the widespread re-purposing of the internet for AI. Our longitudinal analyses show that in a single year (2023-2024) there has been a rapid crescendo of data restrictions from web sources, rendering ~5%+ of all tokens in C4, or 28%+ of the most actively maintained, critical sources in C4, fully restricted from use. For Terms of Service crawling restrictions, a full 45% of C4 is now restricted. If respected or enforced, these restrictions are rapidly biasing the diversity, freshness, and scaling laws for general-purpose AI systems. We hope to illustrate the emerging crises in data consent, for both developers and creators. The foreclosure of much of the open web will impact not only commercial AI, but also non-commercial AI and academic research. △ Less

Submitted 24 July, 2024; v1 submitted 20 July, 2024; originally announced July 2024.

Comments: 41 pages (13 main), 5 figures, 9 tables

arXiv:2407.12875 [pdf, other]

ChatBCG: Can AI Read Your Slide Deck?

Authors: Nikita Singh, Rob Balian, Lukas Martinelli

Abstract: Multimodal models like GPT4o and Gemini Flash are exceptional at inference and summarization tasks, which approach human-level in performance. However, we find that these models underperform compared to humans when asked to do very specific 'reading and estimation' tasks, particularly in the context of visual charts in business decks. This paper evaluates the accuracy of GPT 4o and Gemini Flash-1.… ▽ More Multimodal models like GPT4o and Gemini Flash are exceptional at inference and summarization tasks, which approach human-level in performance. However, we find that these models underperform compared to humans when asked to do very specific 'reading and estimation' tasks, particularly in the context of visual charts in business decks. This paper evaluates the accuracy of GPT 4o and Gemini Flash-1.5 in answering straightforward questions about data on labeled charts (where data is clearly annotated on the graphs), and unlabeled charts (where data is not clearly annotated and has to be inferred from the X and Y axis). We conclude that these models aren't currently capable of reading a deck accurately end-to-end if it contains any complex or unlabeled charts. Even if a user created a deck of only labeled charts, the model would only be able to read 7-8 out of 15 labeled charts perfectly end-to-end. For full list of slide deck figures visit https://www.repromptai.com/chat_bcg △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: for full list of figures visit https://www.repromptai.com/chat_bcg

arXiv:2407.09721 [pdf, other]

Purrfect Pitch: Exploring Musical Interval Learning through Multisensory Interfaces

Authors: Sam Chin, Cathy Mengying Fang, Nikhil Singh, Ibrahim Ibrahim, Joe Paradiso, Pattie Maes

Abstract: We introduce Purrfect Pitch, a system consisting of a wearable haptic device and a custom-designed learning interface for musical ear training. We focus on the ability to identify musical intervals (sequences of two musical notes), which is a perceptually ambiguous task that usually requires strenuous rote training. With our system, the user would hear a sequence of two tones while simultaneously… ▽ More We introduce Purrfect Pitch, a system consisting of a wearable haptic device and a custom-designed learning interface for musical ear training. We focus on the ability to identify musical intervals (sequences of two musical notes), which is a perceptually ambiguous task that usually requires strenuous rote training. With our system, the user would hear a sequence of two tones while simultaneously receiving two corresponding vibrotactile stimuli on the back. Providing haptic feedback along the back makes the auditory distance between the two tones more salient, and the back-worn design is comfortable and unobtrusive. During training, the user receives multi-sensory feedback from our system and inputs their guessed interval value on our web-based learning interface. They see a green (otherwise red) screen for a correct guess with the correct interval value. Our study with 18 participants shows that our system enables novice learners to identify intervals more accurately and consistently than those who only received audio feedback, even after the haptic feedback is removed. We also share further insights on how to design a multisensory learning system. △ Less

Submitted 12 July, 2024; originally announced July 2024.

arXiv:2407.08989 [pdf, other]

Robustness of LLMs to Perturbations in Text

Authors: Ayush Singh, Navpreet Singh, Shubham Vatsal

Abstract: Having a clean dataset has been the foundational assumption of most natural language processing (NLP) systems. However, properly written text is rarely found in real-world scenarios and hence, oftentimes invalidates the aforementioned foundational assumption. Recently, Large language models (LLMs) have shown impressive performance, but can they handle the inevitable noise in real-world data? This… ▽ More Having a clean dataset has been the foundational assumption of most natural language processing (NLP) systems. However, properly written text is rarely found in real-world scenarios and hence, oftentimes invalidates the aforementioned foundational assumption. Recently, Large language models (LLMs) have shown impressive performance, but can they handle the inevitable noise in real-world data? This work tackles this critical question by investigating LLMs' resilience against morphological variations in text. To that end, we artificially introduce varying levels of noise into a diverse set of datasets and systematically evaluate LLMs' robustness against the corrupt variations of the original text. Our findings show that contrary to popular beliefs, generative LLMs are quiet robust to noisy perturbations in text. This is a departure from pre-trained models like BERT or RoBERTa whose performance has been shown to be sensitive to deteriorating noisy text. Additionally, we test LLMs' resilience on multiple real-world benchmarks that closely mimic commonly found errors in the wild. With minimal prompting, LLMs achieve a new state-of-the-art on the benchmark tasks of Grammar Error Correction (GEC) and Lexical Semantic Change (LSC). To empower future research, we also release a dataset annotated by humans stating their preference for LLM vs. human-corrected outputs along with the code to reproduce our results. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: 8 pages, 1 figure, 6 tables, updated with results also from GPT-4, LLaMa-3

ACM Class: I.7; I.2.7; I.2.4

arXiv:2407.07818 [pdf, other]

The Misclassification Likelihood Matrix: Some Classes Are More Likely To Be Misclassified Than Others

Authors: Daniel Sikar, Artur Garcez, Robin Bloomfield, Tillman Weyde, Kaleem Peeroo, Naman Singh, Maeve Hutchinson, Dany Laksono, Mirela Reljan-Delaney

Abstract: This study introduces the Misclassification Likelihood Matrix (MLM) as a novel tool for quantifying the reliability of neural network predictions under distribution shifts. The MLM is obtained by leveraging softmax outputs and clustering techniques to measure the distances between the predictions of a trained neural network and class centroids. By analyzing these distances, the MLM provides a comp… ▽ More This study introduces the Misclassification Likelihood Matrix (MLM) as a novel tool for quantifying the reliability of neural network predictions under distribution shifts. The MLM is obtained by leveraging softmax outputs and clustering techniques to measure the distances between the predictions of a trained neural network and class centroids. By analyzing these distances, the MLM provides a comprehensive view of the model's misclassification tendencies, enabling decision-makers to identify the most common and critical sources of errors. The MLM allows for the prioritization of model improvements and the establishment of decision thresholds based on acceptable risk levels. The approach is evaluated on the MNIST dataset using a Convolutional Neural Network (CNN) and a perturbed version of the dataset to simulate distribution shifts. The results demonstrate the effectiveness of the MLM in assessing the reliability of predictions and highlight its potential in enhancing the interpretability and risk mitigation capabilities of neural networks. The implications of this work extend beyond image classification, with ongoing applications in autonomous systems, such as self-driving cars, to improve the safety and reliability of decision-making in complex, real-world environments. △ Less

Submitted 13 August, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

Comments: 9 pages, 7 figures, 1 table

arXiv:2407.06910 [pdf, other]

Fine-grained large-scale content recommendations for MSX sellers

Authors: Manpreet Singh, Ravdeep Pasricha, Ravi Prasad Kondapalli, Kiran R, Nitish Singh, Akshita Agarwalla, Manoj R, Manish Prabhakar, Laurent Boué

Abstract: One of the most critical tasks of Microsoft sellers is to meticulously track and nurture potential business opportunities through proactive engagement and tailored solutions. Recommender systems play a central role to help sellers achieve their goals. In this paper, we present a content recommendation model which surfaces various types of content (technical documentation, comparison with competito… ▽ More One of the most critical tasks of Microsoft sellers is to meticulously track and nurture potential business opportunities through proactive engagement and tailored solutions. Recommender systems play a central role to help sellers achieve their goals. In this paper, we present a content recommendation model which surfaces various types of content (technical documentation, comparison with competitor products, customer success stories etc.) that sellers can share with their customers or use for their own self-learning. The model operates at the opportunity level which is the lowest possible granularity and the most relevant one for sellers. It is based on semantic matching between metadata from the contents and carefully selected attributes of the opportunities. Considering the volume of seller-managed opportunities in organizations such as Microsoft, we show how to perform efficient semantic matching over a very large number of opportunity-content combinations. The main challenge is to ensure that the top-5 relevant contents for each opportunity are recommended out of a total of $\approx 40,000$ published contents. We achieve this target through an extensive comparison of different model architectures and feature selection. Finally, we further examine the quality of the recommendations in a quantitative manner using a combination of human domain experts as well as by using the recently proposed "LLM as a judge" framework. △ Less

Submitted 9 July, 2024; originally announced July 2024.

Journal ref: Microsoft Journal of Applied Research, Volume 21, 2024

arXiv:2406.18899 [pdf, other]

Autonomous Control of a Novel Closed Chain Five Bar Active Suspension via Deep Reinforcement Learning

Authors: Nishesh Singh, Sidharth Ramesh, Abhishek Shankar, Jyotishka Duttagupta, Leander Stephen D'Souza, Sanjay Singh

Abstract: Planetary exploration requires traversal in environments with rugged terrains. In addition, Mars rovers and other planetary exploration robots often carry sensitive scientific experiments and components onboard, which must be protected from mechanical harm. This paper deals with an active suspension system focused on chassis stabilisation and an efficient traversal method while encountering unavoi… ▽ More Planetary exploration requires traversal in environments with rugged terrains. In addition, Mars rovers and other planetary exploration robots often carry sensitive scientific experiments and components onboard, which must be protected from mechanical harm. This paper deals with an active suspension system focused on chassis stabilisation and an efficient traversal method while encountering unavoidable obstacles. Soft Actor-Critic (SAC) was applied along with Proportional Integral Derivative (PID) control to stabilise the chassis and traverse large obstacles at low speeds. The model uses the rover's distance from surrounding obstacles, the height of the obstacle, and the chassis' orientation to actuate the control links of the suspension accurately. Simulations carried out in the Gazebo environment are used to validate the proposed active system. △ Less

Submitted 4 July, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

Comments: 15 pages, 11 figures

ACM Class: I.2.9

arXiv:2406.15199 [pdf, other]

On the Computing and Communication Tradeoff in Reasoning-Based Multi-User Semantic Communications

Authors: Nitisha Singh, Christo Kurisummoottil Thomas, Walid Saad, Emilio Calvanese Strinati

Abstract: Semantic communication (SC) is recognized as a promising approach for enabling reliable communication with minimal data transfer while maintaining seamless connectivity for a group of wireless users. Unlocking the advantages of SC for multi-user cases requires revisiting how communication and computing resources are allocated. This reassessment should consider the reasoning abilities of end-users,… ▽ More Semantic communication (SC) is recognized as a promising approach for enabling reliable communication with minimal data transfer while maintaining seamless connectivity for a group of wireless users. Unlocking the advantages of SC for multi-user cases requires revisiting how communication and computing resources are allocated. This reassessment should consider the reasoning abilities of end-users, enabling receiving nodes to fill in missing information or anticipate future events more effectively. Yet, state-of-the-art SC systems primarily focus on resource allocation through compression based on semantic relevance, while overlooking the underlying data generation mechanisms and the tradeoff between communications and computing. Thus, they cannot help prevent a disruption in connectivity. In contrast, in this paper, a novel framework for computing and communication resource allocation is proposed that seeks to demonstrate how SC systems with reasoning capabilities at the end nodes can improve reliability in an end-to-end multi-user wireless system with intermittent communication links. Towards this end, a novel reasoning-aware SC system is proposed for enabling users to utilize their local computing resources to reason the representations when the communication links are unavailable. To optimize communication and computing resource allocation in this system, a noncooperative game is formulated among multiple users whose objective is to maximize the effective semantic information (computed as a product of reliability and semantic information) while controlling the number of semantically relevant links that are disrupted. Simulation results show that the proposed reasoning-aware SC system results in at least a $16.6\%$ enhancement in throughput and a significant improvement in reliability compared to classical communications systems that do not incorporate reasoning. △ Less

Submitted 21 June, 2024; originally announced June 2024.

Comments: 7 pages, 5 figures, in submission to IEEE GLOBECOM

arXiv:2406.05923 [pdf, other]

Contrastive Learning from Synthetic Audio Doppelgangers

Authors: Manuel Cherep, Nikhil Singh

Abstract: Learning robust audio representations currently demands extensive datasets of real-world sound recordings. By applying artificial transformations to these recordings, models can learn to recognize similarities despite subtle variations through techniques like contrastive learning. However, these transformations are only approximations of the true diversity found in real-world sounds, which are gen… ▽ More Learning robust audio representations currently demands extensive datasets of real-world sound recordings. By applying artificial transformations to these recordings, models can learn to recognize similarities despite subtle variations through techniques like contrastive learning. However, these transformations are only approximations of the true diversity found in real-world sounds, which are generated by complex interactions of physical processes, from vocal cord vibrations to the resonance of musical instruments. We propose a solution to both the data scale and transformation limitations, leveraging synthetic audio. By randomly perturbing the parameters of a sound synthesizer, we generate audio doppelgängers-synthetic positive pairs with causally manipulated variations in timbre, pitch, and temporal envelopes. These variations, difficult to achieve through transformations of existing audio, provide a rich source of contrastive information. Despite the shift to randomly generated synthetic data, our method produces strong representations, competitive with real data on standard audio classification benchmarks. Notably, our approach is lightweight, requires no data storage, and has only a single hyperparameter, which we extensively analyze. We offer this method as a complement to existing strategies for contrastive learning in audio, using synthesized sounds to reduce the data burden on practitioners. △ Less

Submitted 9 June, 2024; originally announced June 2024.

Comments: 17 pages, 6 figures

arXiv:2406.04145 [pdf, other]

Every Answer Matters: Evaluating Commonsense with Probabilistic Measures

Authors: Qi Cheng, Michael Boratko, Pranay Kumar Yelugam, Tim O'Gorman, Nalini Singh, Andrew McCallum, Xiang Lorraine Li

Abstract: Large language models have demonstrated impressive performance on commonsense tasks; however, these tasks are often posed as multiple-choice questions, allowing models to exploit systematic biases. Commonsense is also inherently probabilistic with multiple correct answers. The purpose of "boiling water" could be making tea and cooking, but it also could be killing germs. Existing tasks do not capt… ▽ More Large language models have demonstrated impressive performance on commonsense tasks; however, these tasks are often posed as multiple-choice questions, allowing models to exploit systematic biases. Commonsense is also inherently probabilistic with multiple correct answers. The purpose of "boiling water" could be making tea and cooking, but it also could be killing germs. Existing tasks do not capture the probabilistic nature of common sense. To this end, we present commonsense frame completion (CFC), a new generative task that evaluates common sense via multiple open-ended generations. We also propose a method of probabilistic evaluation that strongly correlates with human judgments. Humans drastically outperform strong language model baselines on our dataset, indicating this approach is both a challenging and useful evaluation of machine common sense. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: ACL 2024 Camera Ready

arXiv:2406.00294 [pdf, other]

Creative Text-to-Audio Generation via Synthesizer Programming

Authors: Manuel Cherep, Nikhil Singh, Jessica Shand

Abstract: Neural audio synthesis methods now allow specifying ideas in natural language. However, these methods produce results that cannot be easily tweaked, as they are based on large latent spaces and up to billions of uninterpretable parameters. We propose a text-to-audio generation method that leverages a virtual modular sound synthesizer with only 78 parameters. Synthesizers have long been used by ski… ▽ More Neural audio synthesis methods now allow specifying ideas in natural language. However, these methods produce results that cannot be easily tweaked, as they are based on large latent spaces and up to billions of uninterpretable parameters. We propose a text-to-audio generation method that leverages a virtual modular sound synthesizer with only 78 parameters. Synthesizers have long been used by skilled sound designers for media like music and film due to their flexibility and intuitive controls. Our method, CTAG, iteratively updates a synthesizer's parameters to produce high-quality audio renderings of text prompts that can be easily inspected and tweaked. Sounds produced this way are also more abstract, capturing essential conceptual features over fine-grained acoustic details, akin to how simple sketches can vividly convey visual concepts. Our results show how CTAG produces sounds that are distinctive, perceived as artistic, and yet similarly identifiable to recent neural audio synthesis models, positioning it as a valuable and complementary tool. △ Less

Submitted 1 June, 2024; originally announced June 2024.

Comments: Accepted to ICML 2024

arXiv:2405.09781 [pdf, other]

An Independent Implementation of Quantum Machine Learning Algorithms in Qiskit for Genomic Data

Authors: Navneet Singh, Shiva Raj Pokhrel

Abstract: In this paper, we explore the power of Quantum Machine Learning as we extend, implement and evaluate algorithms like Quantum Support Vector Classifier (QSVC), Pegasos-QSVC, Variational Quantum Circuits (VQC), and Quantum Neural Networks (QNN) in Qiskit with diverse feature mapping techniques for genomic sequence classification. In this paper, we explore the power of Quantum Machine Learning as we extend, implement and evaluate algorithms like Quantum Support Vector Classifier (QSVC), Pegasos-QSVC, Variational Quantum Circuits (VQC), and Quantum Neural Networks (QNN) in Qiskit with diverse feature mapping techniques for genomic sequence classification. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Comments: 2 pager extended abstract

Journal ref: SIGCOMM 2024, Sydney Australia

arXiv:2404.19345 [pdf, other]

Connecting physics to systems with modular spin-circuits

Authors: Kemal Selcuk, Saleh Bunaiyan, Nihal Sanjay Singh, Shehrin Sayed, Samiran Ganguly, Giovanni Finocchio, Supriyo Datta, Kerem Y. Camsari

Abstract: An emerging paradigm in modern electronics is that of CMOS + $\sf X$ requiring the integration of standard CMOS technology with novel materials and technologies denoted by $\sf X$. In this context, a crucial challenge is to develop accurate circuit models for $\sf X$ that are compatible with standard models for CMOS-based circuits and systems. In this perspective we present physics-based, experime… ▽ More An emerging paradigm in modern electronics is that of CMOS + $\sf X$ requiring the integration of standard CMOS technology with novel materials and technologies denoted by $\sf X$. In this context, a crucial challenge is to develop accurate circuit models for $\sf X$ that are compatible with standard models for CMOS-based circuits and systems. In this perspective we present physics-based, experimentally benchmarked modular circuit models that can be used to evaluate a class of CMOS + $\sf X$ systems, where $\sf X$ denotes magnetic and spintronic materials and phenomena. This class of materials is particularly challenging because they go beyond conventional charge-based phenomena and involve the spin degree of freedom which involves non-trivial quantum effects. Starting from density matrices $-$ the central quantity in quantum transport $-$ using well-defined approximations, it is possible to obtain spin-circuits that generalize ordinary circuit theory to 4-component currents and voltages (1 for charge and 3 for spin). With step-by-step examples that progressively go higher in the computing stack, we illustrate how the spin-circuit approach can be used to start from the physics of magnetism and spintronics to enable accurate system-level evaluations. We believe the core approach can be extended to include other quantum degrees of freedom like valley and pseudospins starting from corresponding density matrices. △ Less

Submitted 30 April, 2024; originally announced April 2024.

arXiv:2404.18713 [pdf, other]

Adaptive Reinforcement Learning for Robot Control

Authors: Yu Tang Liu, Nilaksh Singh, Aamir Ahmad

Abstract: Deep reinforcement learning (DRL) has shown remarkable success in simulation domains, yet its application in designing robot controllers remains limited, due to its single-task orientation and insufficient adaptability to environmental changes. To overcome these limitations, we present a novel adaptive agent that leverages transfer learning techniques to dynamically adapt policy in response to dif… ▽ More Deep reinforcement learning (DRL) has shown remarkable success in simulation domains, yet its application in designing robot controllers remains limited, due to its single-task orientation and insufficient adaptability to environmental changes. To overcome these limitations, we present a novel adaptive agent that leverages transfer learning techniques to dynamically adapt policy in response to different tasks and environmental conditions. The approach is validated through the blimp control challenge, where multitasking capabilities and environmental adaptability are essential. The agent is trained using a custom, highly parallelized simulator built on IsaacGym. We perform zero-shot transfer to fly the blimp in the real world to solve various tasks. We share our code at \url{https://github.com/robot-perception-group/adaptive\_agent/}. △ Less

Submitted 29 April, 2024; originally announced April 2024.

arXiv:2404.01704 [pdf, other]

Application of S-band for Protection in Multi-band Flexible-Grid Optical Networks

Authors: Varsha Lohani, Anjali Sharma, Yatindra Nath Singh

Abstract: The core network is experiencing bandwidth capacity constraints as internet traffic grows. As a result, the notion of a Multi-band flexible-grid optical network was established to increase the lifespan of an optical core network. In this paper, we use the C+L band for working traffic transmission and the S-band for protection against failure. Furthermore, we compare the proposed method with the ex… ▽ More The core network is experiencing bandwidth capacity constraints as internet traffic grows. As a result, the notion of a Multi-band flexible-grid optical network was established to increase the lifespan of an optical core network. In this paper, we use the C+L band for working traffic transmission and the S-band for protection against failure. Furthermore, we compare the proposed method with the existing ones. △ Less

Submitted 2 April, 2024; originally announced April 2024.

Comments: First Draft

arXiv:2403.09806 [pdf, other]

xLP: Explainable Link Prediction for Master Data Management

Authors: Balaji Ganesan, Matheen Ahmed Pasha, Srinivasa Parkala, Neeraj R Singh, Gayatri Mishra, Sumit Bhatia, Hima Patel, Somashekar Naganna, Sameep Mehta

Abstract: Explaining neural model predictions to users requires creativity. Especially in enterprise applications, where there are costs associated with users' time, and their trust in the model predictions is critical for adoption. For link prediction in master data management, we have built a number of explainability solutions drawing from research in interpretability, fact verification, path ranking, neu… ▽ More Explaining neural model predictions to users requires creativity. Especially in enterprise applications, where there are costs associated with users' time, and their trust in the model predictions is critical for adoption. For link prediction in master data management, we have built a number of explainability solutions drawing from research in interpretability, fact verification, path ranking, neuro-symbolic reasoning and self-explaining AI. In this demo, we present explanations for link prediction in a creative way, to allow users to choose explanations they are more comfortable with. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: 8 pages, 4 figures, NeurIPS 2020 Competition and Demonstration Track. arXiv admin note: text overlap with arXiv:2012.05516

arXiv:2402.19371 [pdf]

OpenMedLM: Prompt engineering can out-perform fine-tuning in medical question-answering with open-source large language models

Authors: Jenish Maharjan, Anurag Garikipati, Navan Preet Singh, Leo Cyrus, Mayank Sharma, Madalina Ciobanu, Gina Barnes, Rahul Thapa, Qingqing Mao, Ritankar Das

Abstract: LLMs have become increasingly capable at accomplishing a range of specialized-tasks and can be utilized to expand equitable access to medical knowledge. Most medical LLMs have involved extensive fine-tuning, leveraging specialized medical data and significant, thus costly, amounts of computational power. Many of the top performing LLMs are proprietary and their access is limited to very few resear… ▽ More LLMs have become increasingly capable at accomplishing a range of specialized-tasks and can be utilized to expand equitable access to medical knowledge. Most medical LLMs have involved extensive fine-tuning, leveraging specialized medical data and significant, thus costly, amounts of computational power. Many of the top performing LLMs are proprietary and their access is limited to very few research groups. However, open-source (OS) models represent a key area of growth for medical LLMs due to significant improvements in performance and an inherent ability to provide the transparency and compliance required in healthcare. We present OpenMedLM, a prompting platform which delivers state-of-the-art (SOTA) performance for OS LLMs on medical benchmarks. We evaluated a range of OS foundation LLMs (7B-70B) on four medical benchmarks (MedQA, MedMCQA, PubMedQA, MMLU medical-subset). We employed a series of prompting strategies, including zero-shot, few-shot, chain-of-thought (random selection and kNN selection), and ensemble/self-consistency voting. We found that OpenMedLM delivers OS SOTA results on three common medical LLM benchmarks, surpassing the previous best performing OS models that leveraged computationally costly extensive fine-tuning. The model delivers a 72.6% accuracy on the MedQA benchmark, outperforming the previous SOTA by 2.4%, and achieves 81.7% accuracy on the MMLU medical-subset, establishing itself as the first OS LLM to surpass 80% accuracy on this benchmark. Our results highlight medical-specific emergent properties in OS LLMs which have not yet been documented to date elsewhere, and showcase the benefits of further leveraging prompt engineering to improve the performance of accessible LLMs for medical applications. △ Less

Submitted 29 February, 2024; originally announced February 2024.

arXiv:2402.12336 [pdf, other]

Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models

Authors: Christian Schlarmann, Naman Deep Singh, Francesco Croce, Matthias Hein

Abstract: Multi-modal foundation models like OpenFlamingo, LLaVA, and GPT-4 are increasingly used for various real-world tasks. Prior work has shown that these models are highly vulnerable to adversarial attacks on the vision modality. These attacks can be leveraged to spread fake information or defraud users, and thus pose a significant risk, which makes the robustness of large multi-modal foundation model… ▽ More Multi-modal foundation models like OpenFlamingo, LLaVA, and GPT-4 are increasingly used for various real-world tasks. Prior work has shown that these models are highly vulnerable to adversarial attacks on the vision modality. These attacks can be leveraged to spread fake information or defraud users, and thus pose a significant risk, which makes the robustness of large multi-modal foundation models a pressing problem. The CLIP model, or one of its variants, is used as a frozen vision encoder in many large vision-language models (LVLMs), e.g. LLaVA and OpenFlamingo. We propose an unsupervised adversarial fine-tuning scheme to obtain a robust CLIP vision encoder, which yields robustness on all vision down-stream tasks (LVLMs, zero-shot classification) that rely on CLIP. In particular, we show that stealth-attacks on users of LVLMs by a malicious third party providing manipulated images are no longer possible once one replaces the original CLIP model with our robust one. No retraining or fine-tuning of the down-stream LVLMs is required. The code and robust models are available at https://github.com/chs20/RobustVLM △ Less

Submitted 5 June, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

Comments: ICML 2024 Oral

arXiv:2402.05963 [pdf, other]

Frugal Actor-Critic: Sample Efficient Off-Policy Deep Reinforcement Learning Using Unique Experiences

Authors: Nikhil Kumar Singh, Indranil Saha

Abstract: Efficient utilization of the replay buffer plays a significant role in the off-policy actor-critic reinforcement learning (RL) algorithms used for model-free control policy synthesis for complex dynamical systems. We propose a method for achieving sample efficiency, which focuses on selecting unique samples and adding them to the replay buffer during the exploration with the goal of reducing the b… ▽ More Efficient utilization of the replay buffer plays a significant role in the off-policy actor-critic reinforcement learning (RL) algorithms used for model-free control policy synthesis for complex dynamical systems. We propose a method for achieving sample efficiency, which focuses on selecting unique samples and adding them to the replay buffer during the exploration with the goal of reducing the buffer size and maintaining the independent and identically distributed (IID) nature of the samples. Our method is based on selecting an important subset of the set of state variables from the experiences encountered during the initial phase of random exploration, partitioning the state space into a set of abstract states based on the selected important state variables, and finally selecting the experiences with unique state-reward combination by using a kernel density estimator. We formally prove that the off-policy actor-critic algorithm incorporating the proposed method for unique experience accumulation converges faster than the vanilla off-policy actor-critic algorithm. Furthermore, we evaluate our method by comparing it with two state-of-the-art actor-critic RL algorithms on several continuous control benchmarks available in the Gym environment. Experimental results demonstrate that our method achieves a significant reduction in the size of the replay buffer for all the benchmarks while achieving either faster convergent or better reward accumulation compared to the baseline algorithms. △ Less

Submitted 5 February, 2024; originally announced February 2024.

arXiv:2402.03704 [pdf, other]

WhisperFuzz: White-Box Fuzzing for Detecting and Locating Timing Vulnerabilities in Processors

Authors: Pallavi Borkar, Chen Chen, Mohamadreza Rostami, Nikhilesh Singh, Rahul Kande, Ahmad-Reza Sadeghi, Chester Rebeiro, Jeyavijayan Rajendran

Abstract: Timing vulnerabilities in processors have emerged as a potent threat. As processors are the foundation of any computing system, identifying these flaws is imperative. Recently fuzzing techniques, traditionally used for detecting software vulnerabilities, have shown promising results for uncovering vulnerabilities in large-scale hardware designs, such as processors. Researchers have adapted black-b… ▽ More Timing vulnerabilities in processors have emerged as a potent threat. As processors are the foundation of any computing system, identifying these flaws is imperative. Recently fuzzing techniques, traditionally used for detecting software vulnerabilities, have shown promising results for uncovering vulnerabilities in large-scale hardware designs, such as processors. Researchers have adapted black-box or grey-box fuzzing to detect timing vulnerabilities in processors. However, they cannot identify the locations or root causes of these timing vulnerabilities, nor do they provide coverage feedback to enable the designer's confidence in the processor's security. To address the deficiencies of the existing fuzzers, we present WhisperFuzz--the first white-box fuzzer with static analysis--aiming to detect and locate timing vulnerabilities in processors and evaluate the coverage of microarchitectural timing behaviors. WhisperFuzz uses the fundamental nature of processors' timing behaviors, microarchitectural state transitions, to localize timing vulnerabilities. WhisperFuzz automatically extracts microarchitectural state transitions from a processor design at the register-transfer level (RTL) and instruments the design to monitor the state transitions as coverage. Moreover, WhisperFuzz measures the time a design-under-test (DUT) takes to process tests, identifying any minor, abnormal variations that may hint at a timing vulnerability. WhisperFuzz detects 12 new timing vulnerabilities across advanced open-sourced RISC-V processors: BOOM, Rocket Core, and CVA6. Eight of these violate the zero latency requirements of the Zkt extension and are considered serious security vulnerabilities. Moreover, WhisperFuzz also pinpoints the locations of the new and the existing vulnerabilities. △ Less

Submitted 14 March, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

Comments: Accepted to USENIX Sec'24

arXiv:2402.00356 [pdf, other]

IoT in the Cloud: Exploring Security Challenges and Mitigations for a Connected World

Authors: Nivedita Singh, Rajkumar Buyya, Hyoungshich Kim

Abstract: The Internet of Things (IoT) has seen remarkable advancements in recent years, leading to a paradigm shift in the digital landscape. However, these technological strides have introduced new challenges, particularly in cybersecurity. IoT devices, inherently connected to the internet, are susceptible to various forms of attacks. Moreover, IoT services often handle sensitive user data, which could be… ▽ More The Internet of Things (IoT) has seen remarkable advancements in recent years, leading to a paradigm shift in the digital landscape. However, these technological strides have introduced new challenges, particularly in cybersecurity. IoT devices, inherently connected to the internet, are susceptible to various forms of attacks. Moreover, IoT services often handle sensitive user data, which could be exploited by malicious actors or unauthorized service providers. As IoT ecosystems expand, the convergence of traditional and cloud-based systems presents unique security threats in the absence of uniform regulations. Cloud-based IoT systems, enabled by Platform-as-a-Service (PaaS) and Infrastructure-as-a-Service (IaaS) models, offer flexibility and scalability but also pose additional security risks. The intricate interaction between these systems and traditional IoT devices demands comprehensive strategies to protect data integrity and user privacy. This paper highlights the pressing security concerns associated with the widespread adoption of IoT devices and services. We propose viable solutions to bridge the existing security gaps while anticipating and preparing for future challenges. Our approach entails a comprehensive exploration of the key security challenges that IoT services are currently facing. We also suggest proactive strategies to mitigate these risks, thereby strengthening the overall security of IoT devices and services. △ Less

Submitted 26 August, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

arXiv:2401.16838 [pdf, other]

A Complete Fragment of LTL(EB)

Authors: Flavio Ferrarotti, Peter Rivière, Klaus-Dieter Schewe, Neeraj Kumar Singh, Yamine Aït Ameur

Abstract: The verification of liveness conditions is an important aspect of state-based rigorous methods. This article investigates this problem in a fragment $\square$LTL of the logic LTL(EB), the integration of the UNTIL-fragment of Pnueli's linear time temporal logic (LTL) and the logic of Event-B, in which the most commonly used liveness conditions can be expressed. For this fragment a sound set of deri… ▽ More The verification of liveness conditions is an important aspect of state-based rigorous methods. This article investigates this problem in a fragment $\square$LTL of the logic LTL(EB), the integration of the UNTIL-fragment of Pnueli's linear time temporal logic (LTL) and the logic of Event-B, in which the most commonly used liveness conditions can be expressed. For this fragment a sound set of derivation rules is developed, which is also complete under mild restrictions for Event-B machines. △ Less

Submitted 30 January, 2024; originally announced January 2024.

Comments: 22 pages

MSC Class: 68Q60; 68N30

arXiv:2401.05826 [pdf, other]

Crumbled Cookie Exploring E-commerce Websites Cookie Policies with Data Protection Regulations

Authors: Nivedita Singh, Yejin Do, Yongsang Yu. Imane Fouad, Jungrae Kim, Hyoungshick Kim

Abstract: Despite stringent data protection regulations such as the General Data Protection Regulation (GDPR), the California Consumer Privacy Act (CCPA), and other country-specific regulations, many websites continue to use cookies to track user activities. Recent studies have revealed several data protection violations, resulting in significant penalties, especially for multinational corporations. Motivat… ▽ More Despite stringent data protection regulations such as the General Data Protection Regulation (GDPR), the California Consumer Privacy Act (CCPA), and other country-specific regulations, many websites continue to use cookies to track user activities. Recent studies have revealed several data protection violations, resulting in significant penalties, especially for multinational corporations. Motivated by the question of why these data protection violations continue to occur despite strong data protection regulations, we examined 360 popular e-commerce websites in multiple countries to analyze whether they comply with regulations to protect user privacy from a cookie perspective. △ Less

Submitted 11 January, 2024; originally announced January 2024.

arXiv:2401.04732 [pdf, other]

A case study of Generative AI in MSX Sales Copilot: Improving seller productivity with a real-time question-answering system for content recommendation

Authors: Manpreet Singh, Ravdeep Pasricha, Nitish Singh, Ravi Prasad Kondapalli, Manoj R, Kiran R, Laurent Boué

Abstract: In this paper, we design a real-time question-answering system specifically targeted for helping sellers get relevant material/documentation they can share live with their customers or refer to during a call. Taking the Seismic content repository as a relatively large scale example of a diverse dataset of sales material, we demonstrate how LLM embeddings of sellers' queries can be matched with the… ▽ More In this paper, we design a real-time question-answering system specifically targeted for helping sellers get relevant material/documentation they can share live with their customers or refer to during a call. Taking the Seismic content repository as a relatively large scale example of a diverse dataset of sales material, we demonstrate how LLM embeddings of sellers' queries can be matched with the relevant content. We achieve this by engineering prompts in an elaborate fashion that makes use of the rich set of meta-features available for documents and sellers. Using a bi-encoder with cross-encoder re-ranker architecture, we show how the solution returns the most relevant content recommendations in just a few seconds even for large datasets. Our recommender system is deployed as an AML endpoint for real-time inferencing and has been integrated into a Copilot interface that is now deployed in the production version of the Dynamics CRM, known as MSX, used daily by Microsoft sellers. △ Less

Submitted 4 January, 2024; originally announced January 2024.

Journal ref: Microsoft Journal of Applied Research, Volume 20, 2024

arXiv:2312.13193 [pdf]

HCDIR: End-to-end Hate Context Detection, and Intensity Reduction model for online comments

Authors: Neeraj Kumar Singh, Koyel Ghosh, Joy Mahapatra, Utpal Garain, Apurbalal Senapati

Abstract: Warning: This paper contains examples of the language that some people may find offensive. Detecting and reducing hateful, abusive, offensive comments is a critical and challenging task on social media. Moreover, few studies aim to mitigate the intensity of hate speech. While studies have shown that context-level semantics are crucial for detecting hateful comments, most of this research focuses… ▽ More Warning: This paper contains examples of the language that some people may find offensive. Detecting and reducing hateful, abusive, offensive comments is a critical and challenging task on social media. Moreover, few studies aim to mitigate the intensity of hate speech. While studies have shown that context-level semantics are crucial for detecting hateful comments, most of this research focuses on English due to the ample datasets available. In contrast, low-resource languages, like Indian languages, remain under-researched because of limited datasets. Contrary to hate speech detection, hate intensity reduction remains unexplored in high-resource and low-resource languages. In this paper, we propose a novel end-to-end model, HCDIR, for Hate Context Detection, and Hate Intensity Reduction in social media posts. First, we fine-tuned several pre-trained language models to detect hateful comments to ascertain the best-performing hateful comments detection model. Then, we identified the contextual hateful words. Identification of such hateful words is justified through the state-of-the-art explainable learning model, i.e., Integrated Gradient (IG). Lastly, the Masked Language Modeling (MLM) model has been employed to capture domain-specific nuances to reduce hate intensity. We masked the 50\% hateful words of the comments identified as hateful and predicted the alternative words for these masked terms to generate convincing sentences. An optimal replacement for the original hate comments from the feasible sentences is preferred. Extensive experiments have been conducted on several recent datasets using automatic metric-based evaluation (BERTScore) and thorough human evaluation. To enhance the faithfulness in human evaluation, we arranged a group of three human annotators with varied expertise. △ Less

Submitted 20 December, 2023; originally announced December 2023.

arXiv:2311.11565 [pdf, other]

Protected Working Groups-based Resilient Resource Provisioning in MCF-enabled SDM-EONs

Authors: Anjali Sharma, Varsha Lohani, Yatindra Nath Singh

Abstract: Space Division Multiplexed- Elastic Optical Networks using Multicore Fibers are a promising and viable solution to meet the increasing heterogeneous bandwidth demands. The extra capacity gained due to spatial parameters in SDM-EONs could encounter detrimental losses if any link fails and timely restoration is not done. This paper proposes a Protected and Unprotected Working Core Groups assignment… ▽ More Space Division Multiplexed- Elastic Optical Networks using Multicore Fibers are a promising and viable solution to meet the increasing heterogeneous bandwidth demands. The extra capacity gained due to spatial parameters in SDM-EONs could encounter detrimental losses if any link fails and timely restoration is not done. This paper proposes a Protected and Unprotected Working Core Groups assignment (PWCG/ UPWCG) scheme for differentiated connection requests in multicore fiber-enabled SDM-EONs. A PWCG is inherently protected by resources in a Dedicated Spare Core Group (DSCG). First, we divide the cores into three groups using traffic and crosstalk considerations. In the second step, we use the obtained core groups for resource provisioning in dynamic network scenarios. The effectiveness of our proposed technique is compared with a Link Disjoint Path Protection (LDPP) technique, and the simulation study verifies our assertions and the findings. △ Less

Submitted 20 November, 2023; originally announced November 2023.

arXiv:2311.11214 [pdf]

Infrared image identification method of substation equipment fault under weak supervision

Authors: Anjali Sharma, Priya Banerjee, Nikhil Singh

Abstract: This study presents a weakly supervised method for identifying faults in infrared images of substation equipment. It utilizes the Faster RCNN model for equipment identification, enhancing detection accuracy through modifications to the model's network structure and parameters. The method is exemplified through the analysis of infrared images captured by inspection robots at substations. Performanc… ▽ More This study presents a weakly supervised method for identifying faults in infrared images of substation equipment. It utilizes the Faster RCNN model for equipment identification, enhancing detection accuracy through modifications to the model's network structure and parameters. The method is exemplified through the analysis of infrared images captured by inspection robots at substations. Performance is validated against manually marked results, demonstrating that the proposed algorithm significantly enhances the accuracy of fault identification across various equipment types. △ Less

Submitted 18 November, 2023; originally announced November 2023.

arXiv:2311.08314 [pdf, other]

Convolutional Neural Networks Exploiting Attributes of Biological Neurons

Authors: Neeraj Kumar Singh, Nikhil R. Pal

Abstract: In this era of artificial intelligence, deep neural networks like Convolutional Neural Networks (CNNs) have emerged as front-runners, often surpassing human capabilities. These deep networks are often perceived as the panacea for all challenges. Unfortunately, a common downside of these networks is their ''black-box'' character, which does not necessarily mirror the operation of biological neural… ▽ More In this era of artificial intelligence, deep neural networks like Convolutional Neural Networks (CNNs) have emerged as front-runners, often surpassing human capabilities. These deep networks are often perceived as the panacea for all challenges. Unfortunately, a common downside of these networks is their ''black-box'' character, which does not necessarily mirror the operation of biological neural systems. Some even have millions/billions of learnable (tunable) parameters, and their training demands extensive data and time. Here, we integrate the principles of biological neurons in certain layer(s) of CNNs. Specifically, we explore the use of neuro-science-inspired computational models of the Lateral Geniculate Nucleus (LGN) and simple cells of the primary visual cortex. By leveraging such models, we aim to extract image features to use as input to CNNs, hoping to enhance training efficiency and achieve better accuracy. We aspire to enable shallow networks with a Push-Pull Combination of Receptive Fields (PP-CORF) model of simple cells as the foundation layer of CNNs to enhance their learning process and performance. To achieve this, we propose a two-tower CNN, one shallow tower and the other as ResNet 18. Rather than extracting the features blindly, it seeks to mimic how the brain perceives and extracts features. The proposed system exhibits a noticeable improvement in the performance (on an average of $5\%-10\%$) on CIFAR-10, CIFAR-100, and ImageNet-100 datasets compared to ResNet-18. We also check the efficiency of only the Push-Pull tower of the network. △ Less

Submitted 14 November, 2023; originally announced November 2023.

Comments: 20 pages, 6 figures

arXiv:2310.09020

Credit Blockchain for Faster Transactions in P2P Energy Trading

Authors: Amit kumar Vishwakarma, Yatindra Nath Singh

Abstract: P2P trading of energy can be a good alternative to incentivize distributed non-conventional energy production and meet the burgeoning energy demand. For efficient P2P trading, a free market for trading needs to be established while ensuring the information reliability, security, and privacy. Blockchain has been used to provide this framework, but it consumes very high energy and is slow. Further,… ▽ More P2P trading of energy can be a good alternative to incentivize distributed non-conventional energy production and meet the burgeoning energy demand. For efficient P2P trading, a free market for trading needs to be established while ensuring the information reliability, security, and privacy. Blockchain has been used to provide this framework, but it consumes very high energy and is slow. Further, until now, no blockchain model has considered the role of conventional electric utility companies in P2P trading. In this paper, we have introduced a credit blockchain that reduces energy consumption by employing a new mechanism to update transactions and increases speed by providing interest free loans to buyers. This model also integrates the electric utility companies within the P2P trading framework, thereby increasing members trading options. We have also discussed the pricing strategies for trading. All the above assertions have been verified through simulations, demonstrating that this model will promote P2P trading by providing enhanced security, speed, and greater trading options. The proposed model will also help trade energy at prices beneficial for both sellers and buyers. △ Less

Submitted 21 November, 2023; v1 submitted 13 October, 2023; originally announced October 2023.

Comments: I am submitting the paper to another journal

MSC Class: NA ACM Class: F.2.2

arXiv:2308.10714 [pdf, other]

CXL Memory as Persistent Memory for Disaggregated HPC: A Practical Approach

Authors: Yehonatan Fridman, Suprasad Mutalik Desai, Navneet Singh, Thomas Willhalm, Gal Oren

Abstract: In the landscape of High-Performance Computing (HPC), the quest for efficient and scalable memory solutions remains paramount. The advent of Compute Express Link (CXL) introduces a promising avenue with its potential to function as a Persistent Memory (PMem) solution in the context of disaggregated HPC systems. This paper presents a comprehensive exploration of CXL memory's viability as a candidat… ▽ More In the landscape of High-Performance Computing (HPC), the quest for efficient and scalable memory solutions remains paramount. The advent of Compute Express Link (CXL) introduces a promising avenue with its potential to function as a Persistent Memory (PMem) solution in the context of disaggregated HPC systems. This paper presents a comprehensive exploration of CXL memory's viability as a candidate for PMem, supported by physical experiments conducted on cutting-edge multi-NUMA nodes equipped with CXL-attached memory prototypes. Our study not only benchmarks the performance of CXL memory but also illustrates the seamless transition from traditional PMem programming models to CXL, reinforcing its practicality. To substantiate our claims, we establish a tangible CXL prototype using an FPGA card embodying CXL 1.1/2.0 compliant endpoint designs (Intel FPGA CXL IP). Performance evaluations, executed through the STREAM and STREAM-PMem benchmarks, showcase CXL memory's ability to mirror PMem characteristics in App-Direct and Memory Mode while achieving impressive bandwidth metrics with Intel 4th generation Xeon (Sapphire Rapids) processors. The results elucidate the feasibility of CXL memory as a persistent memory solution, outperforming previously established benchmarks. In contrast to published DCPMM results, our CXL-DDR4 memory module offers comparable bandwidth to local DDR4 memory configurations, albeit with a moderate decrease in performance. The modified STREAM-PMem application underscores the ease of transitioning programming models from PMem to CXL, thus underscoring the practicality of adopting CXL memory. △ Less

Submitted 21 August, 2023; originally announced August 2023.

Comments: 12 pages, 9 figures

arXiv:2306.12941 [pdf, other]

Towards Reliable Evaluation and Fast Training of Robust Semantic Segmentation Models

Authors: Francesco Croce, Naman D Singh, Matthias Hein

Abstract: Adversarial robustness has been studied extensively in image classification, especially for the $\ell_\infty$-threat model, but significantly less so for related tasks such as object detection and semantic segmentation, where attacks turn out to be a much harder optimization problem than for image classification. We propose several problem-specific novel attacks minimizing different metrics in acc… ▽ More Adversarial robustness has been studied extensively in image classification, especially for the $\ell_\infty$-threat model, but significantly less so for related tasks such as object detection and semantic segmentation, where attacks turn out to be a much harder optimization problem than for image classification. We propose several problem-specific novel attacks minimizing different metrics in accuracy and mIoU. The ensemble of our attacks, SEA, shows that existing attacks severely overestimate the robustness of semantic segmentation models. Surprisingly, existing attempts of adversarial training for semantic segmentation models turn out to be weak or even completely non-robust. We investigate why previous adaptations of adversarial training to semantic segmentation failed and show how recently proposed robust ImageNet backbones can be used to obtain adversarially robust semantic segmentation models with up to six times less training time for PASCAL-VOC and the more challenging ADE20k. The associated code and robust models are available at https://github.com/nmndeep/robust-segmentation △ Less

Submitted 16 July, 2024; v1 submitted 22 June, 2023; originally announced June 2023.

Comments: ECCV 2024

arXiv:2306.10898 [pdf, other]

B-cos Alignment for Inherently Interpretable CNNs and Vision Transformers

Authors: Moritz Böhle, Navdeeppal Singh, Mario Fritz, Bernt Schiele

Abstract: We present a new direction for increasing the interpretability of deep neural networks (DNNs) by promoting weight-input alignment during training. For this, we propose to replace the linear transformations in DNNs by our novel B-cos transformation. As we show, a sequence (network) of such transformations induces a single linear transformation that faithfully summarises the full model computations.… ▽ More We present a new direction for increasing the interpretability of deep neural networks (DNNs) by promoting weight-input alignment during training. For this, we propose to replace the linear transformations in DNNs by our novel B-cos transformation. As we show, a sequence (network) of such transformations induces a single linear transformation that faithfully summarises the full model computations. Moreover, the B-cos transformation is designed such that the weights align with relevant signals during optimisation. As a result, those induced linear transformations become highly interpretable and highlight task-relevant features. Importantly, the B-cos transformation is designed to be compatible with existing architectures and we show that it can easily be integrated into virtually all of the latest state of the art models for computer vision - e.g. ResNets, DenseNets, ConvNext models, as well as Vision Transformers - by combining the B-cos-based explanations with normalisation and attention layers, all whilst maintaining similar accuracy on ImageNet. Finally, we show that the resulting explanations are of high visual quality and perform well under quantitative interpretability metrics. △ Less

Submitted 15 January, 2024; v1 submitted 19 June, 2023; originally announced June 2023.

Comments: Extension of B-cos Networks: Alignment is All We Need for Interpretability (Böhle et al., CVPR 2022). Accepted for publication in IEEE Transactions on Pattern Analysis and Machine Intelligence. arXiv admin note: substantial text overlap with arXiv:2205.10268

arXiv:2306.08997

Exploring the MIT Mathematics and EECS Curriculum Using Large Language Models

Authors: Sarah J. Zhang, Samuel Florin, Ariel N. Lee, Eamon Niknafs, Andrei Marginean, Annie Wang, Keith Tyser, Zad Chin, Yann Hicke, Nikhil Singh, Madeleine Udell, Yoon Kim, Tonio Buonassisi, Armando Solar-Lezama, Iddo Drori

Abstract: We curate a comprehensive dataset of 4,550 questions and solutions from problem sets, midterm exams, and final exams across all MIT Mathematics and Electrical Engineering and Computer Science (EECS) courses required for obtaining a degree. We evaluate the ability of large language models to fulfill the graduation requirements for any MIT major in Mathematics and EECS. Our results demonstrate that… ▽ More We curate a comprehensive dataset of 4,550 questions and solutions from problem sets, midterm exams, and final exams across all MIT Mathematics and Electrical Engineering and Computer Science (EECS) courses required for obtaining a degree. We evaluate the ability of large language models to fulfill the graduation requirements for any MIT major in Mathematics and EECS. Our results demonstrate that GPT-3.5 successfully solves a third of the entire MIT curriculum, while GPT-4, with prompt engineering, achieves a perfect solve rate on a test set excluding questions based on images. We fine-tune an open-source large language model on this dataset. We employ GPT-4 to automatically grade model responses, providing a detailed performance breakdown by course, question, and answer type. By embedding questions in a low-dimensional space, we explore the relationships between questions, topics, and classes and discover which questions and classes are required for solving other questions and classes through few-shot learning. Our analysis offers valuable insights into course prerequisites and curriculum design, highlighting language models' potential for learning and improving Mathematics and EECS education. △ Less

Submitted 24 June, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

Comments: Did not receive permission to release the data or model fine-tuned on the data

arXiv:2305.17531 [pdf, other]

Probing reaction channels via reinforcement learning

Authors: Senwei Liang, Aditya N. Singh, Yuanran Zhu, David T. Limmer, Chao Yang

Abstract: We propose a reinforcement learning based method to identify important configurations that connect reactant and product states along chemical reaction paths. By shooting multiple trajectories from these configurations, we can generate an ensemble of configurations that concentrate on the transition path ensemble. This configuration ensemble can be effectively employed in a neural network-based par… ▽ More We propose a reinforcement learning based method to identify important configurations that connect reactant and product states along chemical reaction paths. By shooting multiple trajectories from these configurations, we can generate an ensemble of configurations that concentrate on the transition path ensemble. This configuration ensemble can be effectively employed in a neural network-based partial differential equation solver to obtain an approximation solution of a restricted Backward Kolmogorov equation, even when the dimension of the problem is very high. The resulting solution, known as the committor function, encodes mechanistic information for the reaction and can in turn be used to evaluate reaction rates. △ Less

Submitted 27 May, 2023; originally announced May 2023.

arXiv:2305.16440 [pdf, ps, other]

Representation Transfer Learning via Multiple Pre-trained models for Linear Regression

Authors: Navjot Singh, Suhas Diggavi

Abstract: In this paper, we consider the problem of learning a linear regression model on a data domain of interest (target) given few samples. To aid learning, we are provided with a set of pre-trained regression models that are trained on potentially different data domains (sources). Assuming a representation structure for the data generating linear models at the sources and the target domains, we propose… ▽ More In this paper, we consider the problem of learning a linear regression model on a data domain of interest (target) given few samples. To aid learning, we are provided with a set of pre-trained regression models that are trained on potentially different data domains (sources). Assuming a representation structure for the data generating linear models at the sources and the target domains, we propose a representation transfer based learning method for constructing the target model. The proposed scheme is comprised of two phases: (i) utilizing the different source representations to construct a representation that is adapted to the target data, and (ii) using the obtained model as an initialization to a fine-tuning procedure that re-trains the entire (over-parameterized) regression model on the target data. For each phase of the training method, we provide excess risk bounds for the learned model compared to the true data generating target model. The derived bounds show a gain in sample complexity for our proposed method compared to the baseline method of not leveraging source representations when achieving the same excess risk, therefore, theoretically demonstrating the effectiveness of transfer learning for linear regression. △ Less

Submitted 24 June, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

Comments: 20 pages

arXiv:2305.16251 [pdf, other]

doi 10.1007/978-981-15-6401-7_10-1

A Survey of Security Concerns and Countermeasures in Modern Micro-architectures with Transient Execution

Authors: Nikhilesh Singh, Vinod Ganesan, Chester Rebeiro

Abstract: In the last two decades, the evolving cyber-threat landscape has brought to center stage the contentious tradeoffs between the security and performance of modern microprocessors. The guarantees provided by the hardware to ensure no violation of process boundaries have been shown to be breached in several real-world scenarios. While modern CPU features such as superscalar, out-of-order, simultaneou… ▽ More In the last two decades, the evolving cyber-threat landscape has brought to center stage the contentious tradeoffs between the security and performance of modern microprocessors. The guarantees provided by the hardware to ensure no violation of process boundaries have been shown to be breached in several real-world scenarios. While modern CPU features such as superscalar, out-of-order, simultaneous multi-threading, and speculative execution play a critical role in boosting system performance, they are central for a potent class of security attacks termed transient micro-architectural attacks. These attacks leverage shared hardware resources in the CPU that are used during speculative and out-of-order execution to steal sensitive information. Researchers have used these attacks to read data from the Operating Systems (OS) and Trusted Execution Environments (TEE) and to even break hardware-enforced isolation. Over the years, several variants of transient micro-architectural attacks have been developed. While each variant differs in the shared hardware resource used, the underlying attack follows a similar strategy. This paper presents a panoramic view of security concerns in modern CPUs, focusing on the mechanisms of these attacks and providing a classification of the variants. Further, we discuss state-of-the-art defense mechanisms towards mitigating these attacks. △ Less

Submitted 25 May, 2023; originally announced May 2023.

arXiv:2304.05949 [pdf, other]

doi 10.1038/s41467-024-46645-6

CMOS + stochastic nanomagnets: heterogeneous computers for probabilistic inference and learning

Authors: Nihal Sanjay Singh, Keito Kobayashi, Qixuan Cao, Kemal Selcuk, Tianrui Hu, Shaila Niazi, Navid Anjum Aadit, Shun Kanai, Hideo Ohno, Shunsuke Fukami, Kerem Y. Camsari

Abstract: Extending Moore's law by augmenting complementary-metal-oxide semiconductor (CMOS) transistors with emerging nanotechnologies (X) has become increasingly important. One important class of problems involve sampling-based Monte Carlo algorithms used in probabilistic machine learning, optimization, and quantum simulation. Here, we combine stochastic magnetic tunnel junction (sMTJ)-based probabilistic… ▽ More Extending Moore's law by augmenting complementary-metal-oxide semiconductor (CMOS) transistors with emerging nanotechnologies (X) has become increasingly important. One important class of problems involve sampling-based Monte Carlo algorithms used in probabilistic machine learning, optimization, and quantum simulation. Here, we combine stochastic magnetic tunnel junction (sMTJ)-based probabilistic bits (p-bits) with Field Programmable Gate Arrays (FPGA) to create an energy-efficient CMOS + X (X = sMTJ) prototype. This setup shows how asynchronously driven CMOS circuits controlled by sMTJs can perform probabilistic inference and learning by leveraging the algorithmic update-order-invariance of Gibbs sampling. We show how the stochasticity of sMTJs can augment low-quality random number generators (RNG). Detailed transistor-level comparisons reveal that sMTJ-based p-bits can replace up to 10,000 CMOS transistors while dissipating two orders of magnitude less energy. Integrated versions of our approach can advance probabilistic computing involving deep Boltzmann machines and other energy-based learning algorithms with extremely high throughput and energy efficiency. △ Less

Submitted 23 February, 2024; v1 submitted 12 April, 2023; originally announced April 2023.

Journal ref: Nature Communications volume 15, Article number: 2685 (2024)

arXiv:2304.05600 [pdf, other]

Looking Similar, Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning

Authors: Nikhil Singh, Chih-Wei Wu, Iroro Orife, Mahdi Kalayeh

Abstract: Audiovisual representation learning typically relies on the correspondence between sight and sound. However, there are often multiple audio tracks that can correspond with a visual scene. Consider, for example, different conversations on the same crowded street. The effect of such counterfactual pairs on audiovisual representation learning has not been previously explored. To investigate this, we… ▽ More Audiovisual representation learning typically relies on the correspondence between sight and sound. However, there are often multiple audio tracks that can correspond with a visual scene. Consider, for example, different conversations on the same crowded street. The effect of such counterfactual pairs on audiovisual representation learning has not been previously explored. To investigate this, we use dubbed versions of movies and television shows to augment cross-modal contrastive learning. Our approach learns to represent alternate audio tracks, differing only in speech, similarly to the same video. Our results, from a comprehensive set of experiments investigating different training strategies, show this general approach improves performance on a range of downstream auditory and audiovisual tasks, without majorly affecting linguistic task performance overall. These findings highlight the importance of considering speech variation when learning scene-level audiovisual correspondences and suggest that dubbed audio can be a useful augmentation technique for training audiovisual models toward more robust performance on diverse downstream tasks. △ Less

Submitted 8 June, 2024; v1 submitted 12 April, 2023; originally announced April 2023.

Comments: Accepted to CVPR 2024

arXiv:2303.13965 [pdf, other]

Generalized Distance Metric for Various DHT Routing Algorithms in Peer-to-Peer Networks

Authors: Rashmi Kushwaha, Shreyas Kulkarni, Yatindra Nath Singh

Abstract: We present a generalized distance metric that can be used to implement routing strategies and identify routing table entries to reach the root node for a given key, in a DHT (Distributed Hash Table) network based on either Chord, Kademlia, Tapestry, or Pastry. The generalization shows that all the above four DHT algorithms are in fact, the same algorithm but with different parameters in distance r… ▽ More We present a generalized distance metric that can be used to implement routing strategies and identify routing table entries to reach the root node for a given key, in a DHT (Distributed Hash Table) network based on either Chord, Kademlia, Tapestry, or Pastry. The generalization shows that all the above four DHT algorithms are in fact, the same algorithm but with different parameters in distance representation. We also proposes that nodes can have routing tables of varying sizes based on their memory capabilities but with the fact that each node must have at least two entries, one for the node closest from it, and the other for the node from whom it is closest in each ring components for all the algorithms. Messages will always reach the correct root nodes by following the above rule. We also further observe that in any network, if the distance metric to define the root node in the DHT is same at all the nodes, then the root node for a key will also be the same, irrespective of the size of the routing table at different nodes. △ Less

Submitted 28 February, 2024; v1 submitted 24 March, 2023; originally announced March 2023.

Comments: 8 pages, 4 figures, 14 Tables

arXiv:2303.01870 [pdf, other]

Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models

Authors: Naman D Singh, Francesco Croce, Matthias Hein

Abstract: While adversarial training has been extensively studied for ResNet architectures and low resolution datasets like CIFAR, much less is known for ImageNet. Given the recent debate about whether transformers are more robust than convnets, we revisit adversarial training on ImageNet comparing ViTs and ConvNeXts. Extensive experiments show that minor changes in architecture, most notably replacing Patc… ▽ More While adversarial training has been extensively studied for ResNet architectures and low resolution datasets like CIFAR, much less is known for ImageNet. Given the recent debate about whether transformers are more robust than convnets, we revisit adversarial training on ImageNet comparing ViTs and ConvNeXts. Extensive experiments show that minor changes in architecture, most notably replacing PatchStem with ConvStem, and training scheme have a significant impact on the achieved robustness. These changes not only increase robustness in the seen $\ell_\infty$-threat model, but even more so improve generalization to unseen $\ell_1/\ell_2$-attacks. Our modified ConvNeXt, ConvNeXt + ConvStem, yields the most robust $\ell_\infty$-models across different ranges of model parameters and FLOPs, while our ViT + ConvStem yields the best generalization to unseen threat models. △ Less

Submitted 28 October, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

Comments: Accepted at NeurIPS 2023

arXiv:2301.10365 [pdf, other]

Data Consistent Deep Rigid MRI Motion Correction

Authors: Nalini M. Singh, Neel Dey, Malte Hoffmann, Bruce Fischl, Elfar Adalsteinsson, Robert Frost, Adrian V. Dalca, Polina Golland

Abstract: Motion artifacts are a pervasive problem in MRI, leading to misdiagnosis or mischaracterization in population-level imaging studies. Current retrospective rigid intra-slice motion correction techniques jointly optimize estimates of the image and the motion parameters. In this paper, we use a deep network to reduce the joint image-motion parameter search to a search over rigid motion parameters alo… ▽ More Motion artifacts are a pervasive problem in MRI, leading to misdiagnosis or mischaracterization in population-level imaging studies. Current retrospective rigid intra-slice motion correction techniques jointly optimize estimates of the image and the motion parameters. In this paper, we use a deep network to reduce the joint image-motion parameter search to a search over rigid motion parameters alone. Our network produces a reconstruction as a function of two inputs: corrupted k-space data and motion parameters. We train the network using simulated, motion-corrupted k-space data generated with known motion parameters. At test-time, we estimate unknown motion parameters by minimizing a data consistency loss between the motion parameters, the network-based image reconstruction given those parameters, and the acquired measurements. Intra-slice motion correction experiments on simulated and realistic 2D fast spin echo brain MRI achieve high reconstruction fidelity while providing the benefits of explicit data consistency optimization. Our code is publicly available at https://www.github.com/nalinimsingh/neuroMoCo. △ Less

Submitted 16 November, 2023; v1 submitted 24 January, 2023; originally announced January 2023.

Comments: Presented at MIDL 2023. 14 pages, 6 figures. Keywords: motion correction, magnetic resonance imaging, deep learning

arXiv:2301.10035 [pdf]

Putting ChatGPT's Medical Advice to the (Turing) Test

Authors: Oded Nov, Nina Singh, Devin Mann

Abstract: Objective: Assess the feasibility of using ChatGPT or a similar AI-based chatbot for patient-provider communication. Participants: A US representative sample of 430 study participants aged 18 and above. 53.2% of respondents analyzed were women; their average age was 47.1. Exposure: Ten representative non-administrative patient-provider interactions were extracted from the EHR. Patients' questions… ▽ More Objective: Assess the feasibility of using ChatGPT or a similar AI-based chatbot for patient-provider communication. Participants: A US representative sample of 430 study participants aged 18 and above. 53.2% of respondents analyzed were women; their average age was 47.1. Exposure: Ten representative non-administrative patient-provider interactions were extracted from the EHR. Patients' questions were placed in ChatGPT with a request for the chatbot to respond using approximately the same word count as the human provider's response. In the survey, each patient's question was followed by a provider- or ChatGPT-generated response. Participants were informed that five responses were provider-generated and five were chatbot-generated. Participants were asked, and incentivized financially, to correctly identify the response source. Participants were also asked about their trust in chatbots' functions in patient-provider communication, using a Likert scale of 1-5. Results: The correct classification of responses ranged between 49.0% to 85.7% for different questions. On average, chatbot responses were correctly identified 65.5% of the time, and provider responses were correctly distinguished 65.1% of the time. On average, responses toward patients' trust in chatbots' functions were weakly positive (mean Likert score: 3.4), with lower trust as the health-related complexity of the task in questions increased. Conclusions: ChatGPT responses to patient questions were weakly distinguishable from provider responses. Laypeople appear to trust the use of chatbots to answer lower risk health questions. △ Less

Submitted 24 January, 2023; originally announced January 2023.

arXiv:2301.03013 [pdf]

Semantic rule Web-based Diagnosis and Treatment of Vector-Borne Diseases using SWRL rules

Authors: Ritesh Chandra, Sadhana Tiwari, Sonali Agarwal, Navjot Singh

Abstract: Vector-borne diseases (VBDs) are a kind of infection caused through the transmission of vectors generated by the bites of infected parasites, bacteria, and viruses, such as ticks, mosquitoes, triatomine bugs, blackflies, and sandflies. If these diseases are not properly treated within a reasonable time frame, the mortality rate may rise. In this work, we propose a set of ontologies that will help… ▽ More Vector-borne diseases (VBDs) are a kind of infection caused through the transmission of vectors generated by the bites of infected parasites, bacteria, and viruses, such as ticks, mosquitoes, triatomine bugs, blackflies, and sandflies. If these diseases are not properly treated within a reasonable time frame, the mortality rate may rise. In this work, we propose a set of ontologies that will help in the diagnosis and treatment of vector-borne diseases. For developing VBD's ontology, electronic health records taken from the Indian Health Records website, text data generated from Indian government medical mobile applications, and doctors' prescribed handwritten notes of patients are used as input. This data is then converted into correct text using Optical Character Recognition (OCR) and a spelling checker after pre-processing. Natural Language Processing (NLP) is applied for entity extraction from text data for making Resource Description Framework (RDF) medical data with the help of the Patient Clinical Data (PCD) ontology. Afterwards, Basic Formal Ontology (BFO), National Vector Borne Disease Control Program (NVBDCP) guidelines, and RDF medical data are used to develop ontologies for VBDs, and Semantic Web Rule Language (SWRL) rules are applied for diagnosis and treatment. The developed ontology helps in the construction of decision support systems (DSS) for the NVBDCP to control these diseases. △ Less

Submitted 31 January, 2023; v1 submitted 8 January, 2023; originally announced January 2023.

arXiv:2212.01022 [pdf, ps, other]

STL-Based Synthesis of Feedback Controllers Using Reinforcement Learning

Authors: Nikhil Kumar Singh, Indranil Saha

Abstract: Deep Reinforcement Learning (DRL) has the potential to be used for synthesizing feedback controllers (agents) for various complex systems with unknown dynamics. These systems are expected to satisfy diverse safety and liveness properties best captured using temporal logic. In RL, the reward function plays a crucial role in specifying the desired behaviour of these agents. However, the problem of d… ▽ More Deep Reinforcement Learning (DRL) has the potential to be used for synthesizing feedback controllers (agents) for various complex systems with unknown dynamics. These systems are expected to satisfy diverse safety and liveness properties best captured using temporal logic. In RL, the reward function plays a crucial role in specifying the desired behaviour of these agents. However, the problem of designing the reward function for an RL agent to satisfy complex temporal logic specifications has received limited attention in the literature. To address this, we provide a systematic way of generating rewards in real-time by using the quantitative semantics of Signal Temporal Logic (STL), a widely used temporal logic to specify the behaviour of cyber-physical systems. We propose a new quantitative semantics for STL having several desirable properties, making it suitable for reward generation. We evaluate our STL-based reinforcement learning mechanism on several complex continuous control benchmarks and compare our STL semantics with those available in the literature in terms of their efficacy in synthesizing the controller agent. Experimental results establish our new semantics to be the most suitable for synthesizing feedback controllers for complex continuous dynamical systems through reinforcement learning. △ Less

Submitted 2 December, 2022; originally announced December 2022.

Comments: Full version of the paper to be published in AAAI 2023

arXiv:2211.07980 [pdf, other]

A Benchmark and Dataset for Post-OCR text correction in Sanskrit

Authors: Ayush Maheshwari, Nikhil Singh, Amrith Krishna, Ganesh Ramakrishnan

Abstract: Sanskrit is a classical language with about 30 million extant manuscripts fit for digitisation, available in written, printed or scannedimage forms. However, it is still considered to be a low-resource language when it comes to available digital resources. In this work, we release a post-OCR text correction dataset containing around 218,000 sentences, with 1.5 million words, from 30 different book… ▽ More Sanskrit is a classical language with about 30 million extant manuscripts fit for digitisation, available in written, printed or scannedimage forms. However, it is still considered to be a low-resource language when it comes to available digital resources. In this work, we release a post-OCR text correction dataset containing around 218,000 sentences, with 1.5 million words, from 30 different books. Texts in Sanskrit are known to be diverse in terms of their linguistic and stylistic usage since Sanskrit was the 'lingua franca' for discourse in the Indian subcontinent for about 3 millennia. Keeping this in mind, we release a multi-domain dataset, from areas as diverse as astronomy, medicine and mathematics, with some of them as old as 18 centuries. Further, we release multiple strong baselines as benchmarks for the task, based on pre-trained Seq2Seq language models. We find that our best-performing model, consisting of byte level tokenization in conjunction with phonetic encoding (Byt5+SLP1), yields a 23% point increase over the OCR output in terms of word and character error rates. Moreover, we perform extensive experiments in evaluating these models on their performance and analyse common causes of mispredictions both at the graphemic and lexical levels. Our code and dataset is publicly available at https://github.com/ayushbits/pe-ocr-sanskrit. △ Less

Submitted 15 November, 2022; originally announced November 2022.

Comments: Findings of EMNLP, 2022. Code and Data: https://github.com/ayushbits/pe-ocr-sanskrit

arXiv:2211.06153 [pdf, other]

SUNDEW: An Ensemble of Predictors for Case-Sensitive Detection of Malware

Authors: Sareena Karapoola, Nikhilesh Singh, Chester Rebeiro, Kamakoti V

Abstract: Malware programs are diverse, with varying objectives, functionalities, and threat levels ranging from mere pop-ups to financial losses. Consequently, their run-time footprints across the system differ, impacting the optimal data source (Network, Operating system (OS), Hardware) and features that are instrumental to malware detection. Further, the variations in threat levels of malware classes aff… ▽ More Malware programs are diverse, with varying objectives, functionalities, and threat levels ranging from mere pop-ups to financial losses. Consequently, their run-time footprints across the system differ, impacting the optimal data source (Network, Operating system (OS), Hardware) and features that are instrumental to malware detection. Further, the variations in threat levels of malware classes affect the user requirements for detection. Thus, the optimal tuple of <data-source, features, user-requirements> is different for each malware class, impacting the state-of-the-art detection solutions that are agnostic to these subtle differences. This paper presents SUNDEW, a framework to detect malware classes using their optimal tuple of <data-source, features, user-requirements>. SUNDEW uses an ensemble of specialized predictors, each trained with a particular data source (network, OS, and hardware) and tuned for features and requirements of a specific class. While the specialized ensemble with a holistic view across the system improves detection, aggregating the independent conflicting inferences from the different predictors is challenging. SUNDEW resolves such conflicts with a hierarchical aggregation considering the threat-level, noise in the data sources, and prior domain knowledge. We evaluate SUNDEW on a real-world dataset of over 10,000 malware samples from 8 classes. It achieves an F1-Score of one for most classes, with an average of 0.93 and a limited performance overhead of 1.5%. △ Less

Submitted 14 November, 2022; v1 submitted 11 November, 2022; originally announced November 2022.

arXiv:2209.11524 [pdf, other]

Control Barrier Functions in UGVs for Kinematic Obstacle Avoidance: A Collision Cone Approach

Authors: Phani Thontepu, Bhavya Giri Goswami, Manan Tayal, Neelaksh Singh, Shyamsundar P I, Shyam Sundar M G, Suresh Sundaram, Vaibhav Katewa, Shishir Kolathaya

Abstract: In this paper, we propose a new class of Control Barrier Functions (CBFs) for Unmanned Ground Vehicles (UGVs) that help avoid collisions with kinematic (non-zero velocity) obstacles. While the current forms of CBFs have been successful in guaranteeing safety/collision avoidance with static obstacles, extensions for the dynamic case have seen limited success. Moreover, with the UGV models like the… ▽ More In this paper, we propose a new class of Control Barrier Functions (CBFs) for Unmanned Ground Vehicles (UGVs) that help avoid collisions with kinematic (non-zero velocity) obstacles. While the current forms of CBFs have been successful in guaranteeing safety/collision avoidance with static obstacles, extensions for the dynamic case have seen limited success. Moreover, with the UGV models like the unicycle or the bicycle, applications of existing CBFs have been conservative in terms of control, i.e., steering/thrust control has not been possible under certain scenarios. Drawing inspiration from the classical use of collision cones for obstacle avoidance in trajectory planning, we introduce its novel CBF formulation with theoretical guarantees on safety for both the unicycle and bicycle models. The main idea is to ensure that the velocity of the obstacle w.r.t. the vehicle is always pointing away from the vehicle. Accordingly, we construct a constraint that ensures that the velocity vector always avoids a cone of vectors pointing at the vehicle. The efficacy of this new control methodology is later verified by Pybullet simulations on TurtleBot3 and F1Tenth. △ Less

Submitted 16 October, 2023; v1 submitted 23 September, 2022; originally announced September 2022.

Comments: 6 pages, 4 figures, For supplement video follow https://youtu.be/Dme7Wm9y6es. *The first and second authors have contributed equally

ACM Class: I.2.9; G.1.6; J.2

arXiv:2209.11282 [pdf]

Automated detection of Alzheimer disease using MRI images and deep neural networks- A review

Authors: Narotam Singh, Patteshwari. D, Neha Soni, Amita Kapoor

Abstract: Early detection of Alzheimer disease is crucial for deploying interventions and slowing the disease progression. A lot of machine learning and deep learning algorithms have been explored in the past decade with the aim of building an automated detection for Alzheimer. Advancements in data augmentation techniques and advanced deep learning architectures have opened up new frontiers in this field, a… ▽ More Early detection of Alzheimer disease is crucial for deploying interventions and slowing the disease progression. A lot of machine learning and deep learning algorithms have been explored in the past decade with the aim of building an automated detection for Alzheimer. Advancements in data augmentation techniques and advanced deep learning architectures have opened up new frontiers in this field, and research is moving at a rapid speed. Hence, the purpose of this survey is to provide an overview of recent research on deep learning models for Alzheimer disease diagnosis. In addition to categorizing the numerous data sources, neural network architectures, and commonly used assessment measures, we also classify implementation and reproducibility. Our objective is to assist interested researchers in keeping up with the newest developments and in reproducing earlier investigations as benchmarks. In addition, we also indicate future research directions for this topic. △ Less

Submitted 22 September, 2022; originally announced September 2022.

Comments: 22 Pages, 5 Figures, 7 Tables

Showing 1–50 of 179 results for author: Singh, N