-
Learning to Infer Unobserved Behaviors: Estimating User's Preference for a Site over Other Sites
Authors:
Atanu R Sinha,
Tanay Anand,
Paridhi Maheshwari,
A V Lakshmy,
Vishal Jain
Abstract:
A site's recommendation system relies on knowledge of its users' preferences to offer relevant recommendations to them. These preferences are for attributes that comprise items and content shown on the site, and are estimated from the data of users' interactions with the site. Another form of users' preferences is material too, namely, users' preferences for the site over other sites, since that s…
▽ More
A site's recommendation system relies on knowledge of its users' preferences to offer relevant recommendations to them. These preferences are for attributes that comprise items and content shown on the site, and are estimated from the data of users' interactions with the site. Another form of users' preferences is material too, namely, users' preferences for the site over other sites, since that shows users' base level propensities to engage with the site. Estimating users' preferences for the site, however, faces major obstacles because (a) the focal site usually has no data of its users' interactions with other sites; these interactions are users' unobserved behaviors for the focal site; and (b) the Machine Learning literature in recommendation does not offer a model of this situation. Even if (b) is resolved, the problem in (a) persists since without access to data of its users' interactions with other sites, there is no ground truth for evaluation. Moreover, it is most useful when (c) users' preferences for the site can be estimated at the individual level, since the site can then personalize recommendations to individual users. We offer a method to estimate individual user's preference for a focal site, under this premise. In particular, we compute the focal site's share of a user's online engagements without any data from other sites. We show an evaluation framework for the model using only the focal site's data, allowing the site to test the model. We rely upon a Hierarchical Bayes Method and perform estimation in two different ways - Markov Chain Monte Carlo and Stochastic Gradient with Langevin Dynamics. Our results find good support for the approach to computing personalized share of engagement and for its evaluation.
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
No-regret Algorithms for Fair Resource Allocation
Authors:
Abhishek Sinha,
Ativ Joshi,
Rajarshi Bhattacharjee,
Cameron Musco,
Mohammad Hajiesmaili
Abstract:
We consider a fair resource allocation problem in the no-regret setting against an unrestricted adversary. The objective is to allocate resources equitably among several agents in an online fashion so that the difference of the aggregate $α$-fair utilities of the agents between an optimal static clairvoyant allocation and that of the online policy grows sub-linearly with time. The problem is chall…
▽ More
We consider a fair resource allocation problem in the no-regret setting against an unrestricted adversary. The objective is to allocate resources equitably among several agents in an online fashion so that the difference of the aggregate $α$-fair utilities of the agents between an optimal static clairvoyant allocation and that of the online policy grows sub-linearly with time. The problem is challenging due to the non-additive nature of the $α$-fairness function. Previously, it was shown that no online policy can exist for this problem with a sublinear standard regret. In this paper, we propose an efficient online resource allocation policy, called Online Proportional Fair (OPF), that achieves $c_α$-approximate sublinear regret with the approximation factor $c_α=(1-α)^{-(1-α)}\leq 1.445,$ for $0\leq α< 1$. The upper bound to the $c_α$-regret for this problem exhibits a surprising phase transition phenomenon. The regret bound changes from a power-law to a constant at the critical exponent $α=\frac{1}{2}.$ As a corollary, our result also resolves an open problem raised by Even-Dar et al. [2009] on designing an efficient no-regret policy for the online job scheduling problem in certain parameter regimes. The proof of our results introduces new algorithmic and analytical techniques, including greedy estimation of the future gradients for non-additive global reward functions and bootstrapping adaptive regret bounds, which may be of independent interest.
△ Less
Submitted 11 March, 2023;
originally announced March 2023.
-
Characterizing Persistence and Disparity of Covid-19 Infection Rates with City Level Demographic and Regional Features
Authors:
Emi Aoki,
Arkajyoti Sinha,
Charles Thompson,
Kavitha Chandra
Abstract:
The design of data-driven dashboards that inform municipalities on ongoing changes in infections within their community is addressed in this research. Daily reports of Covid-19 infections published by the state of Wisconsin as the initial surge in the pandemic ensued during the October 2020 to September 2021 time frame is considered as a case study. Of particular interest is the identification of…
▽ More
The design of data-driven dashboards that inform municipalities on ongoing changes in infections within their community is addressed in this research. Daily reports of Covid-19 infections published by the state of Wisconsin as the initial surge in the pandemic ensued during the October 2020 to September 2021 time frame is considered as a case study. Of particular interest is the identification of regions and population groups distinguished by race and ethnicity that may be experiencing a disproportional rate of infections over time. This study integrates the municipality-level daily positive cases that are disaggregated by race and ethnicity and population size data derived from the US Census Bureau. The goal is to present timely data-driven information in a manner that is accessible to the general population, is relatable to the constituents and promotes community engagement in managing and mitigating the infections. A statistical metric referred to as the rank difference and its persistence over time is used to capture the disproportional incidence of Covid-19 positive cases on particular race and ethnic groups in relation to their population size. A persistence index is derived to identify regions that continually exhibit positive rank differences on a daily time scale and indicate disparity in disease incidence. The analysis leads to the identification that several municipalities in Wisconsin that are located in regions of low population and away from the denser urban centers are those that continue to exhibit disparity in the infection rates for Black/African American and Hispanic/Latino population groups. Examples of a dashboard that can be utilized to capture both aggregate level and temporal patterns of Covid-19 infections are presented.
△ Less
Submitted 22 November, 2022;
originally announced November 2022.
-
Multiscale Generative Models: Improving Performance of a Generative Model Using Feedback from Other Dependent Generative Models
Authors:
Changyu Chen,
Avinandan Bose,
Shih-Fen Cheng,
Arunesh Sinha
Abstract:
Realistic fine-grained multi-agent simulation of real-world complex systems is crucial for many downstream tasks such as reinforcement learning. Recent work has used generative models (GANs in particular) for providing high-fidelity simulation of real-world systems. However, such generative models are often monolithic and miss out on modeling the interaction in multi-agent systems. In this work, w…
▽ More
Realistic fine-grained multi-agent simulation of real-world complex systems is crucial for many downstream tasks such as reinforcement learning. Recent work has used generative models (GANs in particular) for providing high-fidelity simulation of real-world systems. However, such generative models are often monolithic and miss out on modeling the interaction in multi-agent systems. In this work, we take a first step towards building multiple interacting generative models (GANs) that reflects the interaction in real world. We build and analyze a hierarchical set-up where a higher-level GAN is conditioned on the output of multiple lower-level GANs. We present a technique of using feedback from the higher-level GAN to improve performance of lower-level GANs. We mathematically characterize the conditions under which our technique is impactful, including understanding the transfer learning nature of our set-up. We present three distinct experiments on synthetic data, time series data, and image domain, revealing the wide applicability of our technique.
△ Less
Submitted 24 February, 2022; v1 submitted 24 January, 2022;
originally announced January 2022.
-
An Improved Mathematical Model of Sepsis: Modeling, Bifurcation Analysis, and Optimal Control Study for Complex Nonlinear Infectious Disease System
Authors:
Yuyang Chen,
Kaiming Bi,
Chih-Hang J. Wu,
David Ben-Arieh,
Ashesh Sinha
Abstract:
Sepsis is a life-threatening medical emergency, which is a major cause of death worldwide and the second highest cause of mortality in the United States. Researching the optimal control treatment or intervention strategy on the comprehensive sepsis system is key in reducing mortality. For this purpose, first, this paper improves a complex nonlinear sepsis model proposed in our previous work. Then,…
▽ More
Sepsis is a life-threatening medical emergency, which is a major cause of death worldwide and the second highest cause of mortality in the United States. Researching the optimal control treatment or intervention strategy on the comprehensive sepsis system is key in reducing mortality. For this purpose, first, this paper improves a complex nonlinear sepsis model proposed in our previous work. Then, bifurcation analyses are conducted for each sepsis subsystem to study the model behaviors under some system parameters. The bifurcation analysis results also further indicate the necessity of control treatment and intervention therapy. If the sepsis system is without adding any control under some parameter and initial system value settings, the system will perform persistent inflammation outcomes as time goes by. Therefore, we develop our complex improved nonlinear sepsis model into a sepsis optimal control model, and then use some effective biomarkers recommended in existing clinic practices as optimization objective function to measure the development of sepsis. Besides that, a Bayesian optimization algorithm by combining Recurrent neural network (RNN-BO algorithm) is introduced to predict the optimal control strategy for the studied sepsis optimal control system. The difference between the RNN-BO algorithm from other optimization algorithms is that once given any new initial system value setting (initial value is associated with the initial conditions of patients), the RNN-BO algorithm is capable of quickly predicting a corresponding time-series optimal control based on the historical optimal control data for any new sepsis patient. To demonstrate the effectiveness and efficiency of the RNN-BO algorithm on solving the optimal control solution on the complex nonlinear sepsis system, some numerical simulations are implemented by comparing with other optimization algorithms in this paper.
△ Less
Submitted 7 January, 2022;
originally announced January 2022.
-
High-dimensional Bayesian Optimization Algorithm with Recurrent Neural Network for Disease Control Models in Time Series
Authors:
Yuyang Chen,
Kaiming Bi,
Chih-Hang J. Wu,
David Ben-Arieh,
Ashesh Sinha
Abstract:
Bayesian Optimization algorithm has become a promising approach for nonlinear global optimization problems and many machine learning applications. Over the past few years, improvements and enhancements have been brought forward and they have shown some promising results in solving the complex dynamic problems, systems of ordinary differential equations where the objective functions are computation…
▽ More
Bayesian Optimization algorithm has become a promising approach for nonlinear global optimization problems and many machine learning applications. Over the past few years, improvements and enhancements have been brought forward and they have shown some promising results in solving the complex dynamic problems, systems of ordinary differential equations where the objective functions are computationally expensive to evaluate. Besides, the straightforward implementation of the Bayesian Optimization algorithm performs well merely for optimization problems with 10-20 dimensions. The study presented in this paper proposes a new high dimensional Bayesian Optimization algorithm combining Recurrent neural networks, which is expected to predict the optimal solution for the global optimization problems with high dimensional or time series decision models. The proposed RNN-BO algorithm can solve the optimal control problems in the lower dimension space and then learn from the historical data using the recurrent neural network to learn the historical optimal solution data and predict the optimal control strategy for any new initial system value setting. In addition, accurately and quickly providing the optimal control strategy is essential to effectively and efficiently control the epidemic spread while minimizing the associated financial costs. Therefore, to verify the effectiveness of the proposed algorithm, computational experiments are carried out on a deterministic SEIR epidemic model and a stochastic SIS optimal control model. Finally, we also discuss the impacts of different numbers of the RNN layers and training epochs on the trade-off between solution quality and related computational efforts.
△ Less
Submitted 1 January, 2022;
originally announced January 2022.
-
Using Image Transformations to Learn Network Structure
Authors:
Brayan Ortiz,
Amitabh Sinha
Abstract:
Many learning tasks require observing a sequence of images and making a decision. In a transportation problem of designing and planning for shipping boxes between nodes, we show how to treat the network of nodes and the flows between them as images. These images have useful structural information that can be statistically summarized. Using image compression techniques, we reduce an image down to a…
▽ More
Many learning tasks require observing a sequence of images and making a decision. In a transportation problem of designing and planning for shipping boxes between nodes, we show how to treat the network of nodes and the flows between them as images. These images have useful structural information that can be statistically summarized. Using image compression techniques, we reduce an image down to a set of numbers that contain interpretable geographic information that we call geographic signatures. Using geographic signatures, we learn network structure that can be utilized to recommend future network connectivity. We develop a Bayesian reinforcement algorithm that takes advantage of statistically summarized network information as priors and user-decisions to reinforce an agent's probabilistic decision. Additionally, we show how reinforcement learning can be used with compression directly without interpretation in simple tasks.
△ Less
Submitted 8 June, 2023; v1 submitted 6 December, 2021;
originally announced December 2021.
-
High dimensional Bayesian Optimization Algorithm for Complex System in Time Series
Authors:
Yuyang Chen,
Kaiming Bi,
Chih-Hang J. Wu,
David Ben-Arieh,
Ashesh Sinha
Abstract:
At present, high-dimensional global optimization problems with time-series models have received much attention from engineering fields. Since it was proposed, Bayesian optimization has quickly become a popular and promising approach for solving global optimization problems. However, the standard Bayesian optimization algorithm is insufficient to solving the global optimal solution when the model i…
▽ More
At present, high-dimensional global optimization problems with time-series models have received much attention from engineering fields. Since it was proposed, Bayesian optimization has quickly become a popular and promising approach for solving global optimization problems. However, the standard Bayesian optimization algorithm is insufficient to solving the global optimal solution when the model is high-dimensional. Hence, this paper presents a novel high dimensional Bayesian optimization algorithm by considering dimension reduction and different dimension fill-in strategies. Most existing literature about Bayesian optimization algorithms did not discuss the sampling strategies to optimize the acquisition function. This study proposed a new sampling method based on both the multi-armed bandit and random search methods while optimizing the acquisition function. Besides, based on the time-dependent or dimension-dependent characteristics of the model, the proposed algorithm can reduce the dimension evenly. Then, five different dimension fill-in strategies were discussed and compared in this study. Finally, to increase the final accuracy of the optimal solution, the proposed algorithm adds a local search based on a series of Adam-based steps at the final stage. Our computational experiments demonstrated that the proposed Bayesian optimization algorithm could achieve reasonable solutions with excellent performances for high dimensional global optimization problems with a time-series optimal control model.
△ Less
Submitted 4 August, 2021;
originally announced August 2021.
-
A New Bayesian Optimization Algorithm for Complex High-Dimensional Disease Epidemic Systems
Authors:
Yuyang Chen,
Kaiming Bi,
Chih-Hang J. Wu,
David Ben-Arieh,
Ashesh Sinha
Abstract:
This paper presents an Improved Bayesian Optimization (IBO) algorithm to solve complex high-dimensional epidemic models' optimal control solution. Evaluating the total objective function value for disease control models with hundreds of thousands of control time periods is a high computational cost. In this paper, we improve the conventional Bayesian Optimization (BO) approach from two parts. The…
▽ More
This paper presents an Improved Bayesian Optimization (IBO) algorithm to solve complex high-dimensional epidemic models' optimal control solution. Evaluating the total objective function value for disease control models with hundreds of thousands of control time periods is a high computational cost. In this paper, we improve the conventional Bayesian Optimization (BO) approach from two parts. The existing BO methods optimize the minimizer step for once time during each acquisition function update process. To find a better solution for each acquisition function update, we do more local minimization steps to tune the algorithm. When the model is high dimensions, and the objective function is complicated, only some update iterations of the acquisition function may not find the global optimal solution. The IBO algorithm adds a series of Adam-based steps at the final stage of the algorithm to increase the solution's accuracy. Comparative simulation experiments using different kernel functions and acquisition functions have shown that the Improved Bayesian Optimization algorithm is effective and suitable for handing large-scale and complex epidemic models under study. The IBO algorithm is then compared with four other global optimization algorithms on three well-known synthetic test functions. The effectiveness and robustness of the IBO algorithm are also demonstrated through some simulation experiments to compare with the Particle Swarm Optimization algorithm and Random Search algorithm. With its reliable convergence behaviors and straightforward implementation, the IBO algorithm has a great potential to solve other complex optimal control problems with high dimensionality.
△ Less
Submitted 30 July, 2021;
originally announced August 2021.
-
A fast learning algorithm for One-Class Slab Support Vector Machines
Authors:
Bagesh Kumar,
Ayush Sinha,
Sourin Chakrabarti,
Prof. O. P. Vyas
Abstract:
One Class Slab Support Vector Machines (OCSSVM) have turned out to be better in terms of accuracy in certain classes of classification problems than the traditional SVMs and One Class SVMs or even other One class classifiers. This paper proposes fast training method for One Class Slab SVMs using an updated Sequential Minimal Optimization (SMO) which divides the multi variable optimization problem…
▽ More
One Class Slab Support Vector Machines (OCSSVM) have turned out to be better in terms of accuracy in certain classes of classification problems than the traditional SVMs and One Class SVMs or even other One class classifiers. This paper proposes fast training method for One Class Slab SVMs using an updated Sequential Minimal Optimization (SMO) which divides the multi variable optimization problem to smaller sub problems of size two that can then be solved analytically. The results indicate that this training method scales better to large sets of training data than other Quadratic Programming (QP) solvers.
△ Less
Submitted 3 May, 2021; v1 submitted 6 November, 2020;
originally announced November 2020.
-
Neural Bridge Sampling for Evaluating Safety-Critical Autonomous Systems
Authors:
Aman Sinha,
Matthew O'Kelly,
Russ Tedrake,
John Duchi
Abstract:
Learning-based methodologies increasingly find applications in safety-critical domains like autonomous driving and medical robotics. Due to the rare nature of dangerous events, real-world testing is prohibitively expensive and unscalable. In this work, we employ a probabilistic approach to safety evaluation in simulation, where we are concerned with computing the probability of dangerous events. W…
▽ More
Learning-based methodologies increasingly find applications in safety-critical domains like autonomous driving and medical robotics. Due to the rare nature of dangerous events, real-world testing is prohibitively expensive and unscalable. In this work, we employ a probabilistic approach to safety evaluation in simulation, where we are concerned with computing the probability of dangerous events. We develop a novel rare-event simulation method that combines exploration, exploitation, and optimization techniques to find failure modes and estimate their rate of occurrence. We provide rigorous guarantees for the performance of our method in terms of both statistical and computational efficiency. Finally, we demonstrate the efficacy of our approach on a variety of scenarios, illustrating its usefulness as a tool for rapid sensitivity analysis and model comparison that are essential to developing and testing safety-critical autonomous systems.
△ Less
Submitted 8 August, 2021; v1 submitted 24 August, 2020;
originally announced August 2020.
-
A Gradient-based Bilevel Optimization Approach for Tuning Hyperparameters in Machine Learning
Authors:
Ankur Sinha,
Tanmay Khandait,
Raja Mohanty
Abstract:
Hyperparameter tuning is an active area of research in machine learning, where the aim is to identify the optimal hyperparameters that provide the best performance on the validation set. Hyperparameter tuning is often achieved using naive techniques, such as random search and grid search. However, most of these methods seldom lead to an optimal set of hyperparameters and often get very expensive.…
▽ More
Hyperparameter tuning is an active area of research in machine learning, where the aim is to identify the optimal hyperparameters that provide the best performance on the validation set. Hyperparameter tuning is often achieved using naive techniques, such as random search and grid search. However, most of these methods seldom lead to an optimal set of hyperparameters and often get very expensive. In this paper, we propose a bilevel solution method for solving the hyperparameter optimization problem that does not suffer from the drawbacks of the earlier studies. The proposed method is general and can be easily applied to any class of machine learning algorithms. The idea is based on the approximation of the lower level optimal value function mapping, which is an important mapping in bilevel optimization and helps in reducing the bilevel problem to a single level constrained optimization task. The single-level constrained optimization problem is solved using the augmented Lagrangian method. We discuss the theory behind the proposed algorithm and perform extensive computational study on two datasets that confirm the efficiency of the proposed method. We perform a comparative study against grid search, random search and Bayesian optimization techniques that shows that the proposed algorithm is multiple times faster on problems with one or two hyperparameters. The computational gain is expected to be significantly higher as the number of hyperparameters increase. Corresponding to a given hyperparameter most of the techniques in the literature often assume a unique optimal parameter set that minimizes loss on the training set. Such an assumption is often violated by deep learning architectures and the proposed method does not require any such assumption.
△ Less
Submitted 21 July, 2020;
originally announced July 2020.
-
An RNN-Survival Model to Decide Email Send Times
Authors:
Harvineet Singh,
Moumita Sinha,
Atanu R. Sinha,
Sahil Garg,
Neha Banerjee
Abstract:
Email communications are ubiquitous. Firms control send times of emails and thereby the instants at which emails reach recipients (it is assumed email is received instantaneously from the send time). However, they do not control the duration it takes for recipients to open emails, labeled as time-to-open. Importantly, among emails that are opened, most occur within a short window from their send t…
▽ More
Email communications are ubiquitous. Firms control send times of emails and thereby the instants at which emails reach recipients (it is assumed email is received instantaneously from the send time). However, they do not control the duration it takes for recipients to open emails, labeled as time-to-open. Importantly, among emails that are opened, most occur within a short window from their send times. We posit that emails are likely to be opened sooner when send times are convenient for recipients, while for other send times, emails can get ignored. Thus, to compute appropriate send times it is important to predict times-to-open accurately. We propose a recurrent neural network (RNN) in a survival model framework to predict times-to-open, for each recipient. Using that we compute appropriate send times. We experiment on a data set of emails sent to a million customers over five months. The sequence of emails received by a person from a sender is a result of interactions with past emails from the sender, and hence contain useful signal that inform our model. This sequential dependence affords our proposed RNN-Survival (RNN-S) approach to outperform survival analysis approaches in predicting times-to-open. We show that best times to send emails can be computed accurately from predicted times-to-open. This approach allows a firm to tune send times of emails, which is in its control, to favorably influence open rates and engagement.
△ Less
Submitted 21 April, 2020;
originally announced April 2020.
-
FormulaZero: Distributionally Robust Online Adaptation via Offline Population Synthesis
Authors:
Aman Sinha,
Matthew O'Kelly,
Hongrui Zheng,
Rahul Mangharam,
John Duchi,
Russ Tedrake
Abstract:
Balancing performance and safety is crucial to deploying autonomous vehicles in multi-agent environments. In particular, autonomous racing is a domain that penalizes safe but conservative policies, highlighting the need for robust, adaptive strategies. Current approaches either make simplifying assumptions about other agents or lack robust mechanisms for online adaptation. This work makes algorith…
▽ More
Balancing performance and safety is crucial to deploying autonomous vehicles in multi-agent environments. In particular, autonomous racing is a domain that penalizes safe but conservative policies, highlighting the need for robust, adaptive strategies. Current approaches either make simplifying assumptions about other agents or lack robust mechanisms for online adaptation. This work makes algorithmic contributions to both challenges. First, to generate a realistic, diverse set of opponents, we develop a novel method for self-play based on replica-exchange Markov chain Monte Carlo. Second, we propose a distributionally robust bandit optimization procedure that adaptively adjusts risk aversion relative to uncertainty in beliefs about opponents' behaviors. We rigorously quantify the tradeoffs in performance and robustness when approximating these computations in real-time motion-planning, and we demonstrate our methods experimentally on autonomous vehicles that achieve scaled speeds comparable to Formula One racecars.
△ Less
Submitted 22 August, 2020; v1 submitted 8 March, 2020;
originally announced March 2020.
-
Rich-Item Recommendations for Rich-Users: Exploiting Dynamic and Static Side Information
Authors:
Amar Budhiraja,
Gaurush Hiranandani,
Darshak Chhatbar,
Aditya Sinha,
Navya Yarrabelly,
Ayush Choure,
Oluwasanmi Koyejo,
Prateek Jain
Abstract:
In this paper, we study the problem of recommendation system where the users and items to be recommended are rich data structures with multiple entity types and with multiple sources of side-information in the form of graphs. We provide a general formulation for the problem that captures the complexities of modern real-world recommendations and generalizes many existing formulations. In our formul…
▽ More
In this paper, we study the problem of recommendation system where the users and items to be recommended are rich data structures with multiple entity types and with multiple sources of side-information in the form of graphs. We provide a general formulation for the problem that captures the complexities of modern real-world recommendations and generalizes many existing formulations. In our formulation, each user/document that requires a recommendation and each item or tag that is to be recommended, both are modeled by a set of static entities and a dynamic component. The relationships between entities are captured by several weighted bipartite graphs. To effectively exploit these complex interactions and learn the recommendation model, we propose MEDRES- a multiple graph-CNN based novel deep-learning architecture. MEDRES uses AL-GCN, a novel graph convolution network block, that harnesses strong representative features from the underlying graphs. Moreover, in order to capture highly heterogeneous engagement of different users with the system and constraints on the number of items to be recommended, we propose a novel ranking metric pAp@k along with a method to optimize the metric directly. We demonstrate effectiveness of our method on two benchmarks: a) citation data, b) Flickr data. In addition, we present two real-world case studies of our formulation and the MEDRES architecture. We show how our technique can be used to naturally model the message recommendation problem and the teams recommendation problem in the Microsoft Teams (MSTeams) product and demonstrate that it is 5-6% points more accurate than the production-grade models.
△ Less
Submitted 26 July, 2020; v1 submitted 28 January, 2020;
originally announced January 2020.
-
Efficient Black-box Assessment of Autonomous Vehicle Safety
Authors:
Justin Norden,
Matthew O'Kelly,
Aman Sinha
Abstract:
While autonomous vehicle (AV) technology has shown substantial progress, we still lack tools for rigorous and scalable testing. Real-world testing, the $\textit{de-facto}$ evaluation method, is dangerous to the public. Moreover, due to the rare nature of failures, billions of miles of driving are needed to statistically validate performance claims. Thus, the industry has largely turned to simulati…
▽ More
While autonomous vehicle (AV) technology has shown substantial progress, we still lack tools for rigorous and scalable testing. Real-world testing, the $\textit{de-facto}$ evaluation method, is dangerous to the public. Moreover, due to the rare nature of failures, billions of miles of driving are needed to statistically validate performance claims. Thus, the industry has largely turned to simulation to evaluate AV systems. However, having a simulation stack alone is not a solution. A simulation testing framework needs to prioritize which scenarios to run, learn how the chosen scenarios provide coverage of failure modes, and rank failure scenarios in order of importance. We implement a simulation testing framework that evaluates an entire modern AV system as a black box. This framework estimates the probability of accidents under a base distribution governing standard traffic behavior. In order to accelerate rare-event probability evaluation, we efficiently learn to identify and rank failure scenarios via adaptive importance-sampling methods. Using this framework, we conduct the first independent evaluation of a full-stack commercial AV system, Comma AI's OpenPilot.
△ Less
Submitted 5 June, 2020; v1 submitted 8 December, 2019;
originally announced December 2019.
-
A Method for Computing Class-wise Universal Adversarial Perturbations
Authors:
Tejus Gupta,
Abhishek Sinha,
Nupur Kumari,
Mayank Singh,
Balaji Krishnamurthy
Abstract:
We present an algorithm for computing class-specific universal adversarial perturbations for deep neural networks. Such perturbations can induce misclassification in a large fraction of images of a specific class. Unlike previous methods that use iterative optimization for computing a universal perturbation, the proposed method employs a perturbation that is a linear function of weights of the neu…
▽ More
We present an algorithm for computing class-specific universal adversarial perturbations for deep neural networks. Such perturbations can induce misclassification in a large fraction of images of a specific class. Unlike previous methods that use iterative optimization for computing a universal perturbation, the proposed method employs a perturbation that is a linear function of weights of the neural network and hence can be computed much faster. The method does not require any training data and has no hyper-parameters. The attack obtains 34% to 51% fooling rate on state-of-the-art deep neural networks on ImageNet and transfers across models. We also study the characteristics of the decision boundaries learned by standard and adversarially trained models to understand the universal adversarial perturbations.
△ Less
Submitted 1 December, 2019;
originally announced December 2019.
-
Practical Compositional Fairness: Understanding Fairness in Multi-Component Recommender Systems
Authors:
Xuezhi Wang,
Nithum Thain,
Anu Sinha,
Flavien Prost,
Ed H. Chi,
Jilin Chen,
Alex Beutel
Abstract:
How can we build recommender systems to take into account fairness? Real-world recommender systems are often composed of multiple models, built by multiple teams. However, most research on fairness focuses on improving fairness in a single model. Further, recent research on classification fairness has shown that combining multiple "fair" classifiers can still result in an "unfair" classification s…
▽ More
How can we build recommender systems to take into account fairness? Real-world recommender systems are often composed of multiple models, built by multiple teams. However, most research on fairness focuses on improving fairness in a single model. Further, recent research on classification fairness has shown that combining multiple "fair" classifiers can still result in an "unfair" classification system. This presents a significant challenge: how do we understand and improve fairness in recommender systems composed of multiple components?
In this paper, we study the compositionality of recommender fairness. We consider two recently proposed fairness ranking metrics: equality of exposure and pairwise ranking accuracy. While we show that fairness in recommendation is not guaranteed to compose, we provide theory for a set of conditions under which fairness of individual models does compose. We then present an analytical framework for both understanding whether a real system's signals can achieve compositional fairness, and improving which component would have the greatest impact on the fairness of the overall system. In addition to the theoretical results, we find on multiple datasets -- including a large-scale real-world recommender system -- that the overall system's end-to-end fairness is largely achievable by improving fairness in individual components.
△ Less
Submitted 25 January, 2021; v1 submitted 5 November, 2019;
originally announced November 2019.
-
Charting the Right Manifold: Manifold Mixup for Few-shot Learning
Authors:
Puneet Mangla,
Mayank Singh,
Abhishek Sinha,
Nupur Kumari,
Vineeth N Balasubramanian,
Balaji Krishnamurthy
Abstract:
Few-shot learning algorithms aim to learn model parameters capable of adapting to unseen classes with the help of only a few labeled examples. A recent regularization technique - Manifold Mixup focuses on learning a general-purpose representation, robust to small changes in the data distribution. Since the goal of few-shot learning is closely linked to robust representation learning, we study Mani…
▽ More
Few-shot learning algorithms aim to learn model parameters capable of adapting to unseen classes with the help of only a few labeled examples. A recent regularization technique - Manifold Mixup focuses on learning a general-purpose representation, robust to small changes in the data distribution. Since the goal of few-shot learning is closely linked to robust representation learning, we study Manifold Mixup in this problem setting. Self-supervised learning is another technique that learns semantically meaningful features, using only the inherent structure of the data. This work investigates the role of learning relevant feature manifold for few-shot tasks using self-supervision and regularization techniques. We observe that regularizing the feature manifold, enriched via self-supervised techniques, with Manifold Mixup significantly improves few-shot learning performance. We show that our proposed method S2M2 beats the current state-of-the-art accuracy on standard few-shot learning datasets like CIFAR-FS, CUB, mini-ImageNet and tiered-ImageNet by 3-8 %. Through extensive experimentation, we show that the features learned using our approach generalize to complex few-shot evaluation tasks, cross-domain scenarios and are robust against slight changes to data distribution.
△ Less
Submitted 18 January, 2020; v1 submitted 28 July, 2019;
originally announced July 2019.
-
Harnessing the Vulnerability of Latent Layers in Adversarially Trained Models
Authors:
Mayank Singh,
Abhishek Sinha,
Nupur Kumari,
Harshitha Machiraju,
Balaji Krishnamurthy,
Vineeth N Balasubramanian
Abstract:
Neural networks are vulnerable to adversarial attacks -- small visually imperceptible crafted noise which when added to the input drastically changes the output. The most effective method of defending against these adversarial attacks is to use the methodology of adversarial training. We analyze the adversarially trained robust models to study their vulnerability against adversarial attacks at the…
▽ More
Neural networks are vulnerable to adversarial attacks -- small visually imperceptible crafted noise which when added to the input drastically changes the output. The most effective method of defending against these adversarial attacks is to use the methodology of adversarial training. We analyze the adversarially trained robust models to study their vulnerability against adversarial attacks at the level of the latent layers. Our analysis reveals that contrary to the input layer which is robust to adversarial attack, the latent layer of these robust models are highly susceptible to adversarial perturbations of small magnitude. Leveraging this information, we introduce a new technique Latent Adversarial Training (LAT) which comprises of fine-tuning the adversarially trained models to ensure the robustness at the feature layers. We also propose Latent Attack (LA), a novel algorithm for construction of adversarial examples. LAT results in minor improvement in test accuracy and leads to a state-of-the-art adversarial accuracy against the universal first-order adversarial PGD attack which is shown for the MNIST, CIFAR-10, CIFAR-100 datasets.
△ Less
Submitted 25 June, 2019; v1 submitted 13 May, 2019;
originally announced May 2019.
-
Dense Depth Estimation in Monocular Endoscopy with Self-supervised Learning Methods
Authors:
Xingtong Liu,
Ayushi Sinha,
Masaru Ishii,
Gregory D. Hager,
Austin Reiter,
Russell H. Taylor,
Mathias Unberath
Abstract:
We present a self-supervised approach to training convolutional neural networks for dense depth estimation from monocular endoscopy data without a priori modeling of anatomy or shading. Our method only requires monocular endoscopic videos and a multi-view stereo method, e.g., structure from motion, to supervise learning in a sparse manner. Consequently, our method requires neither manual labeling…
▽ More
We present a self-supervised approach to training convolutional neural networks for dense depth estimation from monocular endoscopy data without a priori modeling of anatomy or shading. Our method only requires monocular endoscopic videos and a multi-view stereo method, e.g., structure from motion, to supervise learning in a sparse manner. Consequently, our method requires neither manual labeling nor patient computed tomography (CT) scan in the training and application phases. In a cross-patient experiment using CT scans as groundtruth, the proposed method achieved submillimeter mean residual error. In a comparison study to recent self-supervised depth estimation methods designed for natural video on in vivo sinus endoscopy data, we demonstrate that the proposed approach outperforms the previous methods by a large margin. The source code for this work is publicly available online at https://github.com/lppllppl920/EndoscopyDepthEstimation-Pytorch.
△ Less
Submitted 29 October, 2019; v1 submitted 20 February, 2019;
originally announced February 2019.
-
In-silico Risk Analysis of Personalized Artificial Pancreas Controllers via Rare-event Simulation
Authors:
Matthew O'Kelly,
Aman Sinha,
Justin Norden,
Hongseok Namkoong
Abstract:
Modern treatments for Type 1 diabetes (T1D) use devices known as artificial pancreata (APs), which combine an insulin pump with a continuous glucose monitor (CGM) operating in a closed-loop manner to control blood glucose levels. In practice, poor performance of APs (frequent hyper- or hypoglycemic events) is common enough at a population level that many T1D patients modify the algorithms on exist…
▽ More
Modern treatments for Type 1 diabetes (T1D) use devices known as artificial pancreata (APs), which combine an insulin pump with a continuous glucose monitor (CGM) operating in a closed-loop manner to control blood glucose levels. In practice, poor performance of APs (frequent hyper- or hypoglycemic events) is common enough at a population level that many T1D patients modify the algorithms on existing AP systems with unregulated open-source software. Anecdotally, the patients in this group have shown superior outcomes compared with standard of care, yet we do not understand how safe any AP system is since adverse outcomes are rare. In this paper, we construct generative models of individual patients' physiological characteristics and eating behaviors. We then couple these models with a T1D simulator approved for pre-clinical trials by the FDA. Given the ability to simulate patient outcomes in-silico, we utilize techniques from rare-event simulation theory in order to efficiently quantify the performance of a device with respect to a particular patient. We show a 72,000$\times$ speedup in simulation speed over real-time and up to 2-10 times increase in the frequency which we are able to sample adverse conditions relative to standard Monte Carlo sampling. In practice our toolchain enables estimates of the likelihood of hypoglycemic events with approximately an order of magnitude fewer simulations.
△ Less
Submitted 1 December, 2018;
originally announced December 2018.
-
Systematic Biases in Link Prediction: comparing heuristic and graph embedding based methods
Authors:
Aakash Sinha,
Rémy Cazabet,
Rémi Vaudaine
Abstract:
Link prediction is a popular research topic in network analysis. In the last few years, new techniques based on graph embedding have emerged as a powerful alternative to heuristics. In this article, we study the problem of systematic biases in the prediction, and show that some methods based on graph embedding offer less biased results than those based on heuristics, despite reaching lower scores…
▽ More
Link prediction is a popular research topic in network analysis. In the last few years, new techniques based on graph embedding have emerged as a powerful alternative to heuristics. In this article, we study the problem of systematic biases in the prediction, and show that some methods based on graph embedding offer less biased results than those based on heuristics, despite reaching lower scores according to usual quality scores. We discuss the relevance of this finding in the context of the filter bubble problem and the algorithmic fairness of recommender systems.
△ Less
Submitted 11 October, 2018;
originally announced November 2018.
-
Scalable End-to-End Autonomous Vehicle Testing via Rare-event Simulation
Authors:
Matthew O'Kelly,
Aman Sinha,
Hongseok Namkoong,
John Duchi,
Russ Tedrake
Abstract:
While recent developments in autonomous vehicle (AV) technology highlight substantial progress, we lack tools for rigorous and scalable testing. Real-world testing, the $\textit{de facto}$ evaluation environment, places the public in danger, and, due to the rare nature of accidents, will require billions of miles in order to statistically validate performance claims. We implement a simulation fram…
▽ More
While recent developments in autonomous vehicle (AV) technology highlight substantial progress, we lack tools for rigorous and scalable testing. Real-world testing, the $\textit{de facto}$ evaluation environment, places the public in danger, and, due to the rare nature of accidents, will require billions of miles in order to statistically validate performance claims. We implement a simulation framework that can test an entire modern autonomous driving system, including, in particular, systems that employ deep-learning perception and control algorithms. Using adaptive importance-sampling methods to accelerate rare-event probability evaluation, we estimate the probability of an accident under a base distribution governing standard traffic behavior. We demonstrate our framework on a highway scenario, accelerating system evaluation by $2$-$20$ times over naive Monte Carlo sampling methods and $10$-$300 \mathsf{P}$ times (where $\mathsf{P}$ is the number of processors) over real-world testing.
△ Less
Submitted 12 January, 2019; v1 submitted 31 October, 2018;
originally announced November 2018.
-
Gradient Adversarial Training of Neural Networks
Authors:
Ayan Sinha,
Zhao Chen,
Vijay Badrinarayanan,
Andrew Rabinovich
Abstract:
We propose gradient adversarial training, an auxiliary deep learning framework applicable to different machine learning problems. In gradient adversarial training, we leverage a prior belief that in many contexts, simultaneous gradient updates should be statistically indistinguishable from each other. We enforce this consistency using an auxiliary network that classifies the origin of the gradient…
▽ More
We propose gradient adversarial training, an auxiliary deep learning framework applicable to different machine learning problems. In gradient adversarial training, we leverage a prior belief that in many contexts, simultaneous gradient updates should be statistically indistinguishable from each other. We enforce this consistency using an auxiliary network that classifies the origin of the gradient tensor, and the main network serves as an adversary to the auxiliary network in addition to performing standard task-based training. We demonstrate gradient adversarial training for three different scenarios: (1) as a defense to adversarial examples we classify gradient tensors and tune them to be agnostic to the class of their corresponding example, (2) for knowledge distillation, we do binary classification of gradient tensors derived from the student or teacher network and tune the student gradient tensor to mimic the teacher's gradient tensor; and (3) for multi-task learning we classify the gradient tensors derived from different task loss functions and tune them to be statistically indistinguishable. For each of the three scenarios we show the potential of gradient adversarial training procedure. Specifically, gradient adversarial training increases the robustness of a network to adversarial attacks, is able to better distill the knowledge from a teacher network to a student network compared to soft targets, and boosts multi-task learning by aligning the gradient tensors derived from the task specific loss functions. Overall, our experiments demonstrate that gradient tensors contain latent information about whatever tasks are being trained, and can support diverse machine learning problems when intelligently guided through adversarialization using a auxiliary network.
△ Less
Submitted 20 June, 2018;
originally announced June 2018.
-
Neural Networks in Adversarial Setting and Ill-Conditioned Weight Space
Authors:
Mayank Singh,
Abhishek Sinha,
Balaji Krishnamurthy
Abstract:
Recently, Neural networks have seen a huge surge in its adoption due to their ability to provide high accuracy on various tasks. On the other hand, the existence of adversarial examples have raised suspicions regarding the generalization capabilities of neural networks. In this work, we focus on the weight matrix learnt by the neural networks and hypothesize that ill conditioned weight matrix is o…
▽ More
Recently, Neural networks have seen a huge surge in its adoption due to their ability to provide high accuracy on various tasks. On the other hand, the existence of adversarial examples have raised suspicions regarding the generalization capabilities of neural networks. In this work, we focus on the weight matrix learnt by the neural networks and hypothesize that ill conditioned weight matrix is one of the contributing factors in neural network's susceptibility towards adversarial examples. For ensuring that the learnt weight matrix's condition number remains sufficiently low, we suggest using orthogonal regularizer. We show that this indeed helps in increasing the adversarial accuracy on MNIST and F-MNIST datasets.
△ Less
Submitted 3 January, 2018;
originally announced January 2018.
-
Intelligent Fault Analysis in Electrical Power Grids
Authors:
Biswarup Bhattacharya,
Abhishek Sinha
Abstract:
Power grids are one of the most important components of infrastructure in today's world. Every nation is dependent on the security and stability of its own power grid to provide electricity to the households and industries. A malfunction of even a small part of a power grid can cause loss of productivity, revenue and in some cases even life. Thus, it is imperative to design a system which can dete…
▽ More
Power grids are one of the most important components of infrastructure in today's world. Every nation is dependent on the security and stability of its own power grid to provide electricity to the households and industries. A malfunction of even a small part of a power grid can cause loss of productivity, revenue and in some cases even life. Thus, it is imperative to design a system which can detect the health of the power grid and take protective measures accordingly even before a serious anomaly takes place. To achieve this objective, we have set out to create an artificially intelligent system which can analyze the grid information at any given time and determine the health of the grid through the usage of sophisticated formal models and novel machine learning techniques like recurrent neural networks. Our system simulates grid conditions including stimuli like faults, generator output fluctuations, load fluctuations using Siemens PSS/E software and this data is trained using various classifiers like SVM, LSTM and subsequently tested. The results are excellent with our methods giving very high accuracy for the data. This model can easily be scaled to handle larger and more complex grid architectures.
△ Less
Submitted 8 November, 2017;
originally announced November 2017.
-
Deep Fault Analysis and Subset Selection in Solar Power Grids
Authors:
Biswarup Bhattacharya,
Abhishek Sinha
Abstract:
Non-availability of reliable and sustainable electric power is a major problem in the developing world. Renewable energy sources like solar are not very lucrative in the current stage due to various uncertainties like weather, storage, land use among others. There also exists various other issues like mis-commitment of power, absence of intelligent fault analysis, congestion, etc. In this paper, w…
▽ More
Non-availability of reliable and sustainable electric power is a major problem in the developing world. Renewable energy sources like solar are not very lucrative in the current stage due to various uncertainties like weather, storage, land use among others. There also exists various other issues like mis-commitment of power, absence of intelligent fault analysis, congestion, etc. In this paper, we propose a novel deep learning-based system for predicting faults and selecting power generators optimally so as to reduce costs and ensure higher reliability in solar power systems. The results are highly encouraging and they suggest that the approaches proposed in this paper have the potential to be applied successfully in the developing world.
△ Less
Submitted 7 November, 2017;
originally announced November 2017.
-
Certifying Some Distributional Robustness with Principled Adversarial Training
Authors:
Aman Sinha,
Hongseok Namkoong,
Riccardo Volpi,
John Duchi
Abstract:
Neural networks are vulnerable to adversarial examples and researchers have proposed many heuristic attack and defense mechanisms. We address this problem through the principled lens of distributionally robust optimization, which guarantees performance under adversarial input perturbations. By considering a Lagrangian penalty formulation of perturbing the underlying data distribution in a Wasserst…
▽ More
Neural networks are vulnerable to adversarial examples and researchers have proposed many heuristic attack and defense mechanisms. We address this problem through the principled lens of distributionally robust optimization, which guarantees performance under adversarial input perturbations. By considering a Lagrangian penalty formulation of perturbing the underlying data distribution in a Wasserstein ball, we provide a training procedure that augments model parameter updates with worst-case perturbations of training data. For smooth losses, our procedure provably achieves moderate levels of robustness with little computational or statistical cost relative to empirical risk minimization. Furthermore, our statistical guarantees allow us to efficiently certify robustness for the population loss. For imperceptible perturbations, our method matches or outperforms heuristic approaches.
△ Less
Submitted 1 May, 2020; v1 submitted 29 October, 2017;
originally announced October 2017.
-
A Multi-objective Exploratory Procedure for Regression Model Selection
Authors:
Ankur Sinha,
Pekka Malo,
Timo Kuosmanen
Abstract:
Variable selection is recognized as one of the most critical steps in statistical modeling. The problems encountered in engineering and social sciences are commonly characterized by over-abundance of explanatory variables, non-linearities and unknown interdependencies between the regressors. An added difficulty is that the analysts may have little or no prior knowledge on the relative importance o…
▽ More
Variable selection is recognized as one of the most critical steps in statistical modeling. The problems encountered in engineering and social sciences are commonly characterized by over-abundance of explanatory variables, non-linearities and unknown interdependencies between the regressors. An added difficulty is that the analysts may have little or no prior knowledge on the relative importance of the variables. To provide a robust method for model selection, this paper introduces the Multi-objective Genetic Algorithm for Variable Selection (MOGA-VS) that provides the user with an optimal set of regression models for a given data-set. The algorithm considers the regression problem as a two objective task, and explores the Pareto-optimal (best subset) models by preferring those models over the other which have less number of regression coefficients and better goodness of fit. The model exploration can be performed based on in-sample or generalization error minimization. The model selection is proposed to be performed in two steps. First, we generate the frontier of Pareto-optimal regression models by eliminating the dominated models without any user intervention. Second, a decision making process is executed which allows the user to choose the most preferred model using visualisations and simple metrics. The method has been evaluated on a recently published real dataset on Communities and Crime within United States.
△ Less
Submitted 13 July, 2016; v1 submitted 28 March, 2012;
originally announced March 2012.