-
Facilitating Matches on Allocation Platforms
Authors:
Yohai Trabelsi,
Abhijin Adiga,
Yonatan Aumann,
Sarit Kraus,
S. S. Ravi
Abstract:
We consider a setting where goods are allocated to agents by way of an allocation platform (e.g., a matching platform). An ``allocation facilitator'' aims to increase the overall utility/social-good of the allocation by encouraging (some of the) agents to relax (some of) their restrictions. At the same time, the advice must not hurt agents who would otherwise be better off. Additionally, the facil…
▽ More
We consider a setting where goods are allocated to agents by way of an allocation platform (e.g., a matching platform). An ``allocation facilitator'' aims to increase the overall utility/social-good of the allocation by encouraging (some of the) agents to relax (some of) their restrictions. At the same time, the advice must not hurt agents who would otherwise be better off. Additionally, the facilitator may be constrained by a ``bound'' (a.k.a. `budget'), limiting the number and/or type of restrictions it may seek to relax. We consider the facilitator's optimization problem of choosing an optimal set of restrictions to request to relax under the aforementioned constraints. Our contributions are three-fold: (i) We provide a formal definition of the problem, including the participation guarantees to which the facilitator should adhere. We define a hierarchy of participation guarantees and also consider several social-good functions. (ii) We provide polynomial algorithms for solving various versions of the associated optimization problems, including one-to-one and many-to-one allocation settings. (iii) We demonstrate the benefits of such facilitation and relaxation, and the implications of the different participation guarantees, using extensive experimentation on three real-world datasets.
△ Less
Submitted 24 August, 2025;
originally announced August 2025.
-
IGraSS: Learning to Identify Infrastructure Networks from Satellite Imagery by Iterative Graph-constrained Semantic Segmentation
Authors:
Oishee Bintey Hoque,
Abhijin Adiga,
Aniruddha Adiga,
Siddharth Chaudhary,
Madhav V. Marathe,
S. S. Ravi,
Kirti Rajagopalan,
Amanda Wilson,
Samarth Swarup
Abstract:
Accurate canal network mapping is essential for water management, including irrigation planning and infrastructure maintenance. State-of-the-art semantic segmentation models for infrastructure mapping, such as roads, rely on large, well-annotated remote sensing datasets. However, incomplete or inadequate ground truth can hinder these learning approaches. Many infrastructure networks have graph-lev…
▽ More
Accurate canal network mapping is essential for water management, including irrigation planning and infrastructure maintenance. State-of-the-art semantic segmentation models for infrastructure mapping, such as roads, rely on large, well-annotated remote sensing datasets. However, incomplete or inadequate ground truth can hinder these learning approaches. Many infrastructure networks have graph-level properties such as reachability to a source (like canals) or connectivity (roads) that can be leveraged to improve these existing ground truth. This paper develops a novel iterative framework IGraSS, combining a semantic segmentation module-incorporating RGB and additional modalities (NDWI, DEM)-with a graph-based ground-truth refinement module. The segmentation module processes satellite imagery patches, while the refinement module operates on the entire data viewing the infrastructure network as a graph. Experiments show that IGraSS reduces unreachable canal segments from around 18% to 3%, and training with refined ground truth significantly improves canal identification. IGraSS serves as a robust framework for both refining noisy ground truth and mapping canal networks from remote sensing imagery. We also demonstrate the effectiveness and generalizability of IGraSS using road networks as an example, applying a different graph-theoretic constraint to complete road networks.
△ Less
Submitted 10 June, 2025; v1 submitted 9 June, 2025;
originally announced June 2025.
-
Knowledge-Informed Deep Learning for Irrigation Type Mapping from Remote Sensing
Authors:
Oishee Bintey Hoque,
Nibir Chandra Mandal,
Abhijin Adiga,
Samarth Swarup,
Sayjro Kossi Nouwakpo,
Amanda Wilson,
Madhav Marathe
Abstract:
Accurate mapping of irrigation methods is crucial for sustainable agricultural practices and food systems. However, existing models that rely solely on spectral features from satellite imagery are ineffective due to the complexity of agricultural landscapes and limited training data, making this a challenging problem. We present Knowledge-Informed Irrigation Mapping (KIIM), a novel Swin-Transforme…
▽ More
Accurate mapping of irrigation methods is crucial for sustainable agricultural practices and food systems. However, existing models that rely solely on spectral features from satellite imagery are ineffective due to the complexity of agricultural landscapes and limited training data, making this a challenging problem. We present Knowledge-Informed Irrigation Mapping (KIIM), a novel Swin-Transformer based approach that uses (i) a specialized projection matrix to encode crop to irrigation probability, (ii) a spatial attention map to identify agricultural lands from non-agricultural lands, (iii) bi-directional cross-attention to focus complementary information from different modalities, and (iv) a weighted ensemble for combining predictions from images and crop information. Our experimentation on five states in the US shows up to 22.9\% (IoU) improvement over baseline with a 71.4% (IoU) improvement for hard-to-classify drip irrigation. In addition, we propose a two-phase transfer learning approach to enhance cross-state irrigation mapping, achieving a 51% IoU boost in a state with limited labeled data. The ability to achieve baseline performance with only 40% of the training data highlights its efficiency, reducing the dependency on extensive manual labeling efforts and making large-scale, automated irrigation mapping more feasible and cost-effective.
△ Less
Submitted 5 June, 2025; v1 submitted 13 May, 2025;
originally announced May 2025.
-
IrrMap: A Large-Scale Comprehensive Dataset for Irrigation Method Mapping
Authors:
Nibir Chandra Mandal,
Oishee Bintey Hoque,
Abhijin Adiga,
Samarth Swarup,
Mandy Wilson,
Lu Feng,
Yangfeng Ji,
Miaomiao Zhang,
Geoffrey Fox,
Madhav Marathe
Abstract:
We introduce IrrMap, the first large-scale dataset (1.1 million patches) for irrigation method mapping across regions. IrrMap consists of multi-resolution satellite imagery from LandSat and Sentinel, along with key auxiliary data such as crop type, land use, and vegetation indices. The dataset spans 1,687,899 farms and 14,117,330 acres across multiple western U.S. states from 2013 to 2023, providi…
▽ More
We introduce IrrMap, the first large-scale dataset (1.1 million patches) for irrigation method mapping across regions. IrrMap consists of multi-resolution satellite imagery from LandSat and Sentinel, along with key auxiliary data such as crop type, land use, and vegetation indices. The dataset spans 1,687,899 farms and 14,117,330 acres across multiple western U.S. states from 2013 to 2023, providing a rich and diverse foundation for irrigation analysis and ensuring geospatial alignment and quality control. The dataset is ML-ready, with standardized 224x224 GeoTIFF patches, the multiple input modalities, carefully chosen train-test-split data, and accompanying dataloaders for seamless deep learning model training andbenchmarking in irrigation mapping. The dataset is also accompanied by a complete pipeline for dataset generation, enabling researchers to extend IrrMap to new regions for irrigation data collection or adapt it with minimal effort for other similar applications in agricultural and geospatial analysis. We also analyze the irrigation method distribution across crop groups, spatial irrigation patterns (using Shannon diversity indices), and irrigated area variations for both LandSat and Sentinel, providing insights into regional and resolution-based differences. To promote further exploration, we openly release IrrMap, along with the derived datasets, benchmark models, and pipeline code, through a GitHub repository: https://github.com/Nibir088/IrrMap and Data repository: https://huggingface.co/Nibir/IrrMap, providing comprehensive documentation and implementation details.
△ Less
Submitted 31 May, 2025; v1 submitted 13 May, 2025;
originally announced May 2025.
-
A Unifying Information-theoretic Perspective on Evaluating Generative Models
Authors:
Alexis Fox,
Samarth Swarup,
Abhijin Adiga
Abstract:
Considering the difficulty of interpreting generative model output, there is significant current research focused on determining meaningful evaluation metrics. Several recent approaches utilize "precision" and "recall," borrowed from the classification domain, to individually quantify the output fidelity (realism) and output diversity (representation of the real data variation), respectively. With…
▽ More
Considering the difficulty of interpreting generative model output, there is significant current research focused on determining meaningful evaluation metrics. Several recent approaches utilize "precision" and "recall," borrowed from the classification domain, to individually quantify the output fidelity (realism) and output diversity (representation of the real data variation), respectively. With the increase in metric proposals, there is a need for a unifying perspective, allowing for easier comparison and clearer explanation of their benefits and drawbacks. To this end, we unify a class of kth-nearest-neighbors (kNN)-based metrics under an information-theoretic lens using approaches from kNN density estimation. Additionally, we propose a tri-dimensional metric composed of Precision Cross-Entropy (PCE), Recall Cross-Entropy (RCE), and Recall Entropy (RE), which separately measure fidelity and two distinct aspects of diversity, inter- and intra-class. Our domain-agnostic metric, derived from the information-theoretic concepts of entropy and cross-entropy, can be dissected for both sample- and mode-level analysis. Our detailed experimental results demonstrate the sensitivity of our metric components to their respective qualities and reveal undesirable behaviors of other metrics.
△ Less
Submitted 27 February, 2025; v1 submitted 18 December, 2024;
originally announced December 2024.
-
A High-Resolution, US-scale Digital Similar of Interacting Livestock, Wild Birds, and Human Ecosystems with Applications to Multi-host Epidemic Spread
Authors:
Abhijin Adiga,
Ayush Chopra,
Mandy L. Wilson,
S. S. Ravi,
Dawen Xie,
Samarth Swarup,
Bryan Lewis,
John Barnes,
Ramesh Raskar,
Madhav V. Marathe
Abstract:
One Health issues, such as the spread of highly pathogenic avian influenza~(HPAI), present significant challenges at the human-animal-environmental interface. Recent H5N1 outbreaks underscore the need for comprehensive modeling efforts that capture the complex interactions between various entities in these interconnected ecosystems. To support such efforts, we develop a methodology to construct a…
▽ More
One Health issues, such as the spread of highly pathogenic avian influenza~(HPAI), present significant challenges at the human-animal-environmental interface. Recent H5N1 outbreaks underscore the need for comprehensive modeling efforts that capture the complex interactions between various entities in these interconnected ecosystems. To support such efforts, we develop a methodology to construct a synthetic spatiotemporal gridded dataset of livestock production and processing, human population, and wild birds for the contiguous United States, called a \emph{digital similar}. This representation is a result of fusing diverse datasets using statistical and optimization techniques, followed by extensive verification and validation. The livestock component includes farm-level representations of four major livestock types -- cattle, poultry, swine, and sheep -- including further categorization into subtypes such as dairy cows, beef cows, chickens, turkeys, ducks, etc. Weekly abundance data for wild bird species identified in the transmission of avian influenza are included. Gridded distributions of the human population, along with demographic and occupational features, capture the placement of agricultural workers and the general population. We demonstrate how the digital similar can be applied to evaluate spillover risk to dairy cows and poultry from wild bird population, then validate these results using historical H5N1 incidences. The resulting subtype-specific spatiotemporal risk maps identify hotspots of high risk from H5N1 infected wild bird population to dairy cattle and poultry operations, thus guiding surveillance efforts.
△ Less
Submitted 7 March, 2025; v1 submitted 2 November, 2024;
originally announced November 2024.
-
Efficient PAC Learnability of Dynamical Systems Over Multilayer Networks
Authors:
Zirou Qiu,
Abhijin Adiga,
Madhav V. Marathe,
S. S. Ravi,
Daniel J. Rosenkrantz,
Richard E. Stearns,
Anil Vullikanti
Abstract:
Networked dynamical systems are widely used as formal models of real-world cascading phenomena, such as the spread of diseases and information. Prior research has addressed the problem of learning the behavior of an unknown dynamical system when the underlying network has a single layer. In this work, we study the learnability of dynamical systems over multilayer networks, which are more realistic…
▽ More
Networked dynamical systems are widely used as formal models of real-world cascading phenomena, such as the spread of diseases and information. Prior research has addressed the problem of learning the behavior of an unknown dynamical system when the underlying network has a single layer. In this work, we study the learnability of dynamical systems over multilayer networks, which are more realistic and challenging. First, we present an efficient PAC learning algorithm with provable guarantees to show that the learner only requires a small number of training examples to infer an unknown system. We further provide a tight analysis of the Natarajan dimension which measures the model complexity. Asymptotically, our bound on the Nararajan dimension is tight for almost all multilayer graphs. The techniques and insights from our work provide the theoretical foundations for future investigations of learning problems for multilayer dynamical systems.
△ Less
Submitted 28 July, 2024; v1 submitted 10 May, 2024;
originally announced May 2024.
-
IrrNet: Advancing Irrigation Mapping with Incremental Patch Size Training on Remote Sensing Imagery
Authors:
Oishee Bintey Hoque,
Samarth Swarup,
Abhijin Adiga,
Sayjro Kossi Nouwakpo,
Madhav Marathe
Abstract:
Irrigation mapping plays a crucial role in effective water management, essential for preserving both water quality and quantity, and is key to mitigating the global issue of water scarcity. The complexity of agricultural fields, adorned with diverse irrigation practices, especially when multiple systems coexist in close quarters, poses a unique challenge. This complexity is further compounded by t…
▽ More
Irrigation mapping plays a crucial role in effective water management, essential for preserving both water quality and quantity, and is key to mitigating the global issue of water scarcity. The complexity of agricultural fields, adorned with diverse irrigation practices, especially when multiple systems coexist in close quarters, poses a unique challenge. This complexity is further compounded by the nature of Landsat's remote sensing data, where each pixel is rich with densely packed information, complicating the task of accurate irrigation mapping. In this study, we introduce an innovative approach that employs a progressive training method, which strategically increases patch sizes throughout the training process, utilizing datasets from Landsat 5 and 7, labeled with the WRLU dataset for precise labeling. This initial focus allows the model to capture detailed features, progressively shifting to broader, more general features as the patch size enlarges. Remarkably, our method enhances the performance of existing state-of-the-art models by approximately 20%. Furthermore, our analysis delves into the significance of incorporating various spectral bands into the model, assessing their impact on performance. The findings reveal that additional bands are instrumental in enabling the model to discern finer details more effectively. This work sets a new standard for leveraging remote sensing imagery in irrigation mapping.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Learning the Topology and Behavior of Discrete Dynamical Systems
Authors:
Zirou Qiu,
Abhijin Adiga,
Madhav V. Marathe,
S. S. Ravi,
Daniel J. Rosenkrantz,
Richard E. Stearns,
Anil Vullikanti
Abstract:
Discrete dynamical systems are commonly used to model the spread of contagions on real-world networks. Under the PAC framework, existing research has studied the problem of learning the behavior of a system, assuming that the underlying network is known. In this work, we focus on a more challenging setting: to learn both the behavior and the underlying topology of a black-box system. We show that,…
▽ More
Discrete dynamical systems are commonly used to model the spread of contagions on real-world networks. Under the PAC framework, existing research has studied the problem of learning the behavior of a system, assuming that the underlying network is known. In this work, we focus on a more challenging setting: to learn both the behavior and the underlying topology of a black-box system. We show that, in general, this learning problem is computationally intractable. On the positive side, we present efficient learning methods under the PAC model when the underlying graph of the dynamical system belongs to some classes. Further, we examine a relaxed setting where the topology of an unknown system is partially observed. For this case, we develop an efficient PAC learner to infer the system and establish the sample complexity. Lastly, we present a formal analysis of the expressive power of the hypothesis class of dynamical systems where both the topology and behavior are unknown, using the well-known formalism of the Natarajan dimension. Our results provide a theoretical foundation for learning both the behavior and topology of discrete dynamical systems.
△ Less
Submitted 29 March, 2024; v1 submitted 18 February, 2024;
originally announced February 2024.
-
Value-based Resource Matching with Fairness Criteria: Application to Agricultural Water Trading
Authors:
Abhijin Adiga,
Yohai Trabelsi,
Tanvir Ferdousi,
Madhav Marathe,
S. S. Ravi,
Samarth Swarup,
Anil Kumar Vullikanti,
Mandy L. Wilson,
Sarit Kraus,
Reetwika Basu,
Supriya Savalkar,
Matthew Yourek,
Michael Brady,
Kirti Rajagopalan,
Jonathan Yoder
Abstract:
Optimal allocation of agricultural water in the event of droughts is an important global problem. In addressing this problem, many aspects, including the welfare of farmers, the economy, and the environment, must be considered. Under this backdrop, our work focuses on several resource-matching problems accounting for agents with multi-crop portfolios, geographic constraints, and fairness. First, w…
▽ More
Optimal allocation of agricultural water in the event of droughts is an important global problem. In addressing this problem, many aspects, including the welfare of farmers, the economy, and the environment, must be considered. Under this backdrop, our work focuses on several resource-matching problems accounting for agents with multi-crop portfolios, geographic constraints, and fairness. First, we address a matching problem where the goal is to maximize a welfare function in two-sided markets where buyers' requirements and sellers' supplies are represented by value functions that assign prices (or costs) to specified volumes of water. For the setting where the value functions satisfy certain monotonicity properties, we present an efficient algorithm that maximizes a social welfare function. When there are minimum water requirement constraints, we present a randomized algorithm which ensures that the constraints are satisfied in expectation. For a single seller--multiple buyers setting with fairness constraints, we design an efficient algorithm that maximizes the minimum level of satisfaction of any buyer. We also present computational complexity results that highlight the limits on the generalizability of our results. We evaluate the algorithms developed in our work with experiments on both real-world and synthetic data sets with respect to drought severity, value functions, and seniority of agents.
△ Less
Submitted 11 February, 2024; v1 submitted 9 February, 2024;
originally announced February 2024.
-
Resource Sharing Through Multi-Round Matchings
Authors:
Yohai Trabelsi,
Abhijin Adiga,
Sarit Kraus,
S. S. Ravi,
Daniel J. Rosenkrantz
Abstract:
Applications such as employees sharing office spaces over a workweek can be modeled as problems where agents are matched to resources over multiple rounds. Agents' requirements limit the set of compatible resources and the rounds in which they want to be matched. Viewing such an application as a multi-round matching problem on a bipartite compatibility graph between agents and resources, we show t…
▽ More
Applications such as employees sharing office spaces over a workweek can be modeled as problems where agents are matched to resources over multiple rounds. Agents' requirements limit the set of compatible resources and the rounds in which they want to be matched. Viewing such an application as a multi-round matching problem on a bipartite compatibility graph between agents and resources, we show that a solution (i.e., a set of matchings, with one matching per round) can be found efficiently if one exists. To cope with situations where a solution does not exist, we consider two extensions. In the first extension, a benefit function is defined for each agent and the objective is to find a multi-round matching to maximize the total benefit. For a general class of benefit functions satisfying certain properties (including diminishing returns), we show that this multi-round matching problem is efficiently solvable. This class includes utilitarian and Rawlsian welfare functions. For another benefit function, we show that the maximization problem is NP-hard. In the second extension, the objective is to generate advice to each agent (i.e., a subset of requirements to be relaxed) subject to a budget constraint so that the agent can be matched. We show that this budget-constrained advice generation problem is NP-hard. For this problem, we develop an integer linear programming formulation as well as a heuristic based on local search. We experimentally evaluate our algorithms on synthetic networks and apply them to two real-world situations: shared office spaces and matching courses to classrooms.
△ Less
Submitted 30 November, 2022;
originally announced November 2022.
-
Resource Allocation to Agents with Restrictions: Maximizing Likelihood with Minimum Compromise
Authors:
Yohai Trabelsi,
Abhijin Adiga,
Sarit Kraus,
S. S. Ravi
Abstract:
Many scenarios where agents with restrictions compete for resources can be cast as maximum matching problems on bipartite graphs. Our focus is on resource allocation problems where agents may have restrictions that make them incompatible with some resources. We assume that a Principle chooses a maximum matching randomly so that each agent is matched to a resource with some probability. Agents woul…
▽ More
Many scenarios where agents with restrictions compete for resources can be cast as maximum matching problems on bipartite graphs. Our focus is on resource allocation problems where agents may have restrictions that make them incompatible with some resources. We assume that a Principle chooses a maximum matching randomly so that each agent is matched to a resource with some probability. Agents would like to improve their chances of being matched by modifying their restrictions within certain limits. The Principle's goal is to advise an unsatisfied agent to relax its restrictions so that the total cost of relaxation is within a budget (chosen by the agent) and the increase in the probability of being assigned a resource is maximized. We establish hardness results for some variants of this budget-constrained maximization problem and present algorithmic results for other variants. We experimentally evaluate our methods on synthetic datasets as well as on two novel real-world datasets: a vacation activities dataset and a classrooms dataset.
△ Less
Submitted 12 September, 2022;
originally announced September 2022.
-
Perturbative methods for mostly monotonic probabilistic satisfiability problems
Authors:
Stephen Eubank,
Madhurima Nath,
Yihui Ren,
Abhijin Adiga
Abstract:
The probabilistic satisfiability of a logical expression is a fundamental concept known as the partition function in statistical physics and field theory, an evaluation of a related graph's Tutte polynomial in mathematics, and the Moore-Shannon network reliability of that graph in engineering. It is the crucial element for decision-making under uncertainty. Not surprisingly, it is provably hard to…
▽ More
The probabilistic satisfiability of a logical expression is a fundamental concept known as the partition function in statistical physics and field theory, an evaluation of a related graph's Tutte polynomial in mathematics, and the Moore-Shannon network reliability of that graph in engineering. It is the crucial element for decision-making under uncertainty. Not surprisingly, it is provably hard to compute exactly or even to approximate. Many of these applications are concerned only with a subset of problems for which the solutions are monotonic functions. Here we extend the weak- and strong-coupling methods of statistical physics to heterogeneous satisfiability problems and introduce a novel approach to constructing lower and upper bounds on the approximation error for monotonic problems. These bounds combine information from both perturbative analyses to produce bounds that are tight in the sense that they are saturated by some problem instance that is compatible with all the information contained in either approximation.
△ Less
Submitted 7 June, 2022;
originally announced June 2022.
-
Cohorting to isolate asymptomatic spreaders: An agent-based simulation study on the Mumbai Suburban Railway
Authors:
Alok Talekar,
Sharad Shriram,
Nidhin Vaidhiyan,
Gaurav Aggarwal,
Jiangzhuo Chen,
Srini Venkatramanan,
Lijing Wang,
Aniruddha Adiga,
Adam Sadilek,
Ashish Tendulkar,
Madhav Marathe,
Rajesh Sundaresan,
Milind Tambe
Abstract:
The Mumbai Suburban Railways, \emph{locals}, are a key transit infrastructure of the city and is crucial for resuming normal economic activity. To reduce disease transmission, policymakers can enforce reduced crowding and mandate wearing of masks. \emph{Cohorting} -- forming groups of travelers that always travel together, is an additional policy to reduce disease transmission on \textit{locals} w…
▽ More
The Mumbai Suburban Railways, \emph{locals}, are a key transit infrastructure of the city and is crucial for resuming normal economic activity. To reduce disease transmission, policymakers can enforce reduced crowding and mandate wearing of masks. \emph{Cohorting} -- forming groups of travelers that always travel together, is an additional policy to reduce disease transmission on \textit{locals} without severe restrictions. Cohorting allows us to: ($i$) form traveler bubbles, thereby decreasing the number of distinct interactions over time; ($ii$) potentially quarantine an entire cohort if a single case is detected, making contact tracing more efficient, and ($iii$) target cohorts for testing and early detection of symptomatic as well as asymptomatic cases. Studying impact of cohorts using compartmental models is challenging because of the ensuing representational complexity. Agent-based models provide a natural way to represent cohorts along with the representation of the cohort members with the larger social network. This paper describes a novel multi-scale agent-based model to study the impact of cohorting strategies on COVID-19 dynamics in Mumbai. We achieve this by modeling the Mumbai urban region using a detailed agent-based model comprising of 12.4 million agents. Individual cohorts and their inter-cohort interactions as they travel on locals are modeled using local mean field approximations. The resulting multi-scale model in conjunction with a detailed disease transmission and intervention simulator is used to assess various cohorting strategies. The results provide a quantitative trade-off between cohort size and its impact on disease dynamics and well being. The results show that cohorts can provide significant benefit in terms of reduced transmission without significantly impacting ridership and or economic \& social activity.
△ Less
Submitted 24 December, 2020; v1 submitted 23 December, 2020;
originally announced December 2020.
-
Examining Deep Learning Models with Multiple Data Sources for COVID-19 Forecasting
Authors:
Lijing Wang,
Aniruddha Adiga,
Srinivasan Venkatramanan,
Jiangzhuo Chen,
Bryan Lewis,
Madhav Marathe
Abstract:
The COVID-19 pandemic represents the most significant public health disaster since the 1918 influenza pandemic. During pandemics such as COVID-19, timely and reliable spatio-temporal forecasting of epidemic dynamics is crucial. Deep learning-based time series models for forecasting have recently gained popularity and have been successfully used for epidemic forecasting. Here we focus on the design…
▽ More
The COVID-19 pandemic represents the most significant public health disaster since the 1918 influenza pandemic. During pandemics such as COVID-19, timely and reliable spatio-temporal forecasting of epidemic dynamics is crucial. Deep learning-based time series models for forecasting have recently gained popularity and have been successfully used for epidemic forecasting. Here we focus on the design and analysis of deep learning-based models for COVID-19 forecasting. We implement multiple recurrent neural network-based deep learning models and combine them using the stacking ensemble technique. In order to incorporate the effects of multiple factors in COVID-19 spread, we consider multiple sources such as COVID-19 confirmed and death case count data and testing data for better predictions. To overcome the sparsity of training data and to address the dynamic correlation of the disease, we propose clustering-based training for high-resolution forecasting. The methods help us to identify the similar trends of certain groups of regions due to various spatio-temporal effects. We examine the proposed method for forecasting weekly COVID-19 new confirmed cases at county-, state-, and country-level. A comprehensive comparison between different time series models in COVID-19 context is conducted and analyzed. The results show that simple deep learning models can achieve comparable or better performance when compared with more complicated models. We are currently integrating our methods as a part of our weekly forecasts that we provide state and federal authorities.
△ Less
Submitted 23 November, 2020; v1 submitted 27 October, 2020;
originally announced October 2020.
-
Models for COVID-19 Pandemic: A Comparative Analysis
Authors:
Aniruddha Adiga,
Devdatt Dubhashi,
Bryan Lewis,
Madhav Marathe,
Srinivasan Venkatramanan,
Anil Vullikanti
Abstract:
COVID-19 pandemic represents an unprecedented global health crisis in the last 100 years. Its economic, social and health impact continues to grow and is likely to end up as one of the worst global disasters since the 1918 pandemic and the World Wars. Mathematical models have played an important role in the ongoing crisis; they have been used to inform public policies and have been instrumental in…
▽ More
COVID-19 pandemic represents an unprecedented global health crisis in the last 100 years. Its economic, social and health impact continues to grow and is likely to end up as one of the worst global disasters since the 1918 pandemic and the World Wars. Mathematical models have played an important role in the ongoing crisis; they have been used to inform public policies and have been instrumental in many of the social distancing measures that were instituted worldwide.
In this article we review some of the important mathematical models used to support the ongoing planning and response efforts. These models differ in their use, their mathematical form and their scope.
△ Less
Submitted 21 September, 2020;
originally announced September 2020.
-
Learning Everywhere: Pervasive Machine Learning for Effective High-Performance Computation
Authors:
Geoffrey Fox,
James A. Glazier,
JCS Kadupitiya,
Vikram Jadhao,
Minje Kim,
Judy Qiu,
James P. Sluka,
Endre Somogyi,
Madhav Marathe,
Abhijin Adiga,
Jiangzhuo Chen,
Oliver Beckstein,
Shantenu Jha
Abstract:
The convergence of HPC and data-intensive methodologies provide a promising approach to major performance improvements. This paper provides a general description of the interaction between traditional HPC and ML approaches and motivates the Learning Everywhere paradigm for HPC. We introduce the concept of effective performance that one can achieve by combining learning methodologies with simulatio…
▽ More
The convergence of HPC and data-intensive methodologies provide a promising approach to major performance improvements. This paper provides a general description of the interaction between traditional HPC and ML approaches and motivates the Learning Everywhere paradigm for HPC. We introduce the concept of effective performance that one can achieve by combining learning methodologies with simulation-based approaches, and distinguish between traditional performance as measured by benchmark scores. To support the promise of integrating HPC and learning methods, this paper examines specific examples and opportunities across a series of domains. It concludes with a series of open computer science and cyberinfrastructure questions and challenges that the Learning Everywhere paradigm presents.
△ Less
Submitted 27 February, 2019;
originally announced February 2019.
-
A Non-Convex Optimization Technique for Sparse Blind Deconvolution -- Initialization Aspects and Error Reduction Properties
Authors:
Aniruddha Adiga,
Chandra Sekhar Seelamantula
Abstract:
Sparse blind deconvolution is the problem of estimating the blur kernel and sparse excitation, both of which are unknown. Considering a linear convolution model, as opposed to the standard circular convolution model, we derive a sufficient condition for stable deconvolution. The columns of the linear convolution matrix form a Riesz basis with the tightness of the Riesz bounds determined by the aut…
▽ More
Sparse blind deconvolution is the problem of estimating the blur kernel and sparse excitation, both of which are unknown. Considering a linear convolution model, as opposed to the standard circular convolution model, we derive a sufficient condition for stable deconvolution. The columns of the linear convolution matrix form a Riesz basis with the tightness of the Riesz bounds determined by the autocorrelation of the blur kernel. Employing a Bayesian framework results in a non-convex, non-smooth cost function consisting of an $\ell_2$ data-fidelity term and a sparsity promoting $\ell_p$-norm ($0 \le p \le 1$) regularizer. Since the $\ell_p$-norm is not differentiable at the origin, we employ an $ε$-regularized $\ell_p$-norm as a surrogate. The data term is also non-convex in both the blur kernel and excitation. An iterative scheme termed alternating minimization (Alt. Min.) $\ell_p-\ell_2$ projections algorithm (ALPA) is developed for optimization of the $ε$-regularized cost function. Further, we demonstrate that, in every iteration, the $ε$-regularized cost function is non-increasing and more importantly, bounds the original $\ell_p$-norm-based cost. Due to non-convexity of the cost, the accuracy of estimation is largely influenced by the initialization. Considering regularized least-squares estimate as the initialization, we analyze how the initialization errors are concentrated, first in Gaussian noise, and then in bounded noise, the latter case resulting in tighter bounds. Comparisons with state-of-the-art blind deconvolution algorithms show that the deconvolution accuracy is higher in case of ALPA. In the context of natural speech signals, ALPA results in accurate deconvolution of a voiced speech segment into a sparse excitation and smooth vocal tract response.
△ Less
Submitted 11 October, 2017; v1 submitted 24 August, 2017;
originally announced August 2017.
-
Sublinear Approximation Algorithms for Boxicity and Related Problems
Authors:
Abhijin Adiga,
Jasine Babu,
L. Sunil Chandran
Abstract:
Boxicity of a graph G(V, E) is the minimum integer k such that G can be represented as the intersection graph of axis parallel boxes in $\mathbb{R}^k$. Cubicity is a variant of boxicity, where the axis parallel boxes in the intersection representation are restricted to be of unit length sides. Deciding whether boxicity (resp. cubicity) of a graph is at most k is NP-hard, even for k=2 or 3. Computi…
▽ More
Boxicity of a graph G(V, E) is the minimum integer k such that G can be represented as the intersection graph of axis parallel boxes in $\mathbb{R}^k$. Cubicity is a variant of boxicity, where the axis parallel boxes in the intersection representation are restricted to be of unit length sides. Deciding whether boxicity (resp. cubicity) of a graph is at most k is NP-hard, even for k=2 or 3. Computing these parameters is inapproximable within $O(n^{1 - ε})$-factor, for any $ε>0$ in polynomial time unless NP=ZPP, even for many simple graph classes.
In this paper, we give a polynomial time $κ(n)$ factor approximation algorithm for computing boxicity and a $κ(n)\lceil \log \log n\rceil$ factor approximation algorithm for computing the cubicity, where $κ(n) =2\left\lceil\frac{n\sqrt{\log \log n}}{\sqrt{\log n}}\right\rceil$. These o(n) factor approximation algorithms also produce the corresponding box (resp. cube) representations. As a special case, this resolves the question paused by Spinrad about polynomial time construction of o(n) dimensional box representations for boxicity 2 graphs. Other consequences of our approximation algorithm include $O(κ(n))$ factor approximation algorithms for computing the following parameters: the partial order dimension of finite posets, the interval dimension of finite posets, minimum chain cover of bipartite graphs, threshold dimension of split graphs and Ferrer's dimension of digraphs. Each of these parameters is inapproximable within an $O(n^{1 - ε})$-factor, for any $ε>0$ in polynomial time unless NP=ZPP and the algorithms we derive seem to be the first o(n) factor approximation algorithms known for all these problems.
△ Less
Submitted 7 June, 2015; v1 submitted 19 May, 2015;
originally announced May 2015.
-
Approximation Algorithms for Reducing the Spectral Radius to Control Epidemic Spread
Authors:
Sudip Saha,
Abhijin Adiga,
B. Aditya Prakash,
Anil Kumar S. Vullikanti
Abstract:
The largest eigenvalue of the adjacency matrix of a network (referred to as the spectral radius) is an important metric in its own right. Further, for several models of epidemic spread on networks (e.g., the `flu-like' SIS model), it has been shown that an epidemic dies out quickly if the spectral radius of the graph is below a certain threshold that depends on the model parameters. This motivates…
▽ More
The largest eigenvalue of the adjacency matrix of a network (referred to as the spectral radius) is an important metric in its own right. Further, for several models of epidemic spread on networks (e.g., the `flu-like' SIS model), it has been shown that an epidemic dies out quickly if the spectral radius of the graph is below a certain threshold that depends on the model parameters. This motivates a strategy to control epidemic spread by reducing the spectral radius of the underlying network.
In this paper, we develop a suite of provable approximation algorithms for reducing the spectral radius by removing the minimum cost set of edges (modeling quarantining) or nodes (modeling vaccinations), with different time and quality tradeoffs. Our main algorithm, \textsc{GreedyWalk}, is based on the idea of hitting closed walks of a given length, and gives an $O(\log^2{n})$-approximation, where $n$ denotes the number of nodes; it also performs much better in practice compared to all prior heuristics proposed for this problem. We further present a novel sparsification method to improve its running time.
In addition, we give a new primal-dual based algorithm with an even better approximation guarantee ($O(\log n)$), albeit with slower running time. We also give lower bounds on the worst-case performance of some of the popular heuristics. Finally we demonstrate the applicability of our algorithms and the properties of our solutions via extensive experiments on multiple synthetic and real networks.
△ Less
Submitted 26 January, 2015;
originally announced January 2015.
-
Parameterized and Approximation Algorithms for Boxicity
Authors:
Abhijin Adiga,
Jasine Babu,
L. Sunil Chandran
Abstract:
Boxicity of a graph $G(V,$ $E)$, denoted by $box(G)$, is the minimum integer $k$ such that $G$ can be represented as the intersection graph of axis parallel boxes in $\mathbb{R}^k$. The problem of computing boxicity is inapproximable even for graph classes like bipartite, co-bipartite and split graphs within $O(n^{1 - ε})$-factor, for any $ε>0$ in polynomial time unless $NP=ZPP$. We give FPT appro…
▽ More
Boxicity of a graph $G(V,$ $E)$, denoted by $box(G)$, is the minimum integer $k$ such that $G$ can be represented as the intersection graph of axis parallel boxes in $\mathbb{R}^k$. The problem of computing boxicity is inapproximable even for graph classes like bipartite, co-bipartite and split graphs within $O(n^{1 - ε})$-factor, for any $ε>0$ in polynomial time unless $NP=ZPP$. We give FPT approximation algorithms for computing the boxicity of graphs, where the parameter used is the vertex or edge edit distance of the given graph from families of graphs of bounded boxicity. This can be seen as a generalization of the parameterizations discussed in \cite{Adiga2}.
Extending the same idea in one of our algorithms, we also get an $O\left(\frac{n\sqrt{\log \log n}}{\sqrt{\log n}}\right)$ factor approximation algorithm for computing boxicity and an $O\left(\frac{n {(\log \log n)}^{\frac{3}{2}}}{\sqrt{\log n}}\right)$ factor approximation algorithm for computing the cubicity. These seem to be the first $o(n)$ factor approximation algorithms known for both boxicity and cubicity. As a consequence of this result, a $o(n)$ factor approximation algorithm for computing the partial order dimension of finite posets and a $o(n)$ factor approximation algorithm for computing the threshold dimension of split graphs would follow.
△ Less
Submitted 5 March, 2014; v1 submitted 28 January, 2012;
originally announced January 2012.
-
Cubicity, Degeneracy, and Crossing Number
Authors:
Abhijin Adiga,
L. Sunil Chandran,
Rogers Mathew
Abstract:
A $k$-box $B=(R_1,...,R_k)$, where each $R_i$ is a closed interval on the real line, is defined to be the Cartesian product $R_1\times R_2\times ...\times R_k$. If each $R_i$ is a unit length interval, we call $B$ a $k$-cube. Boxicity of a graph $G$, denoted as $\boxi(G)$, is the minimum integer $k$ such that $G$ is an intersection graph of $k$-boxes. Similarly, the cubicity of $G$, denoted as…
▽ More
A $k$-box $B=(R_1,...,R_k)$, where each $R_i$ is a closed interval on the real line, is defined to be the Cartesian product $R_1\times R_2\times ...\times R_k$. If each $R_i$ is a unit length interval, we call $B$ a $k$-cube. Boxicity of a graph $G$, denoted as $\boxi(G)$, is the minimum integer $k$ such that $G$ is an intersection graph of $k$-boxes. Similarly, the cubicity of $G$, denoted as $\cubi(G)$, is the minimum integer $k$ such that $G$ is an intersection graph of $k$-cubes.
It was shown in [L. Sunil Chandran, Mathew C. Francis, and Naveen Sivadasan: Representing graphs as the intersection of axis-parallel cubes. MCDES-2008, IISc Centenary Conference, available at CoRR, abs/cs/ 0607092, 2006.] that, for a graph $G$ with maximum degree $Δ$, $\cubi(G)\leq \lceil 4(Δ+1)\log n\rceil$. In this paper, we show that, for a $k$-degenerate graph $G$, $\cubi(G) \leq (k+2) \lceil 2e \log n \rceil$. Since $k$ is at most $Δ$ and can be much lower, this clearly is a stronger result. This bound is tight. We also give an efficient deterministic algorithm that runs in $O(n^2k)$ time to output a $8k(\lceil 2.42 \log n\rceil + 1)$ dimensional cube representation for $G$.
An important consequence of the above result is that if the crossing number of a graph $G$ is $t$, then $\boxi(G)$ is $O(t^{1/4}{\lceil\log t\rceil}^{3/4})$ . This bound is tight up to a factor of $O((\log t)^{1/4})$. We also show that, if $G$ has $n$ vertices, then $\cubi(G)$ is $O(\log n + t^{1/4}\log t)$.
Using our bound for the cubicity of $k$-degenerate graphs we show that cubicity of almost all graphs in $\mathcal{G}(n,m)$ model is $O(d_{av}\log n)$, where $d_{av}$ denotes the average degree of the graph under consideration.
△ Less
Submitted 30 January, 2012; v1 submitted 26 May, 2011;
originally announced May 2011.
-
A Constant Factor Approximation Algorithm for Boxicity of Circular Arc Graphs
Authors:
Abhijin Adiga,
Jasine Babu,
L. Sunil Chandran
Abstract:
Boxicity of a graph $G(V,E)$ is the minimum integer $k$ such that $G$ can be represented as the intersection graph of $k$-dimensional axis parallel rectangles in $\mathbf{R}^k$. Equivalently, it is the minimum number of interval graphs on the vertex set $V$ such that the intersection of their edge sets is $E$. It is known that boxicity cannot be approximated even for graph classes like bipartite,…
▽ More
Boxicity of a graph $G(V,E)$ is the minimum integer $k$ such that $G$ can be represented as the intersection graph of $k$-dimensional axis parallel rectangles in $\mathbf{R}^k$. Equivalently, it is the minimum number of interval graphs on the vertex set $V$ such that the intersection of their edge sets is $E$. It is known that boxicity cannot be approximated even for graph classes like bipartite, co-bipartite and split graphs below $O(n^{0.5 - ε})$-factor, for any $ε>0$ in polynomial time unless $NP=ZPP$. Till date, there is no well known graph class of unbounded boxicity for which even an $n^ε$-factor approximation algorithm for computing boxicity is known, for any $ε<1$. In this paper, we study the boxicity problem on Circular Arc graphs - intersection graphs of arcs of a circle. We give a $(2+\frac{1}{k})$-factor polynomial time approximation algorithm for computing the boxicity of any circular arc graph along with a corresponding box representation, where $k \ge 1$ is its boxicity. For Normal Circular Arc(NCA) graphs, with an NCA model given, this can be improved to an additive 2-factor approximation algorithm. The time complexity of the algorithms to approximately compute the boxicity is $O(mn+n^2)$ in both these cases and in $O(mn+kn^2)= O(n^3)$ time we also get their corresponding box representations, where $n$ is the number of vertices of the graph and $m$ is its number of edges. The additive 2-factor algorithm directly works for any Proper Circular Arc graph, since computing an NCA model for it can be done in polynomial time.
△ Less
Submitted 8 February, 2011;
originally announced February 2011.