-
Generative Latent Diffusion for Efficient Spatiotemporal Data Reduction
Authors:
Xiao Li,
Liangji Zhu,
Anand Rangarajan,
Sanjay Ranka
Abstract:
Generative models have demonstrated strong performance in conditional settings and can be viewed as a form of data compression, where the condition serves as a compact representation. However, their limited controllability and reconstruction accuracy restrict their practical application to data compression. In this work, we propose an efficient latent diffusion framework that bridges this gap by c…
▽ More
Generative models have demonstrated strong performance in conditional settings and can be viewed as a form of data compression, where the condition serves as a compact representation. However, their limited controllability and reconstruction accuracy restrict their practical application to data compression. In this work, we propose an efficient latent diffusion framework that bridges this gap by combining a variational autoencoder with a conditional diffusion model. Our method compresses only a small number of keyframes into latent space and uses them as conditioning inputs to reconstruct the remaining frames via generative interpolation, eliminating the need to store latent representations for every frame. This approach enables accurate spatiotemporal reconstruction while significantly reducing storage costs. Experimental results across multiple datasets show that our method achieves up to 10 times higher compression ratios than rule-based state-of-the-art compressors such as SZ3, and up to 63 percent better performance than leading learning-based methods under the same reconstruction error.
△ Less
Submitted 2 July, 2025;
originally announced July 2025.
-
Evaluating Generative Vehicle Trajectory Models for Traffic Intersection Dynamics
Authors:
Yash Ranjan,
Rahul Sengupta,
Anand Rangarajan,
Sanjay Ranka
Abstract:
Traffic Intersections are vital to urban road networks as they regulate the movement of people and goods. However, they are regions of conflicting trajectories and are prone to accidents. Deep Generative models of traffic dynamics at signalized intersections can greatly help traffic authorities better understand the efficiency and safety aspects. At present, models are evaluated on computational m…
▽ More
Traffic Intersections are vital to urban road networks as they regulate the movement of people and goods. However, they are regions of conflicting trajectories and are prone to accidents. Deep Generative models of traffic dynamics at signalized intersections can greatly help traffic authorities better understand the efficiency and safety aspects. At present, models are evaluated on computational metrics that primarily look at trajectory reconstruction errors. They are not evaluated online in a `live' microsimulation scenario. Further, these metrics do not adequately consider traffic engineering-specific concerns such as red-light violations, unallowed stoppage, etc. In this work, we provide a comprehensive analytics tool to train, run, and evaluate models with metrics that give better insights into model performance from a traffic engineering point of view. We train a state-of-the-art multi-vehicle trajectory forecasting model on a large dataset collected by running a calibrated scenario of a real-world urban intersection. We then evaluate the performance of the prediction models, online in a microsimulator, under unseen traffic conditions. We show that despite using ideally-behaved trajectories as input, and achieving low trajectory reconstruction errors, the generated trajectories show behaviors that break traffic rules. We introduce new metrics to evaluate such undesired behaviors and present our results.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
IntTrajSim: Trajectory Prediction for Simulating Multi-Vehicle driving at Signalized Intersections
Authors:
Yash Ranjan,
Rahul Sengupta,
Anand Rangarajan,
Sanjay Ranka
Abstract:
Traffic simulators are widely used to study the operational efficiency of road infrastructure, but their rule-based approach limits their ability to mimic real-world driving behavior. Traffic intersections are critical components of the road infrastructure, both in terms of safety risk (nearly 28% of fatal crashes and 58% of nonfatal crashes happen at intersections) as well as the operational effi…
▽ More
Traffic simulators are widely used to study the operational efficiency of road infrastructure, but their rule-based approach limits their ability to mimic real-world driving behavior. Traffic intersections are critical components of the road infrastructure, both in terms of safety risk (nearly 28% of fatal crashes and 58% of nonfatal crashes happen at intersections) as well as the operational efficiency of a road corridor. This raises an important question: can we create a data-driven simulator that can mimic the macro- and micro-statistics of the driving behavior at a traffic intersection? Deep Generative Modeling-based trajectory prediction models provide a good starting point to model the complex dynamics of vehicles at an intersection. But they are not tested in a "live" micro-simulation scenario and are not evaluated on traffic engineering-related metrics. In this study, we propose traffic engineering-related metrics to evaluate generative trajectory prediction models and provide a simulation-in-the-loop pipeline to do so. We also provide a multi-headed self-attention-based trajectory prediction model that incorporates the signal information, which outperforms our previous models on the evaluation metrics.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
TGDT: A Temporal Graph-based Digital Twin for Urban Traffic Corridors
Authors:
Nooshin Yousefzadeh,
Rahul Sengupta,
Jeremy Dilmore,
Sanjay Ranka
Abstract:
Urban congestion at signalized intersections leads to significant delays, economic losses, and increased emissions. Existing deep learning models often lack spatial generalizability, rely on complex architectures, and struggle with real-time deployment. To address these limitations, we propose the Temporal Graph-based Digital Twin (TGDT), a scalable framework that integrates Temporal Convolutional…
▽ More
Urban congestion at signalized intersections leads to significant delays, economic losses, and increased emissions. Existing deep learning models often lack spatial generalizability, rely on complex architectures, and struggle with real-time deployment. To address these limitations, we propose the Temporal Graph-based Digital Twin (TGDT), a scalable framework that integrates Temporal Convolutional Networks and Attentional Graph Neural Networks for dynamic, direction-aware traffic modeling and assessment at urban corridors. TGDT estimates key Measures of Effectiveness (MOEs) for traffic flow optimization at both the intersection level (e.g., queue length, waiting time) and the corridor level (e.g., traffic volume, travel time). Its modular architecture and sequential optimization scheme enable easy extension to any number of intersections and MOEs. The model outperforms state-of-the-art baselines by accurately producing high-dimensional, concurrent multi-output estimates. It also demonstrates high robustness and accuracy across diverse traffic conditions, including extreme scenarios, while relying on only a minimal set of traffic features. Fully parallelized, TGDT can simulate over a thousand scenarios within a matter of seconds, offering a cost-effective, interpretable, and real-time solution for urban traffic management and optimization.
△ Less
Submitted 14 May, 2025; v1 submitted 24 April, 2025;
originally announced April 2025.
-
Guaranteed Conditional Diffusion: 3D Block-based Models for Scientific Data Compression
Authors:
Jaemoon Lee,
Xiao Li,
Liangji Zhu,
Sanjay Ranka,
Anand Rangarajan
Abstract:
This paper proposes a new compression paradigm -- Guaranteed Conditional Diffusion with Tensor Correction (GCDTC) -- for lossy scientific data compression. The framework is based on recent conditional diffusion (CD) generative models, and it consists of a conditional diffusion model, tensor correction, and error guarantee. Our diffusion model is a mixture of 3D conditioning and 2D denoising U-Net.…
▽ More
This paper proposes a new compression paradigm -- Guaranteed Conditional Diffusion with Tensor Correction (GCDTC) -- for lossy scientific data compression. The framework is based on recent conditional diffusion (CD) generative models, and it consists of a conditional diffusion model, tensor correction, and error guarantee. Our diffusion model is a mixture of 3D conditioning and 2D denoising U-Net. The approach leverages a 3D block-based compressing module to address spatiotemporal correlations in structured scientific data. Then, the reverse diffusion process for 2D spatial data is conditioned on the ``slices'' of content latent variables produced by the compressing module. After training, the denoising decoder reconstructs the data with zero noise and content latent variables, and thus it is entirely deterministic. The reconstructed outputs of the CD model are further post-processed by our tensor correction and error guarantee steps to control and ensure a maximum error distortion, which is an inevitable requirement in lossy scientific data compression. Our experiments involving two datasets generated by climate and chemical combustion simulations show that our framework outperforms standard convolutional autoencoders and yields competitive compression quality with an existing scientific data compression algorithm.
△ Less
Submitted 18 February, 2025;
originally announced February 2025.
-
A General Framework for Error-controlled Unstructured Scientific Data Compression
Authors:
Qian Gong,
Zhe Wang,
Viktor Reshniak,
Xin Liang,
Jieyang Chen,
Qing Liu,
Tushar M. Athawale,
Yi Ju,
Anand Rangarajan,
Sanjay Ranka,
Norbert Podhorszki,
Rick Archibald,
Scott Klasky
Abstract:
Data compression plays a key role in reducing storage and I/O costs. Traditional lossy methods primarily target data on rectilinear grids and cannot leverage the spatial coherence in unstructured mesh data, leading to suboptimal compression ratios. We present a multi-component, error-bounded compression framework designed to enhance the compression of floating-point unstructured mesh data, which i…
▽ More
Data compression plays a key role in reducing storage and I/O costs. Traditional lossy methods primarily target data on rectilinear grids and cannot leverage the spatial coherence in unstructured mesh data, leading to suboptimal compression ratios. We present a multi-component, error-bounded compression framework designed to enhance the compression of floating-point unstructured mesh data, which is common in scientific applications. Our approach involves interpolating mesh data onto a rectilinear grid and then separately compressing the grid interpolation and the interpolation residuals. This method is general, independent of mesh types and typologies, and can be seamlessly integrated with existing lossy compressors for improved performance. We evaluated our framework across twelve variables from two synthetic datasets and two real-world simulation datasets. The results indicate that the multi-component framework consistently outperforms state-of-the-art lossy compressors on unstructured data, achieving, on average, a $2.3-3.5\times$ improvement in compression ratios, with error bounds ranging from $\num{1e-6}$ to $\num{1e-2}$. We further investigate the impact of hyperparameters, such as grid spacing and error allocation, to deliver optimal compression ratios in diverse datasets.
△ Less
Submitted 12 January, 2025;
originally announced January 2025.
-
Foundation Model for Lossy Compression of Spatiotemporal Scientific Data
Authors:
Xiao Li,
Jaemoon Lee,
Anand Rangarajan,
Sanjay Ranka
Abstract:
We present a foundation model (FM) for lossy scientific data compression, combining a variational autoencoder (VAE) with a hyper-prior structure and a super-resolution (SR) module. The VAE framework uses hyper-priors to model latent space dependencies, enhancing compression efficiency. The SR module refines low-resolution representations into high-resolution outputs, improving reconstruction quali…
▽ More
We present a foundation model (FM) for lossy scientific data compression, combining a variational autoencoder (VAE) with a hyper-prior structure and a super-resolution (SR) module. The VAE framework uses hyper-priors to model latent space dependencies, enhancing compression efficiency. The SR module refines low-resolution representations into high-resolution outputs, improving reconstruction quality. By alternating between 2D and 3D convolutions, the model efficiently captures spatiotemporal correlations in scientific data while maintaining low computational cost. Experimental results demonstrate that the FM generalizes well to unseen domains and varying data shapes, achieving up to 4 times higher compression ratios than state-of-the-art methods after domain-specific fine-tuning. The SR module improves compression ratio by 30 percent compared to simple upsampling techniques. This approach significantly reduces storage and transmission costs for large-scale scientific simulations while preserving data integrity and fidelity.
△ Less
Submitted 22 December, 2024;
originally announced December 2024.
-
Dynamic Graph Attention Networks for Travel Time Distribution Prediction in Urban Arterial Roads
Authors:
Nooshin Yousefzadeh,
Rahul Sengupta,
Sanjay Ranka
Abstract:
Effective congestion management along signalized corridors is essential for improving productivity and reducing costs, with arterial travel time serving as a key performance metric. Traditional approaches, such as Coordinated Signal Timing and Adaptive Traffic Control Systems, often lack scalability and generalizability across diverse urban layouts. We propose Fusion-based Dynamic Graph Neural Net…
▽ More
Effective congestion management along signalized corridors is essential for improving productivity and reducing costs, with arterial travel time serving as a key performance metric. Traditional approaches, such as Coordinated Signal Timing and Adaptive Traffic Control Systems, often lack scalability and generalizability across diverse urban layouts. We propose Fusion-based Dynamic Graph Neural Networks (FDGNN), a structured framework for simultaneous modeling of travel time distributions in both directions along arterial corridors. FDGNN utilizes attentional graph convolution on dynamic, bidirectional graphs and integrates fusion techniques to capture evolving spatiotemporal traffic dynamics. The framework is trained on extensive hours of simulation data and utilizes GPU computation to ensure scalability. The results demonstrate that our framework can efficiently and accurately model travel time as a normal distribution on arterial roads leveraging a unique dynamic graph representation of corridor traffic states. This representation integrates sequential traffic signal timing plans, local driving behaviors, temporal turning movement counts, and ingress traffic volumes, even when aggregated over intervals as short as a single cycle length. The results demonstrate resilience to effective traffic variations, including cycle lengths, green time percentages, traffic density, and counterfactual routes. Results further confirm its stability under varying conditions at different intersections. This framework supports dynamic signal timing, enhances congestion management, and improves travel time reliability in real-world applications.
△ Less
Submitted 15 December, 2024;
originally announced December 2024.
-
Attention Based Machine Learning Methods for Data Reduction with Guaranteed Error Bounds
Authors:
Xiao Li,
Jaemoon Lee,
Anand Rangarajan,
Sanjay Ranka
Abstract:
Scientific applications in fields such as high energy physics, computational fluid dynamics, and climate science generate vast amounts of data at high velocities. This exponential growth in data production is surpassing the advancements in computing power, network capabilities, and storage capacities. To address this challenge, data compression or reduction techniques are crucial. These scientific…
▽ More
Scientific applications in fields such as high energy physics, computational fluid dynamics, and climate science generate vast amounts of data at high velocities. This exponential growth in data production is surpassing the advancements in computing power, network capabilities, and storage capacities. To address this challenge, data compression or reduction techniques are crucial. These scientific datasets have underlying data structures that consist of structured and block structured multidimensional meshes where each grid point corresponds to a tensor. It is important that data reduction techniques leverage strong spatial and temporal correlations that are ubiquitous in these applications. Additionally, applications such as CFD, process tensors comprising hundred plus species and their attributes at each grid point. Reduction techniques should be able to leverage interrelationships between the elements in each tensor. In this paper, we propose an attention-based hierarchical compression method utilizing a block-wise compression setup. We introduce an attention-based hyper-block autoencoder to capture inter-block correlations, followed by a block-wise encoder to capture block-specific information. A PCA-based post-processing step is employed to guarantee error bounds for each data block. Our method effectively captures both spatiotemporal and inter-variable correlations within and between data blocks. Compared to the state-of-the-art SZ3, our method achieves up to 8 times higher compression ratio on the multi-variable S3D dataset. When evaluated on single-variable setups using the E3SM and XGC datasets, our method still achieves up to 3 times and 2 times higher compression ratio, respectively.
△ Less
Submitted 9 September, 2024;
originally announced September 2024.
-
Video-based Pedestrian and Vehicle Traffic Analysis During Football Games
Authors:
Jacques P. Fleischer,
Ryan Pallack,
Ahan Mishra,
Gustavo Riente de Andrade,
Subhadipto Poddar,
Emmanuel Posadas,
Robert Schenck,
Tania Banerjee,
Anand Rangarajan,
Sanjay Ranka
Abstract:
This paper utilizes video analytics to study pedestrian and vehicle traffic behavior, focusing on analyzing traffic patterns during football gamedays. The University of Florida (UF) hosts six to seven home football games on Saturdays during the college football season, attracting significant pedestrian activity. Through video analytics, this study provides valuable insights into the impact of thes…
▽ More
This paper utilizes video analytics to study pedestrian and vehicle traffic behavior, focusing on analyzing traffic patterns during football gamedays. The University of Florida (UF) hosts six to seven home football games on Saturdays during the college football season, attracting significant pedestrian activity. Through video analytics, this study provides valuable insights into the impact of these events on traffic volumes and safety at intersections. Comparing pedestrian and vehicle activities on gamedays versus non-gamedays reveals differing patterns. For example, pedestrian volume substantially increases during gamedays, which is positively correlated with the probability of the away team winning. This correlation is likely because fans of the home team enjoy watching difficult games. Win probabilities as an early predictor of pedestrian volumes at intersections can be a tool to help traffic professionals anticipate traffic management needs. Pedestrian-to-vehicle (P2V) conflicts notably increase on gamedays, particularly a few hours before games start. Addressing this, a "Barnes Dance" movement phase within the intersection is recommended. Law enforcement presence during high-activity gamedays can help ensure pedestrian compliance and enhance safety. In contrast, we identified that vehicle-to-vehicle (V2V) conflicts generally do not increase on gamedays and may even decrease due to heightened driver caution.
△ Less
Submitted 4 August, 2024;
originally announced August 2024.
-
MTDT: A Multi-Task Deep Learning Digital Twin
Authors:
Nooshin Yousefzadeh,
Rahul Sengupta,
Yashaswi Karnati,
Anand Rangarajan,
Sanjay Ranka
Abstract:
Traffic congestion has significant impacts on both the economy and the environment. Measures of Effectiveness (MOEs) have long been the standard for evaluating traffic intersections' level of service and operational efficiency. However, the scarcity of traditional high-resolution loop detector data (ATSPM) presents challenges in accurately measuring MOEs or capturing the intricate spatiotemporal c…
▽ More
Traffic congestion has significant impacts on both the economy and the environment. Measures of Effectiveness (MOEs) have long been the standard for evaluating traffic intersections' level of service and operational efficiency. However, the scarcity of traditional high-resolution loop detector data (ATSPM) presents challenges in accurately measuring MOEs or capturing the intricate spatiotemporal characteristics inherent in urban intersection traffic. To address this challenge, we present a comprehensive intersection traffic flow simulation that utilizes a multi-task learning paradigm. This approach combines graph convolutions for primary estimating lane-wise exit and inflow with time series convolutions for secondary assessing multi-directional queue lengths and travel time distribution through any arbitrary urban traffic intersection. Compared to existing deep learning methodologies, the proposed Multi-Task Deep Learning Digital Twin (MTDT) distinguishes itself through its adaptability to local temporal and spatial features, such as signal timing plans, intersection topology, driving behaviors, and turning movement counts. We also show the benefit of multi-task learning in the effectiveness of individual traffic simulation tasks. Furthermore, our approach facilitates sequential computation and provides complete parallelization through GPU implementation. This not only streamlines the computational process but also enhances scalability and performance.
△ Less
Submitted 14 May, 2025; v1 submitted 1 May, 2024;
originally announced May 2024.
-
Machine Learning Techniques for Data Reduction of Climate Applications
Authors:
Xiao Li,
Qian Gong,
Jaemoon Lee,
Scott Klasky,
Anand Rangarajan,
Sanjay Ranka
Abstract:
Scientists conduct large-scale simulations to compute derived quantities-of-interest (QoI) from primary data. Often, QoI are linked to specific features, regions, or time intervals, such that data can be adaptively reduced without compromising the integrity of QoI. For many spatiotemporal applications, these QoI are binary in nature and represent presence or absence of a physical phenomenon. We pr…
▽ More
Scientists conduct large-scale simulations to compute derived quantities-of-interest (QoI) from primary data. Often, QoI are linked to specific features, regions, or time intervals, such that data can be adaptively reduced without compromising the integrity of QoI. For many spatiotemporal applications, these QoI are binary in nature and represent presence or absence of a physical phenomenon. We present a pipelined compression approach that first uses neural-network-based techniques to derive regions where QoI are highly likely to be present. Then, we employ a Guaranteed Autoencoder (GAE) to compress data with differential error bounds. GAE uses QoI information to apply low-error compression to only these regions. This results in overall high compression ratios while still achieving downstream goals of simulation or data collections. Experimental results are presented for climate data generated from the E3SM Simulation model for downstream quantities such as tropical cyclone and atmospheric river detection and tracking. These results show that our approach is superior to comparable methods in the literature.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
Machine Learning Techniques for Data Reduction of CFD Applications
Authors:
Jaemoon Lee,
Ki Sung Jung,
Qian Gong,
Xiao Li,
Scott Klasky,
Jacqueline Chen,
Anand Rangarajan,
Sanjay Ranka
Abstract:
We present an approach called guaranteed block autoencoder that leverages Tensor Correlations (GBATC) for reducing the spatiotemporal data generated by computational fluid dynamics (CFD) and other scientific applications. It uses a multidimensional block of tensors (spanning in space and time) for both input and output, capturing the spatiotemporal and interspecies relationship within a tensor. Th…
▽ More
We present an approach called guaranteed block autoencoder that leverages Tensor Correlations (GBATC) for reducing the spatiotemporal data generated by computational fluid dynamics (CFD) and other scientific applications. It uses a multidimensional block of tensors (spanning in space and time) for both input and output, capturing the spatiotemporal and interspecies relationship within a tensor. The tensor consists of species that represent different elements in a CFD simulation. To guarantee the error bound of the reconstructed data, principal component analysis (PCA) is applied to the residual between the original and reconstructed data. This yields a basis matrix, which is then used to project the residual of each instance. The resulting coefficients are retained to enable accurate reconstruction. Experimental results demonstrate that our approach can deliver two orders of magnitude in reduction while still keeping the errors of primary data under scientifically acceptable bounds. Compared to reduction-based approaches based on SZ, our method achieves a substantially higher compression ratio for a given error bound or a better error for a given compression ratio.
△ Less
Submitted 28 April, 2024;
originally announced April 2024.
-
Graph Attention Network for Lane-Wise and Topology-Invariant Intersection Traffic Simulation
Authors:
Nooshin Yousefzadeh,
Rahul Sengupta,
Yashaswi Karnati,
Anand Rangarajan,
Sanjay Ranka
Abstract:
Traffic congestion has significant economic, environmental, and social ramifications. Intersection traffic flow dynamics are influenced by numerous factors. While microscopic traffic simulators are valuable tools, they are computationally intensive and challenging to calibrate. Moreover, existing machine-learning approaches struggle to provide lane-specific waveforms or adapt to intersection topol…
▽ More
Traffic congestion has significant economic, environmental, and social ramifications. Intersection traffic flow dynamics are influenced by numerous factors. While microscopic traffic simulators are valuable tools, they are computationally intensive and challenging to calibrate. Moreover, existing machine-learning approaches struggle to provide lane-specific waveforms or adapt to intersection topology and traffic patterns. In this study, we propose two efficient and accurate "Digital Twin" models for intersections, leveraging Graph Attention Neural Networks (GAT). These attentional graph auto-encoder digital twins capture temporal, spatial, and contextual aspects of traffic within intersections, incorporating various influential factors such as high-resolution loop detector waveforms, signal state records, driving behaviors, and turning-movement counts. Trained on diverse counterfactual scenarios across multiple intersections, our models generalize well, enabling the estimation of detailed traffic waveforms for any intersection approach and exit lanes. Multi-scale error metrics demonstrate that our models perform comparably to microsimulations. The primary application of our study lies in traffic signal optimization, a pivotal area in transportation systems research. These lightweight digital twins can seamlessly integrate into corridor and network signal timing optimization frameworks. Furthermore, our study's applications extend to lane reconfiguration, driving behavior analysis, and facilitating informed decisions regarding intersection safety and efficiency enhancements. A promising avenue for future research involves extending this approach to urban freeway corridors and integrating it with measures of effectiveness metrics.
△ Less
Submitted 1 May, 2024; v1 submitted 10 April, 2024;
originally announced April 2024.
-
MGARD: A multigrid framework for high-performance, error-controlled data compression and refactoring
Authors:
Qian Gong,
Jieyang Chen,
Ben Whitney,
Xin Liang,
Viktor Reshniak,
Tania Banerjee,
Jaemoon Lee,
Anand Rangarajan,
Lipeng Wan,
Nicolas Vidal,
Qing Liu,
Ana Gainaru,
Norbert Podhorszki,
Richard Archibald,
Sanjay Ranka,
Scott Klasky
Abstract:
We describe MGARD, a software providing MultiGrid Adaptive Reduction for floating-point scientific data on structured and unstructured grids. With exceptional data compression capability and precise error control, MGARD addresses a wide range of requirements, including storage reduction, high-performance I/O, and in-situ data analysis. It features a unified application programming interface (API)…
▽ More
We describe MGARD, a software providing MultiGrid Adaptive Reduction for floating-point scientific data on structured and unstructured grids. With exceptional data compression capability and precise error control, MGARD addresses a wide range of requirements, including storage reduction, high-performance I/O, and in-situ data analysis. It features a unified application programming interface (API) that seamlessly operates across diverse computing architectures. MGARD has been optimized with highly-tuned GPU kernels and efficient memory and device management mechanisms, ensuring scalable and rapid operations.
△ Less
Submitted 11 January, 2024;
originally announced January 2024.
-
Spatiotemporally adaptive compression for scientific dataset with feature preservation -- a case study on simulation data with extreme climate events analysis
Authors:
Qian Gong,
Chengzhu Zhang,
Xin Liang,
Viktor Reshniak,
Jieyang Chen,
Anand Rangarajan,
Sanjay Ranka,
Nicolas Vidal,
Lipeng Wan,
Paul Ullrich,
Norbert Podhorszki,
Robert Jacob,
Scott Klasky
Abstract:
Scientific discoveries are increasingly constrained by limited storage space and I/O capacities. For time-series simulations and experiments, their data often need to be decimated over timesteps to accommodate storage and I/O limitations. In this paper, we propose a technique that addresses storage costs while improving post-analysis accuracy through spatiotemporal adaptive, error-controlled lossy…
▽ More
Scientific discoveries are increasingly constrained by limited storage space and I/O capacities. For time-series simulations and experiments, their data often need to be decimated over timesteps to accommodate storage and I/O limitations. In this paper, we propose a technique that addresses storage costs while improving post-analysis accuracy through spatiotemporal adaptive, error-controlled lossy compression. We investigate the trade-off between data precision and temporal output rates, revealing that reducing data precision and increasing timestep frequency lead to more accurate analysis outcomes. Additionally, we integrate spatiotemporal feature detection with data compression and demonstrate that performing adaptive error-bounded compression in higher dimensional space enables greater compression ratios, leveraging the error propagation theory of a transformation-based compressor.
To evaluate our approach, we conduct experiments using the well-known E3SM climate simulation code and apply our method to compress variables used for cyclone tracking. Our results show a significant reduction in storage size while enhancing the quality of cyclone tracking analysis, both quantitatively and qualitatively, in comparison to the prevalent timestep decimation approach. Compared to three state-of-the-art lossy compressors lacking feature preservation capabilities, our adaptive compression framework improves perfectly matched cases in TC tracking by 26.4-51.3% at medium compression ratios and by 77.3-571.1% at large compression ratios, with a merely 5-11% computational overhead.
△ Less
Submitted 6 January, 2024;
originally announced January 2024.
-
A Spatiotemporal Correspondence Approach to Unsupervised LiDAR Segmentation with Traffic Applications
Authors:
Xiao Li,
Pan He,
Aotian Wu,
Sanjay Ranka,
Anand Rangarajan
Abstract:
We address the problem of unsupervised semantic segmentation of outdoor LiDAR point clouds in diverse traffic scenarios. The key idea is to leverage the spatiotemporal nature of a dynamic point cloud sequence and introduce drastically stronger augmentation by establishing spatiotemporal correspondences across multiple frames. We dovetail clustering and pseudo-label learning in this work. Essential…
▽ More
We address the problem of unsupervised semantic segmentation of outdoor LiDAR point clouds in diverse traffic scenarios. The key idea is to leverage the spatiotemporal nature of a dynamic point cloud sequence and introduce drastically stronger augmentation by establishing spatiotemporal correspondences across multiple frames. We dovetail clustering and pseudo-label learning in this work. Essentially, we alternate between clustering points into semantic groups and optimizing models using point-wise pseudo-spatiotemporal labels with a simple learning objective. Therefore, our method can learn discriminative features in an unsupervised learning fashion. We show promising segmentation performance on Semantic-KITTI, SemanticPOSS, and FLORIDA benchmark datasets covering scenarios in autonomous vehicle and intersection infrastructure, which is competitive when compared against many existing fully supervised learning methods. This general framework can lead to a unified representation learning approach for LiDAR point clouds incorporating domain knowledge.
△ Less
Submitted 23 August, 2023;
originally announced August 2023.
-
An Efficient Semi-Automated Scheme for Infrastructure LiDAR Annotation
Authors:
Aotian Wu,
Pan He,
Xiao Li,
Ke Chen,
Sanjay Ranka,
Anand Rangarajan
Abstract:
Most existing perception systems rely on sensory data acquired from cameras, which perform poorly in low light and adverse weather conditions. To resolve this limitation, we have witnessed advanced LiDAR sensors become popular in perception tasks in autonomous driving applications. Nevertheless, their usage in traffic monitoring systems is less ubiquitous. We identify two significant obstacles in…
▽ More
Most existing perception systems rely on sensory data acquired from cameras, which perform poorly in low light and adverse weather conditions. To resolve this limitation, we have witnessed advanced LiDAR sensors become popular in perception tasks in autonomous driving applications. Nevertheless, their usage in traffic monitoring systems is less ubiquitous. We identify two significant obstacles in cost-effectively and efficiently developing such a LiDAR-based traffic monitoring system: (i) public LiDAR datasets are insufficient for supporting perception tasks in infrastructure systems, and (ii) 3D annotations on LiDAR point clouds are time-consuming and expensive. To fill this gap, we present an efficient semi-automated annotation tool that automatically annotates LiDAR sequences with tracking algorithms while offering a fully annotated infrastructure LiDAR dataset -- FLORIDA (Florida LiDAR-based Object Recognition and Intelligent Data Annotation) -- which will be made publicly available. Our advanced annotation tool seamlessly integrates multi-object tracking (MOT), single-object tracking (SOT), and suitable trajectory post-processing techniques. Specifically, we introduce a human-in-the-loop schema in which annotators recursively fix and refine annotations imperfectly predicted by our tool and incrementally add them to the training dataset to obtain better SOT and MOT models. By repeating the process, we significantly increase the overall annotation speed by three to four times and obtain better qualitative annotations than a state-of-the-art annotation tool. The human annotation experiments verify the effectiveness of our annotation tool. In addition, we provide detailed statistics and object detection evaluation results for our dataset in serving as a benchmark for perception tasks at traffic intersections.
△ Less
Submitted 25 January, 2023;
originally announced January 2023.
-
Scalable Hybrid Learning Techniques for Scientific Data Compression
Authors:
Tania Banerjee,
Jong Choi,
Jaemoon Lee,
Qian Gong,
Jieyang Chen,
Scott Klasky,
Anand Rangarajan,
Sanjay Ranka
Abstract:
Data compression is becoming critical for storing scientific data because many scientific applications need to store large amounts of data and post process this data for scientific discovery. Unlike image and video compression algorithms that limit errors to primary data, scientists require compression techniques that accurately preserve derived quantities of interest (QoIs). This paper presents a…
▽ More
Data compression is becoming critical for storing scientific data because many scientific applications need to store large amounts of data and post process this data for scientific discovery. Unlike image and video compression algorithms that limit errors to primary data, scientists require compression techniques that accurately preserve derived quantities of interest (QoIs). This paper presents a physics-informed compression technique implemented as an end-to-end, scalable, GPU-based pipeline for data compression that addresses this requirement. Our hybrid compression technique combines machine learning techniques and standard compression methods. Specifically, we combine an autoencoder, an error-bounded lossy compressor to provide guarantees on raw data error, and a constraint satisfaction post-processing step to preserve the QoIs within a minimal error (generally less than floating point error).
The effectiveness of the data compression pipeline is demonstrated by compressing nuclear fusion simulation data generated by a large-scale fusion code, XGC, which produces hundreds of terabytes of data in a single day. Our approach works within the ADIOS framework and results in compression by a factor of more than 150 while requiring only a few percent of the computational resources necessary for generating the data, making the overall approach highly effective for practical scenarios.
△ Less
Submitted 20 December, 2022;
originally announced December 2022.
-
Expressing linear equality constraints in feedforward neural networks
Authors:
Anand Rangarajan,
Pan He,
Jaemoon Lee,
Tania Banerjee,
Sanjay Ranka
Abstract:
We seek to impose linear, equality constraints in feedforward neural networks. As top layer predictors are usually nonlinear, this is a difficult task if we seek to deploy standard convex optimization methods and strong duality. To overcome this, we introduce a new saddle-point Lagrangian with auxiliary predictor variables on which constraints are imposed. Elimination of the auxiliary variables le…
▽ More
We seek to impose linear, equality constraints in feedforward neural networks. As top layer predictors are usually nonlinear, this is a difficult task if we seek to deploy standard convex optimization methods and strong duality. To overcome this, we introduce a new saddle-point Lagrangian with auxiliary predictor variables on which constraints are imposed. Elimination of the auxiliary variables leads to a dual minimization problem on the Lagrange multipliers introduced to satisfy the linear constraints. This minimization problem is combined with the standard learning problem on the weight matrices. From this theoretical line of development, we obtain the surprising interpretation of Lagrange parameters as additional, penultimate layer hidden units with fixed weights stemming from the constraints. Consequently, standard minimization approaches can be used despite the inclusion of Lagrange parameters -- a very satisfying, albeit unexpected, discovery. Examples ranging from multi-label classification to constrained autoencoders are envisaged in the future. The code has been made available at https://github.com/anandrajan0/smartalec
△ Less
Submitted 7 January, 2023; v1 submitted 8 November, 2022;
originally announced November 2022.
-
Learning Canonical Embeddings for Unsupervised Shape Correspondence with Locally Linear Transformations
Authors:
Pan He,
Patrick Emami,
Sanjay Ranka,
Anand Rangarajan
Abstract:
We present a new approach to unsupervised shape correspondence learning between pairs of point clouds. We make the first attempt to adapt the classical locally linear embedding algorithm (LLE) -- originally designed for nonlinear dimensionality reduction -- for shape correspondence. The key idea is to find dense correspondences between shapes by first obtaining high-dimensional neighborhood-preser…
▽ More
We present a new approach to unsupervised shape correspondence learning between pairs of point clouds. We make the first attempt to adapt the classical locally linear embedding algorithm (LLE) -- originally designed for nonlinear dimensionality reduction -- for shape correspondence. The key idea is to find dense correspondences between shapes by first obtaining high-dimensional neighborhood-preserving embeddings of low-dimensional point clouds and subsequently aligning the source and target embeddings using locally linear transformations. We demonstrate that learning the embedding using a new LLE-inspired point cloud reconstruction objective results in accurate shape correspondences. More specifically, the approach comprises an end-to-end learnable framework of extracting high-dimensional neighborhood-preserving embeddings, estimating locally linear transformations in the embedding space, and reconstructing shapes via divergence measure-based alignment of probabilistic density functions built over reconstructed and target shapes. Our approach enforces embeddings of shapes in correspondence to lie in the same universal/canonical embedding space, which eventually helps regularize the learning process and leads to a simple nearest neighbors approach between shape embeddings for finding reliable correspondences. Comprehensive experiments show that the new method makes noticeable improvements over state-of-the-art approaches on standard shape correspondence benchmark datasets covering both human and nonhuman shapes.
△ Less
Submitted 6 September, 2022; v1 submitted 5 September, 2022;
originally announced September 2022.
-
Towards Improving the Generation Quality of Autoregressive Slot VAEs
Authors:
Patrick Emami,
Pan He,
Sanjay Ranka,
Anand Rangarajan
Abstract:
Unconditional scene inference and generation are challenging to learn jointly with a single compositional model. Despite encouraging progress on models that extract object-centric representations (''slots'') from images, unconditional generation of scenes from slots has received less attention. This is primarily because learning the multi-object relations necessary to imagine coherent scenes is di…
▽ More
Unconditional scene inference and generation are challenging to learn jointly with a single compositional model. Despite encouraging progress on models that extract object-centric representations (''slots'') from images, unconditional generation of scenes from slots has received less attention. This is primarily because learning the multi-object relations necessary to imagine coherent scenes is difficult. We hypothesize that most existing slot-based models have a limited ability to learn object correlations. We propose two improvements that strengthen object correlation learning. The first is to condition the slots on a global, scene-level variable that captures higher-order correlations between slots. Second, we address the fundamental lack of a canonical order for objects in images by proposing to learn a consistent order to use for the autoregressive generation of scene objects. Specifically, we train an autoregressive slot prior to sequentially generate scene objects following a learned order. Ordered slot inference entails first estimating a randomly ordered set of slots using existing approaches for extracting slots from images, then aligning those slots to ordered slots generated autoregressively with the slot prior. Our experiments across three multi-object environments demonstrate clear gains in unconditional scene generation quality. Detailed ablation studies are also provided that validate the two proposed improvements.
△ Less
Submitted 27 November, 2023; v1 submitted 2 June, 2022;
originally announced June 2022.
-
Self-Supervised Robust Scene Flow Estimation via the Alignment of Probability Density Functions
Authors:
Pan He,
Patrick Emami,
Sanjay Ranka,
Anand Rangarajan
Abstract:
In this paper, we present a new self-supervised scene flow estimation approach for a pair of consecutive point clouds. The key idea of our approach is to represent discrete point clouds as continuous probability density functions using Gaussian mixture models. Scene flow estimation is therefore converted into the problem of recovering motion from the alignment of probability density functions, whi…
▽ More
In this paper, we present a new self-supervised scene flow estimation approach for a pair of consecutive point clouds. The key idea of our approach is to represent discrete point clouds as continuous probability density functions using Gaussian mixture models. Scene flow estimation is therefore converted into the problem of recovering motion from the alignment of probability density functions, which we achieve using a closed-form expression of the classic Cauchy-Schwarz divergence. Unlike existing nearest-neighbor-based approaches that use hard pairwise correspondences, our proposed approach establishes soft and implicit point correspondences between point clouds and generates more robust and accurate scene flow in the presence of missing correspondences and outliers. Comprehensive experiments show that our method makes noticeable gains over the Chamfer Distance and the Earth Mover's Distance in real-world environments and achieves state-of-the-art performance among self-supervised learning methods on FlyingThings3D and KITTI, even outperforming some supervised methods with ground truth annotations.
△ Less
Submitted 23 March, 2022;
originally announced March 2022.
-
Learning Scene Dynamics from Point Cloud Sequences
Authors:
Pan He,
Patrick Emami,
Sanjay Ranka,
Anand Rangarajan
Abstract:
Understanding 3D scenes is a critical prerequisite for autonomous agents. Recently, LiDAR and other sensors have made large amounts of data available in the form of temporal sequences of point cloud frames. In this work, we propose a novel problem -- sequential scene flow estimation (SSFE) -- that aims to predict 3D scene flow for all pairs of point clouds in a given sequence. This is unlike the p…
▽ More
Understanding 3D scenes is a critical prerequisite for autonomous agents. Recently, LiDAR and other sensors have made large amounts of data available in the form of temporal sequences of point cloud frames. In this work, we propose a novel problem -- sequential scene flow estimation (SSFE) -- that aims to predict 3D scene flow for all pairs of point clouds in a given sequence. This is unlike the previously studied problem of scene flow estimation which focuses on two frames.
We introduce the SPCM-Net architecture, which solves this problem by computing multi-scale spatiotemporal correlations between neighboring point clouds and then aggregating the correlation across time with an order-invariant recurrent unit. Our experimental evaluation confirms that recurrent processing of point cloud sequences results in significantly better SSFE compared to using only two frames. Additionally, we demonstrate that this approach can be effectively modified for sequential point cloud forecasting (SPF), a related problem that demands forecasting future point cloud frames.
Our experimental results are evaluated using a new benchmark for both SSFE and SPF consisting of synthetic and real datasets. Previously, datasets for scene flow estimation have been limited to two frames. We provide non-trivial extensions to these datasets for multi-frame estimation and prediction. Due to the difficulty of obtaining ground truth motion for real-world datasets, we use self-supervised training and evaluation metrics. We believe that this benchmark will be pivotal to future research in this area. All code for benchmark and models will be made accessible.
△ Less
Submitted 16 November, 2021;
originally announced November 2021.
-
Efficient Iterative Amortized Inference for Learning Symmetric and Disentangled Multi-Object Representations
Authors:
Patrick Emami,
Pan He,
Sanjay Ranka,
Anand Rangarajan
Abstract:
Unsupervised multi-object representation learning depends on inductive biases to guide the discovery of object-centric representations that generalize. However, we observe that methods for learning these representations are either impractical due to long training times and large memory consumption or forego key inductive biases. In this work, we introduce EfficientMORL, an efficient framework for…
▽ More
Unsupervised multi-object representation learning depends on inductive biases to guide the discovery of object-centric representations that generalize. However, we observe that methods for learning these representations are either impractical due to long training times and large memory consumption or forego key inductive biases. In this work, we introduce EfficientMORL, an efficient framework for the unsupervised learning of object-centric representations. We show that optimization challenges caused by requiring both symmetry and disentanglement can in fact be addressed by high-cost iterative amortized inference by designing the framework to minimize its dependence on it. We take a two-stage approach to inference: first, a hierarchical variational autoencoder extracts symmetric and disentangled representations through bottom-up inference, and second, a lightweight network refines the representations with top-down feedback. The number of refinement steps taken during training is reduced following a curriculum, so that at test time with zero steps the model achieves 99.1% of the refined decomposition performance. We demonstrate strong object decomposition and disentanglement on the standard multi-object benchmark while achieving nearly an order of magnitude faster training and test time inference over the previous state-of-the-art model.
△ Less
Submitted 7 June, 2021;
originally announced June 2021.
-
Hybrid Generative Models for Two-Dimensional Datasets
Authors:
Hoda Shajari,
Jaemoon Lee,
Sanjay Ranka,
Anand Rangarajan
Abstract:
Two-dimensional array-based datasets are pervasive in a variety of domains. Current approaches for generative modeling have typically been limited to conventional image datasets and performed in the pixel domain which do not explicitly capture the correlation between pixels. Additionally, these approaches do not extend to scientific and other applications where each element value is continuous and…
▽ More
Two-dimensional array-based datasets are pervasive in a variety of domains. Current approaches for generative modeling have typically been limited to conventional image datasets and performed in the pixel domain which do not explicitly capture the correlation between pixels. Additionally, these approaches do not extend to scientific and other applications where each element value is continuous and is not limited to a fixed range. In this paper, we propose a novel approach for generating two-dimensional datasets by moving the computations to the space of representation bases and show its usefulness for two different datasets, one from imaging and another from scientific computing. The proposed approach is general and can be applied to any dataset, representation basis, or generative model. We provide a comprehensive performance comparison of various combinations of generative models and representation basis spaces. We also propose a new evaluation metric which captures the deficiency of generating images in pixel space.
△ Less
Submitted 9 July, 2021; v1 submitted 31 May, 2021;
originally announced June 2021.
-
SparsePipe: Parallel Deep Learning for 3D Point Clouds
Authors:
Keke Zhai,
Pan He,
Tania Banerjee,
Anand Rangarajan,
Sanjay Ranka
Abstract:
We propose SparsePipe, an efficient and asynchronous parallelism approach for handling 3D point clouds with multi-GPU training. SparsePipe is built to support 3D sparse data such as point clouds. It achieves this by adopting generalized convolutions with sparse tensor representation to build expressive high-dimensional convolutional neural networks. Compared to dense solutions, the new models can…
▽ More
We propose SparsePipe, an efficient and asynchronous parallelism approach for handling 3D point clouds with multi-GPU training. SparsePipe is built to support 3D sparse data such as point clouds. It achieves this by adopting generalized convolutions with sparse tensor representation to build expressive high-dimensional convolutional neural networks. Compared to dense solutions, the new models can efficiently process irregular point clouds without densely sliding over the entire space, significantly reducing the memory requirements and allowing higher resolutions of the underlying 3D volumes for better performance.
SparsePipe exploits intra-batch parallelism that partitions input data into multiple processors and further improves the training throughput with inter-batch pipelining to overlap communication and computing. Besides, it suitably partitions the model when the GPUs are heterogeneous such that the computing is load-balanced with reduced communication overhead.
Using experimental results on an eight-GPU platform, we show that SparsePipe can parallelize effectively and obtain better performance on current point cloud benchmarks for both training and inference, compared to its dense solutions.
△ Less
Submitted 26 December, 2020;
originally announced December 2020.
-
Intelligent Intersection: Two-Stream Convolutional Networks for Real-time Near Accident Detection in Traffic Video
Authors:
Xiaohui Huang,
Pan He,
Anand Rangarajan,
Sanjay Ranka
Abstract:
In Intelligent Transportation System, real-time systems that monitor and analyze road users become increasingly critical as we march toward the smart city era. Vision-based frameworks for Object Detection, Multiple Object Tracking, and Traffic Near Accident Detection are important applications of Intelligent Transportation System, particularly in video surveillance and etc. Although deep neural ne…
▽ More
In Intelligent Transportation System, real-time systems that monitor and analyze road users become increasingly critical as we march toward the smart city era. Vision-based frameworks for Object Detection, Multiple Object Tracking, and Traffic Near Accident Detection are important applications of Intelligent Transportation System, particularly in video surveillance and etc. Although deep neural networks have recently achieved great success in many computer vision tasks, a uniformed framework for all the three tasks is still challenging where the challenges multiply from demand for real-time performance, complex urban setting, highly dynamic traffic event, and many traffic movements. In this paper, we propose a two-stream Convolutional Network architecture that performs real-time detection, tracking, and near accident detection of road users in traffic video data. The two-stream model consists of a spatial stream network for Object Detection and a temporal stream network to leverage motion features for Multiple Object Tracking. We detect near accidents by incorporating appearance features and motion features from two-stream networks. Using aerial videos, we propose a Traffic Near Accident Dataset (TNAD) covering various types of traffic interactions that is suitable for vision-based traffic analysis tasks. Our experiments demonstrate the advantage of our framework with an overall competitive qualitative and quantitative performance at high frame rates on the TNAD dataset.
△ Less
Submitted 4 January, 2019;
originally announced January 2019.
-
Dynamic Load Balancing for Compressible Multiphase Turbulence
Authors:
Keke Zhai,
Tania Banerjee,
David Zwick,
Jason Hackl,
Sanjay Ranka
Abstract:
CMT-nek is a new scientific application for performing high fidelity predictive simulations of particle laden explosively dispersed turbulent flows. CMT-nek involves detailed simulations, is compute intensive and is targeted to be deployed on exascale platforms. The moving particles are the main source of load imbalance as the application is executed on parallel processors. In a demonstration prob…
▽ More
CMT-nek is a new scientific application for performing high fidelity predictive simulations of particle laden explosively dispersed turbulent flows. CMT-nek involves detailed simulations, is compute intensive and is targeted to be deployed on exascale platforms. The moving particles are the main source of load imbalance as the application is executed on parallel processors. In a demonstration problem, all the particles are initially in a closed container until a detonation occurs and the particles move apart. If all processors get an equal share of the fluid domain, then only some of the processors get sections of the domain that are initially laden with particles, leading to disparate load on the processors. In order to eliminate load imbalance in different processors and to speedup the makespan, we present different load balancing algorithms for CMT-nek on large scale multi-core platforms consisting of hundred of thousands of cores. The detailed process of the load balancing algorithms are presented. The performance of the different load balancing algorithms are compared and the associated overheads are analyzed. Evaluations on the application with and without load balancing are conducted and these show that with load balancing, simulation time becomes faster by a factor of up to $9.97$.
△ Less
Submitted 6 July, 2018;
originally announced July 2018.
-
Learning Permutations with Sinkhorn Policy Gradient
Authors:
Patrick Emami,
Sanjay Ranka
Abstract:
Many problems at the intersection of combinatorics and computer science require solving for a permutation that optimally matches, ranks, or sorts some data. These problems usually have a task-specific, often non-differentiable objective function that data-driven algorithms can use as a learning signal. In this paper, we propose the Sinkhorn Policy Gradient (SPG) algorithm for learning policies on…
▽ More
Many problems at the intersection of combinatorics and computer science require solving for a permutation that optimally matches, ranks, or sorts some data. These problems usually have a task-specific, often non-differentiable objective function that data-driven algorithms can use as a learning signal. In this paper, we propose the Sinkhorn Policy Gradient (SPG) algorithm for learning policies on permutation matrices. The actor-critic neural network architecture we introduce for SPG uniquely decouples representation learning of the state space from the highly-structured action space of permutations with a temperature-controlled Sinkhorn layer. The Sinkhorn layer produces continuous relaxations of permutation matrices so that the actor-critic architecture can be trained end-to-end. Our empirical results show that agents trained with SPG can perform competitively on sorting, the Euclidean TSP, and matching tasks. We also observe that SPG is significantly more data efficient at the matching task than the baseline methods, which indicates that SPG is conducive to learning representations that are useful for reasoning about permutations.
△ Less
Submitted 17 May, 2018;
originally announced May 2018.
-
Visual Explanations From Deep 3D Convolutional Neural Networks for Alzheimer's Disease Classification
Authors:
Chengliang Yang,
Anand Rangarajan,
Sanjay Ranka
Abstract:
We develop three efficient approaches for generating visual explanations from 3D convolutional neural networks (3D-CNNs) for Alzheimer's disease classification. One approach conducts sensitivity analysis on hierarchical 3D image segmentation, and the other two visualize network activations on a spatial map. Visual checks and a quantitative localization benchmark indicate that all approaches identi…
▽ More
We develop three efficient approaches for generating visual explanations from 3D convolutional neural networks (3D-CNNs) for Alzheimer's disease classification. One approach conducts sensitivity analysis on hierarchical 3D image segmentation, and the other two visualize network activations on a spatial map. Visual checks and a quantitative localization benchmark indicate that all approaches identify important brain parts for Alzheimer's disease diagnosis. Comparative analysis show that the sensitivity analysis based approach has difficulty handling loosely distributed cerebral cortex, and approaches based on visualization of activations are constrained by the resolution of the convolutional layer. The complementarity of these methods improves the understanding of 3D-CNNs in Alzheimer's disease classification from different perspectives.
△ Less
Submitted 5 July, 2018; v1 submitted 7 March, 2018;
originally announced March 2018.
-
Machine Learning Methods for Data Association in Multi-Object Tracking
Authors:
Patrick Emami,
Panos M. Pardalos,
Lily Elefteriadou,
Sanjay Ranka
Abstract:
Data association is a key step within the multi-object tracking pipeline that is notoriously challenging due to its combinatorial nature. A popular and general way to formulate data association is as the NP-hard multidimensional assignment problem (MDAP). Over the last few years, data-driven approaches to assignment have become increasingly prevalent as these techniques have started to mature. We…
▽ More
Data association is a key step within the multi-object tracking pipeline that is notoriously challenging due to its combinatorial nature. A popular and general way to formulate data association is as the NP-hard multidimensional assignment problem (MDAP). Over the last few years, data-driven approaches to assignment have become increasingly prevalent as these techniques have started to mature. We focus this survey solely on learning algorithms for the assignment step of multi-object tracking, and we attempt to unify various methods by highlighting their connections to linear assignment as well as to the MDAP. First, we review probabilistic and end-to-end optimization approaches to data association, followed by methods that learn association affinities from data. We then compare the performance of the methods presented in this survey, and conclude by discussing future research directions.
△ Less
Submitted 25 August, 2020; v1 submitted 19 February, 2018;
originally announced February 2018.
-
Global Model Interpretation via Recursive Partitioning
Authors:
Chengliang Yang,
Anand Rangarajan,
Sanjay Ranka
Abstract:
In this work, we propose a simple but effective method to interpret black-box machine learning models globally. That is, we use a compact binary tree, the interpretation tree, to explicitly represent the most important decision rules that are implicitly contained in the black-box machine learning models. This tree is learned from the contribution matrix which consists of the contributions of input…
▽ More
In this work, we propose a simple but effective method to interpret black-box machine learning models globally. That is, we use a compact binary tree, the interpretation tree, to explicitly represent the most important decision rules that are implicitly contained in the black-box machine learning models. This tree is learned from the contribution matrix which consists of the contributions of input variables to predicted scores for each single prediction. To generate the interpretation tree, a unified process recursively partitions the input variable space by maximizing the difference in the average contribution of the split variable between the divided spaces. We demonstrate the effectiveness of our method in diagnosing machine learning models on multiple tasks. Also, it is useful for new knowledge discovery as such insights are not easily identifiable when only looking at single predictions. In general, our work makes it easier and more efficient for human beings to understand machine learning models.
△ Less
Submitted 23 May, 2018; v1 submitted 10 February, 2018;
originally announced February 2018.
-
Spatial Scaling of Satellite Soil Moisture using Temporal Correlations and Ensemble Learning
Authors:
Subit Chakrabarti,
Jasmeet Judge,
Tara Bongiovanni,
Anand Rangarajan,
Sanjay Ranka
Abstract:
A novel algorithm is developed to downscale soil moisture (SM), obtained at satellite scales of 10-40 km by utilizing its temporal correlations to historical auxiliary data at finer scales. Including such correlations drastically reduces the size of the training set needed, accounts for time-lagged relationships, and enables downscaling even in the presence of short gaps in the auxiliary data. The…
▽ More
A novel algorithm is developed to downscale soil moisture (SM), obtained at satellite scales of 10-40 km by utilizing its temporal correlations to historical auxiliary data at finer scales. Including such correlations drastically reduces the size of the training set needed, accounts for time-lagged relationships, and enables downscaling even in the presence of short gaps in the auxiliary data. The algorithm is based upon bagged regression trees (BRT) and uses correlations between high-resolution remote sensing products and SM observations. The algorithm trains multiple regression trees and automatically chooses the trees that generate the best downscaled estimates. The algorithm was evaluated using a multi-scale synthetic dataset in north central Florida for two years, including two growing seasons of corn and one growing season of cotton per year. The time-averaged error across the region was found to be 0.01 $\mathrm{m}^3/\mathrm{m}^3$, with a standard deviation of 0.012 $\mathrm{m}^3/\mathrm{m}^3$ when 0.02% of the data were used for training in addition to temporal correlations from the past seven days, and all available data from the past year. The maximum spatially averaged errors obtained using this algorithm in downscaled SM were 0.005 $\mathrm{m}^3/\mathrm{m}^3$, for pixels with cotton land-cover. When land surface temperature~(LST) on the day of downscaling was not included in the algorithm to simulate "data gaps", the spatially averaged error increased minimally by 0.015 $\mathrm{m}^3/\mathrm{m}^3$ when LST is unavailable on the day of downscaling. The results indicate that the BRT-based algorithm provides high accuracy for downscaling SM using complex non-linear spatio-temporal correlations, under heterogeneous micro meteorological conditions.
△ Less
Submitted 21 January, 2016;
originally announced January 2016.
-
Disaggregation of SMAP L3 Brightness Temperatures to 9km using Kernel Machines
Authors:
Subit Chakrabarti,
Tara Bongiovanni,
Jasmeet Judge,
Anand Rangarajan,
Sanjay Ranka
Abstract:
In this study, a machine learning algorithm is used for disaggregation of SMAP brightness temperatures (T$_{\textrm{B}}$) from 36km to 9km. It uses image segmentation to cluster the study region based on meteorological and land cover similarity, followed by a support vector machine based regression that computes the value of the disaggregated T$_{\textrm{B}}$ at all pixels. High resolution remote…
▽ More
In this study, a machine learning algorithm is used for disaggregation of SMAP brightness temperatures (T$_{\textrm{B}}$) from 36km to 9km. It uses image segmentation to cluster the study region based on meteorological and land cover similarity, followed by a support vector machine based regression that computes the value of the disaggregated T$_{\textrm{B}}$ at all pixels. High resolution remote sensing products such as land surface temperature, normalized difference vegetation index, enhanced vegetation index, precipitation, soil texture, and land-cover were used for disaggregation. The algorithm was implemented in Iowa, United States, from April to July 2015, and compared with the SMAP L3_SM_AP T$_{\textrm{B}}$ product at 9km. It was found that the disaggregated T$_{\textrm{B}}$ were very similar to the SMAP-T$_{\textrm{B}}$ product, even for vegetated areas with a mean difference $\leq$ 5K. However, the standard deviation of the disaggregation was lower by 7K than that of the AP product. The probability density functions of the disaggregated T$_{\textrm{B}}$ were similar to the SMAP-T$_{\textrm{B}}$. The results indicate that this algorithm may be used for disaggregating T$_{\textrm{B}}$ using complex non-linear correlations on a grid.
△ Less
Submitted 10 February, 2016; v1 submitted 20 January, 2016;
originally announced January 2016.
-
Downscaling Microwave Brightness Temperatures Using Self Regularized Regressive Models
Authors:
Subit Chakrabarti,
Jasmeet Judge,
Anand Rangarajan,
Sanjay Ranka
Abstract:
A novel algorithm is proposed to downscale microwave brightness temperatures ($\mathrm{T_B}$), at scales of 10-40 km such as those from the Soil Moisture Active Passive mission to a resolution meaningful for hydrological and agricultural applications. This algorithm, called Self-Regularized Regressive Models (SRRM), uses auxiliary variables correlated to $\mathrm{T_B}$ along-with a limited set of…
▽ More
A novel algorithm is proposed to downscale microwave brightness temperatures ($\mathrm{T_B}$), at scales of 10-40 km such as those from the Soil Moisture Active Passive mission to a resolution meaningful for hydrological and agricultural applications. This algorithm, called Self-Regularized Regressive Models (SRRM), uses auxiliary variables correlated to $\mathrm{T_B}$ along-with a limited set of \textit{in-situ} SM observations, which are converted to high resolution $\mathrm{T_B}$ observations using biophysical models. It includes an information-theoretic clustering step based on all auxiliary variables to identify areas of similarity, followed by a kernel regression step that produces downscaled $\mathrm{T_B}$. This was implemented on a multi-scale synthetic data-set over NC-Florida for one year. An RMSE of 5.76~K with standard deviation of 2.8~k was achieved during the vegetated season and an RMSE of 1.2~K with a standard deviation of 0.9~K during periods of no vegetation.
△ Less
Submitted 30 January, 2015;
originally announced January 2015.
-
Disaggregation of Remotely Sensed Soil Moisture in Heterogeneous Landscapes using Holistic Structure based Models
Authors:
Subit Chakrabarti,
Jasmeet Judge,
Anand Rangarajan,
Sanjay Ranka
Abstract:
In this study, a novel machine learning algorithm is presented for disaggregation of satellite soil moisture (SM) based on self-regularized regressive models (SRRM) using high-resolution correlated information from auxiliary sources. It includes regularized clustering that assigns soft memberships to each pixel at fine-scale followed by a kernel regression that computes the value of the desired va…
▽ More
In this study, a novel machine learning algorithm is presented for disaggregation of satellite soil moisture (SM) based on self-regularized regressive models (SRRM) using high-resolution correlated information from auxiliary sources. It includes regularized clustering that assigns soft memberships to each pixel at fine-scale followed by a kernel regression that computes the value of the desired variable at all pixels. Coarse-scale remotely sensed SM were disaggregated from 10km to 1km using land cover, precipitation, land surface temperature, leaf area index, and in-situ observations of SM. This algorithm was evaluated using multi-scale synthetic observations in NC Florida for heterogeneous agricultural land covers. It was found that the root mean square error (RMSE) for 96% of the pixels was less than 0.02 $m^3/m^3$. The clusters generated represented the data well and reduced the RMSE by upto 40% during periods of high heterogeneity in land-cover and meteorological conditions. The Kullback Leibler divergence (KLD) between the true SM and the disaggregated estimates is close to 0, for both vegetated and baresoil landcovers. The disaggregated estimates were compared to those generated by the Principle of Relevant Information (PRI) method. The RMSE for the PRI disaggregated estimates is higher than the RMSE for the SRRM on each day of the season. The KLD of the disaggregated estimates generated by the SRRM is at least four orders of magnitude lower than those for the PRI disaggregated estimates, while the computational time needed was reduced by three times. The results indicate that the SRRM can be used for disaggregating SM with complex non-linear correlations on a grid with high accuracy.
△ Less
Submitted 20 January, 2016; v1 submitted 30 January, 2015;
originally announced January 2015.
-
Data-driven Co-clustering Model of Internet Usage in Large Mobile Societies
Authors:
Saeed Moghaddam,
Ahmed Helmy,
Sanjay Ranka,
Manas Somaiya
Abstract:
Design and simulation of future mobile networks will center around human interests and behavior. We propose a design paradigm for mobile networks driven by realistic models of users' on-line behavior, based on mining of billions of wireless-LAN records. We introduce a systematic method for large-scale multi-dimensional coclustering of web activity for thousands of mobile users at 79 locations. We…
▽ More
Design and simulation of future mobile networks will center around human interests and behavior. We propose a design paradigm for mobile networks driven by realistic models of users' on-line behavior, based on mining of billions of wireless-LAN records. We introduce a systematic method for large-scale multi-dimensional coclustering of web activity for thousands of mobile users at 79 locations. We find surprisingly that users can be consistently modeled using ten clusters with disjoint profiles. Access patterns from multiple locations show differential user behavior. This is the first study to obtain such detailed results for mobile Internet usage.
△ Less
Submitted 27 May, 2010;
originally announced May 2010.