-
Physics-informed deep learning for infectious disease forecasting
Authors:
Ying Qian,
Kui Zhang,
Éric Marty,
Avranil Basu,
Eamon B. O'Dea,
Xianqiao Wang,
Spencer Fox,
Pejman Rohani,
John M. Drake,
He Li
Abstract:
Accurate forecasting of contagious diseases is critical for public health policymaking and pandemic preparedness. We propose a new infectious disease forecasting model based on physics-informed neural networks (PINNs), an emerging scientific machine learning approach. By embedding a compartmental model into the loss function, our method integrates epidemiological theory with data, helping to preve…
▽ More
Accurate forecasting of contagious diseases is critical for public health policymaking and pandemic preparedness. We propose a new infectious disease forecasting model based on physics-informed neural networks (PINNs), an emerging scientific machine learning approach. By embedding a compartmental model into the loss function, our method integrates epidemiological theory with data, helping to prevent model overfitting. We further enhance the model with a sub-network that accounts for covariates such as mobility and cumulative vaccine doses, which influence the transmission rate. Using state-level COVID-19 data from California, we demonstrate that the PINN model accurately predicts cases, deaths, and hospitalizations, aligning well with existing benchmarks. Notably, the PINN model outperforms naive baseline forecasts and several sequence deep learning models, including Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTM) networks, Gated Recurrent Units (GRUs), and Transformers. It also achieves performance comparable to a sophisticated Gaussian infection state forecasting model that combines compartmental dynamics, a data observation model, and parameter regression. However, the PINN model features a simpler structure and is easier to implement. In summary, we systematically evaluate the PINN model's ability to forecast infectious disease dynamics, demonstrating its potential as an efficient computational tool to strengthen forecasting capabilities.
△ Less
Submitted 29 April, 2025; v1 submitted 16 January, 2025;
originally announced January 2025.
-
Automating Deformable Gasket Assembly
Authors:
Simeon Adebola,
Tara Sadjadpour,
Karim El-Refai,
Will Panitch,
Zehan Ma,
Roy Lin,
Tianshuang Qiu,
Shreya Ganti,
Charlotte Le,
Jaimyn Drake,
Ken Goldberg
Abstract:
In Gasket Assembly, a deformable gasket must be aligned and pressed into a narrow channel. This task is common for sealing surfaces in the manufacturing of automobiles, appliances, electronics, and other products. Gasket Assembly is a long-horizon, high-precision task and the gasket must align with the channel and be fully pressed in to achieve a secure fit. To compare approaches, we present 4 met…
▽ More
In Gasket Assembly, a deformable gasket must be aligned and pressed into a narrow channel. This task is common for sealing surfaces in the manufacturing of automobiles, appliances, electronics, and other products. Gasket Assembly is a long-horizon, high-precision task and the gasket must align with the channel and be fully pressed in to achieve a secure fit. To compare approaches, we present 4 methods for Gasket Assembly: one policy from deep imitation learning and three procedural algorithms. We evaluate these methods with 100 physical trials. Results suggest that the Binary+ algorithm succeeds in 10/10 on the straight channel whereas the learned policy based on 250 human teleoperated demonstrations succeeds in 8/10 trials and is significantly slower. Code, CAD models, videos, and data can be found at https://berkeleyautomation.github.io/robot-gasket/
△ Less
Submitted 22 August, 2024;
originally announced August 2024.
-
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Authors:
Alexander Khazatsky,
Karl Pertsch,
Suraj Nair,
Ashwin Balakrishna,
Sudeep Dasari,
Siddharth Karamcheti,
Soroush Nasiriany,
Mohan Kumar Srirama,
Lawrence Yunliang Chen,
Kirsty Ellis,
Peter David Fagan,
Joey Hejna,
Masha Itkina,
Marion Lepert,
Yecheng Jason Ma,
Patrick Tree Miller,
Jimmy Wu,
Suneel Belkhale,
Shivin Dass,
Huy Ha,
Arhan Jain,
Abraham Lee,
Youngwoon Lee,
Marius Memmel,
Sungjae Park
, et al. (76 additional authors not shown)
Abstract:
The creation of large, diverse, high-quality robot manipulation datasets is an important stepping stone on the path toward more capable and robust robotic manipulation policies. However, creating such datasets is challenging: collecting robot manipulation data in diverse environments poses logistical and safety challenges and requires substantial investments in hardware and human labour. As a resu…
▽ More
The creation of large, diverse, high-quality robot manipulation datasets is an important stepping stone on the path toward more capable and robust robotic manipulation policies. However, creating such datasets is challenging: collecting robot manipulation data in diverse environments poses logistical and safety challenges and requires substantial investments in hardware and human labour. As a result, even the most general robot manipulation policies today are mostly trained on data collected in a small number of environments with limited scene and task diversity. In this work, we introduce DROID (Distributed Robot Interaction Dataset), a diverse robot manipulation dataset with 76k demonstration trajectories or 350 hours of interaction data, collected across 564 scenes and 84 tasks by 50 data collectors in North America, Asia, and Europe over the course of 12 months. We demonstrate that training with DROID leads to policies with higher performance and improved generalization ability. We open source the full dataset, policy learning code, and a detailed guide for reproducing our robot hardware setup.
△ Less
Submitted 22 April, 2025; v1 submitted 19 March, 2024;
originally announced March 2024.
-
A Touch, Vision, and Language Dataset for Multimodal Alignment
Authors:
Letian Fu,
Gaurav Datta,
Huang Huang,
William Chung-Ho Panitch,
Jaimyn Drake,
Joseph Ortiz,
Mustafa Mukadam,
Mike Lambeta,
Roberto Calandra,
Ken Goldberg
Abstract:
Touch is an important sensing modality for humans, but it has not yet been incorporated into a multimodal generative language model. This is partially due to the difficulty of obtaining natural language labels for tactile data and the complexity of aligning tactile readings with both visual observations and language descriptions. As a step towards bridging that gap, this work introduces a new data…
▽ More
Touch is an important sensing modality for humans, but it has not yet been incorporated into a multimodal generative language model. This is partially due to the difficulty of obtaining natural language labels for tactile data and the complexity of aligning tactile readings with both visual observations and language descriptions. As a step towards bridging that gap, this work introduces a new dataset of 44K in-the-wild vision-touch pairs, with English language labels annotated by humans (10%) and textual pseudo-labels from GPT-4V (90%). We use this dataset to train a vision-language-aligned tactile encoder for open-vocabulary classification and a touch-vision-language (TVL) model for text generation using the trained encoder. Results suggest that by incorporating touch, the TVL model improves (+29% classification accuracy) touch-vision-language alignment over existing models trained on any pair of those modalities. Although only a small fraction of the dataset is human-labeled, the TVL model demonstrates improved visual-tactile understanding over GPT-4V (+12%) and open-source vision-language models (+32%) on a new touch-vision understanding benchmark. Code and data: https://tactile-vlm.github.io.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
FogROS2-Config: Optimizing Latency and Cost for Multi-Cloud Robot Applications
Authors:
Kaiyuan Chen,
Kush Hari,
Rohil Khare,
Charlotte Le,
Trinity Chung,
Jaimyn Drake,
Jeffrey Ichnowski,
John Kubiatowicz,
Ken Goldberg
Abstract:
Cloud service providers provide over 50,000 distinct and dynamically changing set of cloud server options. To help roboticists make cost-effective decisions, we present FogROS2-Config, an open toolkit that takes ROS2 nodes as input and automatically runs relevant benchmarks to quickly return a menu of cloud compute services that tradeoff latency and cost. Because it is infeasible to try every hard…
▽ More
Cloud service providers provide over 50,000 distinct and dynamically changing set of cloud server options. To help roboticists make cost-effective decisions, we present FogROS2-Config, an open toolkit that takes ROS2 nodes as input and automatically runs relevant benchmarks to quickly return a menu of cloud compute services that tradeoff latency and cost. Because it is infeasible to try every hardware configuration, FogROS2-Config quickly samples tests a small set of edge case servers. We evaluate FogROS2-Config on three robotics application tasks: visual SLAM, grasp planning. and motion planning. FogROS2-Config can reduce the cost by up to 20x. By comparing with a Pareto frontier for cost and latency by running the application task on feasible server configurations, we evaluate cost and latency models and confirm that FogROS2-Config selects efficient hardware configurations to balance cost and latency.
△ Less
Submitted 13 May, 2024; v1 submitted 9 November, 2023;
originally announced November 2023.
-
Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Authors:
Open X-Embodiment Collaboration,
Abby O'Neill,
Abdul Rehman,
Abhinav Gupta,
Abhiram Maddukuri,
Abhishek Gupta,
Abhishek Padalkar,
Abraham Lee,
Acorn Pooley,
Agrim Gupta,
Ajay Mandlekar,
Ajinkya Jain,
Albert Tung,
Alex Bewley,
Alex Herzog,
Alex Irpan,
Alexander Khazatsky,
Anant Rai,
Anchit Gupta,
Andrew Wang,
Andrey Kolobov,
Anikait Singh,
Animesh Garg,
Aniruddha Kembhavi,
Annie Xie
, et al. (269 additional authors not shown)
Abstract:
Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning method…
▽ More
Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train generalist X-robot policy that can be adapted efficiently to new robots, tasks, and environments? In this paper, we provide datasets in standardized data formats and models to make it possible to explore this possibility in the context of robotic manipulation, alongside experimental results that provide an example of effective X-robot policies. We assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms. More details can be found on the project website https://robotics-transformer-x.github.io.
△ Less
Submitted 14 May, 2025; v1 submitted 13 October, 2023;
originally announced October 2023.
-
Simulation-based Inference for Exoplanet Atmospheric Retrieval: Insights from winning the Ariel Data Challenge 2023 using Normalizing Flows
Authors:
Mayeul Aubin,
Carolina Cuesta-Lazaro,
Ethan Tregidga,
Javier Viaña,
Cecilia Garraffo,
Iouli E. Gordon,
Mercedes López-Morales,
Robert J. Hargreaves,
Vladimir Yu. Makhnev,
Jeremy J. Drake,
Douglas P. Finkbeiner,
Phillip Cargile
Abstract:
Advancements in space telescopes have opened new avenues for gathering vast amounts of data on exoplanet atmosphere spectra. However, accurately extracting chemical and physical properties from these spectra poses significant challenges due to the non-linear nature of the underlying physics.
This paper presents novel machine learning models developed by the AstroAI team for the Ariel Data Challe…
▽ More
Advancements in space telescopes have opened new avenues for gathering vast amounts of data on exoplanet atmosphere spectra. However, accurately extracting chemical and physical properties from these spectra poses significant challenges due to the non-linear nature of the underlying physics.
This paper presents novel machine learning models developed by the AstroAI team for the Ariel Data Challenge 2023, where one of the models secured the top position among 293 competitors. Leveraging Normalizing Flows, our models predict the posterior probability distribution of atmospheric parameters under different atmospheric assumptions.
Moreover, we introduce an alternative model that exhibits higher performance potential than the winning model, despite scoring lower in the challenge. These findings highlight the need to reevaluate the evaluation metric and prompt further exploration of more efficient and accurate approaches for exoplanet atmosphere spectra analysis.
Finally, we present recommendations to enhance the challenge and models, providing valuable insights for future applications on real observational data. These advancements pave the way for more effective and timely analysis of exoplanet atmospheric properties, advancing our understanding of these distant worlds.
△ Less
Submitted 17 September, 2023;
originally announced September 2023.
-
Electromagnets Under the Table: an Unobtrusive Magnetic Navigation System for Microsurgery
Authors:
Adam Schonewille,
Changyan He,
Cameron Forbrigger,
Nancy Wu,
James Drake,
Thomas Looi,
Eric Diller
Abstract:
Miniature magnetic tools have the potential to enable minimally invasive surgical techniques to be applied to space-restricted surgical procedures in areas such as neurosurgery. However, typical magnetic navigation systems, which create the magnetic fields to drive such tools, either cannot generate large enough fields, or surround the patient in a way that obstructs surgeon access to the patient.…
▽ More
Miniature magnetic tools have the potential to enable minimally invasive surgical techniques to be applied to space-restricted surgical procedures in areas such as neurosurgery. However, typical magnetic navigation systems, which create the magnetic fields to drive such tools, either cannot generate large enough fields, or surround the patient in a way that obstructs surgeon access to the patient. This paper introduces the design of a magnetic navigation system with eight electromagnets arranged completely under the operating table, to endow the system with maximal workspace accessibility, which allows the patient to lie down on the top surface of the system without any constraints. The found optimal geometric layout of the electromagnets maximizes the field strength and uniformity over a reasonable neurosurgical operating volume. The system can generate non-uniform magnetic fields up to 38 mT along the x and y axes and 47 mT along the z axis at a working distance of 120 mm away from the actuation system workbench, deep enough to deploy magnetic microsurgical tools in the brain. The forces which can be exerted on millimeter-scale magnets used in prototype neurosurgical tools are validated experimentally. Due to its large workspace, this system could be used to control milli-robots in a variety of surgical applications.
△ Less
Submitted 23 August, 2023;
originally announced August 2023.
-
MARTA Reach: Piloting an On-Demand Multimodal Transit System in Atlanta
Authors:
Pascal Van Hentenryck,
Connor Riley,
Anthony Trasatti,
Hongzhao Guan,
Tejas Santanam,
Jorge A. Huertas,
Kevin Dalmeijer,
Kari Watkins,
Juwon Drake,
Samson Baskin
Abstract:
This paper reports on the results of the six-month pilot MARTA Reach, which aimed to demonstrate the potential value of On-Demand Multimodal Transit Systems (ODMTS) in the city of Atlanta, Georgia. ODMTS take a transit-centric view by integrating on-demand services and traditional fixed routes in order to address the first/last mile problem. ODMTS combine fixed routes and on-demand shuttle service…
▽ More
This paper reports on the results of the six-month pilot MARTA Reach, which aimed to demonstrate the potential value of On-Demand Multimodal Transit Systems (ODMTS) in the city of Atlanta, Georgia. ODMTS take a transit-centric view by integrating on-demand services and traditional fixed routes in order to address the first/last mile problem. ODMTS combine fixed routes and on-demand shuttle services by design (not as an after-thought) into a transit system that offers a door-to-door multimodal service with fully integrated operations and fare structure. The paper fills a knowledge gap, i.e., the understanding of the impact, benefits, and challenges of deploying ODMTS in a city as complex as Atlanta, Georgia. The pilot was deployed in four different zones with limited transit options, and used on-demand shuttles integrated with the overall transit system to address the first/last mile problem. The paper describes the design and operations of the pilot, and presents the results in terms of ridership, quality of service, trip purposes, alternative modes of transportation, multimodal nature of trips, challenges encountered, and cost estimates. The main findings of the pilot are that Reach offered a highly valued service that performed a large number of trips that would have otherwise been served by ride-hailing companies, taxis, or personal cars. Moreover, the wide majority of Reach trips were multimodal, with connections to rail being most prominent.
△ Less
Submitted 23 September, 2023; v1 submitted 4 August, 2023;
originally announced August 2023.
-
Natural SQL: Making SQL Easier to Infer from Natural Language Specifications
Authors:
Yujian Gan,
Xinyun Chen,
Jinxia Xie,
Matthew Purver,
John R. Woodward,
John Drake,
Qiaofu Zhang
Abstract:
Addressing the mismatch between natural language descriptions and the corresponding SQL queries is a key challenge for text-to-SQL translation. To bridge this gap, we propose an SQL intermediate representation (IR) called Natural SQL (NatSQL). Specifically, NatSQL preserves the core functionalities of SQL, while it simplifies the queries as follows: (1) dispensing with operators and keywords such…
▽ More
Addressing the mismatch between natural language descriptions and the corresponding SQL queries is a key challenge for text-to-SQL translation. To bridge this gap, we propose an SQL intermediate representation (IR) called Natural SQL (NatSQL). Specifically, NatSQL preserves the core functionalities of SQL, while it simplifies the queries as follows: (1) dispensing with operators and keywords such as GROUP BY, HAVING, FROM, JOIN ON, which are usually hard to find counterparts for in the text descriptions; (2) removing the need for nested subqueries and set operators; and (3) making schema linking easier by reducing the required number of schema items. On Spider, a challenging text-to-SQL benchmark that contains complex and nested SQL queries, we demonstrate that NatSQL outperforms other IRs, and significantly improves the performance of several previous SOTA models. Furthermore, for existing models that do not support executable SQL generation, NatSQL easily enables them to generate executable SQL queries, and achieves the new state-of-the-art execution accuracy.
△ Less
Submitted 10 September, 2021;
originally announced September 2021.
-
Blueprint: Cyberinfrastructure Center of Excellence
Authors:
Ewa Deelman,
Anirban Mandal,
Angela P. Murillo,
Jarek Nabrzyski,
Valerio Pascucci,
Robert Ricci,
Ilya Baldin,
Susan Sons,
Laura Christopherson,
Charles Vardeman,
Rafael Ferreira da Silva,
Jane Wyngaard,
Steve Petruzza,
Mats Rynge,
Karan Vahi,
Wendy R. Whitcup,
Josh Drake,
Erik Scott
Abstract:
In 2018, NSF funded an effort to pilot a Cyberinfrastructure Center of Excellence (CI CoE or Center) that would serve the cyberinfrastructure (CI) needs of the NSF Major Facilities (MFs) and large projects with advanced CI architectures. The goal of the CI CoE Pilot project (Pilot) effort was to develop a model and a blueprint for such a CoE by engaging with the MFs, understanding their CI needs,…
▽ More
In 2018, NSF funded an effort to pilot a Cyberinfrastructure Center of Excellence (CI CoE or Center) that would serve the cyberinfrastructure (CI) needs of the NSF Major Facilities (MFs) and large projects with advanced CI architectures. The goal of the CI CoE Pilot project (Pilot) effort was to develop a model and a blueprint for such a CoE by engaging with the MFs, understanding their CI needs, understanding the contributions the MFs are making to the CI community, and exploring opportunities for building a broader CI community. This document summarizes the results of community engagements conducted during the first two years of the project and describes the identified CI needs of the MFs. To better understand MFs' CI, the Pilot has developed and validated a model of the MF data lifecycle that follows the data generation and management within a facility and gained an understanding of how this model captures the fundamental stages that the facilities' data passes through from the scientific instruments to the principal investigators and their teams, to the broader collaborations and the public. The Pilot also aimed to understand what CI workforce development challenges the MFs face while designing, constructing, and operating their CI and what solutions they are exploring and adopting within their projects. Based on the needs of the MFs in the data lifecycle and workforce development areas, this document outlines a blueprint for a CI CoE that will learn about and share the CI solutions designed, developed, and/or adopted by the MFs, provide expertise to the largest NSF projects with advanced and complex CI architectures, and foster a community of CI practitioners and researchers.
△ Less
Submitted 6 March, 2021;
originally announced March 2021.
-
Anatomical Mesh-Based Virtual Fixtures for Surgical Robots
Authors:
Zhaoshuo Li,
Alex Gordon,
Thomas Looi,
James Drake,
Christopher Forrest,
Russell H. Taylor
Abstract:
This paper presents a dynamic constraint formulation to provide protective virtual fixtures of 3D anatomical structures from polygon mesh representations. The proposed approach can anisotropically limit the tool motion of surgical robots without any assumption of the local anatomical shape close to the tool. Using a bounded search strategy and Principle Directed tree, the proposed system can run e…
▽ More
This paper presents a dynamic constraint formulation to provide protective virtual fixtures of 3D anatomical structures from polygon mesh representations. The proposed approach can anisotropically limit the tool motion of surgical robots without any assumption of the local anatomical shape close to the tool. Using a bounded search strategy and Principle Directed tree, the proposed system can run efficiently at 180 Hz for a mesh object containing 989,376 triangles and 493,460 vertices. The proposed algorithm has been validated in both simulation and skull cutting experiments. The skull cutting experiment setup uses a novel piezoelectric bone cutting tool designed for the da Vinci research kit. The result shows that the virtual fixture assisted teleoperation has statistically significant improvements in the cutting path accuracy and penetration depth control. The code has been made publicly available at https://github.com/mli0603/PolygonMeshVirtualFixture.
△ Less
Submitted 28 July, 2020; v1 submitted 3 June, 2020;
originally announced June 2020.
-
Effective reinforcement learning based local search for the maximum k-plex problem
Authors:
Yan Jin,
John H. Drake,
Una Benlic,
Kun He
Abstract:
The maximum k-plex problem is a computationally complex problem, which emerged from graph-theoretic social network studies. This paper presents an effective hybrid local search for solving the maximum k-plex problem that combines the recently proposed breakout local search algorithm with a reinforcement learning strategy. The proposed approach includes distinguishing features such as: a unified ne…
▽ More
The maximum k-plex problem is a computationally complex problem, which emerged from graph-theoretic social network studies. This paper presents an effective hybrid local search for solving the maximum k-plex problem that combines the recently proposed breakout local search algorithm with a reinforcement learning strategy. The proposed approach includes distinguishing features such as: a unified neighborhood search based on the swapping operator, a distance-and-quality reward for actions and a new parameter control mechanism based on reinforcement learning. Extensive experiments for the maximum k-plex problem (k = 2, 3, 4, 5) on 80 benchmark instances from the second DIMACS Challenge demonstrate that the proposed approach can match the best-known results from the literature in all but four problem instances. In addition, the proposed algorithm is able to find 32 new best solutions.
△ Less
Submitted 13 March, 2019;
originally announced March 2019.
-
Real-Time Data Mining of Massive Data Streams from Synoptic Sky Surveys
Authors:
S. G. Djorgovski,
M. J. Graham,
C. Donalek,
A. A. Mahabal,
A. J. Drake,
M. Turmon,
T. Fuchs
Abstract:
The nature of scientific and technological data collection is evolving rapidly: data volumes and rates grow exponentially, with increasing complexity and information content, and there has been a transition from static data sets to data streams that must be analyzed in real time. Interesting or anomalous phenomena must be quickly characterized and followed up with additional measurements via optim…
▽ More
The nature of scientific and technological data collection is evolving rapidly: data volumes and rates grow exponentially, with increasing complexity and information content, and there has been a transition from static data sets to data streams that must be analyzed in real time. Interesting or anomalous phenomena must be quickly characterized and followed up with additional measurements via optimal deployment of limited assets. Modern astronomy presents a variety of such phenomena in the form of transient events in digital synoptic sky surveys, including cosmic explosions (supernovae, gamma ray bursts), relativistic phenomena (black hole formation, jets), potentially hazardous asteroids, etc. We have been developing a set of machine learning tools to detect, classify and plan a response to transient events for astronomy applications, using the Catalina Real-time Transient Survey (CRTS) as a scientific and methodological testbed. The ability to respond rapidly to the potentially most interesting events is a key bottleneck that limits the scientific returns from the current and anticipated synoptic sky surveys. Similar challenge arise in other contexts, from environmental monitoring using sensor networks to autonomous spacecraft systems. Given the exponential growth of data rates, and the time-critical response, we need a fully automated and robust approach. We describe the results obtained to date, and the possible future developments.
△ Less
Submitted 17 January, 2016;
originally announced January 2016.
-
Patterns of Ship-borne Species Spread: A Clustering Approach for Risk Assessment and Management of Non-indigenous Species Spread
Authors:
J Xu,
TL Wickramarathne,
EK Grey,
K Steinhaeuser,
R Keller,
J Drake,
N Chawla,
DM Lodge
Abstract:
The spread of non-indigenous species (NIS) through the global shipping network (GSN) has enormous ecological and economic cost throughout the world. Previous attempts at quantifying NIS invasions have mostly taken "bottom-up" approaches that eventually require the use of multiple simplifying assumptions due to insufficiency and/or uncertainty of available data. By modeling implicit species exchang…
▽ More
The spread of non-indigenous species (NIS) through the global shipping network (GSN) has enormous ecological and economic cost throughout the world. Previous attempts at quantifying NIS invasions have mostly taken "bottom-up" approaches that eventually require the use of multiple simplifying assumptions due to insufficiency and/or uncertainty of available data. By modeling implicit species exchanges via a graph abstraction that we refer to as the Species Flow Network (SFN), a different approach that exploits the power of network science methods in extracting knowledge from largely incomplete data is presented. Here, coarse-grained species flow dynamics are studied via a graph clustering approach that decomposes the SFN to clusters of ports and inter-cluster connections. With this decomposition of ports in place, NIS flow among clusters can be very efficiently reduced by enforcing NIS management on a few chosen inter-cluster connections. Furthermore, efficient NIS management strategy for species exchanges within a cluster (often difficult due higher rate of travel and pathways) are then derived in conjunction with ecological and environmental aspects that govern the species establishment. The benefits of the presented approach include robustness to data uncertainties, implicit incorporation of "stepping-stone" spread of invasive species, and decoupling of species spread and establishment risk estimation. Our analysis of a multi-year (1997--2006) GSN dataset using the presented approach shows the existence of a few large clusters of ports with higher intra-cluster species flow that are fairly stable over time. Furthermore, detailed investigations were carried out on vessel types, ports, and inter-cluster connections. Finally, our observations are discussed in the context of known NIS invasions and future research directions are also presented.
△ Less
Submitted 21 January, 2014;
originally announced January 2014.