-
AERO: An autonomous platform for continuous research
Authors:
Valérie Hayot-Sasson,
Abby Stevens,
Nicholson Collier,
Sudershan Sridhar,
Kyle Conroy,
J. Gregory Pauloski,
Yadu Babuji,
Maxime Gonthier,
Nathaniel Hudson,
Dante D. Sanchez-Gallegos,
Ian Foster,
Jonathan Ozik,
Kyle Chard
Abstract:
The COVID-19 pandemic highlighted the need for new data infrastructure, as epidemiologists and public health workers raced to harness rapidly evolving data, analytics, and infrastructure in support of cross-sector investigations. To meet this need, we developed AERO, an automated research and data sharing platform for continuous, distributed, and multi-disciplinary collaboration. In this paper, we…
▽ More
The COVID-19 pandemic highlighted the need for new data infrastructure, as epidemiologists and public health workers raced to harness rapidly evolving data, analytics, and infrastructure in support of cross-sector investigations. To meet this need, we developed AERO, an automated research and data sharing platform for continuous, distributed, and multi-disciplinary collaboration. In this paper, we describe the AERO design and how it supports the automatic ingestion, validation, and transformation of monitored data into a form suitable for analysis; the automated execution of analyses on this data; and the sharing of data among different entities. We also describe how our AERO implementation leverages capabilities provided by the Globus platform and GitHub for automation, distributed execution, data sharing, and authentication. We present results obtained with an instance of AERO running two public health surveillance applications and demonstrate benchmarking results with a synthetic application, all of which are publicly available for testing.
△ Less
Submitted 23 May, 2025;
originally announced May 2025.
-
Advancing calibration for stochastic agent-based models in epidemiology with Stein variational inference and Gaussian process surrogates
Authors:
Connor Robertson,
Cosmin Safta,
Nicholson Collier,
Jonathan Ozik,
Jaideep Ray
Abstract:
Accurate calibration of stochastic agent-based models (ABMs) in epidemiology is crucial to make them useful in public health policy decisions and interventions. Traditional calibration methods, e.g., Markov Chain Monte Carlo (MCMC), that yield a probability density function for the parameters being calibrated, are often computationally expensive. When applied to ABMs which are highly parametrized,…
▽ More
Accurate calibration of stochastic agent-based models (ABMs) in epidemiology is crucial to make them useful in public health policy decisions and interventions. Traditional calibration methods, e.g., Markov Chain Monte Carlo (MCMC), that yield a probability density function for the parameters being calibrated, are often computationally expensive. When applied to ABMs which are highly parametrized, the calibration process becomes computationally infeasible. This paper investigates the utility of Stein Variational Inference (SVI) as an alternative calibration technique for stochastic epidemiological ABMs approximated by Gaussian process (GP) surrogates. SVI leverages gradient information to iteratively update a set of particles in the space of parameters being calibrated, offering potential advantages in scalability and efficiency for high-dimensional ABMs. The ensemble of particles yields a joint probability density function for the parameters and serves as the calibration. We compare the performance of SVI and MCMC in calibrating CityCOVID, a stochastic epidemiological ABM, focusing on predictive accuracy and calibration effectiveness. Our results demonstrate that SVI maintains predictive accuracy and calibration effectiveness comparable to MCMC, making it a viable alternative for complex epidemiological models. We also present the practical challenges of using a gradient-based calibration such as SVI which include careful tuning of hyperparameters and monitoring of the particle dynamics.
△ Less
Submitted 26 February, 2025;
originally announced February 2025.
-
Bayesian calibration of stochastic agent based model via random forest
Authors:
Connor Robertson,
Cosmin Safta,
Nicholson Collier,
Jonathan Ozik,
Jaideep Ray
Abstract:
Agent-based models (ABM) provide an excellent framework for modeling outbreaks and interventions in epidemiology by explicitly accounting for diverse individual interactions and environments. However, these models are usually stochastic and highly parametrized, requiring precise calibration for predictive performance. When considering realistic numbers of agents and properly accounting for stochas…
▽ More
Agent-based models (ABM) provide an excellent framework for modeling outbreaks and interventions in epidemiology by explicitly accounting for diverse individual interactions and environments. However, these models are usually stochastic and highly parametrized, requiring precise calibration for predictive performance. When considering realistic numbers of agents and properly accounting for stochasticity, this high dimensional calibration can be computationally prohibitive. This paper presents a random forest based surrogate modeling technique to accelerate the evaluation of ABMs and demonstrates its use to calibrate an epidemiological ABM named CityCOVID via Markov chain Monte Carlo (MCMC). The technique is first outlined in the context of CityCOVID's quantities of interest, namely hospitalizations and deaths, by exploring dimensionality reduction via temporal decomposition with principal component analysis (PCA) and via sensitivity analysis. The calibration problem is then presented and samples are generated to best match COVID-19 hospitalization and death numbers in Chicago from March to June in 2020. These results are compared with previous approximate Bayesian calibration (IMABC) results and their predictive performance is analyzed showing improved performance with a reduction in computation.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Towards Improved Uncertainty Quantification of Stochastic Epidemic Models Using Sequential Monte Carlo
Authors:
Arindam Fadikar,
Abby Stevens,
Nicholson Collier,
Kok Ben Toh,
Olga Morozova,
Anna Hotton,
Jared Clark,
David Higdon,
Jonathan Ozik
Abstract:
Sequential Monte Carlo (SMC) algorithms represent a suite of robust computational methodologies utilized for state estimation and parameter inference within dynamical systems, particularly in real-time or online environments where data arrives sequentially over time. In this research endeavor, we propose an integrated framework that combines a stochastic epidemic simulator with a sequential import…
▽ More
Sequential Monte Carlo (SMC) algorithms represent a suite of robust computational methodologies utilized for state estimation and parameter inference within dynamical systems, particularly in real-time or online environments where data arrives sequentially over time. In this research endeavor, we propose an integrated framework that combines a stochastic epidemic simulator with a sequential importance sampling (SIS) scheme to dynamically infer model parameters, which evolve due to social as well as biological processes throughout the progression of an epidemic outbreak and are also influenced by evolving data measurement bias. Through iterative updates of a set of weighted simulated trajectories based on observed data, this framework enables the estimation of posterior distributions for these parameters, thereby capturing their temporal variability and associated uncertainties. Through simulation studies, we showcase the efficacy of SMC in accurately tracking the evolving dynamics of epidemics while appropriately accounting for uncertainties. Moreover, we delve into practical considerations and challenges inherent in implementing SMC for parameter estimation within dynamic epidemiological settings, areas where the substantial computational capabilities of high-performance computing resources can be usefully brought to bear.
△ Less
Submitted 6 March, 2024; v1 submitted 23 February, 2024;
originally announced February 2024.
-
NSF RESUME HPC Workshop: High-Performance Computing and Large-Scale Data Management in Service of Epidemiological Modeling
Authors:
Abby Stevens,
Jonathan Ozik,
Kyle Chard,
Jaline Gerardin,
Justin M. Wozniak
Abstract:
The NSF-funded Robust Epidemic Surveillance and Modeling (RESUME) project successfully convened a workshop entitled "High-performance computing and large-scale data management in service of epidemiological modeling" at the University of Chicago on May 1-2, 2023. This was part of a series of workshops designed to foster sustainable and interdisciplinary co-design for predictive intelligence and pan…
▽ More
The NSF-funded Robust Epidemic Surveillance and Modeling (RESUME) project successfully convened a workshop entitled "High-performance computing and large-scale data management in service of epidemiological modeling" at the University of Chicago on May 1-2, 2023. This was part of a series of workshops designed to foster sustainable and interdisciplinary co-design for predictive intelligence and pandemic prevention. The event brought together 31 experts in epidemiological modeling, high-performance computing (HPC), HPC workflows, and large-scale data management to develop a shared vision for capabilities needed for computational epidemiology to better support pandemic prevention. Through the workshop, participants identified key areas in which HPC capabilities could be used to improve epidemiological modeling, particularly in supporting public health decision-making, with an emphasis on HPC workflows, data integration, and HPC access. The workshop explored nascent HPC workflow and large-scale data management approaches currently in use for epidemiological modeling and sought to draw from approaches used in other domains to determine which practices could be best adapted for use in epidemiological modeling. This report documents the key findings and takeaways from the workshop.
△ Less
Submitted 8 August, 2023;
originally announced August 2023.
-
PSI/J: A Portable Interface for Submitting, Monitoring, and Managing Jobs
Authors:
Mihael Hategan-Marandiuc,
Andre Merzky,
Nicholson Collier,
Ketan Maheshwari,
Jonathan Ozik,
Matteo Turilli,
Andreas Wilke,
Justin M. Wozniak,
Kyle Chard,
Ian Foster,
Rafael Ferreira da Silva,
Shantenu Jha,
Daniel Laney
Abstract:
It is generally desirable for high-performance computing (HPC) applications to be portable between HPC systems, for example to make use of more performant hardware, make effective use of allocations, and to co-locate compute jobs with large datasets. Unfortunately, moving scientific applications between HPC systems is challenging for various reasons, most notably that HPC systems have different HP…
▽ More
It is generally desirable for high-performance computing (HPC) applications to be portable between HPC systems, for example to make use of more performant hardware, make effective use of allocations, and to co-locate compute jobs with large datasets. Unfortunately, moving scientific applications between HPC systems is challenging for various reasons, most notably that HPC systems have different HPC schedulers. We introduce PSI/J, a job management abstraction API intended to simplify the construction of software components and applications that are portable over various HPC scheduler implementations. We argue that such a system is both necessary and that no viable alternative currently exists. We analyze similar notable APIs and attempt to determine the factors that influenced their evolution and adoption by the HPC community. We base the design of PSI/J on that analysis. We describe how PSI/J has been integrated in three workflow systems and one application, and also show via experiments that PSI/J imposes minimal overhead.
△ Less
Submitted 20 September, 2023; v1 submitted 15 July, 2023;
originally announced July 2023.
-
Trajectory-oriented optimization of stochastic epidemiological models
Authors:
Arindam Fadikar,
Mickael Binois,
Nicholson Collier,
Abby Stevens,
Kok Ben Toh,
Jonathan Ozik
Abstract:
Epidemiological models must be calibrated to ground truth for downstream tasks such as producing forward projections or running what-if scenarios. The meaning of calibration changes in case of a stochastic model since output from such a model is generally described via an ensemble or a distribution. Each member of the ensemble is usually mapped to a random number seed (explicitly or implicitly). W…
▽ More
Epidemiological models must be calibrated to ground truth for downstream tasks such as producing forward projections or running what-if scenarios. The meaning of calibration changes in case of a stochastic model since output from such a model is generally described via an ensemble or a distribution. Each member of the ensemble is usually mapped to a random number seed (explicitly or implicitly). With the goal of finding not only the input parameter settings but also the random seeds that are consistent with the ground truth, we propose a class of Gaussian process (GP) surrogates along with an optimization strategy based on Thompson sampling. This Trajectory Oriented Optimization (TOO) approach produces actual trajectories close to the empirical observations instead of a set of parameter settings where only the mean simulation behavior matches with the ground truth.
△ Less
Submitted 13 September, 2023; v1 submitted 6 May, 2023;
originally announced May 2023.
-
Developing Distributed High-performance Computing Capabilities of an Open Science Platform for Robust Epidemic Analysis
Authors:
Nicholson Collier,
Justin M. Wozniak,
Abby Stevens,
Yadu Babuji,
Mickaël Binois,
Arindam Fadikar,
Alexandra Würth,
Kyle Chard,
Jonathan Ozik
Abstract:
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among domain experts, mathematical modelers, and scientific computing specialists. Computationally, however, it also revealed critical gaps in the ability of researchers to exploit advanced computing systems. These challenging areas includ…
▽ More
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among domain experts, mathematical modelers, and scientific computing specialists. Computationally, however, it also revealed critical gaps in the ability of researchers to exploit advanced computing systems. These challenging areas include gaining access to scalable computing systems, porting models and workflows to new systems, sharing data of varying sizes, and producing results that can be reproduced and validated by others. Informed by our team's work in supporting public health decision makers during the COVID-19 pandemic and by the identified capability gaps in applying high-performance computing (HPC) to the modeling of complex social systems, we present the goals, requirements, and initial implementation of OSPREY, an open science platform for robust epidemic analysis. The prototype implementation demonstrates an integrated, algorithm-driven HPC workflow architecture, coordinating tasks across federated HPC resources, with robust, secure and automated access to each of the resources. We demonstrate scalable and fault-tolerant task execution, an asynchronous API to support fast time-to-solution algorithms, an inclusive, multi-language approach, and efficient wide-area data management. The example OSPREY code is made available on a public repository.
△ Less
Submitted 10 May, 2023; v1 submitted 27 April, 2023;
originally announced April 2023.
-
A portfolio approach to massively parallel Bayesian optimization
Authors:
Mickael Binois,
Nicholson Collier,
Jonathan Ozik
Abstract:
One way to reduce the time of conducting optimization studies is to evaluate designs in parallel rather than just one-at-a-time. For expensive-to-evaluate black-boxes, batch versions of Bayesian optimization have been proposed. They work by building a surrogate model of the black-box to simultaneously select multiple designs via an infill criterion. Still, despite the increased availability of com…
▽ More
One way to reduce the time of conducting optimization studies is to evaluate designs in parallel rather than just one-at-a-time. For expensive-to-evaluate black-boxes, batch versions of Bayesian optimization have been proposed. They work by building a surrogate model of the black-box to simultaneously select multiple designs via an infill criterion. Still, despite the increased availability of computing resources that enable large-scale parallelism, the strategies that work for selecting a few tens of parallel designs for evaluations become limiting due to the complexity of selecting more designs. It is even more crucial when the black-box is noisy, necessitating more evaluations as well as repeating experiments. Here we propose a scalable strategy that can keep up with massive batching natively, focused on the exploration/exploitation trade-off and a portfolio allocation. We compare the approach with related methods on noisy functions, for mono and multi-objective optimization tasks. These experiments show orders of magnitude speed improvements over existing methods with similar or better performance.
△ Less
Submitted 3 April, 2023; v1 submitted 18 October, 2021;
originally announced October 2021.
-
A Community Roadmap for Scientific Workflows Research and Development
Authors:
Rafael Ferreira da Silva,
Henri Casanova,
Kyle Chard,
Ilkay Altintas,
Rosa M Badia,
Bartosz Balis,
Tainã Coleman,
Frederik Coppens,
Frank Di Natale,
Bjoern Enders,
Thomas Fahringer,
Rosa Filgueira,
Grigori Fursin,
Daniel Garijo,
Carole Goble,
Dorran Howell,
Shantenu Jha,
Daniel S. Katz,
Daniel Laney,
Ulf Leser,
Maciej Malawski,
Kshitij Mehta,
Loïc Pottier,
Jonathan Ozik,
J. Luc Peterson
, et al. (4 additional authors not shown)
Abstract:
The landscape of workflow systems for scientific applications is notoriously convoluted with hundreds of seemingly equivalent workflow systems, many isolated research claims, and a steep learning curve. To address some of these challenges and lay the groundwork for transforming workflows research and development, the WorkflowsRI and ExaWorks projects partnered to bring the international workflows…
▽ More
The landscape of workflow systems for scientific applications is notoriously convoluted with hundreds of seemingly equivalent workflow systems, many isolated research claims, and a steep learning curve. To address some of these challenges and lay the groundwork for transforming workflows research and development, the WorkflowsRI and ExaWorks projects partnered to bring the international workflows community together. This paper reports on discussions and findings from two virtual "Workflows Community Summits" (January and April, 2021). The overarching goals of these workshops were to develop a view of the state of the art, identify crucial research challenges in the workflows community, articulate a vision for potential community efforts, and discuss technical approaches for realizing this vision. To this end, participants identified six broad themes: FAIR computational workflows; AI workflows; exascale challenges; APIs, interoperability, reuse, and standards; training and education; and building a workflows community. We summarize discussions and recommendations for each of these themes.
△ Less
Submitted 8 October, 2021; v1 submitted 5 October, 2021;
originally announced October 2021.
-
Workflows Community Summit: Advancing the State-of-the-art of Scientific Workflows Management Systems Research and Development
Authors:
Rafael Ferreira da Silva,
Henri Casanova,
Kyle Chard,
Tainã Coleman,
Dan Laney,
Dong Ahn,
Shantenu Jha,
Dorran Howell,
Stian Soiland-Reys,
Ilkay Altintas,
Douglas Thain,
Rosa Filgueira,
Yadu Babuji,
Rosa M. Badia,
Bartosz Balis,
Silvina Caino-Lores,
Scott Callaghan,
Frederik Coppens,
Michael R. Crusoe,
Kaushik De,
Frank Di Natale,
Tu M. A. Do,
Bjoern Enders,
Thomas Fahringer,
Anne Fouilloux
, et al. (33 additional authors not shown)
Abstract:
Scientific workflows are a cornerstone of modern scientific computing, and they have underpinned some of the most significant discoveries of the last decade. Many of these workflows have high computational, storage, and/or communication demands, and thus must execute on a wide range of large-scale platforms, from large clouds to upcoming exascale HPC platforms. Workflows will play a crucial role i…
▽ More
Scientific workflows are a cornerstone of modern scientific computing, and they have underpinned some of the most significant discoveries of the last decade. Many of these workflows have high computational, storage, and/or communication demands, and thus must execute on a wide range of large-scale platforms, from large clouds to upcoming exascale HPC platforms. Workflows will play a crucial role in the data-oriented and post-Moore's computing landscape as they democratize the application of cutting-edge research techniques, computationally intensive methods, and use of new computing platforms. As workflows continue to be adopted by scientific projects and user communities, they are becoming more complex. Workflows are increasingly composed of tasks that perform computations such as short machine learning inference, multi-node simulations, long-running machine learning model training, amongst others, and thus increasingly rely on heterogeneous architectures that include CPUs but also GPUs and accelerators. The workflow management system (WMS) technology landscape is currently segmented and presents significant barriers to entry due to the hundreds of seemingly comparable, yet incompatible, systems that exist. Another fundamental problem is that there are conflicting theoretical bases and abstractions for a WMS. Systems that use the same underlying abstractions can likely be translated between, which is not the case for systems that use different abstractions. More information: https://workflowsri.org/summits/technical
△ Less
Submitted 9 June, 2021;
originally announced June 2021.
-
Workflows Community Summit: Bringing the Scientific Workflows Community Together
Authors:
Rafael Ferreira da Silva,
Henri Casanova,
Kyle Chard,
Dan Laney,
Dong Ahn,
Shantenu Jha,
Carole Goble,
Lavanya Ramakrishnan,
Luc Peterson,
Bjoern Enders,
Douglas Thain,
Ilkay Altintas,
Yadu Babuji,
Rosa M. Badia,
Vivien Bonazzi,
Taina Coleman,
Michael Crusoe,
Ewa Deelman,
Frank Di Natale,
Paolo Di Tommaso,
Thomas Fahringer,
Rosa Filgueira,
Grigori Fursin,
Alex Ganose,
Bjorn Gruning
, et al. (20 additional authors not shown)
Abstract:
Scientific workflows have been used almost universally across scientific domains, and have underpinned some of the most significant discoveries of the past several decades. Many of these workflows have high computational, storage, and/or communication demands, and thus must execute on a wide range of large-scale platforms, from large clouds to upcoming exascale high-performance computing (HPC) pla…
▽ More
Scientific workflows have been used almost universally across scientific domains, and have underpinned some of the most significant discoveries of the past several decades. Many of these workflows have high computational, storage, and/or communication demands, and thus must execute on a wide range of large-scale platforms, from large clouds to upcoming exascale high-performance computing (HPC) platforms. These executions must be managed using some software infrastructure. Due to the popularity of workflows, workflow management systems (WMSs) have been developed to provide abstractions for creating and executing workflows conveniently, efficiently, and portably. While these efforts are all worthwhile, there are now hundreds of independent WMSs, many of which are moribund. As a result, the WMS landscape is segmented and presents significant barriers to entry due to the hundreds of seemingly comparable, yet incompatible, systems that exist. As a result, many teams, small and large, still elect to build their own custom workflow solution rather than adopt, or build upon, existing WMSs. This current state of the WMS landscape negatively impacts workflow users, developers, and researchers. The "Workflows Community Summit" was held online on January 13, 2021. The overarching goal of the summit was to develop a view of the state of the art and identify crucial research challenges in the workflow community. Prior to the summit, a survey sent to stakeholders in the workflow community (including both developers of WMSs and users of workflows) helped to identify key challenges in this community that were translated into 6 broad themes for the summit, each of them being the object of a focused discussion led by a volunteer member of the community. This report documents and organizes the wealth of information provided by the participants before, during, and after the summit.
△ Less
Submitted 16 March, 2021;
originally announced March 2021.
-
Characterization and valuation of uncertainty of calibrated parameters in stochastic decision models
Authors:
Fernando Alarid-Escudero,
Amy B. Knudsen,
Jonathan Ozik,
Nicholson Collier,
Karen M. Kuntz
Abstract:
We evaluated the implications of different approaches to characterize uncertainty of calibrated parameters of stochastic decision models (DMs) in the quantified value of such uncertainty in decision making. We used a microsimulation DM of colorectal cancer (CRC) screening to conduct a cost-effectiveness analysis (CEA) of a 10-year colonoscopy screening. We calibrated the natural history model of C…
▽ More
We evaluated the implications of different approaches to characterize uncertainty of calibrated parameters of stochastic decision models (DMs) in the quantified value of such uncertainty in decision making. We used a microsimulation DM of colorectal cancer (CRC) screening to conduct a cost-effectiveness analysis (CEA) of a 10-year colonoscopy screening. We calibrated the natural history model of CRC to epidemiological data with different degrees of uncertainty and obtained the joint posterior distribution of the parameters using a Bayesian approach. We conducted a probabilistic sensitivity analysis (PSA) on all the model parameters with different characterizations of uncertainty of the calibrated parameters and estimated the value of uncertainty of the different characterizations with a value of information analysis. All analyses were conducted using high performance computing resources running the Extreme-scale Model Exploration with Swift (EMEWS) framework. The posterior distribution had high correlation among some parameters. The parameters of the Weibull hazard function for the age of onset of adenomas had the highest posterior correlation of -0.958. Considering full posterior distributions and the maximum-a-posteriori estimate of the calibrated parameters, there is little difference on the spread of the distribution of the CEA outcomes with a similar expected value of perfect information (EVPI) of \$653 and \$685, respectively, at a WTP of \$66,000/QALY. Ignoring correlation on the posterior distribution of the calibrated parameters, produced the widest distribution of CEA outcomes and the highest EVPI of \$809 at the same WTP. Different characterizations of uncertainty of calibrated parameters have implications on the expect value of reducing uncertainty on the CEA. Ignoring inherent correlation among calibrated parameters on a PSA overestimates the value of uncertainty.
△ Less
Submitted 11 June, 2019;
originally announced June 2019.
-
Microsimulation Model Calibration using Incremental Mixture Approximate Bayesian Computation
Authors:
Carolyn Rutter,
Jonathan Ozik,
Maria DeYoreo,
Nicholson Collier
Abstract:
Microsimulation models (MSMs) are used to predict population-level effects of health care policies by simulating individual-level outcomes. Simulated outcomes are governed by unknown parameters that are chosen so that the model accurately predicts specific targets, a process referred to as model calibration. Calibration targets can come from randomized controlled trials, observational studies, and…
▽ More
Microsimulation models (MSMs) are used to predict population-level effects of health care policies by simulating individual-level outcomes. Simulated outcomes are governed by unknown parameters that are chosen so that the model accurately predicts specific targets, a process referred to as model calibration. Calibration targets can come from randomized controlled trials, observational studies, and expert opinion, and are typically summary statistics. A well calibrated model can reproduce a wide range of targets. MSM calibration generally involves searching a high dimensional parameter space and predicting many targets through model simulation. This requires efficient methods for exploring the parameter space and sufficient computational resources. We develop Incremental Mixture Approximate Bayesian Computation (IMABC) as a method for MSM calibration and implement it via a high-performance computing workflow, which provides the necessary computational scale. IMABC begins with a rejection-based approximate Bayesian computation (ABC) step, drawing a sample of parameters from the prior distribution and simulating calibration targets. Next, the sample is iteratively updated by drawing additional points from a mixture of multivariate normal distributions, centered at the points that yield simulated targets that are near observed targets. Posterior estimates are obtained by weighting sampled parameter vectors to account for the adaptive sampling scheme. We demonstrate IMABC by calibrating a MSM for the natural history of colorectal cancer to obtain simulated draws from the joint posterior distribution of model parameters.
△ Less
Submitted 13 August, 2018; v1 submitted 5 April, 2018;
originally announced April 2018.
-
Streaming supercomputing needs workflow-enabled programming-in-the-large
Authors:
Justin M Wozniak,
Jonathan Ozik,
Daniel S. Katz,
Michael Wilde
Abstract:
This is a position paper, submitted to the Future Online Analysis Platform Workshop (https://press3.mcs.anl.gov/futureplatform/), which argues that simple data analysis applications are common today, but future online supercomputing workloads will need to couple multiple advanced technologies (streams, caches, analysis, and simulations) to rapidly deliver scientific results. Each of these technolo…
▽ More
This is a position paper, submitted to the Future Online Analysis Platform Workshop (https://press3.mcs.anl.gov/futureplatform/), which argues that simple data analysis applications are common today, but future online supercomputing workloads will need to couple multiple advanced technologies (streams, caches, analysis, and simulations) to rapidly deliver scientific results. Each of these technologies are active research areas when integrated with high-performance computing. These components will interact in complex ways, therefore coupling them needs to be programmed. Programming in the large, on top of existing applications, enables us to build much more capable applications and to productively manage this complexity.
△ Less
Submitted 23 February, 2017;
originally announced February 2017.
-
Formation of Multifractal Population Patterns from Reproductive Growth and Local Resettlement
Authors:
Jonathan Ozik,
Brian R. Hunt,
Edward Ott
Abstract:
We consider the general character of the spatial distribution of a population that grows through reproduction and subsequent local resettlement of new population members. We present several simple one and two-dimensional point placement models to illustrate possible generic behavior of these distributions. We show, numerically and analytically, that these models all lead to multifractal spatial…
▽ More
We consider the general character of the spatial distribution of a population that grows through reproduction and subsequent local resettlement of new population members. We present several simple one and two-dimensional point placement models to illustrate possible generic behavior of these distributions. We show, numerically and analytically, that these models all lead to multifractal spatial distributions of population. Additionally, we make qualitative links between our models and the example of the Earth at Night image, showing the Earth's nighttime man-made light as seen from space. The Earth at Night data suffer from saturation of the sensing photodetectors at high brightness (`clipping'), and we account for how this influences the determined dimension spectrum of the light intensity distribution.
△ Less
Submitted 25 July, 2005; v1 submitted 3 February, 2005;
originally announced February 2005.
-
Modeling the X-ray - UV Correlations in NGC 7469
Authors:
Andrew J. Berkley,
Demosthenes Kazanas,
Jonathan Ozik
Abstract:
We model the correlated X-ray - UV observations of NGC 7469, for which well sampled data in both these bands have been obtained recently in a multiwavelength monitoring campaign. To this end we derive the transfer function in wavelength \ls and time lag \t, for reprocessing hard (X-ray) photons from a point source to softer ones (UV-optical) by an infinite plane (representing a cool, thin accret…
▽ More
We model the correlated X-ray - UV observations of NGC 7469, for which well sampled data in both these bands have been obtained recently in a multiwavelength monitoring campaign. To this end we derive the transfer function in wavelength \ls and time lag \t, for reprocessing hard (X-ray) photons from a point source to softer ones (UV-optical) by an infinite plane (representing a cool, thin accretion disk) located at a given distance below the X-ray source, under the assumption that the X-ray flux is absorbed and emitted locally by the disk as a black body of temperature appropriate to the incident flux. Using the observed X-ray light curve as input we have computed the expected continuum UV emission as a function of time at several wavelengths (łł1315 Å, łł6962 Å, łł15000 Å, łł30000 Å) assuming that the X-ray source is located one \sc radius above the disk plane, with the mass of the black hole $M$ and the latitude angle $θ$ of the observer relative to the disk plane as free parameters. We have searched the parameter space of black hole masses and observer azimuthal angles but we were unable to reproduce UV light curves which would resemble, even remotely, those observed. We also explored whether particular combinations of the values of these parameters could lead to light curves whose statistical properties (i.e. the autocorrelation and cross correlation functions) would match those corresponding to the observed UV light curve at łł1315 Å. Even though we considered black hole masses as large as $10^9$ M$_{\odot}$ no such match was possible. Our results indicate that some of the fundamental assumptions of this model will have to be modified to obtain even approximate agreement between the observed and model X-ray - UV light curves.
△ Less
Submitted 13 January, 2000;
originally announced January 2000.