-
Deciphering Urban Morphogenesis: A Morphospace Approach
Authors:
Vini Netto,
Caio Cacholas,
Dries Daems,
Fabiano Ribeiro,
Howard Davis,
Daniel Lenz
Abstract:
Cities emerged independently across different world regions and historical periods, raising fundamental questions: How did the first urban settlements develop? What social and spatial conditions enabled their emergence? Are these processes universal or context-dependent? Moreover, what distinguishes cities from other human settlements? This paper investigates the drivers behind the creation of cit…
▽ More
Cities emerged independently across different world regions and historical periods, raising fundamental questions: How did the first urban settlements develop? What social and spatial conditions enabled their emergence? Are these processes universal or context-dependent? Moreover, what distinguishes cities from other human settlements? This paper investigates the drivers behind the creation of cities through a hybrid approach that integrates urban theory, the biological concept of morphospace (the space of all possible configurations), and archaeological evidence. It explores the transition from sedentary hunter-gatherer communities to urban societies, highlighting fundamental forces converging to produce increasingly complex divisions of labour as a central driver of urbanization. Morphogenesis is conceptualized as a trajectory through morphospace, governed by structure-seeking selection processes that balance density, permeability, and information as critical dimensions. The study highlights the non-ergodic nature of urban morphogenesis, where configurations are progressively selected based on their fitness to support the diversifying interactions between mutually dependent agents. The morphospace framework effectively distinguishes between theoretical spatial configurations, non-urban and proto-urban settlements, and contemporary cities. This analysis supports the proposition that cities emerge and evolve as solutions balancing density, permeability, and informational organization, enabling them to support increasingly complex societal functions.
△ Less
Submitted 25 November, 2024; v1 submitted 20 November, 2024;
originally announced November 2024.
-
Progress Towards Decoding Visual Imagery via fNIRS
Authors:
Michel Adamic,
Wellington Avelino,
Anna Brandenberger,
Bryan Chiang,
Hunter Davis,
Stephen Fay,
Andrew Gregory,
Aayush Gupta,
Raphael Hotter,
Grace Jiang,
Fiona Leng,
Stephen Polcyn,
Thomas Ribeiro,
Paul Scotti,
Michelle Wang,
Marley Xiong,
Jonathan Xu
Abstract:
We demonstrate the possibility of reconstructing images from fNIRS brain activity and start building a prototype to match the required specs. By training an image reconstruction model on downsampled fMRI data, we discovered that cm-scale spatial resolution is sufficient for image generation. We obtained 71% retrieval accuracy with 1-cm resolution, compared to 93% on the full-resolution fMRI, and 2…
▽ More
We demonstrate the possibility of reconstructing images from fNIRS brain activity and start building a prototype to match the required specs. By training an image reconstruction model on downsampled fMRI data, we discovered that cm-scale spatial resolution is sufficient for image generation. We obtained 71% retrieval accuracy with 1-cm resolution, compared to 93% on the full-resolution fMRI, and 20% with 2-cm resolution. With simulations and high-density tomography, we found that time-domain fNIRS can achieve 1-cm resolution, compared to 2-cm resolution for continuous-wave fNIRS. Lastly, we share designs for a prototype time-domain fNIRS device, consisting of a laser driver, a single photon detector, and a time-to-digital converter system.
△ Less
Submitted 22 June, 2024; v1 submitted 11 June, 2024;
originally announced June 2024.
-
Taking GPU Programming Models to Task for Performance Portability
Authors:
Joshua H. Davis,
Pranav Sivaraman,
Joy Kitson,
Konstantinos Parasyris,
Harshitha Menon,
Isaac Minn,
Giorgis Georgakoudis,
Abhinav Bhatele
Abstract:
Portability is critical to ensuring high productivity in developing and maintaining scientific software as the diversity in on-node hardware architectures increases. While several programming models provide portability for diverse GPU platforms, they don't make any guarantees about performance portability. In this work, we explore several programming models -- CUDA, HIP, Kokkos, RAJA, OpenMP, Open…
▽ More
Portability is critical to ensuring high productivity in developing and maintaining scientific software as the diversity in on-node hardware architectures increases. While several programming models provide portability for diverse GPU platforms, they don't make any guarantees about performance portability. In this work, we explore several programming models -- CUDA, HIP, Kokkos, RAJA, OpenMP, OpenACC, and SYCL, to study if the performance of these models is consistently good across NVIDIA and AMD GPUs. We use five proxy applications from different scientific domains, create implementations where missing, and use them to present a comprehensive comparative evaluation of the programming models. We provide a Spack scripting-based methodology to ensure reproducibility of experiments conducted in this work. Finally, we attempt to answer the question -- to what extent does each programming model provide performance portability for heterogeneous systems in real-world usage?
△ Less
Submitted 21 May, 2024; v1 submitted 14 February, 2024;
originally announced February 2024.
-
Can Large Language Models Write Parallel Code?
Authors:
Daniel Nichols,
Joshua H. Davis,
Zhaojun Xie,
Arjun Rajaram,
Abhinav Bhatele
Abstract:
Large language models are increasingly becoming a popular tool for software development. Their ability to model and generate source code has been demonstrated in a variety of contexts, including code completion, summarization, translation, and lookup. However, they often struggle to generate code for complex programs. In this paper, we study the capabilities of state-of-the-art language models to…
▽ More
Large language models are increasingly becoming a popular tool for software development. Their ability to model and generate source code has been demonstrated in a variety of contexts, including code completion, summarization, translation, and lookup. However, they often struggle to generate code for complex programs. In this paper, we study the capabilities of state-of-the-art language models to generate parallel code. In order to evaluate language models, we create a benchmark, ParEval, consisting of prompts that represent 420 different coding tasks related to scientific and parallel computing. We use ParEval to evaluate the effectiveness of several state-of-the-art open- and closed-source language models on these tasks. We introduce novel metrics for evaluating the performance of generated code, and use them to explore how well each large language model performs for 12 different computational problem types and six different parallel programming models.
△ Less
Submitted 14 May, 2024; v1 submitted 23 January, 2024;
originally announced January 2024.
-
ECP SOLLVE: Validation and Verification Testsuite Status Update and Compiler Insight for OpenMP
Authors:
Thomas Huber,
Swaroop Pophale,
Nolan Baker,
Michael Carr,
Nikhil Rao,
Jaydon Reap,
Kristina Holsapple,
Joshua Hoke Davis,
Tobias Burnus,
Seyong Lee,
David E. Bernholdt,
Sunita Chandrasekaran
Abstract:
The OpenMP language continues to evolve with every new specification release, as does the need to validate and verify the new features that have been introduced. With the release of OpenMP 5.0 and OpenMP 5.1, plenty of new target offload and host-based features have been introduced to the programming model. While OpenMP continues to grow in maturity, there is an observable growth in the number of…
▽ More
The OpenMP language continues to evolve with every new specification release, as does the need to validate and verify the new features that have been introduced. With the release of OpenMP 5.0 and OpenMP 5.1, plenty of new target offload and host-based features have been introduced to the programming model. While OpenMP continues to grow in maturity, there is an observable growth in the number of compiler and hardware vendors that support OpenMP. In this manuscript, we focus on evaluating the conformity and implementation progress of various compiler vendors such as Cray, IBM, GNU, Clang/LLVM, NVIDIA, Intel and AMD. We specifically address the 4.5, 5.0, and 5.1 versions of the specification.
△ Less
Submitted 14 November, 2022; v1 submitted 28 August, 2022;
originally announced August 2022.
-
Performance Assessment of OpenMP Compilers Targeting NVIDIA V100 GPUs
Authors:
Joshua Hoke Davis,
Christopher Daley,
Swaroop Pophale,
Thomas Huber,
Sunita Chandrasekaran,
Nicholas J. Wright
Abstract:
Heterogeneous systems are becoming increasingly prevalent. In order to exploit the rich compute resources of such systems, robust programming models are needed for application developers to seamlessly migrate legacy code from today's systems to tomorrow's. Over the past decade and more, directives have been established as one of the promising paths to tackle programmatic challenges on emerging sys…
▽ More
Heterogeneous systems are becoming increasingly prevalent. In order to exploit the rich compute resources of such systems, robust programming models are needed for application developers to seamlessly migrate legacy code from today's systems to tomorrow's. Over the past decade and more, directives have been established as one of the promising paths to tackle programmatic challenges on emerging systems. This work focuses on applying and demonstrating OpenMP offloading directives on five proxy applications. We observe that the performance varies widely from one compiler to the other; a crucial aspect of our work is reporting best practices to application developers who use OpenMP offloading compilers. While some issues can be worked around by the developer, there are other issues that must be reported to the compiler vendors. By restructuring OpenMP offloading directives, we gain an 18x speedup for the su3 proxy application on NERSC's Cori system when using the Clang compiler, and a 15.7x speedup by switching max reductions to add reductions in the laplace mini-app when using the Cray-llvm compiler on Cori.
△ Less
Submitted 2 December, 2020; v1 submitted 19 October, 2020;
originally announced October 2020.
-
Broad Area Search and Detection of Surface-to-Air Missile Sites Using Spatial Fusion of Component Object Detections from Deep Neural Networks
Authors:
Alan B. Cannaday II,
Curt H. Davis,
Grant J. Scott,
Blake Ruprecht,
Derek T. Anderson
Abstract:
Here we demonstrate how Deep Neural Network (DNN) detections of multiple constitutive or component objects that are part of a larger, more complex, and encompassing feature can be spatially fused to improve the search, detection, and retrieval (ranking) of the larger complex feature. First, scores computed from a spatial clustering algorithm are normalized to a reference space so that they are ind…
▽ More
Here we demonstrate how Deep Neural Network (DNN) detections of multiple constitutive or component objects that are part of a larger, more complex, and encompassing feature can be spatially fused to improve the search, detection, and retrieval (ranking) of the larger complex feature. First, scores computed from a spatial clustering algorithm are normalized to a reference space so that they are independent of image resolution and DNN input chip size. Then, multi-scale DNN detections from various component objects are fused to improve the detection and retrieval of DNN detections of a larger complex feature. We demonstrate the utility of this approach for broad area search and detection of Surface-to-Air Missile (SAM) sites that have a very low occurrence rate (only 16 sites) over a ~90,000 km^2 study area in SE China. The results demonstrate that spatial fusion of multi-scale component-object DNN detections can reduce the detection error rate of SAM Sites by $>$85% while still maintaining a 100% recall. The novel spatial fusion approach demonstrated here can be easily extended to a wide variety of other challenging object search and detection problems in large-scale remote sensing image datasets.
△ Less
Submitted 20 July, 2020; v1 submitted 23 March, 2020;
originally announced March 2020.
-
Reconstructing continuous distributions of 3D protein structure from cryo-EM images
Authors:
Ellen D. Zhong,
Tristan Bepler,
Joseph H. Davis,
Bonnie Berger
Abstract:
Cryo-electron microscopy (cryo-EM) is a powerful technique for determining the structure of proteins and other macromolecular complexes at near-atomic resolution. In single particle cryo-EM, the central problem is to reconstruct the three-dimensional structure of a macromolecule from $10^{4-7}$ noisy and randomly oriented two-dimensional projections. However, the imaged protein complexes may exhib…
▽ More
Cryo-electron microscopy (cryo-EM) is a powerful technique for determining the structure of proteins and other macromolecular complexes at near-atomic resolution. In single particle cryo-EM, the central problem is to reconstruct the three-dimensional structure of a macromolecule from $10^{4-7}$ noisy and randomly oriented two-dimensional projections. However, the imaged protein complexes may exhibit structural variability, which complicates reconstruction and is typically addressed using discrete clustering approaches that fail to capture the full range of protein dynamics. Here, we introduce a novel method for cryo-EM reconstruction that extends naturally to modeling continuous generative factors of structural heterogeneity. This method encodes structures in Fourier space using coordinate-based deep neural networks, and trains these networks from unlabeled 2D cryo-EM images by combining exact inference over image orientation with variational inference for structural heterogeneity. We demonstrate that the proposed method, termed cryoDRGN, can perform ab initio reconstruction of 3D protein complexes from simulated and real 2D cryo-EM image data. To our knowledge, cryoDRGN is the first neural network-based approach for cryo-EM reconstruction and the first end-to-end method for directly reconstructing continuous ensembles of protein structures from cryo-EM images.
△ Less
Submitted 14 February, 2020; v1 submitted 11 September, 2019;
originally announced September 2019.
-
Studying the Impact of Power Capping on MapReduce-based, Data-intensive Mini-applications on Intel KNL and KNM Architectures
Authors:
Joshua Hoke Davis,
Tao Gao,
Sunita Chandresekaran,
Michela Taufer
Abstract:
In this poster, we quantitatively measure the impacts of data movement on performance in MapReduce-based applications when executed on HPC systems. We leverage the PAPI 'powercap' component to identify ideal conditions for execution of our applications in terms of (1) dataset characteristics (i.e., unique words); (2) HPC system (i.e., KNL and KNM); and (3) implementation of the MapReduce programmi…
▽ More
In this poster, we quantitatively measure the impacts of data movement on performance in MapReduce-based applications when executed on HPC systems. We leverage the PAPI 'powercap' component to identify ideal conditions for execution of our applications in terms of (1) dataset characteristics (i.e., unique words); (2) HPC system (i.e., KNL and KNM); and (3) implementation of the MapReduce programming model (i.e., with or without combiner optimizations). Results confirm the high energy and runtime costs of data movement, and the benefits of the combiner optimization on these costs.
△ Less
Submitted 15 February, 2019;
originally announced March 2019.
-
Generating Music from Literature
Authors:
Hannah Davis,
Saif M. Mohammad
Abstract:
We present a system, TransProse, that automatically generates musical pieces from text. TransProse uses known relations between elements of music such as tempo and scale, and the emotions they evoke. Further, it uses a novel mechanism to determine sequences of notes that capture the emotional activity in the text. The work has applications in information visualization, in creating audio-visual e-b…
▽ More
We present a system, TransProse, that automatically generates musical pieces from text. TransProse uses known relations between elements of music such as tempo and scale, and the emotions they evoke. Further, it uses a novel mechanism to determine sequences of notes that capture the emotional activity in the text. The work has applications in information visualization, in creating audio-visual e-books, and in developing music apps.
△ Less
Submitted 9 March, 2014;
originally announced March 2014.
-
Informed Traders
Authors:
Dorje C. Brody,
Mark H. A. Davis,
Robyn L. Friedman,
Lane P. Hughston
Abstract:
An asymmetric information model is introduced for the situation in which there is a small agent who is more susceptible to the flow of information in the market than the general market participant, and who tries to implement strategies based on the additional information. In this model market participants have access to a stream of noisy information concerning the future return of an asset, wher…
▽ More
An asymmetric information model is introduced for the situation in which there is a small agent who is more susceptible to the flow of information in the market than the general market participant, and who tries to implement strategies based on the additional information. In this model market participants have access to a stream of noisy information concerning the future return of an asset, whereas the informed trader has access to a further information source which is obscured by an additional noise that may be correlated with the market noise. The informed trader uses the extraneous information source to seek statistical arbitrage opportunities, while at the same time accommodating the additional risk. The amount of information available to the general market participant concerning the asset return is measured by the mutual information of the asset price and the associated cash flow. The worth of the additional information source is then measured in terms of the difference of mutual information between the general market participant and the informed trader. This difference is shown to be nonnegative when the signal-to-noise ratio of the information flow is known in advance. Explicit trading strategies leading to statistical arbitrage opportunities, taking advantage of the additional information, are constructed, illustrating how excess information can be translated into profit.
△ Less
Submitted 17 November, 2008; v1 submitted 8 July, 2008;
originally announced July 2008.