-
The Uniformly Rotated Mondrian Kernel
Authors:
Calvin Osborne,
Eliza O'Reilly
Abstract:
Random feature maps are used to decrease the computational cost of kernel machines in large-scale problems. The Mondrian kernel is one such example of a fast random feature approximation of the Laplace kernel, generated by a computationally efficient hierarchical random partition of the input space known as the Mondrian process. In this work, we study a variation of this random feature map by appl…
▽ More
Random feature maps are used to decrease the computational cost of kernel machines in large-scale problems. The Mondrian kernel is one such example of a fast random feature approximation of the Laplace kernel, generated by a computationally efficient hierarchical random partition of the input space known as the Mondrian process. In this work, we study a variation of this random feature map by applying a uniform random rotation to the input space before running the Mondrian process to approximate a kernel that is invariant under rotations. We obtain a closed-form expression for the isotropic kernel that is approximated, as well as a uniform convergence rate of the uniformly rotated Mondrian kernel to this limit. To this end, we utilize techniques from the theory of stationary random tessellations in stochastic geometry and prove a new result on the geometry of the typical cell of the superposition of uniformly rotated Mondrian tessellations. Finally, we test the empirical performance of this random feature map on both synthetic and real-world datasets, demonstrating its improved performance over the Mondrian kernel on a dataset that is debiased from the standard coordinate axes.
△ Less
Submitted 11 March, 2025; v1 submitted 6 February, 2025;
originally announced February 2025.
-
Observational signatures of mixing-induced cooling in the Kelvin-Helmholtz instability
Authors:
Ben Snow,
Chris Osborne,
Andrew Hillier
Abstract:
Cool ($\approx 10^4$K), dense material permeates the hot ($\approx 10^6$K), tenuous solar corona in form of coronal condensations, for example prominences and coronal rain. As the solar atmosphere evolves, turbulence can drive mixing between the condensations and the surrounding corona, with the mixing layer exhibiting an enhancement in emission from intermediate temperature ($\approx10^5$K) spect…
▽ More
Cool ($\approx 10^4$K), dense material permeates the hot ($\approx 10^6$K), tenuous solar corona in form of coronal condensations, for example prominences and coronal rain. As the solar atmosphere evolves, turbulence can drive mixing between the condensations and the surrounding corona, with the mixing layer exhibiting an enhancement in emission from intermediate temperature ($\approx10^5$K) spectral lines, which is often attributed to turbulent heating within the mixing layer. However, radiative cooling is highly efficient at intermediate temperatures and numerical simulations have shown that radiative cooling can far exceed turbulent heating in prominence-corona mixing scenarios. As such the mixing layer can have a net loss of thermal energy, i.e., the mixing layer is cooling rather than heating. Here, we investigate the observational signatures of cooling processes in Kelvin-Helmholtz mixing between a prominence thread and the surrounding solar corona through 2D numerical simulations. Optically thin emission is synthesised for Si IV, along with optically thick emission for H$α$, Ca II K and Mg II h using Lightweaver The Mg II h probes the turbulent mixing layer, whereas H$α$ and Ca II K form within the thread and along its boundary respectively. As the mixing evolves, intermediate temperatures form leading to an increase in Si IV emission, which coincides with increased radiative losses. The simulation is dominated by cooling in the mixing layer, rather than turbulent heating, and yet enhanced emission in warm lines is produced. As such, an observational signature of decreased emission in cooler lines and increased emission in hotter lines may be a signature of mixing, rather than an implication of heating.
△ Less
Submitted 20 January, 2025;
originally announced January 2025.
-
A Report on Financial Regulations Challenge at COLING 2025
Authors:
Keyi Wang,
Jaisal Patel,
Charlie Shen,
Daniel Kim,
Andy Zhu,
Alex Lin,
Luca Borella,
Cailean Osborne,
Matt White,
Steve Yang,
Kairong Xiao,
Xiao-Yang Liu Yanglet
Abstract:
Financial large language models (FinLLMs) have been applied to various tasks in business, finance, accounting, and auditing. Complex financial regulations and standards are critical to financial services, which LLMs must comply with. However, FinLLMs' performance in understanding and interpreting financial regulations has rarely been studied. Therefore, we organize the Regulations Challenge, a sha…
▽ More
Financial large language models (FinLLMs) have been applied to various tasks in business, finance, accounting, and auditing. Complex financial regulations and standards are critical to financial services, which LLMs must comply with. However, FinLLMs' performance in understanding and interpreting financial regulations has rarely been studied. Therefore, we organize the Regulations Challenge, a shared task at COLING 2025. It encourages the academic community to explore the strengths and limitations of popular LLMs. We create 9 novel tasks and corresponding question sets. In this paper, we provide an overview of these tasks and summarize participants' approaches and results. We aim to raise awareness of FinLLMs' professional capability in financial regulations.
△ Less
Submitted 12 January, 2025; v1 submitted 15 December, 2024;
originally announced December 2024.
-
Orthogonal Polynomials on Bubble-Diamond Fractals
Authors:
Elena Axinn,
Calvin Osborne,
Kasso A. Okoudjou,
Olivia Rigatti,
Helen Shi
Abstract:
We develop a theory of polynomials and, in particular, an analog of the theory of Legendre orthogonal polynomials on the bubble-diamond fractals, a class of fractal sets that can be viewed as the completion of a limit of a sequence of finite graph approximations. In this setting, a polynomial of degree $j$ can be viewed as a multiharmonic function, a solution of the equation $Δ^{j+1}u=0$. We prove…
▽ More
We develop a theory of polynomials and, in particular, an analog of the theory of Legendre orthogonal polynomials on the bubble-diamond fractals, a class of fractal sets that can be viewed as the completion of a limit of a sequence of finite graph approximations. In this setting, a polynomial of degree $j$ can be viewed as a multiharmonic function, a solution of the equation $Δ^{j+1}u=0$. We prove that the sequence of orthogonal polynomials we construct obey a three-term recursion formula. Finally, we present some numerical results about the asymptotics of the coefficients appearing in this three-term recursion formula.
△ Less
Submitted 25 November, 2024;
originally announced November 2024.
-
A Toolkit for Measuring the Impacts of Public Funding on Open Source Software Development
Authors:
Cailean Osborne,
Paul Sharratt,
Dawn Foster,
Mirko Boehm
Abstract:
Governments are increasingly employing funding for open source software (OSS) development as a policy lever to support the security of software supply chains, digital sovereignty, economic growth, and national competitiveness in science and innovation, among others. However, the impacts of public funding on OSS development remain poorly understood, with a lack of consensus on how to meaningfully m…
▽ More
Governments are increasingly employing funding for open source software (OSS) development as a policy lever to support the security of software supply chains, digital sovereignty, economic growth, and national competitiveness in science and innovation, among others. However, the impacts of public funding on OSS development remain poorly understood, with a lack of consensus on how to meaningfully measure them. This gap hampers assessments of the return on public investment and impedes the optimisation of public-interest funding strategies. We address this gap with a toolkit of methodological considerations that may inform such measurements, drawing on prior work on OSS valuations and community health metrics by the Community Health Analytics Open Source Software (CHAOSS) project as well as our first-hand learnings as practitioners tasked with evaluating funding programmes by the Next Generation Internet initiative and the Sovereign Tech Agency. We discuss salient considerations, including the importance of accounting for funding objectives, project life stage and social structure, and regional and organisational cost factors. Next, we present a taxonomy of potential social, economic, and technological impacts that can be both positive and negative, direct and indirect, internal (i.e. within a project) and external (i.e. among a project's ecosystem of dependents and users), and manifest over various time horizons. Furthermore, we discuss the merits and limitations of qualitative, quantitative, and mixed-methods approaches, as well as options for and hazards of estimating multiplier effects. With this toolkit, we contribute to the multi-stakeholder conversation about the value and impacts of funding on OSS developers and society at large.
△ Less
Submitted 8 November, 2024;
originally announced November 2024.
-
Measuring Software Innovation with Open Source Software Development Data
Authors:
Eva Maxfield Brown,
Cailean Osborne,
Peter Cihon,
Moritz Böhmecke-Schwafert,
Kevin Xu,
Mirko Boehm,
Knut Blind
Abstract:
This paper introduces a novel measure of software innovation based on open source software (OSS) development activity on GitHub. We examine the dependency growth and release complexity among $\sim$200,000 unique releases from 28,000 unique packages across the JavaScript, Python, and Ruby ecosystems over two years post-release. We find that major versions show differential, strong prediction of one…
▽ More
This paper introduces a novel measure of software innovation based on open source software (OSS) development activity on GitHub. We examine the dependency growth and release complexity among $\sim$200,000 unique releases from 28,000 unique packages across the JavaScript, Python, and Ruby ecosystems over two years post-release. We find that major versions show differential, strong prediction of one-year lagged log change in dependencies. In addition, semantic versioning of OSS releases is correlated with their complexity and predict downstream adoption. We conclude that major releases of OSS packages count as a unit of innovation complementary to scientific publications, patents, and standards, offering applications for policymakers, managers, and researchers.
△ Less
Submitted 7 November, 2024;
originally announced November 2024.
-
Characterising Open Source Co-opetition in Company-hosted Open Source Software Projects: The Cases of PyTorch, TensorFlow, and Transformers
Authors:
Cailean Osborne,
Farbod Daneshyan,
Runzhi He,
Hengzhi Ye,
Yuxia Zhang,
Minghui Zhou
Abstract:
Companies, including market rivals, have long collaborated on the development of open source software (OSS), resulting in a tangle of co-operation and competition known as "open source co-opetition". While prior work investigates open source co-opetition in OSS projects that are hosted by vendor-neutral foundations, we have a limited understanding thereof in OSS projects that are hosted and govern…
▽ More
Companies, including market rivals, have long collaborated on the development of open source software (OSS), resulting in a tangle of co-operation and competition known as "open source co-opetition". While prior work investigates open source co-opetition in OSS projects that are hosted by vendor-neutral foundations, we have a limited understanding thereof in OSS projects that are hosted and governed by one company. Given their prevalence, it is timely to investigate open source co-opetition in such contexts. Towards this end, we conduct a mixed-methods analysis of three company-hosted OSS projects in the artificial intelligence (AI) industry: Meta's PyTorch (prior to its donation to the Linux Foundation), Google's TensorFlow, and Hugging Face's Transformers. We contribute three key findings. First, while the projects exhibit similar code authorship patterns between host and external companies (80%/20% of commits), collaborations are structured differently (e.g., decentralised vs. hub-and-spoke networks). Second, host and external companies engage in strategic, non-strategic, and contractual collaborations, with varying incentives and collaboration practices. Some of the observed collaborations are specific to the AI industry (e.g., hardware-software optimizations or AI model integrations), while others are typical of the broader software industry (e.g., bug fixing or task outsourcing). Third, single-vendor governance creates a power imbalance that influences open source co-opetition practices and possibilities, from the host company's singular decision-making power (e.g., the risk of license change) to their community involvement strategy (e.g., from over-control to over-delegation). We conclude with recommendations for future research.
△ Less
Submitted 23 October, 2024;
originally announced October 2024.
-
Spectral Characteristics of a Rotating Solar Prominence in Multiple Wavelengths
Authors:
A. G. M. Pietrow,
V. Liakh,
C. M. J. Osborne,
J. Jenkins,
R. Keppens
Abstract:
We present synthetic spectra corresponding to a 2.5D magnetohydrodynamical simulation of a rotating prominence in the Ca II 8542 Å, H$α$, Ca II K, Mg II k, Ly $α$, and Ly $β$ lines. The prominence rotation resulted from angular momentum conservation within a flux rope where asymmetric heating imposed a net rotation prior to the thermal-instability driven condensation phase. The spectra were create…
▽ More
We present synthetic spectra corresponding to a 2.5D magnetohydrodynamical simulation of a rotating prominence in the Ca II 8542 Å, H$α$, Ca II K, Mg II k, Ly $α$, and Ly $β$ lines. The prominence rotation resulted from angular momentum conservation within a flux rope where asymmetric heating imposed a net rotation prior to the thermal-instability driven condensation phase. The spectra were created using a library built on the Lightweaver framework called Promweaver, which provides boundary conditions for incorporating the limb-darkened irradiation of the solar disk on isolated structures such as prominences. Our spectra show distinctive rotational signatures for the Mg II k, Ly $α$, and Ly $β$ lines, even in the presence of complex, turbulent solar atmospheric conditions. However, these signals are hardly detectable for the Ca II 8542 Å, H$α$, Ca II K spectral lines. Most notably we find only a very faint rotational signal in the H$α$ line, thus reigniting the discussion on the existence of sustained rotation in prominences.
△ Less
Submitted 4 October, 2024;
originally announced October 2024.
-
Why Companies "Democratise" Artificial Intelligence: The Case of Open Source Software Donations
Authors:
Cailean Osborne
Abstract:
Companies claim to "democratise" artificial intelligence (AI) when they donate AI open source software (OSS) to non-profit foundations or release AI models, among others, but what does this term mean and why do they do it? As the impact of AI on society and the economy grows, understanding the commercial incentives behind AI democratisation efforts is crucial for ensuring these efforts serve broad…
▽ More
Companies claim to "democratise" artificial intelligence (AI) when they donate AI open source software (OSS) to non-profit foundations or release AI models, among others, but what does this term mean and why do they do it? As the impact of AI on society and the economy grows, understanding the commercial incentives behind AI democratisation efforts is crucial for ensuring these efforts serve broader interests beyond commercial agendas. Towards this end, this study employs a mixed-methods approach to investigate commercial incentives for 43 AI OSS donations to the Linux Foundation. It makes contributions to both research and practice. It contributes a taxonomy of both individual and organisational social, economic, and technological incentives for AI democratisation. In particular, it highlights the role of democratising the governance and control rights of an OSS project (i.e., from one company to open governance) as a structural enabler for downstream goals, such as attracting external contributors, reducing development costs, and influencing industry standards, among others. Furthermore, OSS donations are often championed by individual developers within companies, highlighting the importance of the bottom-up incentives for AI democratisation. The taxonomy provides a framework and toolkit for discerning incentives for other AI democratisation efforts, such as the release of AI models. The paper concludes with a discussion of future research directions.
△ Less
Submitted 26 September, 2024;
originally announced September 2024.
-
The Future of Open Human Feedback
Authors:
Shachar Don-Yehiya,
Ben Burtenshaw,
Ramon Fernandez Astudillo,
Cailean Osborne,
Mimansa Jaiswal,
Tzu-Sheng Kuo,
Wenting Zhao,
Idan Shenfeld,
Andi Peng,
Mikhail Yurochkin,
Atoosa Kasirzadeh,
Yangsibo Huang,
Tatsunori Hashimoto,
Yacine Jernite,
Daniel Vila-Suero,
Omri Abend,
Jennifer Ding,
Sara Hooker,
Hannah Rose Kirk,
Leshem Choshen
Abstract:
Human feedback on conversations with language language models (LLMs) is central to how these systems learn about the world, improve their capabilities, and are steered toward desirable and safe behaviors. However, this feedback is mostly collected by frontier AI labs and kept behind closed doors. In this work, we bring together interdisciplinary experts to assess the opportunities and challenges t…
▽ More
Human feedback on conversations with language language models (LLMs) is central to how these systems learn about the world, improve their capabilities, and are steered toward desirable and safe behaviors. However, this feedback is mostly collected by frontier AI labs and kept behind closed doors. In this work, we bring together interdisciplinary experts to assess the opportunities and challenges to realizing an open ecosystem of human feedback for AI. We first look for successful practices in peer production, open source, and citizen science communities. We then characterize the main challenges for open human feedback. For each, we survey current approaches and offer recommendations. We end by envisioning the components needed to underpin a sustainable and open human feedback ecosystem. In the center of this ecosystem are mutually beneficial feedback loops, between users and specialized models, incentivizing a diverse stakeholders community of model trainers and feedback providers to support a general open feedback pool.
△ Less
Submitted 4 September, 2024; v1 submitted 15 August, 2024;
originally announced August 2024.
-
Radiance Cascades: A Novel High-Resolution Formal Solution for Multidimensional Non-LTE Radiative Transfer
Authors:
Christopher M. J. Osborne,
Alexander Sannikov
Abstract:
Non-LTE radiative transfer is a key tool for modern astrophysics: it is the means by which many key synthetic observables are produced, thus connecting simulations and observations. Radiative transfer models also inform our understanding of the primary formation layers and parameters of different spectral lines, and serve as the basis of inversion tools used to infer the structure of the solar atm…
▽ More
Non-LTE radiative transfer is a key tool for modern astrophysics: it is the means by which many key synthetic observables are produced, thus connecting simulations and observations. Radiative transfer models also inform our understanding of the primary formation layers and parameters of different spectral lines, and serve as the basis of inversion tools used to infer the structure of the solar atmosphere from observations. The default approach for computing the radiation field in multidimensional solar radiative transfer models has long remained the same: a short characteristics, discrete ordinates method, formal solver. In situations with complex atmospheric structure and multiple transitions between optically-thick and -thin regimes these solvers require prohibitively high angular resolution to correctly resolve the radiation field. Here, we present the theory of radiance cascades, a technique designed to exploit structure inherent to the radiation field, allowing for efficient reuse of calculated samples, thus providing a very high-resolution result at a fraction of the computational cost of existing methods. We additionally describe our implementation of this method in the DexRT code, and present initial results of the synthesis of a snapshot of a magnetohydrodynamic model of a solar prominence formed via levitation-condensation. The approach presented here provides a credible route for routinely performing multidimensional radiative transfer calculations free from so-called ray effects, and scaling high-quality non-LTE models to next-generation high-performance computing systems with GPU accelerators.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
An impulsive geomagnetic effect from an early-impulsive flare
Authors:
Hugh S. Hudson,
Edward. W. Cliver,
Lyndsay Fletcher,
Declan A. Diver,
Peter T. Gallagher,
Ying Li,
Christopher M. J. Osborne,
Craig Stark,
Yang Su
Abstract:
The geomagnetic "solar flare effect" (SFE) results from excess ionization in the Earth's ionosphere, famously first detected at the time of the Carrington flare in 1859. This indirect detection of a flare constituted one of the first cases of "multimessenger astronomy," whereby solar ionizing radiation stimulates ionospheric currents. Well-observed SFEs have few-minute time scales and perturbation…
▽ More
The geomagnetic "solar flare effect" (SFE) results from excess ionization in the Earth's ionosphere, famously first detected at the time of the Carrington flare in 1859. This indirect detection of a flare constituted one of the first cases of "multimessenger astronomy," whereby solar ionizing radiation stimulates ionospheric currents. Well-observed SFEs have few-minute time scales and perturbations of >10 nT, with the greatest events reaching above 100 nT. In previously reported cases the SFE time profiles tend to resemble those of solar soft X-ray emission, which ionizes the D-region; there is also a less-well-studied contribution from Lyman-alpha. We report here a specific case, from flare SOL2024-03-10 (M7.4), in which an impulsive SFE deviated from this pattern. This flare contained an "early impulsive" component of exceptionally hard radiation, extending up to gamma-ray energies above 1 MeV, distinctly before the bulk of the flare soft X-ray emission. We can characterize the spectral distribution of this early-impulsive component in detail, thanks to the modern extensive wavelength coverage. A more typical gradual SFE occurred during the flare's main phase. We suggest that events of this type warrant exploration of the solar physics in the "impulse response" limit of very short time scales.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Doppler Dimming and Brightening Effects in Solar Prominences
Authors:
Aaron W. Peat,
Christopher M. J. Osborne,
Petr Heinzel
Abstract:
We explored the impact that Doppler dimming and brightening effects from bulk motions of solar prominences have on the formation of Lya, Ha, and MgII h line profiles. We compared two schemes in which these effects manifest; when the prominence is moving radially away from the solar surface (radial case), and when the prominence is moving parallel to the solar surface (horizontal case). To do this,…
▽ More
We explored the impact that Doppler dimming and brightening effects from bulk motions of solar prominences have on the formation of Lya, Ha, and MgII h line profiles. We compared two schemes in which these effects manifest; when the prominence is moving radially away from the solar surface (radial case), and when the prominence is moving parallel to the solar surface (horizontal case). To do this, we analysed 13,332 model profiles generated through the use of the 1D NLTE (i.e. departures from Local Thermodynamic equilibrium) radiative transfer (RT) code Promweaver, built on the Lightweaver NLTE RT framework to mimic the behaviour and output of the 1D NLTE RT code PROM. We found that horizontal velocities are just as, or more important than radial velocities. This demonstrates that horizontal velocities need to be accounted for when attempting to do any sort of forward modelling.
△ Less
Submitted 19 June, 2024; v1 submitted 18 June, 2024;
originally announced June 2024.
-
Systematic Literature Review of Commercial Participation in Open Source Software
Authors:
Xuetao Li,
Yuxia Zhang,
Cailean Osborne,
Minghui Zhou,
Zhi Jin,
Hui Liu
Abstract:
Open source software (OSS) has been playing a fundamental role in not only information technology but also our social lives. Attracted by various advantages of OSS, increasing commercial companies take extensive participation in open source development and have had a broad impact. This paper provides a comprehensive systematic literature review (SLR) of existing research on company participation i…
▽ More
Open source software (OSS) has been playing a fundamental role in not only information technology but also our social lives. Attracted by various advantages of OSS, increasing commercial companies take extensive participation in open source development and have had a broad impact. This paper provides a comprehensive systematic literature review (SLR) of existing research on company participation in OSS. We collected 92 papers and organized them based on their research topics, which cover three main directions, i.e., participation motivation, contribution model, and impact on OSS development. We found the explored motivations of companies are mainly from economic, technological, and social aspects. Existing studies categorize companies' contribution models in OSS projects mainly through their objectives and how they shape OSS communities. Researchers also explored how commercial participation affects OSS development. We conclude with research challenges and promising research directions on commercial participation in OSS. This study contributes to a comprehensive understanding of commercial participation in OSS development.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
The AI Community Building the Future? A Quantitative Analysis of Development Activity on Hugging Face Hub
Authors:
Cailean Osborne,
Jennifer Ding,
Hannah Rose Kirk
Abstract:
Open model developers have emerged as key actors in the political economy of artificial intelligence (AI), but we still have a limited understanding of collaborative practices in the open AI ecosystem. This paper responds to this gap with a three-part quantitative analysis of development activity on the Hugging Face (HF) Hub, a popular platform for building, sharing, and demonstrating models. Firs…
▽ More
Open model developers have emerged as key actors in the political economy of artificial intelligence (AI), but we still have a limited understanding of collaborative practices in the open AI ecosystem. This paper responds to this gap with a three-part quantitative analysis of development activity on the Hugging Face (HF) Hub, a popular platform for building, sharing, and demonstrating models. First, various types of activity across 348,181 model, 65,761 dataset, and 156,642 space repositories exhibit right-skewed distributions. Activity is extremely imbalanced between repositories; for example, over 70% of models have 0 downloads, while 1% account for 99% of downloads. Furthermore, licenses matter: there are statistically significant differences in collaboration patterns in model repositories with permissive, restrictive, and no licenses. Second, we analyse a snapshot of the social network structure of collaboration in model repositories, finding that the community has a core-periphery structure, with a core of prolific developers and a majority of isolate developers (89%). Upon removing the isolate developers from the network, collaboration is characterised by high reciprocity regardless of developers' network positions. Third, we examine model adoption through the lens of model usage in spaces, finding that a minority of models, developed by a handful of companies, are widely used on the HF Hub. Overall, activity on the HF Hub is characterised by Pareto distributions, congruent with OSS development patterns on platforms like GitHub. We conclude with recommendations for researchers, companies, and policymakers to advance our understanding of open AI development.
△ Less
Submitted 5 June, 2024; v1 submitted 20 May, 2024;
originally announced May 2024.
-
Assessment of SDSS-Derived Galaxy Morphologies Using HST Imaging
Authors:
Chandler Osborne,
Samir Salim
Abstract:
The Sloan Digital Sky Survey (SDSS) was foundational to the study of galaxy evolution, having revealed the bimodality of galaxies and the relationship between their structure and star-forming activity. However, ground-based optical surveys like SDSS are limited in resolution and depth which may lead to biases or poor quality in the derived morphological properties, potentially impacting our unders…
▽ More
The Sloan Digital Sky Survey (SDSS) was foundational to the study of galaxy evolution, having revealed the bimodality of galaxies and the relationship between their structure and star-forming activity. However, ground-based optical surveys like SDSS are limited in resolution and depth which may lead to biases or poor quality in the derived morphological properties, potentially impacting our understanding of how and why galaxies cease their star formation (quench). We use archival HST imaging of ~2,000 SDSS objects to assess the reliability of SDSS-derived morphologies, taking advantage of both SDSS statistical samples and of HST's superior resolution and sensitivity. Single Sersic fitting and bulge-disk decomposition is performed on HST images for direct comparison with SDSS results. Of the three catalogs of SDSS-derived morphologies considered, none are significantly more accurate than the others. For disk-dominated galaxies (n<2.5), global Sersic indices (n) from Meert et al. 2015 (M15) are preferred. For bulge-dominated galaxies (n>2.5), Simard et al. 2011 (S11) and M15 overestimate n by ~20%, and NYU-derived global n are preferred. Global R_eff from S11 are preferred, but overestimate R_eff for the largest galaxies by 0.1 dex. SDSS-derived single-component parameters are generally significantly more robust than SDSS-derived two-component parameters. The bulge Sersic index (n_bulge) cannot be reliably constrained from SDSS imaging. The bulge-to-total (B/T) ratio can be reliably inferred from SDSS for galaxies with SDSS B/T<0.6 provided that n_bulge=4 is enforced. The difference in global n between HST and SDSS depends strongly on B/T; an empirical correction based only on it accounts for most of the systematics in global n.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Public-private funding models in open source software development: A case study on scikit-learn
Authors:
Cailean Osborne
Abstract:
Governments are increasingly funding open source software (OSS) development to support software security, digital sovereignty, and national competitiveness in science and innovation, amongst others. However, little is known about how OSS developers evaluate the relative benefits and drawbacks of governmental funding for OSS. This study explores this question through a case study on scikit-learn, a…
▽ More
Governments are increasingly funding open source software (OSS) development to support software security, digital sovereignty, and national competitiveness in science and innovation, amongst others. However, little is known about how OSS developers evaluate the relative benefits and drawbacks of governmental funding for OSS. This study explores this question through a case study on scikit-learn, a Python library for machine learning, funded by public research grants, commercial sponsorship, micro-donations, and a 32 euro million grant announced in France's artificial intelligence strategy. Through 25 interviews with scikit-learn's maintainers and funders, this study makes two key contributions. First, it contributes empirical findings about the benefits and drawbacks of public and private funding in an impactful OSS project, and the governance protocols employed by the maintainers to balance the diverse interests of their community and funders. Second, it offers practical lessons on funding for OSS developers, governments, and companies based on the experience of scikit-learn. The paper concludes with key recommendations for practitioners and future research directions.
△ Less
Submitted 3 May, 2024; v1 submitted 9 April, 2024;
originally announced April 2024.
-
The Model Openness Framework: Promoting Completeness and Openness for Reproducibility, Transparency, and Usability in Artificial Intelligence
Authors:
Matt White,
Ibrahim Haddad,
Cailean Osborne,
Xiao-Yang Yanglet Liu,
Ahmed Abdelmonsef,
Sachin Varghese,
Arnaud Le Hors
Abstract:
Generative artificial intelligence (AI) offers numerous opportunities for research and innovation, but its commercialization has raised concerns about the transparency and safety of frontier AI models. Most models lack the necessary components for full understanding, auditing, and reproducibility, and some model producers use restrictive licenses whilst claiming that their models are "open source"…
▽ More
Generative artificial intelligence (AI) offers numerous opportunities for research and innovation, but its commercialization has raised concerns about the transparency and safety of frontier AI models. Most models lack the necessary components for full understanding, auditing, and reproducibility, and some model producers use restrictive licenses whilst claiming that their models are "open source". To address these concerns, we introduce the Model Openness Framework (MOF), a three-tiered ranked classification system that rates machine learning models based on their completeness and openness, following open science principles. For each MOF class, we specify code, data, and documentation components of the model development lifecycle that must be released and under which open licenses. In addition, the Model Openness Tool (MOT) provides a user-friendly reference implementation to evaluate the openness and completeness of models against the MOF classification system. Together, the MOF and MOT provide timely practical guidance for (i) model producers to enhance the openness and completeness of their publicly-released models, and (ii) model consumers to identify open models and their constituent components that can be permissively used, studied, modified, and redistributed. Through the MOF, we seek to establish completeness and openness as core tenets of responsible AI research and development, and to promote best practices in the burgeoning open AI ecosystem.
△ Less
Submitted 18 October, 2024; v1 submitted 20 March, 2024;
originally announced March 2024.
-
The Bright Rim Prominences according to 2.5D Radiative Transfer
Authors:
Jack M. Jenkins,
Christopher M. J. Osborne,
Ye Qiu,
Rony Keppens,
Chuan Li
Abstract:
Solar prominences observed close to the limb commonly include a bright feature that, from the perspective of the observer, runs along the interface between itself and the underlying chromosphere. Despite several idealised models being proposed to explain the underlying physics, a more general approach remains outstanding. In this manuscript we demonstrate as a proof-of-concept the first steps in a…
▽ More
Solar prominences observed close to the limb commonly include a bright feature that, from the perspective of the observer, runs along the interface between itself and the underlying chromosphere. Despite several idealised models being proposed to explain the underlying physics, a more general approach remains outstanding. In this manuscript we demonstrate as a proof-of-concept the first steps in applying the Lightweaver radiative transfer framework's 2.5D extension to a `toy' model prominence + VAL3C chromosphere, inspired by recent 1.5D experiments that demonstrated a significant radiative chromosphere--prominence interaction. We find the radiative connection to be significant enough to enhance both the electron number density within the chromosphere, as well as its emergent intensity across a range of spectral lines in the vicinity of the filament absorption signature. Inclining the viewing angle from the vertical, we find these enhancements to become increasingly asymmetric and merge with a larger secondary enhancement sourced directly from the prominence underside. In wavelength, the enhancements are then found to be the largest in both magnitude and horizontal extent for the spectral line cores, decreasing into the line wings. Similar behaviour is found within new Chinese H$α$ Solar Explorer (CHASE)/H$α$ Imaging Spectrograph (HIS) observations, opening the door for subsequent statistical confirmations of the theoretical basis we develop here.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
Strategies for obtaining robust SED fitting parameters for galaxies at z~1 and z~2 in the absence of IR data
Authors:
Chandler Osborne,
Samir Salim
Abstract:
Robust estimation of star formation rates (SFRs) at higher redshifts (z>1) using UV-optical-NIR photometry is contingent on the ability of spectral energy distribution (SED) fitting to simultaneously constrain the dust attenuation, stellar metallicity, and star formation history (SFH). IR-derived dust luminosities can help break the degeneracy between these parameters, but IR data is often not ava…
▽ More
Robust estimation of star formation rates (SFRs) at higher redshifts (z>1) using UV-optical-NIR photometry is contingent on the ability of spectral energy distribution (SED) fitting to simultaneously constrain the dust attenuation, stellar metallicity, and star formation history (SFH). IR-derived dust luminosities can help break the degeneracy between these parameters, but IR data is often not available. Here, we explore strategies for SED fitting at z>1 in the absence of IR data using a sample of log M*>10.2 star-forming galaxies from the Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey (CANDELS) for which 24mu data are available. We adopt the total IR luminosity (L_TIR) obtained from 24mu as the 'ground truth' that allows us to assess how well it can be recovered (as L_dust) from UV-optical-NIR SED fitting. We test a variety of dust attenuation models, stellar population synthesis models, metallicity assumptions, and SFHs separately to identify which assumptions maximize the agreement (correlation and linearity) between L_TIR and L_dust. We find that a flexible dust attenuation law performs best. For stellar populations, we find that BC03 models are favored over those of BPASS. Fixing the stellar metallicity at solar value is preferred to other fixed values or leaving it as a free parameter. For SFHs, we find that minimizing the variability in the recent (<100 Myr) SFH improves the agreement with L_TIR. Finally, we provide a catalog of galaxy parameters (including M* and SFR) for CANDELS galaxies with log M*>8 and 0.7<z<1.3 obtained using the models we found to be the most robust.
△ Less
Submitted 12 January, 2024;
originally announced January 2024.
-
Convergence rates of non-stationary and deep Gaussian process regression
Authors:
Conor Osborne,
Aretha L. Teckentrup
Abstract:
The focus of this work is the convergence of non-stationary and deep Gaussian process regression. More precisely, we follow a Bayesian approach to regression or interpolation, where the prior placed on the unknown function $f$ is a non-stationary or deep Gaussian process, and we derive convergence rates of the posterior mean to the true function $f$ in terms of the number of observed training poin…
▽ More
The focus of this work is the convergence of non-stationary and deep Gaussian process regression. More precisely, we follow a Bayesian approach to regression or interpolation, where the prior placed on the unknown function $f$ is a non-stationary or deep Gaussian process, and we derive convergence rates of the posterior mean to the true function $f$ in terms of the number of observed training points. In some cases, we also show convergence of the posterior variance to zero. The only assumption imposed on the function $f$ is that it is an element of a certain reproducing kernel Hilbert space, which we in particular cases show to be norm-equivalent to a Sobolev space. Our analysis includes the case of estimated hyper-parameters in the covariance kernels employed, both in an empirical Bayes' setting and the particular hierarchical setting constructed through deep Gaussian processes. We consider the settings of noise-free or noisy observations on deterministic or random training points. We establish general assumptions sufficient for the convergence of deep Gaussian process regression, along with explicit examples demonstrating the fulfilment of these assumptions. Specifically, our examples require that the Hölder or Sobolev norms of the penultimate layer are bounded almost surely.
△ Less
Submitted 18 March, 2025; v1 submitted 12 December, 2023;
originally announced December 2023.
-
$Σ_{\mathrm{SFR}}$-M* Diagram: A Valuable Galaxy Evolution Diagnostic to Complement (s)SFR-M* Diagrams
Authors:
Samir Salim,
Sandro Tacchella,
Chandler Osborne,
S. M. Faber,
Janice C. Lee,
Sara L. Ellison
Abstract:
The specific star formation rate (sSFR) is commonly used to describe the level of galaxy star formation (SF) and to select quenched galaxies. However, being a relative measure of the young-to-old population, an ambiguity in its interpretation may arise because a small sSFR can be either because of a substantial previous mass build up, or because SF is low. We show, using large samples spanning 0 <…
▽ More
The specific star formation rate (sSFR) is commonly used to describe the level of galaxy star formation (SF) and to select quenched galaxies. However, being a relative measure of the young-to-old population, an ambiguity in its interpretation may arise because a small sSFR can be either because of a substantial previous mass build up, or because SF is low. We show, using large samples spanning 0 < z < 2, that the normalization of SFR by the physical extent over which SF is taking place (i.e., SFR surface density, $Σ_{\mathrm{SFR}}$) overcomes this ambiguity. $Σ_{\mathrm{SFR}}$ has a strong physical basis, being tied to the molecular gas density and the effectiveness of stellar feedback, so we propose $Σ_{\mathrm{SFR}}$-M* as an important galaxy evolution diagram to complement (s)SFR-M* diagrams. Using the $Σ_{\mathrm{SFR}}$-M* diagram we confirm the Schiminovich et al. (2007) result that the level of SF along the main sequence today is only weakly mass dependent - high-mass galaxies, despite their redder colors, are as active as blue, low-mass ones. At higher redshift, the slope of the "$Σ_{\mathrm{SFR}}$ main sequence" steepens, signaling the epoch of bulge build-up in massive galaxies. We also find that $Σ_{\mathrm{SFR}}$ based on the optical isophotal radius more cleanly selects both the starbursting and the spheroid-dominated (early-type) galaxies than sSFR. One implication of our analysis is that the assessment of the inside-out vs. outside-in quenching scenarios should consider both sSFR and $Σ_{\mathrm{SFR}}$ radial profiles, because ample SF may be present in bulges with low sSFR (red color).
△ Less
Submitted 17 October, 2023;
originally announced October 2023.
-
Improved GALEX UV Photometry for 700,000 SDSS Galaxies
Authors:
Chandler Osborne,
Samir Salim,
Mederic Boquien,
Mark Dickinson,
Stephane Arnouts
Abstract:
The Galaxy Evolution Explorer (GALEX) satellite performed the first and only large-area UV survey, which in tandem with the Sloan Digital Sky Survey (SDSS) has facilitated modeling of the spectral energy distributions (SEDs) of low-redshift galaxies and the determination of various galaxy properties, in particular the star formation rate. However, the relatively crude angular resolution of GALEX (…
▽ More
The Galaxy Evolution Explorer (GALEX) satellite performed the first and only large-area UV survey, which in tandem with the Sloan Digital Sky Survey (SDSS) has facilitated modeling of the spectral energy distributions (SEDs) of low-redshift galaxies and the determination of various galaxy properties, in particular the star formation rate. However, the relatively crude angular resolution of GALEX (5") made its images susceptible to blending of sources, resulting in potentially biased far-UV (FUV) and near-UV (NUV) pipeline photometry. To remedy this issue and take advantage of model-fit photometry, we use the EMphot software to obtain forced GALEX photometry for ~700,000 SDSS galaxies at z<0.3. Positional priors of target galaxies and potentially contaminating neighbors were taken from SDSS. New photometry is based on the best-fitting of three model profiles: optical-like, exponential and flat. New photometry mitigates blending present in the original pipeline catalogs, which affected 16% of galaxies at a level of >0.2 mag and 2% at a level of >1 mag. Pipeline NUV magnitudes are severely affected (>1 mag) when the neighbor is brighter than the target galaxy and within 10", or when the neighbor is fainter and within ~3" of the target. New photometry fixes edge-of-detector bias, which affected pipeline photometry by up to 0.1 mag in NUV. We present catalogs with new photometry for GALEX observations of different depths, corresponding to the all-sky imaging survey (AIS), medium imaging survey (MIS) and deep imaging survey (DIS). Catalogs feature combined magnitudes for multiple detections of the same galaxy in a survey.
△ Less
Submitted 25 July, 2023;
originally announced July 2023.
-
1.5D NLTE spectral synthesis of a 3D filament/prominence simulation
Authors:
J. M. Jenkins,
C. M. J. Osborne,
R. Keppens
Abstract:
Aims. We here demonstrate how the recently developed Lightweaver framework makes non-LTE (NLTE) spectral synthesis feasible on a new 3D ab-initio magnetohydrodynamic (MHD) filament/prominence simulation, in a post-processing step. Methods. We clarify the need to introduce filament/prominent-specific Lightweaver boundary conditions that accurately model incident chromospheric radiation, and include…
▽ More
Aims. We here demonstrate how the recently developed Lightweaver framework makes non-LTE (NLTE) spectral synthesis feasible on a new 3D ab-initio magnetohydrodynamic (MHD) filament/prominence simulation, in a post-processing step. Methods. We clarify the need to introduce filament/prominent-specific Lightweaver boundary conditions that accurately model incident chromospheric radiation, and include a self-consistent and smoothly varying limb darkening function. Results. Progressing from isothermal/isobaric models to the self-consistently generated stratifications within a fully 3D MHD filament/prominence simulation, we find excellent agreement between our 1.5D non local thermodynamic equilibrium Lightweaver synthesis and a popular Hydrogen Hα proxy. We compute additional lines including Ca~\textsc{ii} 8542 alongside the more optically-thick Ca~\textsc{ii} H&K & Mg~\textsc{ii} h&k lines, for which no comparable proxy exists, and explore their formation properties within filament/prominence atmospheres. Conclusions. The versatility of the Lightweaver framework is demonstrated with this extension to 1.5D filament/prominence models, where each vertical column of the instantaneous 3D MHD state is spectrally analysed separately, without accounting for (important) multi-dimensional radiative effects. The general agreement found in the line core contrast of both observations and the Lightweaver-synthesised simulation further validates the current generation of solar filaments/prominences models constructed numerically with MPI-AMRVAC.
△ Less
Submitted 27 November, 2022;
originally announced November 2022.
-
Flare Kernels May be Smaller than You Think: Modelling the Radiative Response of Chromospheric Plasma Adjacent to a Solar Flare
Authors:
Christopher M. J. Osborne,
Lyndsay Fletcher
Abstract:
Numerical models of solar flares typically focus on the behaviour of directly-heated flare models, adopting magnetic field- aligned, plane-parallel methodologies. With high spatial- and spectral-resolution ground-based optical observations of flares, it is essential also to understand the response of the plasma surrounding these strongly heated volumes. We investigate the effects of the extreme ra…
▽ More
Numerical models of solar flares typically focus on the behaviour of directly-heated flare models, adopting magnetic field- aligned, plane-parallel methodologies. With high spatial- and spectral-resolution ground-based optical observations of flares, it is essential also to understand the response of the plasma surrounding these strongly heated volumes. We investigate the effects of the extreme radiation field produced by a heated column of flare plasma on an adjacent slab of chromospheric plasma, using a two-dimensional radiative transfer model and considering the time-dependent solution to the atomic level populations and electron density throughout this model. The outgoing spectra of H$α$ and Ca II 854.2 nm synthesised from our slab show significant spatial-, time-, and wavelength-dependent variations (both enhancements and reductions) in the line cores, extending on order 1 Mm into the non-flaring slab due to the incident transverse radiation field from the flaring boundary. This may lead to significant overestimates of the sizes of directly-heated flare kernels, if line-core observations are used. However, the radiation field alone is insufficient to drive any significant changes in continuum intensity, due to the typical photospheric depths at which they forms, so continuum sources will not have an apparent increase in size. We show that the line formation regions near the flaring boundary can be driven upwards in altitude by over 1 Mm despite the primary thermodynamic parameters (other than electron density) being held horizontally uniform. This work shows that in simple models these effects are significant and should be considered further in future flare modelling and interpretation.
△ Less
Submitted 7 September, 2022;
originally announced September 2022.
-
On the Importance of Ca II Photoionisation by the Hydrogen Lyman Transitions in Solar Flare Models
Authors:
Christopher M. J. Osborne,
Petr Heinzel,
Jana Kašparová,
Lyndsay Fletcher
Abstract:
The forward fitting of solar flare observations with radiation-hydrodynamic simulations is a common technique for learning about energy deposition and atmospheric evolution during these explosive events. A frequent spectral line choice for this process is Ca II 854.2 nm due to its formation in the chromosphere and substantial variability. It is important to ensure that this line is accurately mode…
▽ More
The forward fitting of solar flare observations with radiation-hydrodynamic simulations is a common technique for learning about energy deposition and atmospheric evolution during these explosive events. A frequent spectral line choice for this process is Ca II 854.2 nm due to its formation in the chromosphere and substantial variability. It is important to ensure that this line is accurately modeled to obtain the correct interpretation of observations. Here we investigate the importance of photoionisation of Ca II to Ca III by the hydrogen Lyman transitions; whilst the Lyman continuum is typically considered in this context in simulations, the associated bound-bound transitions are not. This investigation uses two RADYN flare simulations and reprocesses the radiative transfer using the Lightweaver framework which accounts for the overlapping of all active transitions. The Ca II 854.2 nm line profiles are found to vary significantly due to photoionisation by the Lyman lines, showing notably different shapes and even reversed asymmetries. Finally, we investigate to what extent these effects modify the energy balance of the simulation and the implications on future radiation-hydrodynamic simulations. There is found to be a 10-15% change in detailed optically thick radiative losses from considering these photoionisation effects on the calcium lines in the two simulations presented, demonstrating the importance of considering these effects in a self-consistent way.
△ Less
Submitted 23 July, 2021;
originally announced July 2021.
-
The Lightweaver Framework for NLTE Radiative Transfer in Python
Authors:
Christopher M J Osborne,
Ivan Milić
Abstract:
Tools for computing detailed optically thick spectral line profiles out of local thermodynamic equilibrium have always been focused on speed, due to the large computational effort involved. With the Lightweaver framework, we have produced a more flexible, modular toolkit for building custom tools in a high-level language, Python, without sacrificing speed against the current state of the art. The…
▽ More
Tools for computing detailed optically thick spectral line profiles out of local thermodynamic equilibrium have always been focused on speed, due to the large computational effort involved. With the Lightweaver framework, we have produced a more flexible, modular toolkit for building custom tools in a high-level language, Python, without sacrificing speed against the current state of the art. The goal of providing a more flexible method for constructing these complex simulations is to decrease the barrier to entry and allow more rapid exploration of the field.
In this paper we present an overview of the theory of optically thick NLTE radiative transfer, the numerical methods implemented in Lightweaver including the problems of time-dependent populations and charge-conservation, as well as an overview of the components most users will interact with, to demonstrate their flexibility.
△ Less
Submitted 30 June, 2021;
originally announced July 2021.
-
CANDELS Meets GSWLC: Evolution of the Relationship Between Morphology and Star Formation Since z = 2
Authors:
Chandler Osborne,
Samir Salim,
Ivana Damjanov,
S. M. Faber,
Marc Huertas-Company,
David C. Koo,
Kameswara Bharadwaj Mantha,
Daniel H. McIntosh,
Joel R. Primack,
Sandro Tacchella
Abstract:
Galaxy morphology and its evolution over the cosmic epoch hold important clues for understanding the regulation of star formation (SF). However, studying the relationship between morphology and SF has been hindered by the availability of consistent data at different redshifts. Our sample, combining CANDELS (0.8 < z < 2.5) and the GALEX-SDSS-WISE Legacy Catalog (GSWLC; z ~ 0), has physical paramete…
▽ More
Galaxy morphology and its evolution over the cosmic epoch hold important clues for understanding the regulation of star formation (SF). However, studying the relationship between morphology and SF has been hindered by the availability of consistent data at different redshifts. Our sample, combining CANDELS (0.8 < z < 2.5) and the GALEX-SDSS-WISE Legacy Catalog (GSWLC; z ~ 0), has physical parameters derived using consistent SED fitting with flexible dust attenuation laws. We adopt visual classifications from Kartaltepe et al. 2015 and expand them to z ~ 0 using SDSS images matching the physical resolution of CANDELS rest-frame optical images and deep FUV GALEX images matching the physical resolution of the CANDELS rest-frame FUV images. Our main finding is that disks with SF clumps at z ~ 0 make a similar fraction (~15%) of star-forming galaxies as at z ~ 2. The clumpy disk contribution to the SF budget peaks at z ~ 1, rather than z ~ 2, suggesting that the principal epoch of disk assembly continues to lower redshifts. Star-forming spheroids ("blue nuggets"), though less centrally concentrated than quenched spheroids, contribute significantly (~15%) to the SF budget at z ~ 1-2, suggesting that compaction precedes quenching. Among green valley and quiescent galaxies, the pure spheroid fraction drops since z ~ 1, whereas spheroids with disks (S0-like) become dominant. Mergers at or nearing coalescence are enhanced in SFR relative to the main sequence at all redshifts by a factor of ~2, but contribute $\lesssim$5% to the SF budget, with their contribution remaining small above the main sequence.
△ Less
Submitted 2 September, 2020;
originally announced September 2020.
-
Spectral deconvolution with deep learning: removing the effects of spectral PSF broadening
Authors:
Momchil Molnar,
Kevin Reardon,
Christopher Osborne,
Ivan Milić
Abstract:
We explore novel methods of recovering the original spectral line profiles from data obtained by instruments that sample those profiles with an extended or multipeaked spectral transmission profile. The techniques are tested on data obtained at high spatial resolution from the Fast Imaging Solar Spectrograph (FISS) grating spectrograph at the Big Bear Solar Observatory and from the Interferometric…
▽ More
We explore novel methods of recovering the original spectral line profiles from data obtained by instruments that sample those profiles with an extended or multipeaked spectral transmission profile. The techniques are tested on data obtained at high spatial resolution from the Fast Imaging Solar Spectrograph (FISS) grating spectrograph at the Big Bear Solar Observatory and from the Interferometric Bidimensional Spectrometer (IBIS) instrument at the Dunn Solar Telescope. The method robustly deconvolves wide spectral transmission profiles for fields of view sampling a variety of solar structures (granulation, plage and pores) with a photometrical precision of less than 1%. The results and fidelity of the method are tested on data from IBIS obtained using several different spectral resolution modes.
The method, based on convolutional neural networks (CNN), is extremely fast, performing about $10^5$ deconvolutions per second on a CPU and $10^6$ deconvolutions per second on NVIDIA TITAN RTX GPU for a spectrum with 40 wavelength samples. This approach is applicable for deconvolving large amounts of data from instruments with wide spectral transmission profiles, such as the Visible Tunable Filter (VTF) on the DKI Solar Telescope (DKIST). We also investigate its application to future instruments by recovering spectral line profiles obtained with a theoretical multi-peaked spectral transmission profile.
We further discuss the limitations of this deconvolutional approach through the analysis of the dimensionality of the original and multiplexed data.
△ Less
Submitted 11 May, 2020;
originally announced May 2020.
-
Decomposing the classifying diagram in terms of classifying spaces of groups
Authors:
Christina Osborne
Abstract:
The classifying diagram was defined by Rezk and is a generalization of the nerve of a category; in contrast to the nerve, the classifying diagram of two categories is equivalent if and only if the categories are equivalent. In this paper we prove that the classifying diagram of any category is characterized in terms of classifying spaces of stabilizers of groups. We also prove explicit decompositi…
▽ More
The classifying diagram was defined by Rezk and is a generalization of the nerve of a category; in contrast to the nerve, the classifying diagram of two categories is equivalent if and only if the categories are equivalent. In this paper we prove that the classifying diagram of any category is characterized in terms of classifying spaces of stabilizers of groups. We also prove explicit decompositions of the classifying diagrams for the categories of finite ordered sets, finite dimensional vector spaces, and finite sets in terms of classifying spaces of groups.
△ Less
Submitted 25 November, 2019;
originally announced November 2019.
-
MLPerf Inference Benchmark
Authors:
Vijay Janapa Reddi,
Christine Cheng,
David Kanter,
Peter Mattson,
Guenther Schmuelling,
Carole-Jean Wu,
Brian Anderson,
Maximilien Breughe,
Mark Charlebois,
William Chou,
Ramesh Chukka,
Cody Coleman,
Sam Davis,
Pan Deng,
Greg Diamos,
Jared Duke,
Dave Fick,
J. Scott Gardner,
Itay Hubara,
Sachin Idgunji,
Thomas B. Jablin,
Jeff Jiao,
Tom St. John,
Pankaj Kanwar,
David Lee
, et al. (22 additional authors not shown)
Abstract:
Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devic…
▽ More
Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf Inference answers that call. In this paper, we present our benchmarking method for evaluating ML inference systems. Driven by more than 30 organizations as well as more than 200 ML engineers and practitioners, MLPerf prescribes a set of rules and best practices to ensure comparability across systems with wildly differing architectures. The first call for submissions garnered more than 600 reproducible inference-performance measurements from 14 organizations, representing over 30 systems that showcase a wide range of capabilities. The submissions attest to the benchmark's flexibility and adaptability.
△ Less
Submitted 9 May, 2020; v1 submitted 6 November, 2019;
originally announced November 2019.
-
Thyr: A Volumetric Ray-Marching Tool for Simulating Microwave Emission
Authors:
Christopher M. J. Osborne,
Paulo J. A. Simões
Abstract:
Gyrosynchrotron radiation is produced by solar flares, and can be used to infer properties of the accelerated electrons and magnetic field of the flaring region. This microwave emission is highly dependent on many local plasma parameters, and the viewing angle. To correctly interpret observations, detailed simulations of the emission are required. Additionally, gyrosynchrotron emission from the ch…
▽ More
Gyrosynchrotron radiation is produced by solar flares, and can be used to infer properties of the accelerated electrons and magnetic field of the flaring region. This microwave emission is highly dependent on many local plasma parameters, and the viewing angle. To correctly interpret observations, detailed simulations of the emission are required. Additionally, gyrosynchrotron emission from the chromosphere has been largely ignored in modelling efforts, and recent studies have shown the importance of thermal emission at millimetric wavelengths. Thyr is a new tool for modelling microwave emission from three-dimensional flaring loops with spatially varying atmosphere and increased resolution in the lower corona and chromosphere. Thyr is modular and open-source, consisting of separate components to compute the thermal and non-thermal microwave emission coefficients and perform three-dimensional radiative transfer (in local thermodynamic equilibrium). The radiative transfer integral is computed by a novel ray-marching technique to efficiently compute the contribution of many volume elements. This technique can also be employed on a variety of astrophysics problems. Herein we present a review of the theory of gyrosynchrotron radiation, and two simulations of identical flare loops in low- and high-resolution performed with Thyr, with a spectral imaging analysis of differing regions. The high-resolution simulation presents a spectral hardening at higher frequencies. This hardening originates around the top of the chromosphere due to the strong convergence of the magnetic field, and is not present in previous models due to insufficient resolution. This hardening could be observed with a coordinated flare observation from active radio observatories.
△ Less
Submitted 11 March, 2019;
originally announced March 2019.
-
RADYNVERSION: Learning to Invert a Solar Flare Atmosphere with Invertible Neural Networks
Authors:
Christopher M. J. Osborne,
John A. Armstrong,
Lyndsay Fletcher
Abstract:
During a solar flare, it is believed that reconnection takes place in the corona followed by fast energy transport to the chromosphere. The resulting intense heating strongly disturbs the chromospheric structure, and induces complex radiation hydrodynamic effects. Interpreting the physics of the flaring solar atmosphere is one of the most challenging tasks in solar physics. Here we present a novel…
▽ More
During a solar flare, it is believed that reconnection takes place in the corona followed by fast energy transport to the chromosphere. The resulting intense heating strongly disturbs the chromospheric structure, and induces complex radiation hydrodynamic effects. Interpreting the physics of the flaring solar atmosphere is one of the most challenging tasks in solar physics. Here we present a novel deep learning approach, an invertible neural network, to understanding the chromospheric physics of a flaring solar atmosphere via the inversion of observed solar line profiles in Hα and Ca II λ8542. Our network is trained using flare simulations from the 1D radiation hydrodynamics code RADYN as the expected atmosphere and line profile. This model is then applied to single pixels from an observation of an M1.1 solar flare taken with SST/CRISP instrument just after the flare onset. The inverted atmospheres obtained from observations provide physical information on the electron number density, temperature and bulk velocity flow of the plasma throughout the solar atmosphere ranging from 0-10 Mm in height. The density and temperature profiles appear consistent with the expected atmospheric response, and the bulk plasma velocity provides the gradients needed to produce the broad spectral lines whilst also predicting the expected chromospheric evaporation from flare heating. We conclude that we have taught our novel algorithm the physics of a solar flare according to RADYN and that this can be confidently used for the analysis of flare data taken in these two wavelengths. This algorithm can also be adapted for a menagerie of inverse problems providing extremely fast ($\sim$10 μs) inversion samples.
△ Less
Submitted 29 April, 2019; v1 submitted 24 January, 2019;
originally announced January 2019.
-
State Dependence of Stimulus-Induced Variability Tuning in Macaque MT
Authors:
Joseph A. Lombardo,
Matthew V. Macellaio,
Bing Liu,
Stephanie E. Palmer,
Leslie C. Osborne
Abstract:
Behavioral states marked by varying levels of arousal and attention modulate some properties of cortical responses (e.g. average firing rates or pairwise correlations), yet it is not fully understood what drives these response changes and how they might affect downstream stimulus decoding. Here we show that changes in state modulate the tuning of response variance-to-mean ratios (Fano factors) in…
▽ More
Behavioral states marked by varying levels of arousal and attention modulate some properties of cortical responses (e.g. average firing rates or pairwise correlations), yet it is not fully understood what drives these response changes and how they might affect downstream stimulus decoding. Here we show that changes in state modulate the tuning of response variance-to-mean ratios (Fano factors) in a fashion that is neither predicted by a Poisson spiking model nor changes in the mean firing rate, with a substantial effect on stimulus discriminability. We recorded motion-sensitive neurons in middle temporal cortex (MT) in two states: alert fixation and light, opioid anesthesia. Anesthesia tended to lower average spike counts, without decreasing trial-to-trial variability compared to the alert state. Under anesthesia, within-trial fluctuations in excitability were correlated over longer time scales compared to the alert state, creating supra-Poisson Fano factors. In contrast, alert-state MT neurons have higher mean firing rates and largely sub-Poisson variability that is stimulus-dependent and cannot be explained by firing rate differences alone. The absence of such stimulus-induced variability tuning in the anesthetized state suggests different sources of variability between states. A simple model explains state-dependent shifts in the distribution of observed Fano factors via a suppression in the variance of gain fluctuations in the alert state. A population model with stimulus-induced variability tuning and behaviorally constrained information-limiting correlations explores the potential enhancement in stimulus discriminability by the cortical population in the alert state.
△ Less
Submitted 3 October, 2018; v1 submitted 28 October, 2017;
originally announced October 2017.
-
A first step toward higher order chain rules in abelian functor calculus
Authors:
Christina Osborne,
Amelia Tebbe
Abstract:
One of the fundamental tools of undergraduate calculus is the chain rule. The notion of higher order directional derivatives was developed by Huang, Marcantognini, and Young, along with a corresponding higher order chain rule. When Johnson and McCarthy established abelian functor calculus, they proved a chain rule for functors that is analogous to the directional derivative chain rule when…
▽ More
One of the fundamental tools of undergraduate calculus is the chain rule. The notion of higher order directional derivatives was developed by Huang, Marcantognini, and Young, along with a corresponding higher order chain rule. When Johnson and McCarthy established abelian functor calculus, they proved a chain rule for functors that is analogous to the directional derivative chain rule when $n = 1$. In joint work with Bauer, Johnson, and Riehl, we defined an analogue of the iterated directional derivative and provided an inductive proof of the analogue to the chain rule of Huang et al.
This paper consists of the initial investigation of the chain rule found in Bauer et al., which involves a concrete computation of the case when $n=2$. We describe how to obtain the second higher order directional derivative chain rule for abelian functors. This proof is fundamentally different in spirit from the proof given in Bauer et al. as it relies only on properties of cross effects and the linearization of functors.
△ Less
Submitted 15 July, 2017;
originally announced July 2017.
-
Directional derivatives and higher order chain rules for abelian functor calculus
Authors:
Kristine Bauer,
Brenda Johnson,
Christina Osborne,
Emily Riehl,
Amelia Tebbe
Abstract:
In this paper, we consider abelian functor calculus, the calculus of functors of abelian categories established by the second author and McCarthy. We carefully construct a category of abelian categories and suitably homotopically defined functors, and show that this category, equipped with the directional derivative, is a cartesian differential category in the sense of Blute, Cockett, and Seely. T…
▽ More
In this paper, we consider abelian functor calculus, the calculus of functors of abelian categories established by the second author and McCarthy. We carefully construct a category of abelian categories and suitably homotopically defined functors, and show that this category, equipped with the directional derivative, is a cartesian differential category in the sense of Blute, Cockett, and Seely. This provides an abstract framework that makes certain analogies between classical and functor calculus explicit. Inspired by Huang, Marcantognini, and Young's chain rule for higher order directional derivatives of functions, we define a higher order directional derivative for functors of abelian categories. We show that our higher order directional derivative is related to the iterated partial directional derivatives of the second author and McCarthy by a Faà di Bruno style formula. We obtain a higher order chain rule for our directional derivatives using a feature of the cartesian differential category structure, and with this provide a formulation for the $n$th layers of the Taylor tower of a composition of functors $F\circ G$ in terms of the derivatives and directional derivatives of $F$ and $G$, reminiscent of similar formulations for functors of spaces or spectra by Arone and Ching. Throughout, we provide explicit chain homotopy equivalences that tighten previously established quasi-isomorphisms for properties of abelian functor calculus.
△ Less
Submitted 30 May, 2017; v1 submitted 6 October, 2016;
originally announced October 2016.
-
Searching for simplicity: Approaches to the analysis of neurons and behavior
Authors:
Greg J. Stephens,
Leslie C. Osborne,
William Bialek
Abstract:
What fascinates us about animal behavior is its richness and complexity, but understanding behavior and its neural basis requires a simpler description. Traditionally, simplification has been imposed by training animals to engage in a limited set of behaviors, by hand scoring behaviors into discrete classes, or by limiting the sensory experience of the organism. An alternative is to ask whether we…
▽ More
What fascinates us about animal behavior is its richness and complexity, but understanding behavior and its neural basis requires a simpler description. Traditionally, simplification has been imposed by training animals to engage in a limited set of behaviors, by hand scoring behaviors into discrete classes, or by limiting the sensory experience of the organism. An alternative is to ask whether we can search through the dynamics of natural behaviors to find explicit evidence that these behaviors are simpler than they might have been. We review two mathematical approaches to simplification, dimensionality reduction and the maximum entropy method, and we draw on examples from different levels of biological organization, from the crawling behavior of C. elegans to the control of smooth pursuit eye movements in primates, and from the coding of natural scenes by networks of neurons in the retina to the rules of English spelling. In each case, we argue that the explicit search for simplicity uncovers new and unexpected features of the biological system, and that the evidence for simplification gives us a language with which to phrase new questions for the next generation of experiments. The fact that similar mathematical structures succeed in taming the complexity of very different biological systems hints that there is something more general to be discovered.
△ Less
Submitted 17 December, 2010;
originally announced December 2010.
-
Combinatorial coding in neural populations
Authors:
L. C. Osborne,
S. E. Palmer,
S. G. Lisberger,
W. Bialek
Abstract:
To evaluate the nature of the neural code in the cerebral cortex, we have used a combination of theory and experiment to assess how information is represented in a realistic cortical population response. We have shown how a sensory stimulus could be estimated on a biologically-realistic time scale, given brief individual responses from a population of neurons with similar response properties. Fo…
▽ More
To evaluate the nature of the neural code in the cerebral cortex, we have used a combination of theory and experiment to assess how information is represented in a realistic cortical population response. We have shown how a sensory stimulus could be estimated on a biologically-realistic time scale, given brief individual responses from a population of neurons with similar response properties. For neurons in extrastriate motion area MT, a combinatorial code, one that keeps track of the cell identity of action potentials and silences in individual neurons across the population, carries twice as much information about visual motion as does spike count averaged over the same group of cells. The combinatorial code is more informative because of the diverse firing rate dynamics of MT neurons in response to constant motion stimuli, and is robust to neuron-neuron correlations. We provide a theoretical motivation for these observations that challenges commonly held ideas about the nature of cortical coding at the level of single neurons and neural populations.
△ Less
Submitted 26 March, 2008;
originally announced March 2008.