Search | arXiv e-print repository

TORAX: A Fast and Differentiable Tokamak Transport Simulator in JAX

Authors: Jonathan Citrin, Ian Goodfellow, Akhil Raju, Jeremy Chen, Jonas Degrave, Craig Donner, Federico Felici, Philippe Hamel, Andrea Huber, Dmitry Nikulin, David Pfau, Brendan Tracey, Martin Riedmiller, Pushmeet Kohli

Abstract: We present TORAX, a new, open-source, differentiable tokamak core transport simulator implemented in Python using the JAX framework. TORAX solves the coupled equations for ion heat transport, electron heat transport, particle transport, and current diffusion, incorporating modular physics-based and ML models. JAX's just-in-time compilation ensures fast runtimes, while its automatic differentiation… ▽ More We present TORAX, a new, open-source, differentiable tokamak core transport simulator implemented in Python using the JAX framework. TORAX solves the coupled equations for ion heat transport, electron heat transport, particle transport, and current diffusion, incorporating modular physics-based and ML models. JAX's just-in-time compilation ensures fast runtimes, while its automatic differentiation capability enables gradient-based optimization workflows and simplifies the use of Jacobian-based PDE solvers. Coupling to ML-surrogates of physics models is greatly facilitated by JAX's intrinsic support for neural network development and inference. TORAX is verified against the established RAPTOR code, demonstrating agreement in simulated plasma profiles. TORAX provides a powerful and versatile tool for accelerating research in tokamak scenario modeling, pulse design, and control. △ Less

Submitted 7 December, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

Comments: 16 pages, 7 figures

arXiv:2312.09187 [pdf, other]

Vision-Language Models as a Source of Rewards

Authors: Kate Baumli, Satinder Baveja, Feryal Behbahani, Harris Chan, Gheorghe Comanici, Sebastian Flennerhag, Maxime Gazeau, Kristian Holsheimer, Dan Horgan, Michael Laskin, Clare Lyle, Hussain Masoom, Kay McKinney, Volodymyr Mnih, Alexander Neitz, Dmitry Nikulin, Fabio Pardo, Jack Parker-Holder, John Quan, Tim Rocktäschel, Himanshu Sahni, Tom Schaul, Yannick Schroecker, Stephen Spencer, Richie Steigerwald , et al. (2 additional authors not shown)

Abstract: Building generalist agents that can accomplish many goals in rich open-ended environments is one of the research frontiers for reinforcement learning. A key limiting factor for building generalist agents with RL has been the need for a large number of reward functions for achieving different goals. We investigate the feasibility of using off-the-shelf vision-language models, or VLMs, as sources of… ▽ More Building generalist agents that can accomplish many goals in rich open-ended environments is one of the research frontiers for reinforcement learning. A key limiting factor for building generalist agents with RL has been the need for a large number of reward functions for achieving different goals. We investigate the feasibility of using off-the-shelf vision-language models, or VLMs, as sources of rewards for reinforcement learning agents. We show how rewards for visual achievement of a variety of language goals can be derived from the CLIP family of models, and used to train RL agents that can achieve a variety of language goals. We showcase this approach in two distinct visual domains and present a scaling trend showing how larger VLMs lead to more accurate rewards for visual goal achievement, which in turn produces more capable RL agents. △ Less

Submitted 12 July, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

Comments: 10 pages, 5 figures

arXiv:2107.11124 [pdf, other]

COVID-19 and the gig economy in Poland

Authors: Maciej Beręsewicz, Dagmara Nikulin

Abstract: We use a dataset covering nearly the entire target population based on passively collected data from smartphones to measure the impact of the first COVID-19 wave on the gig economy in Poland. In particular, we focus on transportation (Uber, Bolt) and delivery (Wolt, Takeaway, Glover, DeliGoo) apps, which make it possible to distinguish between the demand and supply part of this market. Based on Ba… ▽ More We use a dataset covering nearly the entire target population based on passively collected data from smartphones to measure the impact of the first COVID-19 wave on the gig economy in Poland. In particular, we focus on transportation (Uber, Bolt) and delivery (Wolt, Takeaway, Glover, DeliGoo) apps, which make it possible to distinguish between the demand and supply part of this market. Based on Bayesian structural time-series models, we estimate the causal impact of the first COVID-19 wave on the number of active drivers and couriers. We show a significant relative increase for Wolt and Glover (15% and 24%) and a slight relative decrease for Uber and Bolt (-3% and -7%) in comparison to a counterfactual control. The change for Uber and Bolt can be partially explained by the prospect of a new law (the so-called Uber Lex), which was already announced in 2019 and is intended to regulate the work of platform drivers. △ Less

Submitted 23 July, 2021; originally announced July 2021.

arXiv:2106.12827 [pdf, other]

The gig economy in Poland: evidence based on mobile big data

Authors: Maciej Beręsewicz, Dagmara Nikulin, Marcin Szymkowiak, Kamil Wilak

Abstract: In this article we address the question of how to measure the size and characteristics of the platform economy. We propose a~different, to sample surveys, approach based on smartphone data, which are passively collected through programmatic systems as part of online marketing. In particular, in our study we focus on two types of services: food delivery (Bolt Courier, Takeaway, Glover, Wolt and tra… ▽ More In this article we address the question of how to measure the size and characteristics of the platform economy. We propose a~different, to sample surveys, approach based on smartphone data, which are passively collected through programmatic systems as part of online marketing. In particular, in our study we focus on two types of services: food delivery (Bolt Courier, Takeaway, Glover, Wolt and transport services (Bolt Driver, Free Now, iTaxi and Uber). Our results show that the platform economy in Poland is growing. In particular, with respect to food delivery and transportation services performed by means of applications, we observed a growing trend between January 2018 and December 2020. Taking into account the demographic structure of apps users, our results confirm findings from past studies: the majority of platform workers are young men but the age structure of app users is different for each of the two categories of services. Another surprising finding is that foreigners do not account for the majority of gig workers in Poland. When the number of platform workers is compared with corresponding working populations, the estimated share of active app users accounts for about 0.5-2% of working populations in 9 largest Polish cities. △ Less

Submitted 24 June, 2021; originally announced June 2021.

Comments: 44 pages, 20 figures

arXiv:2105.01957 [pdf, other]

Perceptual Gradient Networks

Authors: Dmitry Nikulin, Roman Suvorov, Aleksei Ivakhnenko, Victor Lempitsky

Abstract: Many applications of deep learning for image generation use perceptual losses for either training or fine-tuning of the generator networks. The use of perceptual loss however incurs repeated forward-backward passes in a large image classification network as well as a considerable memory overhead required to store the activations of this network. It is therefore desirable or sometimes even critical… ▽ More Many applications of deep learning for image generation use perceptual losses for either training or fine-tuning of the generator networks. The use of perceptual loss however incurs repeated forward-backward passes in a large image classification network as well as a considerable memory overhead required to store the activations of this network. It is therefore desirable or sometimes even critical to get rid of these overheads. In this work, we propose a way to train generator networks using approximations of perceptual loss that are computed without forward-backward passes. Instead, we use a simpler perceptual gradient network that directly synthesizes the gradient field of a perceptual loss. We introduce the concept of proxy targets, which stabilize the predicted gradient, meaning that learning with it does not lead to divergence or oscillations. In addition, our method allows interpretation of the predicted gradient, providing insight into the internals of perceptual loss and suggesting potential ways to improve it in future work. △ Less

Submitted 5 May, 2021; originally announced May 2021.

Comments: 28 pages, 15 figures, 8 tables

arXiv:1908.02511 [pdf, other]

Free-Lunch Saliency via Attention in Atari Agents

Authors: Dmitry Nikulin, Anastasia Ianina, Vladimir Aliev, Sergey Nikolenko

Abstract: We propose a new approach to visualize saliency maps for deep neural network models and apply it to deep reinforcement learning agents trained on Atari environments. Our method adds an attention module that we call FLS (Free Lunch Saliency) to the feature extractor from an established baseline (Mnih et al., 2015). This addition results in a trainable model that can produce saliency maps, i.e., vis… ▽ More We propose a new approach to visualize saliency maps for deep neural network models and apply it to deep reinforcement learning agents trained on Atari environments. Our method adds an attention module that we call FLS (Free Lunch Saliency) to the feature extractor from an established baseline (Mnih et al., 2015). This addition results in a trainable model that can produce saliency maps, i.e., visualizations of the importance of different parts of the input for the agent's current decision making. We show experimentally that a network with an FLS module exhibits performance similar to the baseline (i.e., it is "free", with no performance cost) and can be used as a drop-in replacement for reinforcement learning agents. We also design another feature extractor that scores slightly lower but provides higher-fidelity visualizations. In addition to attained scores, we report saliency metrics evaluated on the Atari-HEAD dataset of human gameplay. △ Less

Submitted 30 October, 2019; v1 submitted 7 August, 2019; originally announced August 2019.

Comments: 2019 ICCV Workshop on Interpreting and Explaining Visual Artificial Intelligence Models. 15 pages, 14 figures, 5 tables

arXiv:1906.10957 [pdf, other]

doi 10.1111/rssc.12481

Estimation of the size of informal employment based on administrative records with non-ignorable selection mechanism

Authors: Maciej Beręsewicz, Dagmara Nikulin

Abstract: In this study we used company level administrative data from the National Labour Inspectorate and The Polish Social Insurance Institution in order to estimate the prevalence of informal employment in Poland. Since the selection mechanism is non-ignorable we employed a generalization of Heckman's sample selection model assuming non-Gaussian correlation of errors and clustering by incorporation of r… ▽ More In this study we used company level administrative data from the National Labour Inspectorate and The Polish Social Insurance Institution in order to estimate the prevalence of informal employment in Poland. Since the selection mechanism is non-ignorable we employed a generalization of Heckman's sample selection model assuming non-Gaussian correlation of errors and clustering by incorporation of random effects. We found that 5.7% (4.6%, 7.1%; 95% CI) of registered enterprises in Poland, to some extent, take advantage of the informal labour force. Our study exemplifies a new approach to measuring informal employment, which can be implemented in other countries. It also contributes to the existing literature by providing, to the best of our knowledge, the first estimates of informal employment at the level of companies based solely on administrative data. △ Less

Submitted 26 June, 2019; originally announced June 2019.

Journal ref: 2021

Showing 1–7 of 7 results for author: Nikulin, D