Search | arXiv e-print repository

arXiv:2502.20543 [pdf]

doi 10.22152/programming-journal.org/2026/10/9

Meta-compilation of Baseline JIT Compilers with Druid

Authors: Nahuel Palumbo, Guillermo Polito, Stéphane Ducasse, Pablo Tesone

Abstract: Virtual Machines (VMs) combine interpreters and just-in-time (JIT) compiled code to achieve good performance. However, implementing different execution engines increases the cost of developing and maintaining such solutions. JIT compilers based on meta-compilation cope with these issues by automatically generating optimizing JIT compilers. This leaves open the question of how meta-compilation appl… ▽ More Virtual Machines (VMs) combine interpreters and just-in-time (JIT) compiled code to achieve good performance. However, implementing different execution engines increases the cost of developing and maintaining such solutions. JIT compilers based on meta-compilation cope with these issues by automatically generating optimizing JIT compilers. This leaves open the question of how meta-compilation applies to baseline JIT compilers, which improve warmup times by trading off optimizations. In this paper, we present Druid, an ahead-of-time automatic approach to generate baseline JIT compiler frontends from interpreters. Language developers guide meta-compilation by annotating interpreter code and using Druid's intrinsics. Druid targets the meta-compilation to an existing JIT compiler infrastructure to achieve good warm-up performance. We applied Druid in the context of the Pharo programming language and evaluated it by comparing an autogenerated JIT compiler frontend against the one in production for more than 10 years. Our generated JIT compiler frontend is 2x faster on average than the interpreter and achieves on average 0.7x the performance of the handwritten JIT compiler. Our experiment only required changes in 60 call sites in the interpreter, showing that our solution makes language VMs **easier to maintain and evolve in the long run**. △ Less

Submitted 27 February, 2025; originally announced February 2025.

Journal ref: The Art, Science, and Engineering of Programming, 2025, Vol. 10, Issue 1, Article 9

arXiv:2306.12410 [pdf]

doi 10.22152/programming-journal.org/2024/8/2

A VM-Agnostic and Backwards Compatible Protected Modifier for Dynamically-Typed Languages

Authors: Iona Thomas, Vincent Aranega, Stéphane Ducasse, Guillermo Polito, Pablo Tesone

Abstract: In object-oriented languages, method visibility modifiers hold a key role in separating internal methods from the public API. Protected visibility modifiers offer a way to hide methods from external objects while authorizing internal use and overriding in subclasses. While present in main statically-typed languages, visibility modifiers are not as common or mature in dynamically-typed languages. I… ▽ More In object-oriented languages, method visibility modifiers hold a key role in separating internal methods from the public API. Protected visibility modifiers offer a way to hide methods from external objects while authorizing internal use and overriding in subclasses. While present in main statically-typed languages, visibility modifiers are not as common or mature in dynamically-typed languages. In this article, we present ProtDyn, a self-send-based visibility model calculated at compile time for dynamically-typed languages relying on name-mangling and syntactic differentiation of self vs non self sends. We present #Pharo, a ProtDyn implementation of this model that is backwards compatible with existing programs, and its port to Python. Using these implementations we study the performance impact of ProtDyn on the method lookup, in the presence of global lookup caches and polymorphic inline caches. We show that our name mangling and double method registration technique has a very low impact on performance and keeps the benefits from the global lookup cache and polymorphic inline cache. We also show that the memory overhead on a real use case is between 2% and 13% in the worst-case scenario. Protected modifier semantics enforces encapsulation such as private but allow developers to still extend the class in subclasses. ProtDyn offers a VM-agnostic and backwards-compatible design to introduce protected semantics in dynamically-typed languages. △ Less

Submitted 21 June, 2023; originally announced June 2023.

Journal ref: The Art, Science, and Engineering of Programming, 2024, Vol. 8, Issue 1, Article 2

arXiv:1909.03658 [pdf, other]

Sindarin: A Versatile Scripting API for the Pharo Debugger

Authors: Thomas Dupriez, Guillermo Polito, Steven Costiou, Vincent Aranega, Stéphane Ducasse

Abstract: Debugging is one of the most important and time consuming activities in software maintenance, yet mainstream debuggers are not well-adapted to several debugging scenarios. This has led to the research of new techniques covering specific families of complex bugs. Notably, recent research proposes to empower developers with scripting DSLs, plugin-based and moldable debuggers. However, these solution… ▽ More Debugging is one of the most important and time consuming activities in software maintenance, yet mainstream debuggers are not well-adapted to several debugging scenarios. This has led to the research of new techniques covering specific families of complex bugs. Notably, recent research proposes to empower developers with scripting DSLs, plugin-based and moldable debuggers. However, these solutions are tailored to specific use-cases, or too costly for one-time-use scenarios. In this paper we argue that exposing a debugging scripting interface in mainstream debuggers helps in solving many challenging debugging scenarios. For this purpose, we present Sindarin, a scripting API that eases the expression and automation of different strategies developers pursue during their debugging sessions. Sindarin provides a GDB-like API, augmented with AST-bytecode-source code mappings and object-centric capabilities. To demonstrate the versatility of Sindarin, we reproduce several advanced breakpoints and non-trivial debugging mechanisms from the literature. △ Less

Submitted 9 September, 2019; originally announced September 2019.

arXiv:1811.02034 [pdf]

doi 10.22152/programming-journal.org/2019/3/3

Out-Of-Place debugging: a debugging architecture to reduce debugging interference

Authors: Matteo Marra, Guillermo Polito, Elisa Gonzalez Boix

Abstract: Context. Recent studies show that developers spend most of their programming time testing, verifying and debugging software. As applications become more and more complex, developers demand more advanced debugging support to ease the software development process. Inquiry. Since the 70's many debugging solutions were introduced. Amongst them, online debuggers provide a good insight on the conditio… ▽ More Context. Recent studies show that developers spend most of their programming time testing, verifying and debugging software. As applications become more and more complex, developers demand more advanced debugging support to ease the software development process. Inquiry. Since the 70's many debugging solutions were introduced. Amongst them, online debuggers provide a good insight on the conditions that led to a bug, allowing inspection and interaction with the variables of the program. However, most of the online debugging solutions introduce \textit{debugging interference} to the execution of the program, i.e. pauses, latency, and evaluation of code containing side-effects. Approach. This paper investigates a novel debugging technique called \outofplace debugging. The goal is to minimize the debugging interference characteristic of online debugging while allowing online remote capabilities. An \outofplace debugger transfers the program execution and application state from the debugged application to the debugger application, both running in different processes. Knowledge. On the one hand, \outofplace debugging allows developers to debug applications remotely, overcoming the need of physical access to the machine where the debugged application is running. On the other hand, debugging happens locally on the remote machine avoiding latency. That makes it suitable to be deployed on a distributed system and handle the debugging of several processes running in parallel. Grounding. We implemented a concrete out-of-place debugger for the Pharo Smalltalk programming language. We show that our approach is practical by performing several benchmarks, comparing our approach with a classic remote online debugger. We show that our prototype debugger outperforms by a 1000 times a traditional remote debugger in several scenarios. Moreover, we show that the presence of our debugger does not impact the overall performance of an application. Importance. This work combines remote debugging with the debugging experience of a local online debugger. Out-of-place debugging is the first online debugging technique that can minimize debugging interference while debugging a remote application. Yet, it still keeps the benefits of online debugging ( e.g. step-by-step execution). This makes the technique suitable for modern applications which are increasingly parallel, distributed and reactive to streams of data from various sources like sensors, UI, network, etc. △ Less

Submitted 5 November, 2018; originally announced November 2018.

Journal ref: The Art, Science, and Engineering of Programming, 2019, Vol. 3, Issue 2, Article 3

arXiv:1809.07258 [pdf, other]

DPPy: Sampling DPPs with Python

Authors: Guillaume Gautier, Guillermo Polito, Rémi Bardenet, Michal Valko

Abstract: Determinantal point processes (DPPs) are specific probability distributions over clouds of points that are used as models and computational tools across physics, probability, statistics, and more recently machine learning. Sampling from DPPs is a challenge and therefore we present DPPy, a Python toolbox that gathers known exact and approximate sampling algorithms for both finite and continuous DPP… ▽ More Determinantal point processes (DPPs) are specific probability distributions over clouds of points that are used as models and computational tools across physics, probability, statistics, and more recently machine learning. Sampling from DPPs is a challenge and therefore we present DPPy, a Python toolbox that gathers known exact and approximate sampling algorithms for both finite and continuous DPPs. The project is hosted on GitHub and equipped with an extensive documentation. △ Less

Submitted 12 August, 2019; v1 submitted 19 September, 2018; originally announced September 2018.

Comments: Code at http://github.com/guilgautier/DPPy/ Documentation at http://dppy.readthedocs.io/

Journal ref: Journal of Machine Learning Research 20 (2019) 1-7

arXiv:1708.01679 [pdf, other]

doi 10.22152/programming-journal.org/2018/2/1

Scoped Extension Methods in Dynamically-Typed Languages

Authors: Guillermo Polito, Camille Teruel, Stéphane Ducasse, Luc Fabresse

Abstract: Context. An extension method is a method declared in a package other than the package of its host class. Thanks to extension methods, developers can adapt to their needs classes they do not own: adding methods to core classes is a typical use case. This is particularly useful for adapting software and therefore to increase reusability. Inquiry. In most dynamically-typed languages, extension meth… ▽ More Context. An extension method is a method declared in a package other than the package of its host class. Thanks to extension methods, developers can adapt to their needs classes they do not own: adding methods to core classes is a typical use case. This is particularly useful for adapting software and therefore to increase reusability. Inquiry. In most dynamically-typed languages, extension methods are globally visible. Because any developer can define extension methods for any class, naming conflicts occur: if two developers define an extension method with the same signature in the same class, only one extension method is visible and overwrites the other. Similarly, if two developers each define an extension method with the same name in a class hierarchy, one overrides the other. To avoid such "accidental overrides", some dynamically-typed languages limit the visibility of an extension method to a particular scope. However, their semantics have not been fully described and compared. In addition, these solutions typically rely on a dedicated and slow method lookup algorithm to resolve conflicts at runtime. Approach. In this article, we present a formalization of the underlying models of Ruby refinements, Groovy categories, Classboxes, and Method Shelters that are scoping extension method solutions in dynamically-typed languages. Knowledge. Our formal framework allows us to objectively compare and analyze the shortcomings of the studied solutions and other different approaches such as MultiJava. In addition, language designers can use our formal framework to determine which mechanism has less risk of "accidental overrides". Grounding. Our comparison and analysis of existing solutions is grounded because it is based on denotational semantics formalizations. Importance. Extension methods are widely used in programming languages that support them, especially dynamically-typed languages such as Pharo, Ruby or Python. However, without a carefully designed mechanism, this feature can cause insidious hidden bugs or can be voluntarily used to gain access to protected operations, violate encapsulation or break fundamental invariants. △ Less

Submitted 4 August, 2017; originally announced August 2017.

Journal ref: The Art, Science, and Engineering of Programming, 2018, Vol. 2, Issue 1, Article 1

arXiv:1509.04207 [pdf, other]

doi 10.1145/2811237.2811299

DeltaImpactFinder: Assessing Semantic Merge Conflicts with Dependency Analysis

Authors: Martín Dias, Guillermo Polito, Damien Cassou, Stéphane Ducasse

Abstract: In software development, version control systems (VCS) provide branching and merging support tools. Such tools are popular among developers to concurrently change a code-base in separate lines and reconcile their changes automatically afterwards. However, two changes that are correct independently can introduce bugs when merged together. We call semantic merge conflicts this kind of bugs. Change i… ▽ More In software development, version control systems (VCS) provide branching and merging support tools. Such tools are popular among developers to concurrently change a code-base in separate lines and reconcile their changes automatically afterwards. However, two changes that are correct independently can introduce bugs when merged together. We call semantic merge conflicts this kind of bugs. Change impact analysis (CIA) aims at estimating the effects of a change in a codebase. In this paper, we propose to detect semantic merge conflicts using CIA. On a merge, DELTAIMPACTFINDER analyzes and compares the impact of a change in its origin and destination branches. We call the difference between these two impacts the delta-impact. If the delta-impact is empty, then there is no indicator of a semantic merge conflict and the merge can continue automatically. Otherwise, the delta-impact contains what are the sources of possible conflicts. △ Less

Submitted 14 September, 2015; originally announced September 2015.

Comments: International Workshop on Smalltalk Technologies 2015, Jul 2015, Brescia, Italy

arXiv:1212.2341 [pdf, other]

Semantics and Security Issues in JavaScript

Authors: Stéphane Ducasse, Nicolas Petton, Guillermo Polito, Damien Cassou

Abstract: There is a plethora of research articles describing the deep semantics of JavaScript. Nevertheless, such articles are often difficult to grasp for readers not familiar with formal semantics. In this report, we propose a digest of the semantics of JavaScript centered around security concerns. This document proposes an overview of the JavaScript language and the misleading semantic points in its des… ▽ More There is a plethora of research articles describing the deep semantics of JavaScript. Nevertheless, such articles are often difficult to grasp for readers not familiar with formal semantics. In this report, we propose a digest of the semantics of JavaScript centered around security concerns. This document proposes an overview of the JavaScript language and the misleading semantic points in its design. The first part of the document describes the main characteristics of the language itself. The second part presents how those characteristics can lead to problems. It finishes by showing some coding patterns to avoid certain traps and presents some ECMAScript 5 new features. △ Less

Submitted 11 December, 2012; originally announced December 2012.

Comments: Deliverable Resilience FUI 12: 7.3.2.1 Failles de sécurité en JavaScript / JavaScript security issues

Showing 1–8 of 8 results for author: Polito, G