Search | arXiv e-print repository

Quo Vadis, Code Review? Exploring the Future of Code Review

Authors: Michael Dorner, Andreas Bauer, Darja Šmite, Lukas Thode, Daniel Mendez, Ricardo Britto, Stephan Lukasczyk, Ehsan Zabardast, Michael Kormann

Abstract: Code review has long been a core practice in collaborative software engineering. In this research, we explore how practitioners reflect on code review today and what changes they anticipate in the near future. We then discuss the potential long-term risks of these anticipated changes for the evolution of code review and its role in collaborative software engineering. Code review has long been a core practice in collaborative software engineering. In this research, we explore how practitioners reflect on code review today and what changes they anticipate in the near future. We then discuss the potential long-term risks of these anticipated changes for the evolution of code review and its role in collaborative software engineering. △ Less

Submitted 20 August, 2025; v1 submitted 9 August, 2025; originally announced August 2025.

arXiv:2505.13985 [pdf, ps, other]

The Capability of Code Review as a Communication Network

Authors: Michael Dorner, Daniel Mendez

Abstract: Background: Code review, a core practice in software engineering, has been widely studied as a collaborative process, with prior work suggesting it functions as a communication network. However, this theory remains untested, limiting its practical and theoretical significance. Objective: This study aims to (1) formalize the theory of code review as a communication network explicit and (2) empiri… ▽ More Background: Code review, a core practice in software engineering, has been widely studied as a collaborative process, with prior work suggesting it functions as a communication network. However, this theory remains untested, limiting its practical and theoretical significance. Objective: This study aims to (1) formalize the theory of code review as a communication network explicit and (2) empirically test its validity by quantifying how widely and how quickly information can spread in code review. Method: We replicate an in-silico experiment simulating information diffusion -- the spread of information among participants -- under best-case conditions across three open-source (Android, Visual Studio Code, React) and three closed-source code review systems (Microsoft, Spotify, Trivago) each modeled as communication network. By measuring the number of reachable participants and the minimal topological and temporal distances, we quantify how widely and how quickly information can spread through code review. Results: We demonstrate that code review can enable both wide and fast information diffusion, even at a large scale. However, this capacity varies: open-source code review spreads information faster, while closed-source review reaches more participants. Conclusion: Our findings reinforce and refine the theory, highlighting implications for measuring collaboration, generalizing open-source studies, and the role of AI in shaping future code review. △ Less

Submitted 20 May, 2025; originally announced May 2025.

Comments: arXiv admin note: text overlap with arXiv:2306.08980

arXiv:2411.09593 [pdf, other]

SMILE-UHURA Challenge -- Small Vessel Segmentation at Mesoscopic Scale from Ultra-High Resolution 7T Magnetic Resonance Angiograms

Authors: Soumick Chatterjee, Hendrik Mattern, Marc Dörner, Alessandro Sciarra, Florian Dubost, Hannes Schnurre, Rupali Khatun, Chun-Chih Yu, Tsung-Lin Hsieh, Yi-Shan Tsai, Yi-Zeng Fang, Yung-Ching Yang, Juinn-Dar Huang, Marshall Xu, Siyu Liu, Fernanda L. Ribeiro, Saskia Bollmann, Karthikesh Varma Chintalapati, Chethan Mysuru Radhakrishna, Sri Chandana Hudukula Ram Kumara, Raviteja Sutrave, Abdul Qayyum, Moona Mazher, Imran Razzak, Cristobal Rodero , et al. (23 additional authors not shown)

Abstract: The human brain receives nutrients and oxygen through an intricate network of blood vessels. Pathology affecting small vessels, at the mesoscopic scale, represents a critical vulnerability within the cerebral blood supply and can lead to severe conditions, such as Cerebral Small Vessel Diseases. The advent of 7 Tesla MRI systems has enabled the acquisition of higher spatial resolution images, maki… ▽ More The human brain receives nutrients and oxygen through an intricate network of blood vessels. Pathology affecting small vessels, at the mesoscopic scale, represents a critical vulnerability within the cerebral blood supply and can lead to severe conditions, such as Cerebral Small Vessel Diseases. The advent of 7 Tesla MRI systems has enabled the acquisition of higher spatial resolution images, making it possible to visualise such vessels in the brain. However, the lack of publicly available annotated datasets has impeded the development of robust, machine learning-driven segmentation algorithms. To address this, the SMILE-UHURA challenge was organised. This challenge, held in conjunction with the ISBI 2023, in Cartagena de Indias, Colombia, aimed to provide a platform for researchers working on related topics. The SMILE-UHURA challenge addresses the gap in publicly available annotated datasets by providing an annotated dataset of Time-of-Flight angiography acquired with 7T MRI. This dataset was created through a combination of automated pre-segmentation and extensive manual refinement. In this manuscript, sixteen submitted methods and two baseline methods are compared both quantitatively and qualitatively on two different datasets: held-out test MRAs from the same dataset as the training data (with labels kept secret) and a separate 7T ToF MRA dataset where both input volumes and labels are kept secret. The results demonstrate that most of the submitted deep learning methods, trained on the provided training dataset, achieved reliable segmentation performance. Dice scores reached up to 0.838 $\pm$ 0.066 and 0.716 $\pm$ 0.125 on the respective datasets, with an average performance of up to 0.804 $\pm$ 0.15. △ Less

Submitted 14 November, 2024; originally announced November 2024.

arXiv:2406.12553 [pdf, other]

Measuring Information Diffusion in Code Review at Spotify

Authors: Michael Dorner, Daniel Mendez, Ehsan Zabardast, Nicole Valdez, Marcin Floryan

Abstract: Background: As a core practice in software engineering, the nature of code review has been frequently subject to research. Prior exploratory studies found that code review, the discussion around a code change among humans, forms a communication network that enables its participants to exchange and spread information. Although popular in software engineering, there is no confirmatory research corro… ▽ More Background: As a core practice in software engineering, the nature of code review has been frequently subject to research. Prior exploratory studies found that code review, the discussion around a code change among humans, forms a communication network that enables its participants to exchange and spread information. Although popular in software engineering, there is no confirmatory research corroborating this theory and the actual extent of information diffusion in code review is not well understood. Objective: In this registered report, we propose an observational study to measure information diffusion in code review to test the theory of code review as communication network. Method: We approximate the information diffusion in code review through the frequency and the similarity between (1) human participants, (2) affected components, and (3) involved teams of linked code reviews. The measurements approximating the information diffusion in code review serve as a foundation for falsifying the theory of code review as communication network. △ Less

Submitted 18 June, 2024; originally announced June 2024.

arXiv:2405.11965 [pdf, ps, other]

No Free Lunch: Research Software Testing in Teaching

Authors: Michael Dorner, Andreas Bauer, Florian Angermeir

Abstract: Software is at the core of most scientific discoveries today. Therefore, the quality of research results highly depends on the quality of the research software. Rigorous testing, as we know it from software engineering in the industry, could ensure the quality of the research software but it also requires a substantial effort that is often not rewarded in academia. Therefore, this research explore… ▽ More Software is at the core of most scientific discoveries today. Therefore, the quality of research results highly depends on the quality of the research software. Rigorous testing, as we know it from software engineering in the industry, could ensure the quality of the research software but it also requires a substantial effort that is often not rewarded in academia. Therefore, this research explores the effects of research software testing integrated into teaching on research software. In an in-vivo experiment, we integrated the engineering of a test suite for a large-scale network simulation as group projects into a course on software testing at the Blekinge Institute of Technology, Sweden, and qualitatively measured the effects of this integration on the research software. We found that the research software benefited from the integration through substantially improved documentation and fewer hardware and software dependencies. However, this integration was effortful and although the student teams developed elegant and thoughtful test suites, no code by students went directly into the research software since we were not able to make the integration back into the research software obligatory or even remunerative. Although we strongly believe that integrating research software engineering such as testing into teaching is not only valuable for the research software itself but also for students, the research of the next generation, as they get in touch with research software engineering and bleeding-edge research in their field as part of their education, the uncertainty about the intellectual properties of students' code substantially limits the potential of integrating research software testing into teaching. △ Less

Submitted 19 December, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

Comments: Submitted to the Journal of Open Research Software

arXiv:2312.00925 [pdf, ps, other]

Describing Globally Distributed Software Architectures for Tax Compliance

Authors: Michael Dorner, Oliver Treidler, Tom-Eric Kunz, Ehsan Zabardast, Daniel Mendez, Darja Šmite, Maximilian Capraro, Krzysztof Wnuk

Abstract: Background: The company-internal reuse of software components owned by organizational units in different countries constitutes an implicit licensing across borders, which is taxable. This makes tax authorities a less known stakeholder in software architectures. Objective: Therefore, we investigate how software companies can describe the implicit license structure of their globally distributed soft… ▽ More Background: The company-internal reuse of software components owned by organizational units in different countries constitutes an implicit licensing across borders, which is taxable. This makes tax authorities a less known stakeholder in software architectures. Objective: Therefore, we investigate how software companies can describe the implicit license structure of their globally distributed software architectures to tax authorities. Method: We develop a viewpoint that frames the concerns of tax authorities, use this viewpoint to construct a view of a large-scale microservice architecture of a multinational enterprise, and evaluate the resulting software architecture description with a panel of four tax experts. Results: The panel found our proposed architectural viewpoint properly and sufficiently frames the concerns of taxation stakeholders. However, unclear jurisdictions of owners and potentially insufficient definitions of code ownership and software component introduce significant noise to the view that limits the usefulness and explanatory power of our software architecture description. Conclusion: While our software architecture description provides a solid foundation, we believe it only represents the tip of the iceberg. Future research is necessary to pave the way for advancements in tax compliance within software engineering. △ Less

Submitted 9 July, 2024; v1 submitted 1 December, 2023; originally announced December 2023.

Comments: submitted to EMSE

arXiv:2306.08980 [pdf, other]

doi 10.1007/s10664-024-10442-y

The Upper Bound of Information Diffusion in Code Review

Authors: Michael Dorner, Daniel Mendez, Krzysztof Wnuk, Ehsan Zabardast, Jacek Czerwonka

Abstract: Background: Code review, the discussion around a code change among humans, forms a communication network that enables its participants to exchange and spread information. Although reported by qualitative studies, our understanding of the capability of code review as a communication network is still limited. Objective: In this article, we report on a first step towards evaluating the capability of… ▽ More Background: Code review, the discussion around a code change among humans, forms a communication network that enables its participants to exchange and spread information. Although reported by qualitative studies, our understanding of the capability of code review as a communication network is still limited. Objective: In this article, we report on a first step towards evaluating the capability of code review as a communication network by quantifying how fast and how far information can spread through code review: the upper bound of information diffusion in code review. Method: In an in-silico experiment, we simulate an artificial information diffusion within large (Microsoft), mid-sized (Spotify), and small code review systems (Trivago) modelled as communication networks. We then measure the minimal topological and temporal distances between the participants to quantify how far and how fast information can spread in code review. Results: An average code review participants in the small and mid-sized code review systems can spread information to between 72% and 85% of all code review participants within four weeks independently of network size and tooling; for the large code review systems, we found an absolute boundary of about 11000 reachable participants. On average (median), information can spread between two participants in code review in less than five hops and less than five days. Conclusion: We found evidence that the communication network emerging from code review scales well and spreads information fast and broadly, corroborating the findings of prior qualitative work. The study lays the foundation for understanding and improving code review as a communication network. △ Less

Submitted 11 July, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

Comments: To appear in Empirical Software Engineering Journal

ACM Class: D.2.7; D.2.9

Journal ref: Empirical Software Engineering 30, 2 (2025)

arXiv:2304.06539 [pdf, ps, other]

doi 10.1109/MS.2023.3346646

Taxing Collaborative Software Engineering

Authors: Michael Dorner, Maximilian Capraro, Oliver Treidler, Tom-Eric Kunz, Darja Šmite, Ehsan Zabardast, Daniel Mendez, Krzysztof Wnuk

Abstract: The engineering of complex software systems is often the result of a highly collaborative effort. However, collaboration within a multinational enterprise has an overlooked legal implication when developers collaborate across national borders: It is taxable. In this article, we discuss the unsolved problem of taxing collaborative software engineering across borders. We (1) introduce the reader to… ▽ More The engineering of complex software systems is often the result of a highly collaborative effort. However, collaboration within a multinational enterprise has an overlooked legal implication when developers collaborate across national borders: It is taxable. In this article, we discuss the unsolved problem of taxing collaborative software engineering across borders. We (1) introduce the reader to the basic principle of international taxation, (2) identify three main challenges for taxing collaborative software engineering making it a software engineering problem, and (3) estimate the industrial significance of cross-border collaboration in modern software engineering by measuring cross-border code reviews at a multinational software company. △ Less

Submitted 21 November, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

Comments: 8 pages; 3 figures; submitted to IEEE Software

arXiv:2110.07291 [pdf, other]

doi 10.1145/3544902.3546254

Only Time Will Tell: Modelling Information Diffusion in Code Review with Time-Varying Hypergraphs

Authors: Michael Dorner, Darja Šmite, Daniel Mendez, Krzysztof Wnuk, Jacek Czerwonka

Abstract: Background: Modern code review is expected to facilitate knowledge sharing: All relevant information, the collective expertise, and meta-information around the code change and its context become evident, transparent, and explicit in the corresponding code review discussion. The discussion participants can leverage this information in the following code reviews; the information diffuses through the… ▽ More Background: Modern code review is expected to facilitate knowledge sharing: All relevant information, the collective expertise, and meta-information around the code change and its context become evident, transparent, and explicit in the corresponding code review discussion. The discussion participants can leverage this information in the following code reviews; the information diffuses through the communication network that emerges from code review. Traditional time-aggregated graphs fall short in rendering information diffusion as those models ignore the temporal order of the information exchange: Information can only be passed on if it is available in the first place. Aim: This manuscript presents a novel model based on time-varying hypergraphs for rendering information diffusion that overcomes the inherent limitations of traditional, time-aggregated graph-based models. Method: In an in-silico experiment, we simulate an information diffusion within the internal code review at Microsoft and show the empirical impact of time on a key characteristic of information diffusion: the number of reachable participants. Results: Time-aggregation significantly overestimates the paths of information diffusion available in communication networks and, thus, is neither precise nor accurate for modelling and measuring the spread of information within communication networks that emerge from code review. Conclusion: Our model overcomes the inherent limitations of traditional, static or time-aggregated, graph-based communication models and sheds the first light on information diffusion through code review. We believe that our model can serve as a foundation for understanding, measuring, managing, and improving knowledge sharing in code review in particular and information diffusion in software engineering in general. △ Less

Submitted 1 September, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

Comments: 10 pages, 6 figures

arXiv:2008.07753 [pdf, other]

A Replication Study on Measuring the Growth of Open Source

Authors: Michael Dorner, Maximilian Capraro, Ann Barcomb, Krzysztof Wnuk

Abstract: Context: Over the last decades, open-source software has pervaded the software industry and has become one of the key pillars in software engineering. The incomparable growth of open source reflected that pervasion: Prior work described open source as a whole to be growing linearly, polynomially, or even exponentially. Objective: In this study, we explore the long-term growth of open source and… ▽ More Context: Over the last decades, open-source software has pervaded the software industry and has become one of the key pillars in software engineering. The incomparable growth of open source reflected that pervasion: Prior work described open source as a whole to be growing linearly, polynomially, or even exponentially. Objective: In this study, we explore the long-term growth of open source and corroborating previous findings by replicating previous studies on measuring the growth of open source projects. Method: We replicate four existing measurements on the growth of open source on a sample of 172,833 open-source projects using Open Hub as the measurement system: We analyzed lines of code, commits, new projects, and the number of open-source contributors over the last 30 years in the known open-source universe. Results: We found growth of open source to be exhausted: After an initial exponential growth, all measurements show a monotonic downwards trend since its peak in 2013. None of the existing growth models could stand the test of time. Conclusion: Our results raise more questions on the growth of open source and the representativeness of Open Hub as a proxy for describing open source. We discuss multiple interpretations for our observations and encourage further research using alternative data sets. △ Less

Submitted 20 January, 2022; v1 submitted 18 August, 2020; originally announced August 2020.

Comments: 25 pages, 5 figures, submitted to IST

ACM Class: D.2.0; D.2.7; D.2.8; D.2.13; D.2.9; K.2

arXiv:2005.11214 [pdf]

PISCES-RF: a liquid-cooled high-power steady-state helicon plasma device

Authors: Saikat Chakraborty Thakur, Michael J. Simmonds, Juan F. Caneses, Fengjen Chang, Eric M. Hollmann Russell P. Doerner, Richard Goulding, Arnold Lumsdaine, Juergen Rapp, George R. Tynan

Abstract: Radio-frequency (RF) driven helicon plasma sources can produce relatively high-density plasmas (n > 10^19 m-3) at relatively moderate powers (< 2 kW) in argon. However, to produce similar high-density plasmas for fusion relevant gases such as hydrogen, deuterium and helium, much higher RF powers are needed. For very high RF powers, thermal issues of the RF-transparent dielectric window, used in th… ▽ More Radio-frequency (RF) driven helicon plasma sources can produce relatively high-density plasmas (n > 10^19 m-3) at relatively moderate powers (< 2 kW) in argon. However, to produce similar high-density plasmas for fusion relevant gases such as hydrogen, deuterium and helium, much higher RF powers are needed. For very high RF powers, thermal issues of the RF-transparent dielectric window, used in the RF source design, limit the plasma operation timescales. To mitigate this constraint, we have designed, built and tested a novel liquid-cooled RF window which allows steady state operations at high power (up to 20 kW). De-ionized (DI) water, flowing between two concentric dielectric RF windows, is used as the coolant. We show that a full azimuthal blanket of DI water does not degrade plasma production. We obtain steady-state, high-density plasmas (n > 10^19 m-3, T_e ~ 5 eV) using both argon and hydrogen. From calorimetry on the DI water, we measure the net heat that is being removed by the coolant at steady state conditions. Using infra-red (IR) imaging, we calculate the constant plasma heat deposition and measure the final steady state temperature distribution patterns on the inner surface of the ceramic layer. We find that the heat deposition pattern follows the helical shape of the antenna. We also show the consistency between the heat absorbed by the DI water, as measured by calorimetry, and the total heat due to the combined effect of the plasma heating and the absorbed RF. These results are being used to answer critical engineering questions for the 200 kW RF device (MPEX: Materials Plasma Exposure eXperiment) being designed at the Oak Ridge National Laboratory (ORNL) as a next generation plasma material interaction (PMI) device. △ Less

Submitted 29 December, 2020; v1 submitted 22 May, 2020; originally announced May 2020.

Comments: 13 pages, 22 figures

arXiv:1611.10035 [pdf, ps, other]

doi 10.1007/s40879-017-0158-0

Finsler geodesics, periodic Reeb orbits, and open books

Authors: Max Dörner, Hansjörg Geiges, Kai Zehmisch

Abstract: We survey some results on the existence (and non-existence) of periodic Reeb orbits on contact manifolds, both in the open and closed case. We place these statements in the context of Finsler geometry by including a proof of the folklore theorem that the Finsler geodesic flow can be interpreted as a Reeb flow. As a mild extension of previous results we present existence statements on periodic Reeb… ▽ More We survey some results on the existence (and non-existence) of periodic Reeb orbits on contact manifolds, both in the open and closed case. We place these statements in the context of Finsler geometry by including a proof of the folklore theorem that the Finsler geodesic flow can be interpreted as a Reeb flow. As a mild extension of previous results we present existence statements on periodic Reeb orbits on contact manifolds with suitable supporting open books. △ Less

Submitted 8 May, 2017; v1 submitted 30 November, 2016; originally announced November 2016.

Comments: 14 pages; v2: Section 3 added

MSC Class: 37J45; 37C27; 53B40; 53D25; 53D35

Journal ref: Eur. J. Math. 3 (2017), 1058-1075

arXiv:1210.7947 [pdf, ps, other]

doi 10.1093/qmath/hat055

Open books and the Weinstein conjecture

Authors: Max Dörner, Hansjörg Geiges, Kai Zehmisch

Abstract: We show the existence of a contractible periodic Reeb orbit for any contact structure supported by an open book whose binding can be realised as a hypersurface of restricted contact type in a subcritical Stein manifold. A key ingredient in the proof is a higher-dimensional version of Eliashberg's theorem about symplectic cobordisms from a contact manifold to a symplectic fibration. We show the existence of a contractible periodic Reeb orbit for any contact structure supported by an open book whose binding can be realised as a hypersurface of restricted contact type in a subcritical Stein manifold. A key ingredient in the proof is a higher-dimensional version of Eliashberg's theorem about symplectic cobordisms from a contact manifold to a symplectic fibration. △ Less

Submitted 30 October, 2012; originally announced October 2012.

Comments: 16 pages, 2 figures

MSC Class: 53D35; 37C27; 37J45; 57R17

Journal ref: Q. J. Math. 65 (2014), 869-885

Showing 1–13 of 13 results for author: Dörner, M