-
Snaps: Bloated and Outdated?
Authors:
Jukka Ruohonen,
Qusai Ramadan
Abstract:
Snap is an alternative software packaging system developed by Canonical and provided by default in the Ubuntu Linux distribution. Given the heterogeneity of various Linux distributions and their various releases, Snap allows an interoperable delivery of software directly to users. However, concerns and criticism have also been frequently expressed. Regarding this criticism, the paper shows that cu…
▽ More
Snap is an alternative software packaging system developed by Canonical and provided by default in the Ubuntu Linux distribution. Given the heterogeneity of various Linux distributions and their various releases, Snap allows an interoperable delivery of software directly to users. However, concerns and criticism have also been frequently expressed. Regarding this criticism, the paper shows that currently distributed snap packages are indeed on average bloated in terms of their sizes and outdated in terms updating frequencies. With these empirical observations, this short paper contributes to the research domain of software packaging, software packages, and package managers.
△ Less
Submitted 1 July, 2025;
originally announced July 2025.
-
An Alignment Between the CRA's Essential Requirements and the ATT&CK's Mitigations
Authors:
Jukka Ruohonen,
Eun-Young Kang,
Qusai Ramadan
Abstract:
The paper presents an alignment evaluation between the mitigations present in the MITRE's ATT&CK framework and the essential cyber security requirements of the recently introduced Cyber Resilience Act (CRA) in the European Union. In overall, the two align well with each other. With respect to the CRA, there are notable gaps only in terms of data minimization, data erasure, and vulnerability coordi…
▽ More
The paper presents an alignment evaluation between the mitigations present in the MITRE's ATT&CK framework and the essential cyber security requirements of the recently introduced Cyber Resilience Act (CRA) in the European Union. In overall, the two align well with each other. With respect to the CRA, there are notable gaps only in terms of data minimization, data erasure, and vulnerability coordination. In terms of the ATT&CK framework, gaps are present only in terms of threat intelligence, training, out-of-band communication channels, and residual risks. The evaluation presented contributes to narrowing of a common disparity between law and technical frameworks.
△ Less
Submitted 19 May, 2025;
originally announced May 2025.
-
Tracing Vulnerability Propagation Across Open Source Software Ecosystems
Authors:
Jukka Ruohonen,
Qusai Ramadan
Abstract:
The paper presents a traceability analysis of how over 84 thousand vulnerabilities have propagated across 28 open source software ecosystems. According to the results, the propagation sequences have been complex in general, although GitHub, Debian, and Ubuntu stand out. Furthermore, the associated propagation delays have been lengthy, and these do not correlate well with the number of ecosystems i…
▽ More
The paper presents a traceability analysis of how over 84 thousand vulnerabilities have propagated across 28 open source software ecosystems. According to the results, the propagation sequences have been complex in general, although GitHub, Debian, and Ubuntu stand out. Furthermore, the associated propagation delays have been lengthy, and these do not correlate well with the number of ecosystems involved in the associated sequences. Nor does the presence or absence of particularly ecosystems in the sequences yield clear, interpretable patterns. With these results, the paper contributes to the overlapping knowledge bases about software ecosystems, traceability, and vulnerabilities.
△ Less
Submitted 7 May, 2025;
originally announced May 2025.
-
A Time Series Analysis of Malware Uploads to Programming Language Ecosystems
Authors:
Jukka Ruohonen,
Mubashrah Saddiqa
Abstract:
Software ecosystems built around programming languages have greatly facilitated software development. At the same time, their security has increasingly been acknowledged as a problem. To this end, the paper examines the previously overlooked longitudinal aspects of software ecosystem security, focusing on malware uploaded to six popular programming language ecosystems. The dataset examined is base…
▽ More
Software ecosystems built around programming languages have greatly facilitated software development. At the same time, their security has increasingly been acknowledged as a problem. To this end, the paper examines the previously overlooked longitudinal aspects of software ecosystem security, focusing on malware uploaded to six popular programming language ecosystems. The dataset examined is based on the new Open Source Vulnerabilities (OSV) database. According to the results, records about detected malware uploads in the database have recently surpassed those addressing vulnerabilities in packages distributed in the ecosystems. In the early 2025 even up to 80% of all entries in the OSV have been about malware. Regarding time series analysis of malware frequencies and their shares to all database entries, good predictions are available already by relatively simple autoregressive models using the numbers of ecosystems, security advisories, and media and other articles as predictors. With these results and the accompanying discussion, the paper improves and advances the understanding of the thus far overlooked longitudinal aspects of ecosystems and malware.
△ Less
Submitted 22 April, 2025;
originally announced April 2025.
-
From Cyber Security Incident Management to Cyber Security Crisis Management in the European Union
Authors:
Jukka Ruohonen,
Kalle Rindell,
Simone Busetti
Abstract:
Incident management is a classical topic in cyber security. Recently, the European Union (EU) has started to consider also the relation between cyber security incidents and cyber security crises. These considerations and preparations, including those specified in the EU's new cyber security laws, constitute the paper's topic. According to an analysis of the laws and associated policy documents, (i…
▽ More
Incident management is a classical topic in cyber security. Recently, the European Union (EU) has started to consider also the relation between cyber security incidents and cyber security crises. These considerations and preparations, including those specified in the EU's new cyber security laws, constitute the paper's topic. According to an analysis of the laws and associated policy documents, (i) cyber security crises are equated in the EU to large-scale cyber security incidents that either exceed a handling capacity of a single member state or affect at least two member states. For this and other purposes, (ii) the new laws substantially increase mandatory reporting about cyber security incidents, including but not limited to the large-scale incidents. Despite the laws and new governance bodies established by them, however, (iii) the working of actual cyber security crisis management remains unclear particularly at the EU-level. With these policy research results, the paper advances the domain of cyber security incident management research by elaborating how European law perceives cyber security crises and their relation to cyber security incidents, paving the way for many relevant further research topics with practical relevance, whether theoretical, conceptual, or empirical.
△ Less
Submitted 19 April, 2025;
originally announced April 2025.
-
A Scenario Analysis of Ethical Issues in Dark Patterns and Their Research
Authors:
Jukka Ruohonen,
Jani Koskinen,
Søren Harnow Klausen,
Anne Gerdes
Abstract:
Context: Dark patterns are user interface or other software designs that deceive or manipulate users to do things they would not otherwise do. Even though dark patterns have been under active research for a long time, including particularly in computer science but recently also in other fields such as law, systematic applied ethical assessments have generally received only a little attention. Obje…
▽ More
Context: Dark patterns are user interface or other software designs that deceive or manipulate users to do things they would not otherwise do. Even though dark patterns have been under active research for a long time, including particularly in computer science but recently also in other fields such as law, systematic applied ethical assessments have generally received only a little attention. Objective: The present work evaluates ethical concerns in dark patterns and their research in software engineering and closely associated disciplines. The evaluation is extended to cover not only dark patterns themselves but also the research ethics and applied ethics involved in studying, developing, and deploying them. Method: A scenario analysis is used to evaluate six theoretical dark pattern scenarios. The ethical evaluation is carried out by focusing on the three main branches of normative ethics; utilitarianism, deontology, and virtue ethics. In terms of deontology, the evaluation is framed and restricted to the laws enacted in the European Union. Results: The evaluation results indicate that dark patterns are not universally morally bad. That said, numerous ethical issues with practical relevance are demonstrated and elaborated. Some of these may have societal consequences. Conclusion: Dark patterns are ethically problematic but not always. Therefore, ethical assessments are necessary. The two main theoretical concepts behind dark patterns, deception and manipulation, lead to various issues also in research ethics. It can be recommended that dark patterns should be evaluated on case-by-case basis, considering all of the three main branches of normative ethics in an evaluation. Analogous points apply to legal evaluations, especially when considering that the real or perceived harms caused by dark patterns cover both material and non-material harms to natural persons.
△ Less
Submitted 3 March, 2025;
originally announced March 2025.
-
A Mapping Analysis of Requirements Between the CRA and the GDPR
Authors:
Jukka Ruohonen,
Kalle Hjerppe,
Eun-Young Kang
Abstract:
A new Cyber Resilience Act (CRA) was recently agreed upon in the European Union (EU). The paper examines and elaborates what new requirements the CRA entails by contrasting it with the older General Data Protection Regulation (GDPR). According to the results, there are overlaps in terms confidentiality, integrity, and availability guarantees, data minimization, traceability, data erasure, and secu…
▽ More
A new Cyber Resilience Act (CRA) was recently agreed upon in the European Union (EU). The paper examines and elaborates what new requirements the CRA entails by contrasting it with the older General Data Protection Regulation (GDPR). According to the results, there are overlaps in terms confidentiality, integrity, and availability guarantees, data minimization, traceability, data erasure, and security testing. The CRA's seven new essential requirements originate from obligations to (1) ship products without known exploitable vulnerabilities and (2) with secure defaults, to (3) provide security patches typically for a minimum of five years, to (4) minimize attack surfaces, to (5) develop and enable exploitation mitigation techniques, to (6) establish a software bill of materials (SBOM), and to (7) improve vulnerability coordination, including a mandate to establish a coordinated vulnerability disclosure policy. With these results and an accompanying discussion, the paper contributes to requirements engineering research specialized into legal requirements, demonstrating how new laws may affect existing requirements.
△ Less
Submitted 3 March, 2025;
originally announced March 2025.
-
The Popularity Hypothesis in Software Security: A Large-Scale Replication with PHP Packages
Authors:
Jukka Ruohonen,
Qusai Ramadan
Abstract:
There has been a long-standing hypothesis that a software's popularity is related to its security or insecurity in both research and popular discourse. There are also a few empirical studies that have examined the hypothesis, either explicitly or implicitly. The present work continues with and contributes to this research with a replication-motivated large-scale analysis of software written in the…
▽ More
There has been a long-standing hypothesis that a software's popularity is related to its security or insecurity in both research and popular discourse. There are also a few empirical studies that have examined the hypothesis, either explicitly or implicitly. The present work continues with and contributes to this research with a replication-motivated large-scale analysis of software written in the PHP programming language. The dataset examined contains nearly four hundred thousand open source software packages written in PHP. According to the results based on reported security vulnerabilities, the hypothesis does holds; packages having been affected by vulnerabilities over their release histories are generally more popular than packages without having been affected by a single vulnerability. With this replication results, the paper contributes to the efforts to strengthen the empirical knowledge base in cyber and software security.
△ Less
Submitted 11 June, 2025; v1 submitted 23 February, 2025;
originally announced February 2025.
-
Early Perspectives on the Digital Europe Programme
Authors:
Jukka Ruohonen,
Paul Timmers
Abstract:
A new Digital Europe Programme (DEP), a funding instrument for development and innovation, was established in the European Union (EU) in 2021. The paper makes an empirical inquiry into the projects funded through the DEP. According to the results, the projects align well with the DEP's strategic focus on cyber security, artificial intelligence, high-performance computing, innovation hubs, small- a…
▽ More
A new Digital Europe Programme (DEP), a funding instrument for development and innovation, was established in the European Union (EU) in 2021. The paper makes an empirical inquiry into the projects funded through the DEP. According to the results, the projects align well with the DEP's strategic focus on cyber security, artificial intelligence, high-performance computing, innovation hubs, small- and medium-sized enterprises, and education. Most of the projects have received an equal amount of national and EU funding. Although national origins of participating organizations do not explain the amounts of funding granted, there is a rather strong tendency for national organizations to primarily collaborate with other national organizations. Finally, information about the technological domains addressed and the economic sectors involved provides decent explanatory power for statistically explaining the funding amounts granted. With these results and the accompanying discussion, the paper contributes to the timely debate about innovation, technology development, and industrial policy in Europe.
△ Less
Submitted 6 January, 2025;
originally announced January 2025.
-
A Time Series Analysis of Assertions in the Linux Kernel
Authors:
Jukka Ruohonen
Abstract:
Assertions are a classical and typical software development technique. These are extensively used also in operating systems and their kernels, including the Linux kernel. The paper patches a gap in existing knowledge by empirically examining the longitudinal evolution of assertion use in the Linux kernel. According to the results, the use of assertions that cause a kernel panic has slightly but no…
▽ More
Assertions are a classical and typical software development technique. These are extensively used also in operating systems and their kernels, including the Linux kernel. The paper patches a gap in existing knowledge by empirically examining the longitudinal evolution of assertion use in the Linux kernel. According to the results, the use of assertions that cause a kernel panic has slightly but not substantially decreased from the kernel's third to the sixth release series. At the same time, however, the use of softer assertion variants has increased; these do not cause a panic by default but instead produce warnings. With these time series results, the paper contributes to the existing but limited empirical knowledge base about operating system kernels and their long-term evolution.
△ Less
Submitted 27 December, 2024;
originally announced December 2024.
-
SoK: The Design Paradigm of Safe and Secure Defaults
Authors:
Jukka Ruohonen
Abstract:
In security engineering, including software security engineering, there is a well-known design paradigm telling to prefer safe and secure defaults. The paper presents a systematization of knowledge (SoK) of this paradigm by the means of a systematic mapping study and a scoping review of relevant literature. According to the mapping and review, the paradigm has been extensively discussed, used, and…
▽ More
In security engineering, including software security engineering, there is a well-known design paradigm telling to prefer safe and secure defaults. The paper presents a systematization of knowledge (SoK) of this paradigm by the means of a systematic mapping study and a scoping review of relevant literature. According to the mapping and review, the paradigm has been extensively discussed, used, and developed further since the late 1990s. Partially driven by the insecurity of the Internet of things, the volume of publications has accelerated from the circa mid-2010s onward. The publications reviewed indicate that the paradigm has been adopted in numerous different contexts. It has also been expanded with security design principles not originally considered when the paradigm was initiated in the mid-1970s. Among the newer principles are an "off by default" principle, various overriding and fallback principles, as well as those related to the zero trust model. The review also indicates problems developers and others have faced with the paradigm.
△ Less
Submitted 26 February, 2025; v1 submitted 23 December, 2024;
originally announced December 2024.
-
A Systematic Literature Review on the NIS2 Directive
Authors:
Jukka Ruohonen
Abstract:
A directive known as NIS2 was enacted in the European Union (EU) in late 2022. It deals particularly with European critical infrastructures, enlarging their scope substantially from an older directive that only considered the energy and transport sectors as critical. The directive's focus is on cyber security of critical infrastructures, although together with other new EU laws it expands to other…
▽ More
A directive known as NIS2 was enacted in the European Union (EU) in late 2022. It deals particularly with European critical infrastructures, enlarging their scope substantially from an older directive that only considered the energy and transport sectors as critical. The directive's focus is on cyber security of critical infrastructures, although together with other new EU laws it expands to other security domains as well. Given the importance of the directive and most of all the importance of critical infrastructures, the paper presents a systematic literature review on academic research addressing the NIS2 directive either explicitly or implicitly. According to the review, existing research has often framed and discussed the directive with the EU's other cyber security laws. In addition, existing research has often operated in numerous contextual areas, including industrial control systems, telecommunications, the energy and water sectors, and infrastructures for information sharing and situational awareness. Despite the large scope of existing research, the review reveals noteworthy research gaps and worthwhile topics to examine in further research.
△ Less
Submitted 10 December, 2024;
originally announced December 2024.
-
Vulnerability Coordination Under the Cyber Resilience Act
Authors:
Jukka Ruohonen,
Paul Timmers
Abstract:
A new Cyber Resilience Act (CRA) was recently agreed upon in the European Union (EU). It imposes many new cyber security requirements practically to all information technology products, whether hardware or software. The paper examines and elaborates the CRA's new requirements for vulnerability coordination, including vulnerability disclosure. Although these requirements are only a part of the CRA'…
▽ More
A new Cyber Resilience Act (CRA) was recently agreed upon in the European Union (EU). It imposes many new cyber security requirements practically to all information technology products, whether hardware or software. The paper examines and elaborates the CRA's new requirements for vulnerability coordination, including vulnerability disclosure. Although these requirements are only a part of the CRA's obligations for vendors, also some new vulnerability coordination mandates are present, including particularly with respect to so-called actively exploited vulnerabilities. The CRA further alters the coordination practices on the side of public administrations. With the examination, elaboration, and associated discussion, the paper contributes to the study of cyber security regulations, providing also a few practical takeaways.
△ Less
Submitted 8 March, 2025; v1 submitted 9 December, 2024;
originally announced December 2024.
-
An Overview of Cyber Security Funding for Open Source Software
Authors:
Jukka Ruohonen,
Gaurav Choudhary,
Adam Alami
Abstract:
Context: Many open source software (OSS) projects need more human resources for maintenance, improvements, and sometimes even their survival. This need allegedly applies even to vital OSS projects that can be seen as being a part of the world's critical infrastructures. To address this resourcing problem, new funding instruments for OSS projects have been established in recent years. Objectives: T…
▽ More
Context: Many open source software (OSS) projects need more human resources for maintenance, improvements, and sometimes even their survival. This need allegedly applies even to vital OSS projects that can be seen as being a part of the world's critical infrastructures. To address this resourcing problem, new funding instruments for OSS projects have been established in recent years. Objectives: The paper examines two such funding bodies for OSS and the projects they have funded. The focus of both funding bodies is on software security and cyber security in general. Methods: The methodology is based on qualitative thematic analysis. Results: Particularly OSS supply chains, network and cryptography libraries, programming languages, and operating systems and their low-level components have been funded and thus seen as critical in terms of cyber security by the two funding bodies. Conclusions: In addition to the qualitative results presented, the paper makes a contribution by connecting the research branches of critical infrastructure and sustainability of OSS projects. A further contribution is made by connecting the topic examined to recent cyber security regulations. Finally, an important argument is raised that neither cyber security nor sustainability alone can entirely explain the rationales behind the funding decisions made by the two bodies.
△ Less
Submitted 29 April, 2025; v1 submitted 8 December, 2024;
originally announced December 2024.
-
On Algorithmic Fairness and the EU Regulations
Authors:
Jukka Ruohonen
Abstract:
The short paper discusses algorithmic fairness by focusing on non-discrimination and a few important laws in the European Union (EU). In addition to the EU laws addressing discrimination explicitly, the discussion is based on the EU's recently enacted regulation for artificial intelligence (AI) and the older General Data Protection Regulation (GDPR). Through a theoretical scenario analysis, on one…
▽ More
The short paper discusses algorithmic fairness by focusing on non-discrimination and a few important laws in the European Union (EU). In addition to the EU laws addressing discrimination explicitly, the discussion is based on the EU's recently enacted regulation for artificial intelligence (AI) and the older General Data Protection Regulation (GDPR). Through a theoretical scenario analysis, on one hand, the paper demonstrates that correcting discriminatory biases in AI systems can be legally done under the EU regulations. On the other hand, the scenarios also illustrate some practical scenarios from which legal non-compliance may follow. With these scenarios and the accompanying discussion, the paper contributes to the algorithmic fairness research with a few legal insights, enlarging and strengthening also the growing research domain of compliance in AI engineering.
△ Less
Submitted 21 December, 2024; v1 submitted 13 November, 2024;
originally announced November 2024.
-
Fast Fixes and Faulty Drivers: An Empirical Analysis of Regression Bug Fixing Times in the Linux Kernel
Authors:
Jukka Ruohonen,
Adam Alami
Abstract:
Regression bugs refer to situations in which something that worked previously no longer works currently. Such bugs have been pronounced in the Linux kernel. The paper focuses on regression bug tracking in the kernel by considering the time required to fix regression bugs. The dataset examined is based on the regzbot automation framework for tracking regressions in the Linux kernel. According to th…
▽ More
Regression bugs refer to situations in which something that worked previously no longer works currently. Such bugs have been pronounced in the Linux kernel. The paper focuses on regression bug tracking in the kernel by considering the time required to fix regression bugs. The dataset examined is based on the regzbot automation framework for tracking regressions in the Linux kernel. According to the results, (i) regression bug fixing times have been faster than previously reported; between 2021 and 2024, on average, it has taken less than a month to fix regression bugs. It is further evident that (ii) device drivers constitute the most prone subsystem for regression bugs, and also the fixing times vary across the kernel's subsystems. Although (iii) most commits fixing regression bugs have been reviewed, tested, or both, the kernel's code reviewing and manual testing practices do not explain the fixing times. Likewise, (iv) there is only a weak signal that code churn might contribute to explaining the fixing times statistically. Finally, (v) some subsystems exhibit strong effects for explaining the bug fixing times statistically, although overall statistical performance is modest but not atypical to the research domain. With these empirical results, the paper contributes to the efforts to better understand software regressions and their tracking in the Linux kernel.
△ Less
Submitted 4 November, 2024;
originally announced November 2024.
-
An Empirical Study of Vulnerability Handling Times in CPython
Authors:
Jukka Ruohonen
Abstract:
The paper examines the handling times of software vulnerabilities in CPython, the reference implementation and interpreter for the today's likely most popular programming language, Python. The background comes from the so-called vulnerability life cycle analysis, the literature on bug fixing times, and the recent research on security of Python software. Based on regression analysis, the associated…
▽ More
The paper examines the handling times of software vulnerabilities in CPython, the reference implementation and interpreter for the today's likely most popular programming language, Python. The background comes from the so-called vulnerability life cycle analysis, the literature on bug fixing times, and the recent research on security of Python software. Based on regression analysis, the associated vulnerability fixing times can be explained very well merely by knowing who have reported the vulnerabilities. Severity, proof-of-concept code, commits made to a version control system, comments posted on a bug tracker, and references to other sources do not explain the vulnerability fixing times. With these results, the paper contributes to the recent effort to better understand security of the Python ecosystem.
△ Less
Submitted 25 May, 2025; v1 submitted 1 November, 2024;
originally announced November 2024.
-
The Potential of Citizen Platforms for Requirements Engineering of Large Socio-Technical Software Systems
Authors:
Jukka Ruohonen,
Kalle Hjerppe
Abstract:
Participatory citizen platforms are innovative solutions to digitally better engage citizens in policy-making and deliberative democracy in general. Although these platforms have been used also in an engineering context, thus far, there is no existing work for connecting the platforms to requirements engineering. The present paper fills this notable gap. In addition to discussing the platforms in…
▽ More
Participatory citizen platforms are innovative solutions to digitally better engage citizens in policy-making and deliberative democracy in general. Although these platforms have been used also in an engineering context, thus far, there is no existing work for connecting the platforms to requirements engineering. The present paper fills this notable gap. In addition to discussing the platforms in conjunction with requirements engineering, the paper elaborates potential advantages and disadvantages, thus paving the way for a future pilot study in a software engineering context. With these engineering tenets, the paper also contributes to the research of large socio-technical software systems in a public sector context, including their implementation and governance.
△ Less
Submitted 3 April, 2025; v1 submitted 4 October, 2024;
originally announced October 2024.
-
A Static Analysis of Popular C Packages in Linux
Authors:
Jukka Ruohonen,
Mubashrah Saddiqa,
Krzysztof Sierszecki
Abstract:
Static analysis is a classical technique for improving software security and software quality in general. Fairly recently, a new static analyzer was implemented in the GNU Compiler Collection (GCC). The present paper uses the GCC's analyzer to empirically examine popular Linux packages. The dataset used is based on those packages in the Gentoo Linux distribution that are either written in C or con…
▽ More
Static analysis is a classical technique for improving software security and software quality in general. Fairly recently, a new static analyzer was implemented in the GNU Compiler Collection (GCC). The present paper uses the GCC's analyzer to empirically examine popular Linux packages. The dataset used is based on those packages in the Gentoo Linux distribution that are either written in C or contain C code. In total, $3,538$ such packages are covered. According to the results, uninitialized variables and NULL pointer dereference issues are the most common problems according to the analyzer. Classical memory management issues are relatively rare. The warnings also follow a long-tailed probability distribution across the packages; a few packages are highly warning-prone, whereas no warnings are present for as much as 89% of the packages. Furthermore, the warnings do not vary across different application domains. With these results, the paper contributes to the domain of large-scale empirical research on software quality and security. In addition, a discussion is presented about practical implications of the results.
△ Less
Submitted 27 September, 2024;
originally announced September 2024.
-
What Do We Know About the Psychology of Insider Threats?
Authors:
Jukka Ruohonen,
Mubashrah Saddiqa
Abstract:
Insider threats refer to threats originating from people inside organizations. Although such threats are a classical research topic, the systematization of existing knowledge is still limited particularly with respect to non-technical research approaches. To this end, this paper presents a systematic literature review on the psychology of insider threats. According to the review results, the liter…
▽ More
Insider threats refer to threats originating from people inside organizations. Although such threats are a classical research topic, the systematization of existing knowledge is still limited particularly with respect to non-technical research approaches. To this end, this paper presents a systematic literature review on the psychology of insider threats. According to the review results, the literature has operated with multiple distinct theories but there is still a lack of robust theorization with respect to psychology. The literature has also considered characteristics of a person, his or her personal situation, and other more or less objective facts about the person. These are seen to correlate with psychological concepts such as personality traits and psychological states of a person. In addition, the review discusses gaps and limitations in the existing research, thus opening the door for further psychology research.
△ Less
Submitted 24 May, 2025; v1 submitted 8 July, 2024;
originally announced July 2024.
-
Crisis Communication in the Face of Data Breaches
Authors:
Jukka Ruohonen,
Kalle Hjerppe,
Katleena Kortesuo
Abstract:
Data breaches refer to unauthorized accesses to data. Typically but not always, data breaches are about cyber crime. An organization facing such a crime is often also in a crisis situation. Therefore, organizations should prepare also for data breaches in their crisis management procedures. These procedures should include also crisis communication plans. To this end, this paper examines data breac…
▽ More
Data breaches refer to unauthorized accesses to data. Typically but not always, data breaches are about cyber crime. An organization facing such a crime is often also in a crisis situation. Therefore, organizations should prepare also for data breaches in their crisis management procedures. These procedures should include also crisis communication plans. To this end, this paper examines data breach crisis communication strategies and their practical executions. The background comes from the vibrant crisis communication research domain. According to a few qualitative case studies from Finland, the conventional wisdom holds well; the successful cases indicate communicating early, taking responsibility, offering an apology, and notifying public authorities. The unsuccessful cases show varying degrees of the reverse, including shifting of blame, positioning of an organization as a victim, and failing to notify public authorities. With these qualitative insights, the paper contributes to the research domain by focusing specifically on data breach crises, their peculiarities, and their management, including with respect to European regulations that have been neglected in existing crisis communication research.
△ Less
Submitted 3 October, 2024; v1 submitted 3 June, 2024;
originally announced June 2024.
-
The Incoherency Risk in the EU's New Cyber Security Policies
Authors:
Jukka Ruohonen
Abstract:
The European Union (EU) has been pursuing new cyber security policies in recent years. This paper presents a short reflection of four such policies. The focus is on potential incoherency, meaning a lack of integration, divergence between the member states, institutional dysfunction, and other related problems that should be at least partially avoidable by sound policy-making. According to the resu…
▽ More
The European Union (EU) has been pursuing new cyber security policies in recent years. This paper presents a short reflection of four such policies. The focus is on potential incoherency, meaning a lack of integration, divergence between the member states, institutional dysfunction, and other related problems that should be at least partially avoidable by sound policy-making. According to the results, the four policies have substantially increased the complexity of the EU's cyber security framework. In addition, there are potential problems with trust, divergence between industry sectors and different technologies, bureaucratic conflicts, and technical issues, among other things. With these insights, the paper not only contributes to the study of EU policies but also advances the understanding of cyber security policies in general.
△ Less
Submitted 27 September, 2024; v1 submitted 20 May, 2024;
originally announced May 2024.
-
An Exploratory Case Study on Data Breach Journalism
Authors:
Jukka Ruohonen,
Kalle Hjerppe,
Maximilian von Zastrow
Abstract:
This paper explores the novel topic of data breach journalism and data breach news through the case of databreaches.net, a news outlet dedicated to data breaches and related cyber crime. Motivated by the issues in traditional crime news and crime journalism, the case is explored by the means of text mining. According to the results, the outlet has kept a steady publishing pace, mainly focusing on…
▽ More
This paper explores the novel topic of data breach journalism and data breach news through the case of databreaches.net, a news outlet dedicated to data breaches and related cyber crime. Motivated by the issues in traditional crime news and crime journalism, the case is explored by the means of text mining. According to the results, the outlet has kept a steady publishing pace, mainly focusing on plain and short reporting but with generally high-quality source material for the news articles. Despite these characteristics, the news articles exhibit fairly strong sentiments, which is partially expected due to the presence of emotionally laden crime and the long history of sensationalism in crime news. The news site has also covered the full scope of data breaches, although many of these are fairly traditional, exposing personal identifiers and financial details of the victims. Also hospitals and the healthcare sector stand out. With these results, the paper advances the study of data breaches by considering these from the perspective of media and journalism.
△ Less
Submitted 27 July, 2024; v1 submitted 2 May, 2024;
originally announced May 2024.
-
A Note on the Proposed Law for Improving the Transparency of Political Advertising in the European Union
Authors:
Jukka Ruohonen
Abstract:
There is an increasing supply and demand for political advertising throughout the world. At the same time, societal threats, such as election interference by foreign governments and other bad actors, continues to be a pressing concern in many democracies. Furthermore, manipulation of electoral outcomes, whether by foreign or domestic forces, continues to be a concern of many citizens who are also…
▽ More
There is an increasing supply and demand for political advertising throughout the world. At the same time, societal threats, such as election interference by foreign governments and other bad actors, continues to be a pressing concern in many democracies. Furthermore, manipulation of electoral outcomes, whether by foreign or domestic forces, continues to be a concern of many citizens who are also worried about their fundamental rights. To these ends, the European Union (EU) has launched several initiatives for tackling the issues. A new regulation was proposed in 2020 also for improving the transparency of political advertising in the union. This short commentary reviews the regulation proposed and raises a few points about its limitations and potential impacts.
△ Less
Submitted 1 November, 2023; v1 submitted 5 March, 2023;
originally announced March 2023.
-
Reflections on the Data Governance Act
Authors:
Jukka Ruohonen,
Sini Mickelsson
Abstract:
The European Union (EU) has been pursuing a new strategy under the umbrella label of digital sovereignty. Data is an important element in this strategy. To this end, a specific Data Governance Act was enacted in 2022. This new regulation builds upon two ideas: reuse of data held by public sector bodies and voluntary sharing of data under the label of data altruism. This short commentary reviews th…
▽ More
The European Union (EU) has been pursuing a new strategy under the umbrella label of digital sovereignty. Data is an important element in this strategy. To this end, a specific Data Governance Act was enacted in 2022. This new regulation builds upon two ideas: reuse of data held by public sector bodies and voluntary sharing of data under the label of data altruism. This short commentary reviews the main content of the new regulation. Based on the review, a few points are also raised about potential challenges.
△ Less
Submitted 29 March, 2023; v1 submitted 20 February, 2023;
originally announced February 2023.
-
Recent Trends in Cross-Border Data Access by Law Enforcement Agencies
Authors:
Jukka Ruohonen
Abstract:
Access to online data has long been important for law enforcement agencies in their collection of electronic evidence and investigation of crimes. These activities have also long involved cross-border investigations and international cooperation between agencies and jurisdictions. However, technological advances such as cloud computing have complicated the investigations and cooperation arrangemen…
▽ More
Access to online data has long been important for law enforcement agencies in their collection of electronic evidence and investigation of crimes. These activities have also long involved cross-border investigations and international cooperation between agencies and jurisdictions. However, technological advances such as cloud computing have complicated the investigations and cooperation arrangements. Therefore, several new laws have been passed and proposed both in the United States and the European Union for facilitating cross-border crime investigations in the context of cloud computing. These new laws and proposals have also brought many new legal challenges and controversies regarding extraterritoriality, data protection, privacy, and surveillance. With these challenges in mind and with a focus on Europe, this paper reviews the recent trends and policy initiatives for cross-border data access by law enforcement agencies.
△ Less
Submitted 20 September, 2023; v1 submitted 20 February, 2023;
originally announced February 2023.
-
A Text Mining Analysis of Data Protection Politics: The Case of Plenary Sessions of the European Parliament
Authors:
Jukka Ruohonen
Abstract:
Data protection laws and policies have been studied extensively in recent years, but little is known about the parliamentary politics of data protection. This imitation applies even to the European Union (EU) that has taken the global lead in data protection and privacy regulation. For patching this notable gap in existing research, this paper explores the data protection questions raised by the M…
▽ More
Data protection laws and policies have been studied extensively in recent years, but little is known about the parliamentary politics of data protection. This imitation applies even to the European Union (EU) that has taken the global lead in data protection and privacy regulation. For patching this notable gap in existing research, this paper explores the data protection questions raised by the Members of the European Parliament (MEPs) in the Parliament's plenary sessions and the answers given to these by the European Commission. Over a thousand of such questions and answers are covered in a period from 1995 to early 2023. Given computational analysis based on text mining, the results indicate that (a) data protection has been actively debated in the Parliament during the past twenty years. No noticeable longitudinal trends are present; the debates have been relatively constant. As could be expected, (b) the specific data protection laws in the EU have frequently been referenced in these debates, which (c) do not seem to align along conventional political dimensions such as the left-right axis. Furthermore, (d) numerous distinct data protection topics have been debated by the parliamentarians, indicating that data protection politics in the EU go well-beyond the recently enacted regulations.
△ Less
Submitted 20 February, 2023;
originally announced February 2023.
-
Mysterious and Manipulative Black Boxes: A Qualitative Analysis of Perceptions on Recommender Systems
Authors:
Jukka Ruohonen
Abstract:
Recommender systems are used to provide relevant suggestions on various matters. Although these systems are a classical research topic, knowledge is still limited regarding the public opinion about these systems. Public opinion is also important because the systems are known to cause various problems. To this end, this paper presents a qualitative analysis of the perceptions of ordinary citizens,…
▽ More
Recommender systems are used to provide relevant suggestions on various matters. Although these systems are a classical research topic, knowledge is still limited regarding the public opinion about these systems. Public opinion is also important because the systems are known to cause various problems. To this end, this paper presents a qualitative analysis of the perceptions of ordinary citizens, civil society groups, businesses, and others on recommender systems in Europe. The dataset examined is based on the answers submitted to a consultation about the Digital Services Act (DSA) recently enacted in the European Union (EU). Therefore, not only does the paper contribute to the pressing question about regulating new technologies and online platforms, but it also reveals insights about the policy-making of the DSA. According to the qualitative results, Europeans have generally negative opinions about recommender systems and the quality of their recommendations. The systems are widely seen to violate privacy and other fundamental rights. According to many Europeans, these also cause various societal problems, including even threats to democracy. Furthermore, existing regulations in the EU are commonly seen to have failed due to a lack of proper enforcement. Numerous suggestions were made by the respondents to the consultation for improving the situation, but only a few of these ended up to the DSA.
△ Less
Submitted 13 June, 2024; v1 submitted 20 February, 2023;
originally announced February 2023.
-
A Large-Scale Security-Oriented Static Analysis of Python Packages in PyPI
Authors:
Jukka Ruohonen,
Kalle Hjerppe,
Kalle Rindell
Abstract:
Different security issues are a common problem for open source packages archived to and delivered through software ecosystems. These often manifest themselves as software weaknesses that may lead to concrete software vulnerabilities. This paper examines various security issues in Python packages with static analysis. The dataset is based on a snapshot of all packages stored to the Python Package I…
▽ More
Different security issues are a common problem for open source packages archived to and delivered through software ecosystems. These often manifest themselves as software weaknesses that may lead to concrete software vulnerabilities. This paper examines various security issues in Python packages with static analysis. The dataset is based on a snapshot of all packages stored to the Python Package Index (PyPI). In total, over 197 thousand packages and over 749 thousand security issues are covered. Even under the constraints imposed by static analysis, (a) the results indicate prevalence of security issues; at least one issue is present for about 46% of the Python packages. In terms of the issue types, (b) exception handling and different code injections have been the most common issues. The subprocess module stands out in this regard. Reflecting the generally small size of the packages, (c) software size metrics do not predict well the amount of issues revealed through static analysis. With these results and the accompanying discussion, the paper contributes to the field of large-scale empirical studies for better understanding security problems in software ecosystems.
△ Less
Submitted 26 December, 2021; v1 submitted 27 July, 2021;
originally announced July 2021.
-
Digital Divides and Online Media
Authors:
Jukka Ruohonen,
Anne-Marie Tuikka
Abstract:
Digital divide has been a common concern during the past two or three decades; traditionally, it refers to a gap between developed and developing countries in the adoption and use of digital technologies. Given the importance of the topic, digital divide has been also extensively studied, although, hitherto, there is no previous research that would have linked the concept to online media. Given th…
▽ More
Digital divide has been a common concern during the past two or three decades; traditionally, it refers to a gap between developed and developing countries in the adoption and use of digital technologies. Given the importance of the topic, digital divide has been also extensively studied, although, hitherto, there is no previous research that would have linked the concept to online media. Given this gap in the literature, this paper evaluates the "maturity" of online media in 134 countries between 2007 and 2016. Maturity is defined according to the levels of national online media consumption, diversity of political perspectives presented in national online media, and consensus in reporting major political events in national online media. These aspects are explained by considering explanatory factors related to economy, infrastructure, politics, and administration. According to the empirical results based on a dynamic panel data methodology, all aspects except administration are also associated with the maturity of national online media.
△ Less
Submitted 26 December, 2021; v1 submitted 25 June, 2021;
originally announced June 2021.
-
Crossing Cross-Domain Paths in the Current Web
Authors:
Jukka Ruohonen,
Joonas Salovaara,
Ville Leppänen
Abstract:
The loading of resources from third-parties has evoked new security and privacy concerns about the current world wide web. Building on the concepts of forced and implicit trust, this paper examines cross-domain transmission control protocol (TCP) connections that are initiated to domains other than the domain queried with a web browser. The dataset covers nearly ten thousand domains and over three…
▽ More
The loading of resources from third-parties has evoked new security and privacy concerns about the current world wide web. Building on the concepts of forced and implicit trust, this paper examines cross-domain transmission control protocol (TCP) connections that are initiated to domains other than the domain queried with a web browser. The dataset covers nearly ten thousand domains and over three hundred thousand TCP connections initiated by querying popular Finnish websites and globally popular sites. According to the results, (i) cross-domain connections are extremely common in the current Web. (ii) Most of these transmit encrypted content, although mixed content delivery is relatively common; many of the cross-domain connections deliver unencrypted content at the same time. (iii) Many of the cross-domain connections are initiated to known web advertisement domains, but a much larger share traces to social media platforms and cloud infrastructures. Finally, (iv) the results differ slightly between the Finnish web sites sampled and the globally popular sites. With these results, the paper contributes to the ongoing work for better understanding cross-domain connections and dependencies in the world wide web.
△ Less
Submitted 25 June, 2021;
originally announced June 2021.
-
A Comparative Study of Online Disinformation and Offline Protests
Authors:
Jukka Ruohonen
Abstract:
In early 2021 the United States Capitol in Washington was stormed during a riot and violent attack. A similar storming occurred in Brazil in
2023. Although both attacks were instances in longer sequences of events, these have provided a testimony for many observers who had claimed that online actions, including the propagation of disinformation, have offline consequences. Soon after, a number of…
▽ More
In early 2021 the United States Capitol in Washington was stormed during a riot and violent attack. A similar storming occurred in Brazil in
2023. Although both attacks were instances in longer sequences of events, these have provided a testimony for many observers who had claimed that online actions, including the propagation of disinformation, have offline consequences. Soon after, a number of papers have been published about the relation between online disinformation and offline violence, among other related relations. Hitherto, the effects upon political protests have been unexplored. This paper thus evaluates such effects with a time series cross-sectional sample of 125 countries in a period between 2000 and 2019. The results are mixed. Based on Bayesian multi-level regression modeling, (i) there indeed is an effect between online disinformation and offline protests, but the effect is partially meditated by political polarization. The results are clearer in a sample of countries belonging to the European Economic Area. With this sample, (ii) offline protest counts increase from online disinformation disseminated by domestic governments, political parties, and politicians as well as by foreign governments. Furthermore, (iii) Internet shutdowns tend to decrease the counts, although, paradoxically, the absence of governmental online monitoring of social media tends to also decrease these. With these results, the paper contributes to the blossoming disinformation research by modeling the impact of disinformation upon offline phenomenon. The contribution is important due to the various policy measures planned or already enacted.
△ Less
Submitted 7 December, 2024; v1 submitted 21 June, 2021;
originally announced June 2021.
-
Reassessing Measures for Press Freedom
Authors:
Jukka Ruohonen
Abstract:
There has been an increasing interest in press freedom in the face of various global scandals, transformation of media, technological change, obstacles to deliberative democracy, and other factors. Press freedom is frequently used also as an explanatory factor in comparative empirical research. However, validations of existing measurement instruments on press freedom have been far and few between.…
▽ More
There has been an increasing interest in press freedom in the face of various global scandals, transformation of media, technological change, obstacles to deliberative democracy, and other factors. Press freedom is frequently used also as an explanatory factor in comparative empirical research. However, validations of existing measurement instruments on press freedom have been far and few between. Given these points, this paper evaluates eight cross-country instruments on press freedom in 146 countries between 2001 and 2020, replicating an earlier study with a comparable research setup. The methodology is based on principal component analysis and multi-level regression modeling. According to the results, the construct (convergence) validity of the instruments is good; they all measure the same underlying semi-narrow definition for press freedom elaborated in the paper. In addition, any of the indices seems suitable to be used interchangeability in empirical research. Limitations and future research directions are further discussed.
△ Less
Submitted 19 September, 2023; v1 submitted 19 June, 2021;
originally announced June 2021.
-
A Few Observations About State-Centric Online Propaganda
Authors:
Jukka Ruohonen
Abstract:
This paper presents a few observations about pro-Kremlin propaganda between 2015 and early 2021 with a dataset from the East Stratcom Task Force (ESTF), which is affiliated with the European Union (EU) but working independently from it. Instead of focusing on misinformation and disinformation, the observations are motivated by classical propaganda research and the ongoing transformation of media s…
▽ More
This paper presents a few observations about pro-Kremlin propaganda between 2015 and early 2021 with a dataset from the East Stratcom Task Force (ESTF), which is affiliated with the European Union (EU) but working independently from it. Instead of focusing on misinformation and disinformation, the observations are motivated by classical propaganda research and the ongoing transformation of media systems. According to the tentative results, (i) the propaganda can be assumed to target both domestic and foreign audiences. Of the countries and regions discussed, (ii) Russia, Ukraine, the United States, and within Europe, Germany, Poland, and the EU have been the most frequently discussed. Also other conflict regions such as Syria have often appeared in the propaganda. In terms of longitudinal trends, however, (iii) most of these discussions have decreased in volume after the digital tsunami in 2016, although the conflict in Ukraine seems to have again increased the intensity of pro-Kremlin propaganda. Finally, (iv) the themes discussed align with state-centric war propaganda and conflict zones, although also post-truth themes frequently appear; from conspiracy theories via COVID-19 to fascism -- anything goes, as is typical to propaganda.
△ Less
Submitted 9 April, 2021;
originally announced April 2021.
-
Assessing the Readability of Policy Documents on the Digital Single Market of the European Union
Authors:
Jukka Ruohonen
Abstract:
Today, literature skills are necessary. Engineering and other technical professions are not an exception from this requirement. Traditionally, technical reading and writing have been framed with a limited scope, containing documentation, specifications, standards, and related text types. Nowadays, however, the scope covers also other text types, including legal, policy, and related documents. Give…
▽ More
Today, literature skills are necessary. Engineering and other technical professions are not an exception from this requirement. Traditionally, technical reading and writing have been framed with a limited scope, containing documentation, specifications, standards, and related text types. Nowadays, however, the scope covers also other text types, including legal, policy, and related documents. Given this motivation, this paper evaluates the readability of 201 legislations and related policy documents in the European Union (EU). The digital single market (DSM) provides the context. Five classical readability indices provide the methods; these are quantitative measures of a text's readability. The empirical results indicate that (i) generally a Ph.D. level education is required to comprehend the DSM laws and policy documents. Although (ii) the results vary across the five indices used, (iii) readability has slightly improved over time.
△ Less
Submitted 15 September, 2021; v1 submitted 23 February, 2021;
originally announced February 2021.
-
A Review of Product Safety Regulations in the European Union
Authors:
Jukka Ruohonen
Abstract:
Product safety has been a concern in Europe ever since the early 1960s. Despite the long and relatively stable historical lineage of product safety regulations, new technologies, changes in the world economy, and other major transformations have in recent years brought product safety again to the forefront of policy debates. As reforms are also underway, there is a motivation to review the complex…
▽ More
Product safety has been a concern in Europe ever since the early 1960s. Despite the long and relatively stable historical lineage of product safety regulations, new technologies, changes in the world economy, and other major transformations have in recent years brought product safety again to the forefront of policy debates. As reforms are also underway, there is a motivation to review the complex safety policy framework in the European Union (EU). Thus, building on deliberative policy analysis and interpretative literature review, this paper reviews the safety policy for non-food consumer products in the EU. The review covers the historical background and the main laws, administration and enforcement, standardization and harmonization, laws enacted for specific products, notifications delivered by national safety authorities, recalls of dangerous products, and the liability of these. Based on the review and analysis of these themes and the associated literature, some current policy challenges are further discussed.
△ Less
Submitted 19 June, 2022; v1 submitted 6 February, 2021;
originally announced February 2021.
-
The Treachery of Images in the Digital Sovereignty Debate
Authors:
Jukka Ruohonen
Abstract:
This short theoretical and argumentative essay contributes to the ongoing deliberation about the so-called digital sovereignty, as pursued particularly in the European Union (EU). Drawing from classical political science literature, the essay approaches the debate through paradoxes that arise from applying classical notions of sovereignty to the digital domain. With these paradoxes and a focus on…
▽ More
This short theoretical and argumentative essay contributes to the ongoing deliberation about the so-called digital sovereignty, as pursued particularly in the European Union (EU). Drawing from classical political science literature, the essay approaches the debate through paradoxes that arise from applying classical notions of sovereignty to the digital domain. With these paradoxes and a focus on the Peace of Westphalia in 1648, the essay develops a viewpoint distinct from the conventional territorial notion of sovereignty. Accordingly, the lesson from Westphalia has more to do with the capacity of a state to govern. It is also this capacity that is argued to enable the sovereignty of individuals within the digital realm. With this viewpoint, the essay further advances another, broader, and more pressing debate on politics and democracy in the digital era.
△ Less
Submitted 27 July, 2021; v1 submitted 4 December, 2020;
originally announced December 2020.
-
Do Cyber Capabilities and Cyber Power Incentivize International Cooperation?
Authors:
Jukka Ruohonen
Abstract:
This paper explores a research question about whether defensive and offensive cyber security power and the capabilities to exercise the power influence the incentives of nation-states to participate in bilateral and multilateral cooperation (BMC) through formal and informal agreements, alliances, and norms. Drawing from international relations in general and structural realism in particular, three…
▽ More
This paper explores a research question about whether defensive and offensive cyber security power and the capabilities to exercise the power influence the incentives of nation-states to participate in bilateral and multilateral cooperation (BMC) through formal and informal agreements, alliances, and norms. Drawing from international relations in general and structural realism in particular, three hypotheses are presented for assessing the research question empirically: (i) increasing cyber capability lessens the incentives for BMC; (ii) actively demonstrating and exerting cyber power decreases the willingness for BMC; and (iii) small states prefer BMC for cyber security and politics thereto. According to a cross-country dataset of 29 countries, all three hypotheses are rejected. Although presenting a "negative result" with respect to the research question, the accompanying discussion contributes to the state-centric cyber security research in international relations and political science.
△ Less
Submitted 13 November, 2020;
originally announced November 2020.
-
The GDPR Enforcement Fines at Glance
Authors:
Jukka Ruohonen,
Kalle Hjerppe
Abstract:
The General Data Protection Regulation (GDPR) came into force in 2018. After this enforcement, many fines have already been imposed by national data protection authorities in Europe. This paper examines the individual GDPR articles referenced in the enforcement decisions, as well as predicts the amount of enforcement fines with available meta-data and text mining features extracted from the enforc…
▽ More
The General Data Protection Regulation (GDPR) came into force in 2018. After this enforcement, many fines have already been imposed by national data protection authorities in Europe. This paper examines the individual GDPR articles referenced in the enforcement decisions, as well as predicts the amount of enforcement fines with available meta-data and text mining features extracted from the enforcement decision documents. According to the results, three articles related to the general principles, lawfulness, and information security have been the most frequently referenced ones. Although the amount of fines imposed vary across the articles referenced, these three particular articles do not stand out. Furthermore, a better statistical evidence is available with other meta-data features, including information about the particular European countries in which the enforcements were made. Accurate predictions are attainable even with simple machine learning techniques for regression analysis. Basic text mining features outperform the meta-data features in this regard. In addition to these results, the paper reflects the GDPR's enforcement against public administration obstacles in the European Union (EU), as well as discusses the use of automatic decision-making systems in judiciary.
△ Less
Submitted 1 September, 2021; v1 submitted 2 November, 2020;
originally announced November 2020.
-
A Critical Correspondence on Humpty Dumpty's Funding for European Journalism
Authors:
Jukka Ruohonen
Abstract:
This short critical correspondence discusses the Digital News Innovation (DNI) fund orchestrated by Humpty Dumpty -- a.k.a. Google -- for helping European journalism to innovate and renew itself. Based on topic modeling and critical discourse analysis, the results indicate that the innovative projects mostly mimic the old business model of Humpty Dumpty. With these results and the accompanying cri…
▽ More
This short critical correspondence discusses the Digital News Innovation (DNI) fund orchestrated by Humpty Dumpty -- a.k.a. Google -- for helping European journalism to innovate and renew itself. Based on topic modeling and critical discourse analysis, the results indicate that the innovative projects mostly mimic the old business model of Humpty Dumpty. With these results and the accompanying critical discussion, this correspondence contributes to the ongoing battle between platforms and media.
△ Less
Submitted 14 June, 2021; v1 submitted 2 November, 2020;
originally announced November 2020.
-
A Case Study on Software Vulnerability Coordination
Authors:
Jukka Ruohonen,
Sampsa Rauti,
Sami Hyrynsalmi,
Ville Leppänen
Abstract:
Context: Coordination is a fundamental tenet of software engineering. Coordination is required also for identifying discovered and disclosed software vulnerabilities with Common Vulnerabilities and Exposures (CVEs). Motivated by recent practical challenges, this paper examines the coordination of CVEs for open source projects through a public mailing list. Objective: The paper observes the histori…
▽ More
Context: Coordination is a fundamental tenet of software engineering. Coordination is required also for identifying discovered and disclosed software vulnerabilities with Common Vulnerabilities and Exposures (CVEs). Motivated by recent practical challenges, this paper examines the coordination of CVEs for open source projects through a public mailing list. Objective: The paper observes the historical time delays between the assignment of CVEs on a mailing list and the later appearance of these in the National Vulnerability Database (NVD). Drawing from research on software engineering coordination, software vulnerabilities, and bug tracking, the delays are modeled through three dimensions: social networks and communication practices, tracking infrastructures, and the technical characteristics of the CVEs coordinated. Method: Given a period between 2008 and 2016, a sample of over five thousand CVEs is used to model the delays with nearly fifty explanatory metrics. Regression analysis is used for the modeling. Results: The results show that the CVE coordination delays are affected by different abstractions for noise and prerequisite constraints. These abstractions convey effects from the social network and infrastructure dimensions. Particularly strong effect sizes are observed for annual and monthly control metrics, a control metric for weekends, the degrees of the nodes in the CVE coordination networks, and the number of references given in NVD for the CVEs archived. Smaller but visible effects are present for metrics measuring the entropy of the emails exchanged, traces to bug tracking systems, and other related aspects. The empirical signals are weaker for the technical characteristics. Conclusion: [...]
△ Less
Submitted 24 July, 2020;
originally announced July 2020.
-
Extracting Layered Privacy Language Purposes from Web Services
Authors:
Kalle Hjerppe,
Jukka Ruohonen,
Ville Leppänen
Abstract:
Web services are important in the processing of personal data in the World Wide Web. In light of recent data protection regulations, this processing raises a question about consent or other basis of legal processing. While a consent must be informed, many web services fail to provide enough information for users to make informed decisions. Privacy policies and privacy languages are one way for add…
▽ More
Web services are important in the processing of personal data in the World Wide Web. In light of recent data protection regulations, this processing raises a question about consent or other basis of legal processing. While a consent must be informed, many web services fail to provide enough information for users to make informed decisions. Privacy policies and privacy languages are one way for addressing this problem; the former document how personal data is processed, while the latter describe this processing formally. In this paper, the socalled Layered Privacy Language (LPL) is coupled with web services in order to express personal data processing with a formal analysis method that seeks to generate the processing purposes for privacy policies. To this end, the paper reviews the background theory as well as proposes a method and a concrete tool. The results are demonstrated with a small case study.
△ Less
Submitted 30 April, 2020;
originally announced April 2020.
-
Annotation-Based Static Analysis for Personal Data Protection
Authors:
Kalle Hjerppe,
Jukka Ruohonen,
Ville Leppänen
Abstract:
This paper elaborates the use of static source code analysis in the context of data protection. The topic is important for software engineering in order for software developers to improve the protection of personal data during software development. To this end, the paper proposes a design of annotating classes and functions that process personal data. The design serves two primary purposes: on one…
▽ More
This paper elaborates the use of static source code analysis in the context of data protection. The topic is important for software engineering in order for software developers to improve the protection of personal data during software development. To this end, the paper proposes a design of annotating classes and functions that process personal data. The design serves two primary purposes: on one hand, it provides means for software developers to document their intent; on the other hand, it furnishes tools for automatic detection of potential violations. This dual rationale facilitates compliance with the General Data Protection Regulation (GDPR) and other emerging data protection and privacy regulations. In addition to a brief review of the state-of-the-art of static analysis in the data protection context and the design of the proposed analysis method, a concrete tool is presented to demonstrate a practical implementation for the Java programming language.
△ Less
Submitted 22 March, 2020;
originally announced March 2020.
-
Predicting the Amount of GDPR Fines
Authors:
Jukka Ruohonen,
Kalle Hjerppe
Abstract:
The General Data Protection Regulation (GDPR) was enforced in 2018. After this enforcement, many fines have already been imposed by national data protection authorities in the European Union (EU). This paper examines the individual GDPR articles referenced in the enforcement decisions, as well as predicts the amount of enforcement fines with available meta-data and text mining features extracted f…
▽ More
The General Data Protection Regulation (GDPR) was enforced in 2018. After this enforcement, many fines have already been imposed by national data protection authorities in the European Union (EU). This paper examines the individual GDPR articles referenced in the enforcement decisions, as well as predicts the amount of enforcement fines with available meta-data and text mining features extracted from the enforcement decision documents. According to the results, articles related to the general principles, lawfulness, and information security have been the most frequently referenced ones. Although the amount of fines imposed vary across the articles referenced, these three particular articles do not stand out. Furthermore, good predictions are attainable even with simple machine learning techniques for regression analysis. Basic meta-data (such as the articles referenced and the country of origin) yields slightly better performance compared to the text mining features.
△ Less
Submitted 2 November, 2020; v1 submitted 11 March, 2020;
originally announced March 2020.
-
Measuring Basic Load-Balancing and Fail-Over Setups for Email Delivery via DNS MX Records
Authors:
Jukka Ruohonen
Abstract:
The domain name system (DNS) has long provided means to assure basic load-balancing and fail-over (BLBFO) for email delivery. A traditional method uses multiple mail exchanger (MX) records to distribute the load across multiple email servers. Round-robin DNS is the common alternative to this MX-based balancing. Despite the classical nature of these two solutions, neither one has received particula…
▽ More
The domain name system (DNS) has long provided means to assure basic load-balancing and fail-over (BLBFO) for email delivery. A traditional method uses multiple mail exchanger (MX) records to distribute the load across multiple email servers. Round-robin DNS is the common alternative to this MX-based balancing. Despite the classical nature of these two solutions, neither one has received particular attention in Internet measurement research. To patch this gap, this paper examines BLBFO configurations with an active measurement study covering over 2.7 million domains from which about 2.1 million have MX records. Of these MX-enabled domains, about 60% are observed to use BLBFO, and MX-based balancing seems more common than round-robin DNS. Email hosting services offer one explanation for this adoption rate. Many domains seem to also prefer fine-tuned configurations instead of relying on randomization assumptions. Furthermore, about 27% of the domains have at least one exchanger with a valid IPv6 address. Finally, some misconfigurations and related oddities are visible.
△ Less
Submitted 24 July, 2020; v1 submitted 25 February, 2020;
originally announced February 2020.
-
A Dip Into a Deep Well: Online Political Advertisements, Valence, and European Electoral Campaigning
Authors:
Jukka Ruohonen
Abstract:
Online political advertisements have become an important element in electoral campaigning throughout the world. At the same time, concepts such as disinformation and manipulation have emerged as a global concern. Although these concepts are distinct from online political ads and data-driven electoral campaigning, they tend to share a similar trait related to valence, the intrinsic attractiveness o…
▽ More
Online political advertisements have become an important element in electoral campaigning throughout the world. At the same time, concepts such as disinformation and manipulation have emerged as a global concern. Although these concepts are distinct from online political ads and data-driven electoral campaigning, they tend to share a similar trait related to valence, the intrinsic attractiveness or averseness of a message. Given this background, the paper examines online political ads by using a dataset collected from Google's transparency reports. The examination is framed to the mid-2019 situation in Europe, including the European Parliament elections in particular. According to the results based on sentiment analysis of the textual ads displayed via Google's advertisement machinery, (i) most of the political ads have expressed positive sentiments, although these vary greatly between (ii) European countries as well as across (iii) European political parties. In addition to these results, the paper contributes to the timely discussion about data-driven electoral campaigning and its relation to politics and democracy.
△ Less
Submitted 2 November, 2020; v1 submitted 28 January, 2020;
originally announced January 2020.
-
Empirical Notes on the Interaction Between Continuous Kernel Fuzzing and Development
Authors:
Jukka Ruohonen,
Kalle Rindell
Abstract:
Fuzzing has been studied and applied ever since the 1990s. Automated and continuous fuzzing has recently been applied also to open source software projects, including the Linux and BSD kernels. This paper concentrates on the practical aspects of continuous kernel fuzzing in four open source kernels. According to the results, there are over 800 unresolved crashes reported for the four kernels by th…
▽ More
Fuzzing has been studied and applied ever since the 1990s. Automated and continuous fuzzing has recently been applied also to open source software projects, including the Linux and BSD kernels. This paper concentrates on the practical aspects of continuous kernel fuzzing in four open source kernels. According to the results, there are over 800 unresolved crashes reported for the four kernels by the syzkaller/syzbot framework. Many of these have been reported relatively long ago. Interestingly, fuzzing-induced bugs have been resolved in the BSD kernels more rapidly. Furthermore, assertions and debug checks, use-after-frees, and general protection faults account for the majority of bug types in the Linux kernel. About 23% of the fixed bugs in the Linux kernel have either went through code review or additional testing. Finally, only code churn provides a weak statistical signal for explaining the associated bug fixing times in the Linux kernel.
△ Less
Submitted 5 September, 2019;
originally announced September 2019.
-
The General Data Protection Regulation: Requirements, Architectures, and Constraints
Authors:
Kalle Hjerppe,
Jukka Ruohonen,
Ville Leppänen
Abstract:
The General Data Protection Regulation (GDPR) in the European Union is the most famous recently enacted privacy regulation. Despite of the regulation's legal, political, and technological ramifications, relatively little research has been carried out for better understanding the GDPR's practical implications for requirements engineering and software architectures. Building on a grounded theory app…
▽ More
The General Data Protection Regulation (GDPR) in the European Union is the most famous recently enacted privacy regulation. Despite of the regulation's legal, political, and technological ramifications, relatively little research has been carried out for better understanding the GDPR's practical implications for requirements engineering and software architectures. Building on a grounded theory approach with close ties to the Finnish software industry, this paper contributes to the sealing of this gap in previous research. Three questions are asked and answered in the context of software development organizations. First, the paper elaborates nine practical constraints under which many small and medium-sized enterprises (SMEs) often operate when implementing solutions that address the new regulatory demands. Second, the paper elicits nine regulatory requirements from the GDPR for software architectures. Third, the paper presents an implementation for a software architecture that complies both with the requirements elicited and the constraints elaborated.
△ Less
Submitted 17 July, 2019;
originally announced July 2019.
-
Updating the Wassenaar Debate Once Again: Surveillance, Intrusion Software, and Ambiguity
Authors:
Jukka Ruohonen,
Kai Kimppa
Abstract:
This paper analyzes a recent debate on regulating cyber weapons through multilateral export controls. The background relates to the amending of the international Wassenaar Arrangement with offensive cyber security technologies known as intrusion software. Implicitly, such software is related to previously unregulated software vulnerabilities and exploits, which also make the ongoing debate particu…
▽ More
This paper analyzes a recent debate on regulating cyber weapons through multilateral export controls. The background relates to the amending of the international Wassenaar Arrangement with offensive cyber security technologies known as intrusion software. Implicitly, such software is related to previously unregulated software vulnerabilities and exploits, which also make the ongoing debate particularly relevant. By placing the debate into a historical context, the paper reveals interesting historical parallels, elaborates the political background, and underlines many ambiguity problems related to rigorous definitions for cyber weapons. Many difficult problems remaining for framing offensive security tools with multilateral export controls are also pointed out.
△ Less
Submitted 5 June, 2019;
originally announced June 2019.
-
David and Goliath: Privacy Lobbying in the European Union
Authors:
Jukka Ruohonen
Abstract:
The paper examines a question of how much more resources do organized business interests have when compared to resources of civil society groups in the context of privacy lobbying in the European Union (EU). To answer to the question, the paper draws from classical literature on power resources and pluralism. The empirical material comes from a lobbying register maintained by the EU. According to…
▽ More
The paper examines a question of how much more resources do organized business interests have when compared to resources of civil society groups in the context of privacy lobbying in the European Union (EU). To answer to the question, the paper draws from classical literature on power resources and pluralism. The empirical material comes from a lobbying register maintained by the EU. According to the results, (a) there is only a small difference in terms of the average financial and human resources, but a vast difference when absolute amounts are used. Furthermore, (b) organized business interests are better affiliated with each other and other organizations. Finally, (c) many organized business interests maintain their offices in the United States, whereas the non-governmental organizations observed are mostly European. With these results and the accompanying discussion, the paper contributes to the underresearched but inflammatory topic of privacy politics.
△ Less
Submitted 5 June, 2019;
originally announced June 2019.