Exploring User Privacy Awareness on GitHub: An Empirical Study
Authors:
Costanza Alfieri,
Juri Di Rocco,
Paola Inverardi,
Phuong T. Nguyen
Abstract:
GitHub provides developers with a practical way to distribute source code and collaboratively work on common projects. To enhance account security and privacy, GitHub allows its users to manage access permissions, review audit logs, and enable two-factor authentication. However, despite the endless effort, the platform still faces various issues related to the privacy of its users. This paper pres…
▽ More
GitHub provides developers with a practical way to distribute source code and collaboratively work on common projects. To enhance account security and privacy, GitHub allows its users to manage access permissions, review audit logs, and enable two-factor authentication. However, despite the endless effort, the platform still faces various issues related to the privacy of its users. This paper presents an empirical study delving into the GitHub ecosystem. Our focus is on investigating the utilization of privacy settings on the platform and identifying various types of sensitive information disclosed by users. Leveraging a dataset comprising 6,132 developers, we report and analyze their activities by means of comments on pull requests. Our findings indicate an active engagement by users with the available privacy settings on GitHub. Notably, we observe the disclosure of different forms of private information within pull request comments. This observation has prompted our exploration into sensitivity detection using a large language model and BERT, to pave the way for a personalized privacy assistant. Our work provides insights into the utilization of existing privacy protection tools, such as privacy settings, along with their inherent limitations. Essentially, we aim to advance research in this field by providing both the motivation for creating such privacy protection tools and a proposed methodology for personalizing them.
△ Less
Submitted 10 September, 2024; v1 submitted 6 September, 2024;
originally announced September 2024.
Exosoul: ethical profiling in the digital world
Authors:
Costanza Alfieri,
Paola Inverardi,
Patrizio Migliarini,
Massimiliano Palmiero
Abstract:
The development and the spread of increasingly autonomous digital technologies in our society pose new ethical challenges beyond data protection and privacy violation. Users are unprotected in their interactions with digital technologies and at the same time autonomous systems are free to occupy the space of decisions that is prerogative of each human being. In this context the multidisciplinary p…
▽ More
The development and the spread of increasingly autonomous digital technologies in our society pose new ethical challenges beyond data protection and privacy violation. Users are unprotected in their interactions with digital technologies and at the same time autonomous systems are free to occupy the space of decisions that is prerogative of each human being. In this context the multidisciplinary project Exosoul aims at developing a personalized software exoskeleton which mediates actions in the digital world according to the moral preferences of the user. The exoskeleton relies on the ethical profiling of a user, similar in purpose to the privacy profiling proposed in the literature, but aiming at reflecting and predicting general moral preferences. Our approach is hybrid, first based on the identification of profiles in a top-down manner, and then on the refinement of profiles by a personalized data-driven approach. In this work we report our initial experiment on building such top-down profiles. We consider the correlations between ethics positions (idealism and relativism) personality traits (honesty/humility, conscientiousness, Machiavellianism and narcissism) and worldview (normativism), and then we use a clustering approach to create ethical profiles predictive of user's digital behaviors concerning privacy violation, copy-right infringements, caution and protection. Data were collected by administering a questionnaire to 317 young individuals. In the paper we discuss two clustering solutions, one data-driven and one model-driven, in terms of validity and predictive power of digital behavior.
△ Less
Submitted 30 March, 2022;
originally announced April 2022.