Search | arXiv e-print repository

Leveraging XP and CRISP-DM for Agile Data Science Projects

Authors: Andre Massahiro Shimaoka, Renato Cordeiro Ferreira, Alfredo Goldman

Abstract: This study explores the integration of eXtreme Programming (XP) and the Cross-Industry Standard Process for Data Mining (CRISP-DM) in agile Data Science projects. We conducted a case study at the e-commerce company Elo7 to answer the research question: How can the agility of the XP method be integrated with CRISP-DM in Data Science projects? Data was collected through interviews and questionnaires… ▽ More This study explores the integration of eXtreme Programming (XP) and the Cross-Industry Standard Process for Data Mining (CRISP-DM) in agile Data Science projects. We conducted a case study at the e-commerce company Elo7 to answer the research question: How can the agility of the XP method be integrated with CRISP-DM in Data Science projects? Data was collected through interviews and questionnaires with a Data Science team consisting of data scientists, ML engineers, and data product managers. The results show that 86% of the team frequently or always applies CRISP-DM, while 71% adopt XP practices in their projects. Furthermore, the study demonstrates that it is possible to combine CRISP-DM with XP in Data Science projects, providing a structured and collaborative approach. Finally, the study generated improvement recommendations for the company. △ Less

Submitted 27 May, 2025; originally announced May 2025.

arXiv:2106.12081 [pdf, other]

doi 10.1007/978-3-030-70569-5_6

Forecasting Health and Wellbeing for Shift Workers Using Job-role Based Deep Neural Network

Authors: Han Yu, Asami Itoh, Ryota Sakamoto, Motomu Shimaoka, Akane Sano

Abstract: Shift workers who are essential contributors to our society, face high risks of poor health and wellbeing. To help with their problems, we collected and analyzed physiological and behavioral wearable sensor data from shift working nurses and doctors, as well as their behavioral questionnaire data and their self-reported daily health and wellbeing labels, including alertness, happiness, energy, hea… ▽ More Shift workers who are essential contributors to our society, face high risks of poor health and wellbeing. To help with their problems, we collected and analyzed physiological and behavioral wearable sensor data from shift working nurses and doctors, as well as their behavioral questionnaire data and their self-reported daily health and wellbeing labels, including alertness, happiness, energy, health, and stress. We found the similarities and differences between the responses of nurses and doctors. According to the differences in self-reported health and wellbeing labels between nurses and doctors, and the correlations among their labels, we proposed a job-role based multitask and multilabel deep learning model, where we modeled physiological and behavioral data for nurses and doctors simultaneously to predict participants' next day's multidimensional self-reported health and wellbeing status. Our model showed significantly better performances than baseline models and previous state-of-the-art models in the evaluations of binary/3-class classification and regression prediction tasks. We also found features related to heart rate, sleep, and work shift contributed to shift workers' health and wellbeing. △ Less

Submitted 22 June, 2021; originally announced June 2021.

Comments: In: Wireless Mobile Communication and Healthcare. MobiHealth 2020

arXiv:2011.13925 [pdf]

Investigation on Research Ethics and Building a Benchmark

Authors: Shun Inagaki, Robert Ramirez, Masaki Shimaoka, Kenichi Magata

Abstract: When dealing with leading edge cyber security research, especially when operating from the perspective of an attacker or a red team, it becomes necessary for one to at times consider how ethics comes into play. There are currently no cyber security-specific ethics standards, which in particular is one reason more adversarial cyber security research lags behind in Japan. In this research, using mac… ▽ More When dealing with leading edge cyber security research, especially when operating from the perspective of an attacker or a red team, it becomes necessary for one to at times consider how ethics comes into play. There are currently no cyber security-specific ethics standards, which in particular is one reason more adversarial cyber security research lags behind in Japan. In this research, using machine learning and manual methods we extracted best practices for research ethics from past top conference papers. Using this knowledge we constructed an ethics knowledge base for cyber security research. Such a knowledge base can be used to properly distinguish grey-area research so that it is not wrongly forbidden. Using a decision tree-style user interface that we created for our knowledge base, researchers may be able to efficiently identify which aspects of their research require ethical consideration. In this work, as a preliminary step we focused on only a portion of the areas of research covered by cyber security conferences, but our results are applicable to any area of research. △ Less

Submitted 26 November, 2020; originally announced November 2020.

Comments: 8 pages, in Japanese (abstract is a translation). The Institute of Electronics, Information and Communication Engineers, Niigata, Japan (Jan 2018)

MSC Class: 68T50; 68M25 ACM Class: I.7.5; K.4.1; H.3.1; H.3.7

arXiv:2011.02661 [pdf]

Knowledge-Base Practicality for Cybersecurity Research Ethics Evaluation

Authors: Robert B. Ramirez, Tomohiko Yano, Masaki Shimaoka, Kenichi Magata

Abstract: Research ethics in Information and Communications Technology has seen a resurgence in popularity in recent years. Although a number of general ethics standards have been issued, cyber security specifically has yet to see one. Furthermore, such standards are often abstract, lacking in guidance on specific practices. In this paper we compare peer-reviewed ethical analyses of condemned research paper… ▽ More Research ethics in Information and Communications Technology has seen a resurgence in popularity in recent years. Although a number of general ethics standards have been issued, cyber security specifically has yet to see one. Furthermore, such standards are often abstract, lacking in guidance on specific practices. In this paper we compare peer-reviewed ethical analyses of condemned research papers to analyses derived from a knowledge base (KB) of concrete cyber security research ethics best practices. The KB we employ was compiled in prior work from a large random survey of research papers. We demonstrate preliminary evidence that such a KB can be used to yield comparable or more extensive ethical analyses of published cyber security research than expert application of standards like the Menlo Report. We extend the ethical analyses of the reviewed manuscripts, and calculate measures of the efficiency with which the expert versus KB methods yield ethical insights. △ Less

Submitted 4 November, 2020; originally announced November 2020.

Comments: 9 pages, To appear in Computer Security Symposium 2020 (CSS 2020). Readers viewing this document after 10/19/2020 please refer to the most recent version on arXiv.org. Proceedings to be made available at: https://www.iwsec.org/css/2020/proceedings.html

MSC Class: 68M25 ACM Class: K.7.4

Showing 1–4 of 4 results for author: Shimaoka, M