Search | arXiv e-print repository

Where Assessment Validation and Responsible AI Meet

Authors: Jill Burstein, Geoffrey T. LaFlair

Abstract: Validity, reliability, and fairness are core ethical principles embedded in classical argument-based assessment validation theory. These principles are also central to the Standards for Educational and Psychological Testing (2014) which recommended best practices for early applications of artificial intelligence (AI) in high-stakes assessments for automated scoring of written and spoken responses.… ▽ More Validity, reliability, and fairness are core ethical principles embedded in classical argument-based assessment validation theory. These principles are also central to the Standards for Educational and Psychological Testing (2014) which recommended best practices for early applications of artificial intelligence (AI) in high-stakes assessments for automated scoring of written and spoken responses. Responsible AI (RAI) principles and practices set forth by the AI ethics community are critical to ensure the ethical use of AI across various industry domains. Advances in generative AI have led to new policies as well as guidance about the implementation of RAI principles for assessments using AI. Building on Chapelle's foundational validity argument work to address the application of assessment validation theory for technology-based assessment, we propose a unified assessment framework that considers classical test validation theory and assessment-specific and domain-agnostic RAI principles and practice. The framework addresses responsible AI use for assessment that supports validity arguments, alignment with AI ethics to maintain human values and oversight, and broader social responsibility associated with AI use. △ Less

Submitted 4 November, 2024; originally announced November 2024.

arXiv:2409.07476 [pdf]

Responsible AI for Test Equity and Quality: The Duolingo English Test as a Case Study

Authors: Jill Burstein, Geoffrey T. LaFlair, Kevin Yancey, Alina A. von Davier, Ravit Dotan

Abstract: Artificial intelligence (AI) creates opportunities for assessments, such as efficiencies for item generation and scoring of spoken and written responses. At the same time, it poses risks (such as bias in AI-generated item content). Responsible AI (RAI) practices aim to mitigate risks associated with AI. This chapter addresses the critical role of RAI practices in achieving test quality (appropriat… ▽ More Artificial intelligence (AI) creates opportunities for assessments, such as efficiencies for item generation and scoring of spoken and written responses. At the same time, it poses risks (such as bias in AI-generated item content). Responsible AI (RAI) practices aim to mitigate risks associated with AI. This chapter addresses the critical role of RAI practices in achieving test quality (appropriateness of test score inferences), and test equity (fairness to all test takers). To illustrate, the chapter presents a case study using the Duolingo English Test (DET), an AI-powered, high-stakes English language assessment. The chapter discusses the DET RAI standards, their development and their relationship to domain-agnostic RAI principles. Further, it provides examples of specific RAI practices, showing how these practices meaningfully address the ethical principles of validity and reliability, fairness, privacy and security, and transparency and accountability standards to ensure test equity and quality. △ Less

Submitted 28 August, 2024; originally announced September 2024.

arXiv:1908.02827 [pdf, other]

Riverine Coverage with an Autonomous Surface Vehicle over Known Environments

Authors: Nare Karapetyan, Adam Braude, Jason Moulton, Joshua A. Burstein, Scott White, Jason M. O'Kane, Ioannis Rekleitis

Abstract: Environmental monitoring and surveying operations on rivers currently are performed primarily with manually-operated boats. In this domain, autonomous coverage of areas is of vital importance, for improving both the quality and the efficiency of coverage. This paper leverages human expertise in river exploration and data collection strategies to automate and optimize these processes using autonomo… ▽ More Environmental monitoring and surveying operations on rivers currently are performed primarily with manually-operated boats. In this domain, autonomous coverage of areas is of vital importance, for improving both the quality and the efficiency of coverage. This paper leverages human expertise in river exploration and data collection strategies to automate and optimize these processes using autonomous surface vehicles(ASVs). In particular, three deterministic algorithms for both partial and complete coverage of a river segment are proposed,providing varying path length, coverage density, and turning patterns. These strategies resulted in increases in accuracy and efficiency compared to human performance.The proposed methods were extensively tested in simulation using maps of real rivers of different shapes and sizes. In addition, to verify their performance in real world operations, the algorithms were deployed successfully on several parts of the Congaree River in South Carolina, USA, resulting in total of more than 35km of coverage trajectories in the field. △ Less

Submitted 7 August, 2019; originally announced August 2019.

Comments: IEEE/RSJ International Conference on Intelligent Robots and Systems, Accepted July 2019

Showing 1–3 of 3 results for author: Burstein, J