-
Evaluating the long-term viability of eye-tracking for continuous authentication in virtual reality
Authors:
Sai Ganesh Grandhi,
Saeed Samet
Abstract:
Traditional authentication methods, such as passwords and biometrics, verify a user's identity only at the start of a session, leaving systems vulnerable to session hijacking. Continuous authentication, however, ensures ongoing verification by monitoring user behavior. This study investigates the long-term feasibility of eye-tracking as a behavioral biometric for continuous authentication in virtu…
▽ More
Traditional authentication methods, such as passwords and biometrics, verify a user's identity only at the start of a session, leaving systems vulnerable to session hijacking. Continuous authentication, however, ensures ongoing verification by monitoring user behavior. This study investigates the long-term feasibility of eye-tracking as a behavioral biometric for continuous authentication in virtual reality (VR) environments, using data from the GazebaseVR dataset. Our approach evaluates three architectures, Transformer Encoder, DenseNet, and XGBoost, on short and long-term data to determine their efficacy in user identification tasks. Initial results indicate that both Transformer Encoder and DenseNet models achieve high accuracy rates of up to 97% in short-term settings, effectively capturing unique gaze patterns. However, when tested on data collected 26 months later, model accuracy declined significantly, with rates as low as 1.78% for some tasks. To address this, we propose periodic model updates incorporating recent data, restoring accuracy to over 95%. These findings highlight the adaptability required for gaze-based continuous authentication systems and underscore the need for model retraining to manage evolving user behavior. Our study provides insights into the efficacy and limitations of eye-tracking as a biometric for VR authentication, paving the way for adaptive, secure VR user experiences.
△ Less
Submitted 27 February, 2025;
originally announced February 2025.
-
Medical Imaging Complexity and its Effects on GAN Performance
Authors:
William Cagas,
Chan Ko,
Blake Hsiao,
Shryuk Grandhi,
Rishi Bhattacharya,
Kevin Zhu,
Michael Lam
Abstract:
The proliferation of machine learning models in diverse clinical applications has led to a growing need for high-fidelity, medical image training data. Such data is often scarce due to cost constraints and privacy concerns. Alleviating this burden, medical image synthesis via generative adversarial networks (GANs) emerged as a powerful method for synthetically generating photo-realistic images bas…
▽ More
The proliferation of machine learning models in diverse clinical applications has led to a growing need for high-fidelity, medical image training data. Such data is often scarce due to cost constraints and privacy concerns. Alleviating this burden, medical image synthesis via generative adversarial networks (GANs) emerged as a powerful method for synthetically generating photo-realistic images based on existing sets of real medical images. However, the exact image set size required to efficiently train such a GAN is unclear. In this work, we experimentally establish benchmarks that measure the relationship between a sample dataset size and the fidelity of the generated images, given the dataset's distribution of image complexities. We analyze statistical metrics based on delentropy, an image complexity measure rooted in Shannon's entropy in information theory. For our pipeline, we conduct experiments with two state-of-the-art GANs, StyleGAN 3 and SPADE-GAN, trained on multiple medical imaging datasets with variable sample sizes. Across both GANs, general performance improved with increasing training set size but suffered with increasing complexity.
△ Less
Submitted 23 October, 2024;
originally announced October 2024.
-
SuperServe: Fine-Grained Inference Serving for Unpredictable Workloads
Authors:
Alind Khare,
Dhruv Garg,
Sukrit Kalra,
Snigdha Grandhi,
Ion Stoica,
Alexey Tumanov
Abstract:
The increasing deployment of ML models on the critical path of production applications in both datacenter and the edge requires ML inference serving systems to serve these models under unpredictable and bursty request arrival rates. Serving models under such conditions requires these systems to strike a careful balance between the latency and accuracy requirements of the application and the overal…
▽ More
The increasing deployment of ML models on the critical path of production applications in both datacenter and the edge requires ML inference serving systems to serve these models under unpredictable and bursty request arrival rates. Serving models under such conditions requires these systems to strike a careful balance between the latency and accuracy requirements of the application and the overall efficiency of utilization of scarce resources. State-of-the-art systems resolve this tension by either choosing a static point in the latency-accuracy tradeoff space to serve all requests or load specific models on the critical path of request serving. In this work, we instead resolve this tension by simultaneously serving the entire-range of models spanning the latency-accuracy tradeoff space. Our novel mechanism, SubNetAct, achieves this by carefully inserting specialized operators in weight-shared SuperNetworks. These operators enable SubNetAct to dynamically route requests through the network to meet a latency and accuracy target. SubNetAct requires upto 2.6x lower memory to serve a vastly-higher number of models than prior state-of-the-art. In addition, SubNetAct's near-instantaneous actuation of models unlocks the design space of fine-grained, reactive scheduling policies. We explore the design of one such extremely effective policy, SlackFit and instantiate both SubNetAct and SlackFit in a real system, SuperServe. SuperServe achieves 4.67% higher accuracy for the same SLO attainment and 2.85x higher SLO attainment for the same accuracy on a trace derived from the real-world Microsoft Azure Functions workload and yields the best trade-offs on a wide range of extremely-bursty synthetic traces automatically.
△ Less
Submitted 27 December, 2023;
originally announced December 2023.
-
An Empirical Analysis of UI-based Flaky Tests
Authors:
Alan Romano,
Zihe Song,
Sampath Grandhi,
Wei Yang,
Weihang Wang
Abstract:
Flaky tests have gained attention from the research community in recent years and with good reason. These tests lead to wasted time and resources, and they reduce the reliability of the test suites and build systems they affect. However, most of the existing work on flaky tests focus exclusively on traditional unit tests. This work ignores UI tests that have larger input spaces and more diverse ru…
▽ More
Flaky tests have gained attention from the research community in recent years and with good reason. These tests lead to wasted time and resources, and they reduce the reliability of the test suites and build systems they affect. However, most of the existing work on flaky tests focus exclusively on traditional unit tests. This work ignores UI tests that have larger input spaces and more diverse running conditions than traditional unit tests. In addition, UI tests tend to be more complex and resource-heavy, making them unsuited for detection techniques involving rerunning test suites multiple times.
In this paper, we perform a study on flaky UI tests. We analyze 235 flaky UI test samples found in 62 projects from both web and Android environments. We identify the common underlying root causes of flakiness in the UI tests, the strategies used to manifest the flaky behavior, and the fixing strategies used to remedy flaky UI tests. The findings made in this work can provide a foundation for the development of detection and prevention techniques for flakiness arising in UI tests.
△ Less
Submitted 3 March, 2021;
originally announced March 2021.
-
Real or Fake? User Behavior and Attitudes Related to Determining the Veracity of Social Media Posts
Authors:
Linda Plotnick,
Starr Hiltz,
Sukeshini Grandhi,
Julie Dugdale
Abstract:
Citizens and Emergency Managers need to be able to distinguish ''fake'' (untrue) news posts from real news posts on social media during disasters. This paper is based on an online survey conducted in 2018 that produced 341 responses from invitations distributed via email and through Facebook. It explores to what extent and how citizens generally assess whether postings are ''true'' or ''fake,'' an…
▽ More
Citizens and Emergency Managers need to be able to distinguish ''fake'' (untrue) news posts from real news posts on social media during disasters. This paper is based on an online survey conducted in 2018 that produced 341 responses from invitations distributed via email and through Facebook. It explores to what extent and how citizens generally assess whether postings are ''true'' or ''fake,'' and describes indicators of the trustworthiness of content that users would like. The mean response on a semantic differential scale measuring how frequently users attempt to verify the news trustworthiness (a scale from 1-never to 5-always) was 3.37. The most frequent message characteristics citizens' use are grammar and the trustworthiness of the sender. Most respondents would find an indicator of trustworthiness helpful, with the most popular choice being a colored graphic. Limitations and implications for assessments of trustworthiness during disasters are discussed.
△ Less
Submitted 8 April, 2019;
originally announced April 2019.