-
Identifying Sparsely Active Circuits Through Local Loss Landscape Decomposition
Authors:
Brianna Chrisman,
Lucius Bushnaq,
Lee Sharkey
Abstract:
Much of mechanistic interpretability has focused on understanding the activation spaces of large neural networks. However, activation space-based approaches reveal little about the underlying circuitry used to compute features. To better understand the circuits employed by models, we introduce a new decomposition method called Local Loss Landscape Decomposition (L3D). L3D identifies a set of low-r…
▽ More
Much of mechanistic interpretability has focused on understanding the activation spaces of large neural networks. However, activation space-based approaches reveal little about the underlying circuitry used to compute features. To better understand the circuits employed by models, we introduce a new decomposition method called Local Loss Landscape Decomposition (L3D). L3D identifies a set of low-rank subnetworks: directions in parameter space of which a subset can reconstruct the gradient of the loss between any sample's output and a reference output vector. We design a series of progressively more challenging toy models with well-defined subnetworks and show that L3D can nearly perfectly recover the associated subnetworks. Additionally, we investigate the extent to which perturbing the model in the direction of a given subnetwork affects only the relevant subset of samples. Finally, we apply L3D to a real-world transformer model and a convolutional neural network, demonstrating its potential to identify interpretable and relevant circuits in parameter space.
△ Less
Submitted 31 March, 2025;
originally announced April 2025.
-
An Exploration of Active Learning for Affective Digital Phenotyping
Authors:
Peter Washington,
Cezmi Mutlu,
Aaron Kline,
Cathy Hou,
Kaitlyn Dunlap,
Jack Kent,
Arman Husic,
Nate Stockham,
Brianna Chrisman,
Kelley Paskov,
Jae-Yoon Jung,
Dennis P. Wall
Abstract:
Some of the most severe bottlenecks preventing widespread development of machine learning models for human behavior include a dearth of labeled training data and difficulty of acquiring high quality labels. Active learning is a paradigm for using algorithms to computationally select a useful subset of data points to label using metrics for model uncertainty and data similarity. We explore active l…
▽ More
Some of the most severe bottlenecks preventing widespread development of machine learning models for human behavior include a dearth of labeled training data and difficulty of acquiring high quality labels. Active learning is a paradigm for using algorithms to computationally select a useful subset of data points to label using metrics for model uncertainty and data similarity. We explore active learning for naturalistic computer vision emotion data, a particularly heterogeneous and complex data space due to inherently subjective labels. Using frames collected from gameplay acquired from a therapeutic smartphone game for children with autism, we run a simulation of active learning using gameplay prompts as metadata to aid in the active learning process. We find that active learning using information generated during gameplay slightly outperforms random selection of the same number of labeled frames. We next investigate a method to conduct active learning with subjective data, such as in affective computing, and where multiple crowdsourced labels can be acquired for each image. Using the Child Affective Facial Expression (CAFE) dataset, we simulate an active learning process for crowdsourcing many labels and find that prioritizing frames using the entropy of the crowdsourced label distribution results in lower categorical cross-entropy loss compared to random frame selection. Collectively, these results demonstrate pilot evaluations of two novel active learning approaches for subjective affective data collected in noisy settings.
△ Less
Submitted 6 April, 2022; v1 submitted 4 April, 2022;
originally announced April 2022.
-
Challenges and Opportunities for Machine Learning Classification of Behavior and Mental State from Images
Authors:
Peter Washington,
Cezmi Onur Mutlu,
Aaron Kline,
Kelley Paskov,
Nate Tyler Stockham,
Brianna Chrisman,
Nick Deveau,
Mourya Surhabi,
Nick Haber,
Dennis P. Wall
Abstract:
Computer Vision (CV) classifiers which distinguish and detect nonverbal social human behavior and mental state can aid digital diagnostics and therapeutics for psychiatry and the behavioral sciences. While CV classifiers for traditional and structured classification tasks can be developed with standard machine learning pipelines for supervised learning consisting of data labeling, preprocessing, a…
▽ More
Computer Vision (CV) classifiers which distinguish and detect nonverbal social human behavior and mental state can aid digital diagnostics and therapeutics for psychiatry and the behavioral sciences. While CV classifiers for traditional and structured classification tasks can be developed with standard machine learning pipelines for supervised learning consisting of data labeling, preprocessing, and training a convolutional neural network, there are several pain points which arise when attempting this process for behavioral phenotyping. Here, we discuss the challenges and corresponding opportunities in this space, including handling heterogeneous data, avoiding biased models, labeling massive and repetitive data sets, working with ambiguous or compound class labels, managing privacy concerns, creating appropriate representations, and personalizing models. We discuss current state-of-the-art research endeavors in CV such as data curation, data augmentation, crowdsourced labeling, active learning, reinforcement learning, generative models, representation learning, federated learning, and meta-learning. We highlight at least some of the machine learning advancements needed for imaging classifiers to detect human social cues successfully and reliably.
△ Less
Submitted 26 January, 2022;
originally announced January 2022.
-
Classification of Abnormal Hand Movement for Aiding in Autism Detection: Machine Learning Study
Authors:
Anish Lakkapragada,
Aaron Kline,
Onur Cezmi Mutlu,
Kelley Paskov,
Brianna Chrisman,
Nate Stockham,
Peter Washington,
Dennis Wall
Abstract:
A formal autism diagnosis can be an inefficient and lengthy process. Families may wait months or longer before receiving a diagnosis for their child despite evidence that earlier intervention leads to better treatment outcomes. Digital technologies which detect the presence of behaviors related to autism can scale access to pediatric diagnoses. This work aims to demonstrate the feasibility of deep…
▽ More
A formal autism diagnosis can be an inefficient and lengthy process. Families may wait months or longer before receiving a diagnosis for their child despite evidence that earlier intervention leads to better treatment outcomes. Digital technologies which detect the presence of behaviors related to autism can scale access to pediatric diagnoses. This work aims to demonstrate the feasibility of deep learning technologies for detecting hand flapping from unstructured home videos as a first step towards validating whether models and digital technologies can be leveraged to aid with autism diagnoses. We used the Self-Stimulatory Behavior Dataset (SSBD), which contains 75 videos of hand flapping, head banging, and spinning exhibited by children. From all the hand flapping videos, we extracted 100 positive and control videos of hand flapping, each between 2 to 5 seconds in duration. Utilizing both landmark-driven-approaches and MobileNet V2's pretrained convolutional layers, our highest performing model achieved a testing F1 score of 84% (90% precision and 80% recall) when evaluating with 5-fold cross validation 100 times. This work provides the first step towards developing precise deep learning methods for activity detection of autism-related behaviors.
△ Less
Submitted 6 June, 2022; v1 submitted 17 August, 2021;
originally announced August 2021.
-
Activity Recognition with Moving Cameras and Few Training Examples: Applications for Detection of Autism-Related Headbanging
Authors:
Peter Washington,
Aaron Kline,
Onur Cezmi Mutlu,
Emilie Leblanc,
Cathy Hou,
Nate Stockham,
Kelley Paskov,
Brianna Chrisman,
Dennis P. Wall
Abstract:
Activity recognition computer vision algorithms can be used to detect the presence of autism-related behaviors, including what are termed "restricted and repetitive behaviors", or stimming, by diagnostic instruments. The limited data that exist in this domain are usually recorded with a handheld camera which can be shaky or even moving, posing a challenge for traditional feature representation app…
▽ More
Activity recognition computer vision algorithms can be used to detect the presence of autism-related behaviors, including what are termed "restricted and repetitive behaviors", or stimming, by diagnostic instruments. The limited data that exist in this domain are usually recorded with a handheld camera which can be shaky or even moving, posing a challenge for traditional feature representation approaches for activity detection which mistakenly capture the camera's motion as a feature. To address these issues, we first document the advantages and limitations of current feature representation techniques for activity recognition when applied to head banging detection. We then propose a feature representation consisting exclusively of head pose keypoints. We create a computer vision classifier for detecting head banging in home videos using a time-distributed convolutional neural network (CNN) in which a single CNN extracts features from each frame in the input sequence, and these extracted features are fed as input to a long short-term memory (LSTM) network. On the binary task of predicting head banging and no head banging within videos from the Self Stimulatory Behaviour Dataset (SSBD), we reach a mean F1-score of 90.77% using 3-fold cross validation (with individual fold F1-scores of 83.3%, 89.0%, and 100.0%) when ensuring that no child who appeared in the train set was in the test set for all folds. This work documents a successful technique for training a computer vision classifier which can detect human motion with few training examples and even when the camera recording the source clips is unstable. The general methods described here can be applied by designers and developers of interactive systems towards other human motion and pose classification problems used in mobile and ubiquitous interactive systems.
△ Less
Submitted 10 January, 2021;
originally announced January 2021.
-
Training Affective Computer Vision Models by Crowdsourcing Soft-Target Labels
Authors:
Peter Washington,
Onur Cezmi Mutlu,
Emilie Leblanc,
Aaron Kline,
Cathy Hou,
Brianna Chrisman,
Nate Stockham,
Kelley Paskov,
Catalin Voss,
Nick Haber,
Dennis Wall
Abstract:
Emotion classifiers traditionally predict discrete emotions. However, emotion expressions are often subjective, thus requiring a method to handle subjective labels. We explore the use of crowdsourcing to acquire reliable soft-target labels and evaluate an emotion detection classifier trained with these labels. We center our study on the Child Affective Facial Expression (CAFE) dataset, a gold stan…
▽ More
Emotion classifiers traditionally predict discrete emotions. However, emotion expressions are often subjective, thus requiring a method to handle subjective labels. We explore the use of crowdsourcing to acquire reliable soft-target labels and evaluate an emotion detection classifier trained with these labels. We center our study on the Child Affective Facial Expression (CAFE) dataset, a gold standard collection of images depicting pediatric facial expressions along with 100 human labels per image. To test the feasibility of crowdsourcing to generate these labels, we used Microworkers to acquire labels for 207 CAFE images. We evaluate both unfiltered workers as well as workers selected through a short crowd filtration process. We then train two versions of a classifiers on soft-target CAFE labels using the original 100 annotations provided with the dataset: (1) a classifier trained with traditional one-hot encoded labels, and (2) a classifier trained with vector labels representing the distribution of CAFE annotator responses. We compare the resulting softmax output distributions of the two classifiers with a 2-sample independent t-test of L1 distances between the classifier's output probability distribution and the distribution of human labels. While agreement with CAFE is weak for unfiltered crowd workers, the filtered crowd agree with the CAFE labels 100% of the time for many emotions. While the F1-score for a one-hot encoded classifier is much higher (94.33% vs. 78.68%) with respect to the ground truth CAFE labels, the output probability vector of the crowd-trained classifier more closely resembles the distribution of human labels (t=3.2827, p=0.0014). Reporting an emotion probability distribution that accounts for the subjectivity of human interpretation. Crowdsourcing, including a sufficient filtering mechanism, is a feasible solution for acquiring soft-target labels.
△ Less
Submitted 22 September, 2021; v1 submitted 10 January, 2021;
originally announced January 2021.
-
Improved Digital Therapy for Developmental Pediatrics Using Domain-Specific Artificial Intelligence: Machine Learning Study
Authors:
Peter Washington,
Haik Kalantarian,
John Kent,
Arman Husic,
Aaron Kline,
Emilie Leblanc,
Cathy Hou,
Onur Cezmi Mutlu,
Kaitlyn Dunlap,
Yordan Penev,
Maya Varma,
Nate Tyler Stockham,
Brianna Chrisman,
Kelley Paskov,
Min Woo Sun,
Jae-Yoon Jung,
Catalin Voss,
Nick Haber,
Dennis Paul Wall
Abstract:
Background: Automated emotion classification could aid those who struggle to recognize emotions, including children with developmental behavioral conditions such as autism. However, most computer vision emotion recognition models are trained on adult emotion and therefore underperform when applied to child faces. Objective: We designed a strategy to gamify the collection and labeling of child emot…
▽ More
Background: Automated emotion classification could aid those who struggle to recognize emotions, including children with developmental behavioral conditions such as autism. However, most computer vision emotion recognition models are trained on adult emotion and therefore underperform when applied to child faces. Objective: We designed a strategy to gamify the collection and labeling of child emotion-enriched images to boost the performance of automatic child emotion recognition models to a level closer to what will be needed for digital health care approaches. Methods: We leveraged our prototype therapeutic smartphone game, GuessWhat, which was designed in large part for children with developmental and behavioral conditions, to gamify the secure collection of video data of children expressing a variety of emotions prompted by the game. Independently, we created a secure web interface to gamify the human labeling effort, called HollywoodSquares, tailored for use by any qualified labeler. We gathered and labeled 2155 videos, 39,968 emotion frames, and 106,001 labels on all images. With this drastically expanded pediatric emotion-centric database (>30 times larger than existing public pediatric emotion data sets), we trained a convolutional neural network (CNN) computer vision classifier of happy, sad, surprised, fearful, angry, disgust, and neutral expressions evoked by children. Results: The classifier achieved a 66.9% balanced accuracy and 67.4% F1-score on the entirety of the Child Affective Facial Expression (CAFE) as well as a 79.1% balanced accuracy and 78% F1-score on CAFE Subset A, a subset containing at least 60% human agreement on emotions labels. This performance is at least 10% higher than all previously developed classifiers evaluated against CAFE, the best of which reached a 56% balanced accuracy even when combining "anger" and "disgust" into a single class.
△ Less
Submitted 3 June, 2024; v1 submitted 15 December, 2020;
originally announced December 2020.