-
Generating Planning Feedback for Open-Ended Programming Exercises with LLMs
Authors:
Mehmet Arif Demirtaş,
Claire Zheng,
Max Fowler,
Kathryn Cunningham
Abstract:
To complete an open-ended programming exercise, students need to both plan a high-level solution and implement it using the appropriate syntax. However, these problems are often autograded on the correctness of the final submission through test cases, and students cannot get feedback on their planning process. Large language models (LLM) may be able to generate this feedback by detecting the overa…
▽ More
To complete an open-ended programming exercise, students need to both plan a high-level solution and implement it using the appropriate syntax. However, these problems are often autograded on the correctness of the final submission through test cases, and students cannot get feedback on their planning process. Large language models (LLM) may be able to generate this feedback by detecting the overall code structure even for submissions with syntax errors. To this end, we propose an approach that detects which high-level goals and patterns (i.e. programming plans) exist in a student program with LLMs. We show that both the full GPT-4o model and a small variant (GPT-4o-mini) can detect these plans with remarkable accuracy, outperforming baselines inspired by conventional approaches to code analysis. We further show that the smaller, cost-effective variant (GPT-4o-mini) achieves results on par with state-of-the-art (GPT-4o) after fine-tuning, creating promising implications for smaller models for real-time grading. These smaller models can be incorporated into autograders for open-ended code-writing exercises to provide feedback for students' implicit planning skills, even when their program is syntactically incorrect. Furthermore, LLMs may be useful in providing feedback for problems in other domains where students start with a set of high-level solution steps and iteratively compute the output, such as math and physics problems.
△ Less
Submitted 11 April, 2025;
originally announced April 2025.
-
PLAID: Supporting Computing Instructors to Identify Domain-Specific Programming Plans at Scale
Authors:
Yoshee Jain,
Mehmet Arif Demirtaş,
Kathryn Cunningham
Abstract:
Pedagogical approaches focusing on stereotypical code solutions, known as programming plans, can increase problem-solving ability and motivate diverse learners. However, plan-focused pedagogies are rarely used beyond introductory programming. Our formative study (N=10 educators) showed that identifying plans is a tedious process. To advance plan-focused pedagogies in application-focused domains, w…
▽ More
Pedagogical approaches focusing on stereotypical code solutions, known as programming plans, can increase problem-solving ability and motivate diverse learners. However, plan-focused pedagogies are rarely used beyond introductory programming. Our formative study (N=10 educators) showed that identifying plans is a tedious process. To advance plan-focused pedagogies in application-focused domains, we created an LLM-powered pipeline that automates the effortful parts of educators' plan identification process by providing use-case-driven program examples and candidate plans. In design workshops (N=7 educators), we identified design goals to maximize instructors' efficiency in plan identification by optimizing interaction with this LLM-generated content. Our resulting tool, PLAID, enables instructors to access a corpus of relevant programs to inspire plan identification, compare code snippets to assist plan refinement, and facilitates them in structuring code snippets into plans. We evaluated PLAID in a within-subjects user study (N=12 educators) and found that PLAID led to lower cognitive demand and increased productivity compared to the state-of-the-art. Educators found PLAID beneficial for generating instructional material. Thus, our findings suggest that human-in-the-loop approaches hold promise for supporting plan-focused pedagogies at scale.
△ Less
Submitted 14 February, 2025;
originally announced February 2025.
-
Face-Dubbing++: Lip-Synchronous, Voice Preserving Translation of Videos
Authors:
Alexander Waibel,
Moritz Behr,
Fevziye Irem Eyiokur,
Dogucan Yaman,
Tuan-Nam Nguyen,
Carlos Mullov,
Mehmet Arif Demirtas,
Alperen Kantarcı,
Stefan Constantin,
Hazım Kemal Ekenel
Abstract:
In this paper, we propose a neural end-to-end system for voice preserving, lip-synchronous translation of videos. The system is designed to combine multiple component models and produces a video of the original speaker speaking in the target language that is lip-synchronous with the target speech, yet maintains emphases in speech, voice characteristics, face video of the original speaker. The pipe…
▽ More
In this paper, we propose a neural end-to-end system for voice preserving, lip-synchronous translation of videos. The system is designed to combine multiple component models and produces a video of the original speaker speaking in the target language that is lip-synchronous with the target speech, yet maintains emphases in speech, voice characteristics, face video of the original speaker. The pipeline starts with automatic speech recognition including emphasis detection, followed by a translation model. The translated text is then synthesized by a Text-to-Speech model that recreates the original emphases mapped from the original sentence. The resulting synthetic voice is then mapped back to the original speakers' voice using a voice conversion model. Finally, to synchronize the lips of the speaker with the translated audio, a conditional generative adversarial network-based model generates frames of adapted lip movements with respect to the input face image as well as the output of the voice conversion model. In the end, the system combines the generated video with the converted audio to produce the final output. The result is a video of a speaker speaking in another language without actually knowing it. To evaluate our design, we present a user study of the complete system as well as separate evaluations of the single components. Since there is no available dataset to evaluate our whole system, we collect a test set and evaluate our system on this test set. The results indicate that our system is able to generate convincing videos of the original speaker speaking the target language while preserving the original speaker's characteristics. The collected dataset will be shared.
△ Less
Submitted 9 June, 2022;
originally announced June 2022.
-
Semantic Parsing of Interpage Relations
Authors:
Mehmet Arif Demirtaş,
Berke Oral,
Mehmet Yasin Akpınar,
Onur Deniz
Abstract:
Page-level analysis of documents has been a topic of interest in digitization efforts, and multimodal approaches have been applied to both classification and page stream segmentation. In this work, we focus on capturing finer semantic relations between pages of a multi-page document. To this end, we formalize the task as semantic parsing of interpage relations and we propose an end-to-end approach…
▽ More
Page-level analysis of documents has been a topic of interest in digitization efforts, and multimodal approaches have been applied to both classification and page stream segmentation. In this work, we focus on capturing finer semantic relations between pages of a multi-page document. To this end, we formalize the task as semantic parsing of interpage relations and we propose an end-to-end approach for interpage dependency extraction, inspired by the dependency parsing literature. We further design a multi-task training approach to jointly optimize for page embeddings to be used in segmentation, classification, and parsing of the page dependencies using textual and visual features extracted from the pages. Moreover, we also combine the features from two modalities to obtain multimodal page embeddings. To the best of our knowledge, this is the first study to extract rich semantic interpage relations from multi-page documents. Our experimental results show that the proposed method increased LAS by 41 percentage points for semantic parsing, increased accuracy by 33 percentage points for page stream segmentation, and 45 percentage points for page classification over a naive baseline.
△ Less
Submitted 26 May, 2022;
originally announced May 2022.
-
Predicting cognitive scores with graph neural networks through sample selection learning
Authors:
Martin Hanik,
Mehmet Arif Demirtaş,
Mohammed Amine Gharsallaoui,
Islem Rekik
Abstract:
Analyzing the relation between intelligence and neural activity is of the utmost importance in understanding the working principles of the human brain in health and disease. In existing literature, functional brain connectomes have been used successfully to predict cognitive measures such as intelligence quotient (IQ) scores in both healthy and disordered cohorts using machine learning models. How…
▽ More
Analyzing the relation between intelligence and neural activity is of the utmost importance in understanding the working principles of the human brain in health and disease. In existing literature, functional brain connectomes have been used successfully to predict cognitive measures such as intelligence quotient (IQ) scores in both healthy and disordered cohorts using machine learning models. However, existing methods resort to flattening the brain connectome (i.e., graph) through vectorization which overlooks its topological properties. To address this limitation and inspired from the emerging graph neural networks (GNNs), we design a novel regression GNN model (namely RegGNN) for predicting IQ scores from brain connectivity. On top of that, we introduce a novel, fully modular sample selection method to select the best samples to learn from for our target prediction task. However, since such deep learning architectures are computationally expensive to train, we further propose a \emph{learning-based sample selection} method that learns how to choose the training samples with the highest expected predictive power on unseen samples. For this, we capitalize on the fact that connectomes (i.e., their adjacency matrices) lie in the symmetric positive definite (SPD) matrix cone. Our results on full-scale and verbal IQ prediction outperforms comparison methods in autism spectrum disorder cohorts and achieves a competitive performance for neurotypical subjects using 3-fold cross-validation. Furthermore, we show that our sample selection approach generalizes to other learning-based methods, which shows its usefulness beyond our GNN architecture.
△ Less
Submitted 15 February, 2022; v1 submitted 17 June, 2021;
originally announced June 2021.