Search | arXiv e-print repository

Automatic occlusion removal from 3D maps for maritime situational awareness

Authors: Felix Sattler, Borja Carrillo Perez, Maurice Stephan, Sarah Barnes

Abstract: We introduce a novel method for updating 3D geospatial models, specifically targeting occlusion removal in large-scale maritime environments. Traditional 3D reconstruction techniques often face problems with dynamic objects, like cars or vessels, that obscure the true environment, leading to inaccurate models or requiring extensive manual editing. Our approach leverages deep learning techniques, i… ▽ More We introduce a novel method for updating 3D geospatial models, specifically targeting occlusion removal in large-scale maritime environments. Traditional 3D reconstruction techniques often face problems with dynamic objects, like cars or vessels, that obscure the true environment, leading to inaccurate models or requiring extensive manual editing. Our approach leverages deep learning techniques, including instance segmentation and generative inpainting, to directly modify both the texture and geometry of 3D meshes without the need for costly reprocessing. By selectively targeting occluding objects and preserving static elements, the method enhances both geometric and visual accuracy. This approach not only preserves structural and textural details of map data but also maintains compatibility with current geospatial standards, ensuring robust performance across diverse datasets. The results demonstrate significant improvements in 3D model fidelity, making this method highly applicable for maritime situational awareness and the dynamic display of auxiliary information. △ Less

Submitted 5 September, 2024; originally announced September 2024.

Comments: Preprint of SPIE Sensor + Imaging 2024 conference paper

arXiv:2403.13643 [pdf]

Vibration Sensitivity of one-port and two-port MEMS microphones

Authors: Francis Doyon-D'Amour, Carly Stalder, Timothy Hodges, Michel Stephan, Lixiue Wu, Triantafillos Koukoulas, Stephane Leahy, Raphael St-Gelais

Abstract: Micro-electro-mechanical system (MEMS) microphones (mics) with two acoustic ports are currently receiving considerable interest, with the promise of achieving higher directional sensitivity compared to traditional one-port architectures. However, measuring pressure differences in two-port microphones typically commands sensing elements that are softer than in one-port mics, and are therefore presu… ▽ More Micro-electro-mechanical system (MEMS) microphones (mics) with two acoustic ports are currently receiving considerable interest, with the promise of achieving higher directional sensitivity compared to traditional one-port architectures. However, measuring pressure differences in two-port microphones typically commands sensing elements that are softer than in one-port mics, and are therefore presumably more prone to interference from external vibration. Here we derive a universal expression for microphone sensitivity to vibration and we experimentally demonstrate its validity for several emerging two-port microphone technologies. We also perform vibration measurements on a one-port mic, thus providing a one-stop direct comparison between one-port and two-port sensing approaches. We find that the acoustically-referred vibration sensitivity of two-port MEMS mics, in units of measured acoustic pressure per external acceleration (i.e., Pascals per g), does not depend on the sensing element stiffness nor on its natural frequency. We also show that this vibration sensitivity in two-port mics is inversely proportional to frequency as opposed to the frequency independent behavior observed in one-port mics. This is confirmed experimentally for several types of microphone packages. △ Less

Submitted 20 March, 2024; originally announced March 2024.

Comments: 8 pages, 14 figures

arXiv:2402.10893 [pdf, other]

RLVF: Learning from Verbal Feedback without Overgeneralization

Authors: Moritz Stephan, Alexander Khazatsky, Eric Mitchell, Annie S Chen, Sheryl Hsu, Archit Sharma, Chelsea Finn

Abstract: The diversity of contexts in which large language models (LLMs) are deployed requires the ability to modify or customize default model behaviors to incorporate nuanced requirements and preferences. A convenient interface to specify such model adjustments is high-level verbal feedback, such as "Don't use emojis when drafting emails to my boss." However, while writing high-level feedback is far simp… ▽ More The diversity of contexts in which large language models (LLMs) are deployed requires the ability to modify or customize default model behaviors to incorporate nuanced requirements and preferences. A convenient interface to specify such model adjustments is high-level verbal feedback, such as "Don't use emojis when drafting emails to my boss." However, while writing high-level feedback is far simpler than collecting annotations for reinforcement learning from human feedback (RLHF), we find that simply prompting a model with such feedback leads to overgeneralization of the feedback to contexts where it is not relevant. We study the problem of incorporating verbal feedback without such overgeneralization, inspiring a new method Contextualized Critiques with Constrained Preference Optimization (C3PO). C3PO uses a piece of high-level feedback to generate a small synthetic preference dataset specifying how the feedback should (and should not) be applied. It then fine-tunes the model in accordance with the synthetic preference data while minimizing the divergence from the original model for prompts where the feedback does not apply. Our experimental results indicate that our approach effectively applies verbal feedback to relevant scenarios while preserving existing behaviors for other contexts. For both human- and GPT-4-generated high-level feedback, C3PO effectively adheres to the given feedback comparably to in-context baselines while reducing overgeneralization by 30%. △ Less

Submitted 16 February, 2024; originally announced February 2024.

Comments: 9 pages, 9 figures

arXiv:2311.10728 [pdf]

Improving Feedback from Automated Reviews of Student Spreadsheets

Authors: Sören Aguirre Reid, Frank Kammer, Jonas-Ian Kuche, Pia-Doreen Ritzke, Markus Siepermann, Max Stephan, Armin Wagenknecht

Abstract: Spreadsheets are one of the most widely used tools for end users. As a result, spreadsheets such as Excel are now included in many curricula. However, digital solutions for assessing spreadsheet assignments are still scarce in the teaching context. Therefore, we have developed an Intelligent Tutoring System (ITS) to review students' Excel submissions and provide individualized feedback automatical… ▽ More Spreadsheets are one of the most widely used tools for end users. As a result, spreadsheets such as Excel are now included in many curricula. However, digital solutions for assessing spreadsheet assignments are still scarce in the teaching context. Therefore, we have developed an Intelligent Tutoring System (ITS) to review students' Excel submissions and provide individualized feedback automatically. Although the lecturer only needs to provide one reference solution, the students' submissions are analyzed automatically in several ways: value matching, detailed analysis of the formulas, and quality assessment of the solution. To take the students' learning level into account, we have developed feedback levels for an ITS that provide gradually more information about the error by using one of the different analyses. Feedback at a higher level has been shown to lead to a higher percentage of correct submissions and was also perceived as well understandable and helpful by the students. △ Less

Submitted 14 October, 2023; originally announced November 2023.

ACM Class: D.2.0

arXiv:2211.08802 [pdf, other]

Giving Feedback on Interactive Student Programs with Meta-Exploration

Authors: Evan Zheran Liu, Moritz Stephan, Allen Nie, Chris Piech, Emma Brunskill, Chelsea Finn

Abstract: Developing interactive software, such as websites or games, is a particularly engaging way to learn computer science. However, teaching and giving feedback on such software is time-consuming -- standard approaches require instructors to manually grade student-implemented interactive programs. As a result, online platforms that serve millions, like Code.org, are unable to provide any feedback on as… ▽ More Developing interactive software, such as websites or games, is a particularly engaging way to learn computer science. However, teaching and giving feedback on such software is time-consuming -- standard approaches require instructors to manually grade student-implemented interactive programs. As a result, online platforms that serve millions, like Code.org, are unable to provide any feedback on assignments for implementing interactive programs, which critically hinders students' ability to learn. One approach toward automatic grading is to learn an agent that interacts with a student's program and explores states indicative of errors via reinforcement learning. However, existing work on this approach only provides binary feedback of whether a program is correct or not, while students require finer-grained feedback on the specific errors in their programs to understand their mistakes. In this work, we show that exploring to discover errors can be cast as a meta-exploration problem. This enables us to construct a principled objective for discovering errors and an algorithm for optimizing this objective, which provides fine-grained feedback. We evaluate our approach on a set of over 700K real anonymized student programs from a Code.org interactive assignment. Our approach provides feedback with 94.3% accuracy, improving over existing approaches by 17.7% and coming within 1.5% of human-level accuracy. Project web page: https://ezliu.github.io/dreamgrader. △ Less

Submitted 16 November, 2022; originally announced November 2022.

Comments: Advances in Neural Information Processing Systems (NeurIPS 2022). Selected as Oral

arXiv:2210.13545 [pdf, other]

MEET: A Monte Carlo Exploration-Exploitation Trade-off for Buffer Sampling

Authors: Julius Ott, Lorenzo Servadei, Jose Arjona-Medina, Enrico Rinaldi, Gianfranco Mauro, Daniela Sánchez Lopera, Michael Stephan, Thomas Stadelmayer, Avik Santra, Robert Wille

Abstract: Data selection is essential for any data-based optimization technique, such as Reinforcement Learning. State-of-the-art sampling strategies for the experience replay buffer improve the performance of the Reinforcement Learning agent. However, they do not incorporate uncertainty in the Q-Value estimation. Consequently, they cannot adapt the sampling strategies, including exploration and exploitatio… ▽ More Data selection is essential for any data-based optimization technique, such as Reinforcement Learning. State-of-the-art sampling strategies for the experience replay buffer improve the performance of the Reinforcement Learning agent. However, they do not incorporate uncertainty in the Q-Value estimation. Consequently, they cannot adapt the sampling strategies, including exploration and exploitation of transitions, to the complexity of the task. To address this, this paper proposes a new sampling strategy that leverages the exploration-exploitation trade-off. This is enabled by the uncertainty estimation of the Q-Value function, which guides the sampling to explore more significant transitions and, thus, learn a more efficient policy. Experiments on classical control environments demonstrate stable results across various environments. They show that the proposed method outperforms state-of-the-art sampling strategies for dense rewards w.r.t. convergence and peak performance by 26% on average. △ Less

Submitted 17 April, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

Comments: Accepted at ICASSP 2023

Report number: RIKEN-iTHEMS-Report-23

arXiv:2210.04686 [pdf, other]

Utilizing Explainable AI for improving the Performance of Neural Networks

Authors: Huawei Sun, Lorenzo Servadei, Hao Feng, Michael Stephan, Robert Wille, Avik Santra

Abstract: Nowadays, deep neural networks are widely used in a variety of fields that have a direct impact on society. Although those models typically show outstanding performance, they have been used for a long time as black boxes. To address this, Explainable Artificial Intelligence (XAI) has been developing as a field that aims to improve the transparency of the model and increase their trustworthiness. W… ▽ More Nowadays, deep neural networks are widely used in a variety of fields that have a direct impact on society. Although those models typically show outstanding performance, they have been used for a long time as black boxes. To address this, Explainable Artificial Intelligence (XAI) has been developing as a field that aims to improve the transparency of the model and increase their trustworthiness. We propose a retraining pipeline that consistently improves the model predictions starting from XAI and utilizing state-of-the-art techniques. To do that, we use the XAI results, namely SHapley Additive exPlanations (SHAP) values, to give specific training weights to the data samples. This leads to an improved training of the model and, consequently, better performance. In order to benchmark our method, we evaluate it on both real-life and public datasets. First, we perform the method on a radar-based people counting scenario. Afterward, we test it on the CIFAR-10, a public Computer Vision dataset. Experiments using the SHAP-based retraining approach achieve a 4% more accuracy w.r.t. the standard equal weight retraining for people counting tasks. Moreover, on the CIFAR-10, our SHAP-based weighting strategy ends up with a 3% accuracy rate than the training procedure with equal weighted samples. △ Less

Submitted 7 October, 2022; originally announced October 2022.

Comments: accepted at ICMLA 2022

arXiv:2207.06379 [pdf, other]

doi 10.1109/ICPR48806.2021.9412858

Radar Image Reconstruction from Raw ADC Data using Parametric Variational Autoencoder with Domain Adaptation

Authors: Michael Stephan, Thomas Stadelmayer, Avik Santra, Georg Fischer, Robert Weigel, Fabian Lurz

Abstract: This paper presents a parametric variational autoencoder-based human target detection and localization framework working directly with the raw analog-to-digital converter data from the frequency modulated continous wave radar. We propose a parametrically constrained variational autoencoder, with residual and skip connections, capable of generating the clustered and localized target detections on t… ▽ More This paper presents a parametric variational autoencoder-based human target detection and localization framework working directly with the raw analog-to-digital converter data from the frequency modulated continous wave radar. We propose a parametrically constrained variational autoencoder, with residual and skip connections, capable of generating the clustered and localized target detections on the range-angle image. Furthermore, to circumvent the problem of training the proposed neural network on all possible scenarios using real radar data, we propose domain adaptation strategies whereby we first train the neural network using ray tracing based model data and then adapt the network to work on real sensor data. This strategy ensures better generalization and scalability of the proposed neural network even though it is trained with limited radar data. We demonstrate the superior detection and localization performance of our proposed solution compared to the conventional signal processing pipeline and earlier state-of-art deep U-Net architecture with range-doppler images as inputs △ Less

Submitted 30 May, 2022; originally announced July 2022.

MSC Class: 68T07

Journal ref: 25th International Conference on Pattern Recognition (ICPR), 2020, 9529-9536

arXiv:2203.17066 [pdf, other]

Cross-modal Learning of Graph Representations using Radar Point Cloud for Long-Range Gesture Recognition

Authors: Souvik Hazra, Hao Feng, Gamze Naz Kiprit, Michael Stephan, Lorenzo Servadei, Robert Wille, Robert Weigel, Avik Santra

Abstract: Gesture recognition is one of the most intuitive ways of interaction and has gathered particular attention for human computer interaction. Radar sensors possess multiple intrinsic properties, such as their ability to work in low illumination, harsh weather conditions, and being low-cost and compact, making them highly preferable for a gesture recognition solution. However, most literature work foc… ▽ More Gesture recognition is one of the most intuitive ways of interaction and has gathered particular attention for human computer interaction. Radar sensors possess multiple intrinsic properties, such as their ability to work in low illumination, harsh weather conditions, and being low-cost and compact, making them highly preferable for a gesture recognition solution. However, most literature work focuses on solutions with a limited range that is lower than a meter. We propose a novel architecture for a long-range (1m - 2m) gesture recognition solution that leverages a point cloud-based cross-learning approach from camera point cloud to 60-GHz FMCW radar point cloud, which allows learning better representations while suppressing noise. We use a variant of Dynamic Graph CNN (DGCNN) for the cross-learning, enabling us to model relationships between the points at a local and global level and to model the temporal dynamics a Bi-LSTM network is employed. In the experimental results section, we demonstrate our model's overall accuracy of 98.4% for five gestures and its generalization capability. △ Less

Submitted 19 May, 2022; v1 submitted 31 March, 2022; originally announced March 2022.

Comments: Accepted by IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM 2022)

arXiv:2110.05876 [pdf, other]

doi 10.1109/ICASSP43922.2022.9747621

Label-Aware Ranked Loss for robust People Counting using Automotive in-cabin Radar

Authors: Lorenzo Servadei, Huawei Sun, Julius Ott, Michael Stephan, Souvik Hazra, Thomas Stadelmayer, Daniela Sanchez Lopera, Robert Wille, Avik Santra

Abstract: In this paper, we introduce the Label-Aware Ranked loss, a novel metric loss function. Compared to the state-of-the-art Deep Metric Learning losses, this function takes advantage of the ranked ordering of the labels in regression problems. To this end, we first show that the loss minimises when datapoints of different labels are ranked and laid at uniform angles between each other in the embedding… ▽ More In this paper, we introduce the Label-Aware Ranked loss, a novel metric loss function. Compared to the state-of-the-art Deep Metric Learning losses, this function takes advantage of the ranked ordering of the labels in regression problems. To this end, we first show that the loss minimises when datapoints of different labels are ranked and laid at uniform angles between each other in the embedding space. Then, to measure its performance, we apply the proposed loss on a regression task of people counting with a short-range radar in a challenging scenario, namely a vehicle cabin. The introduced approach improves the accuracy as well as the neighboring labels accuracy up to 83.0% and 99.9%: An increase of 6.7%and 2.1% on state-of-the-art methods, respectively. △ Less

Submitted 3 March, 2022; v1 submitted 12 October, 2021; originally announced October 2021.

Comments: accepted at ICASSP 2022

MSC Class: 68T07

arXiv:2108.07729 [pdf, other]

Monitor++?: Multiple versus Single Laboratory Monitors in Early Programming Education

Authors: Matthew Stephan

Abstract: CONTRIBUTION: This paper presents an empirical study of an introductory-level programming course with students using multiple monitors and compares their performance and self-reported experiences versus students using a single monitor. BACKGROUND: Professional-level programming in many technological fields often employs multiple-monitors stations, however, some education laboratories employ single… ▽ More CONTRIBUTION: This paper presents an empirical study of an introductory-level programming course with students using multiple monitors and compares their performance and self-reported experiences versus students using a single monitor. BACKGROUND: Professional-level programming in many technological fields often employs multiple-monitors stations, however, some education laboratories employ single-monitor stations. This is unrepresentative of what students will encounter in practice and experiential learning. RESEARCH QUESTIONS: This study aims to answer three research questions. The questions include discovering the experiential observations of the students, contrasting the performance of the students using one monitor versus those using two monitors, and an investigation of the ways in which multiple monitors were employed by the students. METHODOLOGY: Half of the students in the study had access to multiple monitors. This was the only difference between the two study groups. This study contrasts grade medians and conducts median-test evaluation. Additionally, an experience survey facilitated likert-scale values and open-ended feedback questions facilitated textual analysis. Limitations of the study include the small sample size (86 students) and lack of control of participant composition. FINDINGS: Students reacted very favorably in rating their experience using the intervention. Overall, the multiple-monitor group had a slight performance improvement. Most improvement was in software-design and graphics assignments. Performance increased statistically significantly on the interfaces-and-hierarchies labs. Students used multiple-monitors in different ways including reference guides, assignment specifications, and more. △ Less

Submitted 13 August, 2021; originally announced August 2021.

Comments: Funding support via Miami University's Student Tech Fee - https://miamioh.edu/it-services/initiatives-and-projects/student-tech-grant/proposals-fy17/cec-multiple-monitors/index.html

arXiv:1902.03430 [pdf, other]

HNLB: Utilizing Hardware Matching Capabilities of NICs for Offloading Stateful Load Balancers

Authors: Raphael Durner, Amir Varasteh, Max Stephan, Carmen Mas Machuca, Wolfgang Kellerer

Abstract: In order to scale web or other services, the load on single instances of the respective service has to be balanced. Many services are stateful such that packets belonging to the same connection must be delivered to the same instance. This requires stateful load balancers which are mostly implemented in software. On the one hand, modern packet processing frameworks supporting software load balancer… ▽ More In order to scale web or other services, the load on single instances of the respective service has to be balanced. Many services are stateful such that packets belonging to the same connection must be delivered to the same instance. This requires stateful load balancers which are mostly implemented in software. On the one hand, modern packet processing frameworks supporting software load balancers, such as the Data Plane Development Kit (DPDK), deliver high performance compared to older approaches. On the other hand, common Network Interface Cards (NICs) provide additional matching capabilities that can be utilized for increasing the performance even further and in turn reduce the necessary server resources. In fact, offloading the packet matching to hardware can free up CPU cycles of the servers. Therefore, in this work, we propose the Hybrid NIC-offloading Load Balancer (HNLB), a high performance hybrid hardware-software load balancer, utilizing the NIC-offloading hardware matching capabilities. The results of our performance evaluations show that the throughput using NIC offloading can be increased by up to 50%, compared to a high performance software-only implementation. Furthermore, we investigated the limitations of our proposed approach, e.g., the limited number of possible concurrent connections. △ Less

Submitted 9 February, 2019; originally announced February 2019.

Comments: IEEE International Conference on Communications 2019 (ICC'19)

Showing 1–12 of 12 results for author: Stephan, M