-
Using Machine Learning for Lunar Mineralogy-I: Hyperspectral Imaging of Volcanic Samples
Authors:
Fatemeh Fazel Hesar,
Mojtaba Raouf,
Peyman Soltani,
Bernard Foing,
Michiel J. A. de Dood,
Fons J. Verbeek,
Esther Cheng,
Chenming Zhou
Abstract:
This study examines the mineral composition of volcanic samples similar to lunar materials, focusing on olivine and pyroxene. Using hyperspectral imaging from 400 to 1000 nm, we created data cubes to analyze the reflectance characteristics of samples from samples from Vulcano, a volcanically active island in the Aeolian Archipelago, north of Sicily, Italy, categorizing them into nine regions of in…
▽ More
This study examines the mineral composition of volcanic samples similar to lunar materials, focusing on olivine and pyroxene. Using hyperspectral imaging from 400 to 1000 nm, we created data cubes to analyze the reflectance characteristics of samples from samples from Vulcano, a volcanically active island in the Aeolian Archipelago, north of Sicily, Italy, categorizing them into nine regions of interest and analyzing spectral data for each. We applied various unsupervised clustering algorithms, including K-Means, Hierarchical Clustering, GMM, and Spectral Clustering, to classify the spectral profiles. Principal Component Analysis revealed distinct spectral signatures associated with specific minerals, facilitating precise identification. Clustering performance varied by region, with K-Means achieving the highest silhouette-score of 0.47, whereas GMM performed poorly with a score of only 0.25. Non-negative Matrix Factorization aided in identifying similarities among clusters across different methods and reference spectra for olivine and pyroxene. Hierarchical clustering emerged as the most reliable technique, achieving a 94\% similarity with the olivine spectrum in one sample, whereas GMM exhibited notable variability. Overall, the analysis indicated that both Hierarchical and K-Means methods yielded lower errors in total measurements, with K-Means demonstrating superior performance in estimated dispersion and clustering. Additionally, GMM showed a higher root mean square error compared to the other models. The RMSE analysis confirmed K-Means as the most consistent algorithm across all samples, suggesting a predominance of olivine in the Vulcano region relative to pyroxene. This predominance is likely linked to historical formation conditions similar to volcanic processes on the Moon, where olivine-rich compositions are common in ancient lava flows and impact melt rocks.
△ Less
Submitted 7 April, 2025; v1 submitted 28 March, 2025;
originally announced March 2025.
-
Modeling the Dynamics of Sub-Millisecond Electroadhesive Engagement and Release Times
Authors:
Ahad M. Rauf,
Sean Follmer
Abstract:
Electroadhesion is an electrically controllable switchable adhesive commonly used in soft robots and haptic user interfaces. It can form strong bonds to a wide variety of surfaces at low power consumption. However, electroadhesive clutches in the literature engage to and release from substrates several orders of magnitude slower than a traditional electrostatic model would predict, limiting their…
▽ More
Electroadhesion is an electrically controllable switchable adhesive commonly used in soft robots and haptic user interfaces. It can form strong bonds to a wide variety of surfaces at low power consumption. However, electroadhesive clutches in the literature engage to and release from substrates several orders of magnitude slower than a traditional electrostatic model would predict, limiting their usefulness in high-bandwidth applications. We develop a novel electromechanical model for electroadhesion, factoring in polarization dynamics and contact mechanics between the dielectric and substrate. We show in simulation and experimentally how different design parameters affect the engagement and release times of electroadhesive clutches to metallic substrates. In particular, we find that higher drive frequencies and narrower substrate aspect ratios enable significantly faster dynamics. We demonstrate designs with engagement times under 15 us and release times as low as 875 us, which are 10x and 17.1x faster, respectively, than the best times found in prior literature.
△ Less
Submitted 21 December, 2024;
originally announced December 2024.
-
Advancing Machine Learning for Stellar Activity and Exoplanet Period Rotation
Authors:
Fatemeh Fazel Hesar,
Bernard Foing,
Ana M. Heras,
Mojtaba Raouf,
Victoria Foing,
Shima Javanmardi,
Fons J. Verbeek
Abstract:
This study applied machine learning models to estimate stellar rotation periods from corrected light curve data obtained by the NASA Kepler mission. Traditional methods often struggle to estimate rotation periods accurately due to noise and variability in the light curve data. The workflow involved using initial period estimates from the LS-Periodogram and Transit Least Squares techniques, followe…
▽ More
This study applied machine learning models to estimate stellar rotation periods from corrected light curve data obtained by the NASA Kepler mission. Traditional methods often struggle to estimate rotation periods accurately due to noise and variability in the light curve data. The workflow involved using initial period estimates from the LS-Periodogram and Transit Least Squares techniques, followed by splitting the data into training, validation, and testing sets. We employed several machine learning algorithms, including Decision Tree, Random Forest, K-Nearest Neighbors, and Gradient Boosting, and also utilized a Voting Ensemble approach to improve prediction accuracy and robustness.
The analysis included data from multiple Kepler IDs, providing detailed metrics on orbital periods and planet radii. Performance evaluation showed that the Voting Ensemble model yielded the most accurate results, with an RMSE approximately 50\% lower than the Decision Tree model and 17\% better than the K-Nearest Neighbors model. The Random Forest model performed comparably to the Voting Ensemble, indicating high accuracy. In contrast, the Gradient Boosting model exhibited a worse RMSE compared to the other approaches. Comparisons of the predicted rotation periods to the photometric reference periods showed close alignment, suggesting the machine learning models achieved high prediction accuracy. The results indicate that machine learning, particularly ensemble methods, can effectively solve the problem of accurately estimating stellar rotation periods, with significant implications for advancing the study of exoplanets and stellar astrophysics.
△ Less
Submitted 9 September, 2024;
originally announced September 2024.
-
Electroadhesive Auxetics as Programmable Layer Jamming Skins for Formable Crust Shape Displays
Authors:
Ahad M. Rauf,
Jack S. Bernardo,
Sean Follmer
Abstract:
Shape displays are a class of haptic devices that enable whole-hand haptic exploration of 3D surfaces. However, their scalability is limited by the mechanical complexity and high cost of traditional actuator arrays. In this paper, we propose using electroadhesive auxetic skins as a strain-limiting layer to create programmable shape change in a continuous ("formable crust") shape display. Auxetic s…
▽ More
Shape displays are a class of haptic devices that enable whole-hand haptic exploration of 3D surfaces. However, their scalability is limited by the mechanical complexity and high cost of traditional actuator arrays. In this paper, we propose using electroadhesive auxetic skins as a strain-limiting layer to create programmable shape change in a continuous ("formable crust") shape display. Auxetic skins are manufactured as flexible printed circuit boards with dielectric-laminated electrodes on each auxetic unit cell (AUC), using monolithic fabrication to lower cost and assembly time. By layering multiple sheets and applying a voltage between electrodes on subsequent layers, electroadhesion locks individual AUCs, achieving a maximum in-plane stiffness variation of 7.6x with a power consumption of 50 uW/AUC. We first characterize an individual AUC and compare results to a kinematic model. We then validate the ability of a 5x5 AUC array to actively modify its own axial and transverse stiffness. Finally, we demonstrate this array in a continuous shape display as a strain-limiting skin to programmatically modulate the shape output of an inflatable LDPE pouch. Integrating electroadhesion with auxetics enables new capabilities for scalable, low-profile, and low-power control of flexible robotic systems.
△ Less
Submitted 11 March, 2023; v1 submitted 10 November, 2022;
originally announced November 2022.
-
Meta Learning for Code Summarization
Authors:
Moiz Rauf,
Sebastian Padó,
Michael Pradel
Abstract:
Source code summarization is the task of generating a high-level natural language description for a segment of programming language code. Current neural models for the task differ in their architecture and the aspects of code they consider. In this paper, we show that three SOTA models for code summarization work well on largely disjoint subsets of a large code-base. This complementarity motivates…
▽ More
Source code summarization is the task of generating a high-level natural language description for a segment of programming language code. Current neural models for the task differ in their architecture and the aspects of code they consider. In this paper, we show that three SOTA models for code summarization work well on largely disjoint subsets of a large code-base. This complementarity motivates model combination: We propose three meta-models that select the best candidate summary for a given code segment. The two neural models improve significantly over the performance of the best individual model, obtaining an improvement of 2.1 BLEU points on a dataset of code segments where at least one of the individual models obtains a non-zero BLEU.
△ Less
Submitted 20 January, 2022;
originally announced January 2022.
-
Learning Interclass Relations for Image Classification
Authors:
Muhamedrahimov Raouf,
Bar Amir,
Akselrod-Ballin Ayelet
Abstract:
In standard classification, we typically treat class categories as independent of one-another. In many problems, however, we would be neglecting the natural relations that exist between categories, which are often dictated by an underlying biological or physical process. In this work, we propose novel formulations of the classification problem, based on a realization that the assumption of class-i…
▽ More
In standard classification, we typically treat class categories as independent of one-another. In many problems, however, we would be neglecting the natural relations that exist between categories, which are often dictated by an underlying biological or physical process. In this work, we propose novel formulations of the classification problem, based on a realization that the assumption of class-independence is a limiting factor that leads to the requirement of more training data. First, we propose manual ways to reduce our data needs by reintroducing knowledge about problem-specific interclass relations into the training process. Second, we propose a general approach to jointly learn categorical label representations that can implicitly encode natural interclass relations, alleviating the need for strong prior assumptions, which are not always available. We demonstrate this in the domain of medical images, where access to large amounts of labelled data is not trivial. Specifically, our experiments show the advantages of this approach in the classification of Intravenous Contrast enhancement phases in CT images, which encapsulate multiple interesting inter-class relations.
△ Less
Submitted 24 June, 2020;
originally announced June 2020.
-
IdBench: Evaluating Semantic Representations of Identifier Names in Source Code
Authors:
Yaza Wainakh,
Moiz Rauf,
Michael Pradel
Abstract:
Identifier names convey useful information about the intended semantics of code. Name-based program analyses use this information, e.g., to detect bugs, to predict types, and to improve the readability of code. At the core of name-based analyses are semantic representations of identifiers, e.g., in the form of learned embeddings. The high-level goal of such a representation is to encode whether tw…
▽ More
Identifier names convey useful information about the intended semantics of code. Name-based program analyses use this information, e.g., to detect bugs, to predict types, and to improve the readability of code. At the core of name-based analyses are semantic representations of identifiers, e.g., in the form of learned embeddings. The high-level goal of such a representation is to encode whether two identifiers, e.g., len and size, are semantically similar. Unfortunately, it is currently unclear to what extent semantic representations match the semantic relatedness and similarity perceived by developers. This paper presents IdBench, the first benchmark for evaluating semantic representations against a ground truth created from thousands of ratings by 500 software developers. We use IdBench to study state-of-the-art embedding techniques proposed for natural language, an embedding technique specifically designed for source code, and lexical string distance functions. Our results show that the effectiveness of semantic representations varies significantly and that the best available embeddings successfully represent semantic relatedness. On the downside, no existing technique provides a satisfactory representation of semantic similarities, among other reasons because identifiers with opposing meanings are incorrectly considered to be similar, which may lead to fatal mistakes, e.g., in a refactoring tool. Studying the strengths and weaknesses of the different techniques shows that they complement each other. As a first step toward exploiting this complementarity, we present an ensemble model that combines existing techniques and that clearly outperforms the best available semantic representation.
△ Less
Submitted 14 January, 2021; v1 submitted 11 October, 2019;
originally announced October 2019.