-
Rethinking Gating Mechanism in Sparse MoE: Handling Arbitrary Modality Inputs with Confidence-Guided Gate
Authors:
Liangwei Nathan Zheng,
Wei Emma Zhang,
Mingyu Guo,
Miao Xu,
Olaf Maennel,
Weitong Chen
Abstract:
Effectively managing missing modalities is a fundamental challenge in real-world multimodal learning scenarios, where data incompleteness often results from systematic collection errors or sensor failures. Sparse Mixture-of-Experts (SMoE) architectures have the potential to naturally handle multimodal data, with individual experts specializing in different modalities. However, existing SMoE approa…
▽ More
Effectively managing missing modalities is a fundamental challenge in real-world multimodal learning scenarios, where data incompleteness often results from systematic collection errors or sensor failures. Sparse Mixture-of-Experts (SMoE) architectures have the potential to naturally handle multimodal data, with individual experts specializing in different modalities. However, existing SMoE approach often lacks proper ability to handle missing modality, leading to performance degradation and poor generalization in real-world applications. We propose Conf-SMoE to introduce a two-stage imputation module to handle the missing modality problem for the SMoE architecture and reveal the insight of expert collapse from theoretical analysis with strong empirical evidence. Inspired by our theoretical analysis, Conf-SMoE propose a novel expert gating mechanism by detaching the softmax routing score to task confidence score w.r.t ground truth. This naturally relieves expert collapse without introducing additional load balance loss function. We show that the insights of expert collapse aligns with other gating mechanism such as Gaussian and Laplacian gate. We also evaluate the proposed method on four different real world dataset with three different experiment settings to conduct comprehensive the analysis of Conf-SMoE on modality fusion and resistance to missing modality.
△ Less
Submitted 26 May, 2025;
originally announced May 2025.
-
Slice+Slice Baby: Generating Last-Level Cache Eviction Sets in the Blink of an Eye
Authors:
Bradley Morgan,
Gal Horowitz,
Sioli O'Connell,
Stephan van Schaik,
Chitchanok Chuengsatiansup,
Daniel Genkin,
Olaf Maennel,
Paul Montague,
Eyal Ronen,
Yuval Yarom
Abstract:
An essential step for mounting cache attacks is finding eviction sets, collections of memory locations that contend on cache space. On Intel processors, one of the main challenges for identifying contending addresses is the sliced cache design, where the processor hashes the physical address to determine where in the cache a memory location is stored. While past works have demonstrated that the ha…
▽ More
An essential step for mounting cache attacks is finding eviction sets, collections of memory locations that contend on cache space. On Intel processors, one of the main challenges for identifying contending addresses is the sliced cache design, where the processor hashes the physical address to determine where in the cache a memory location is stored. While past works have demonstrated that the hash function can be reversed, they also showed that it depends on physical address bits that the adversary does not know.
In this work, we make three main contributions to the art of finding eviction sets. We first exploit microarchitectural races to compare memory access times and identify the cache slice to which an address maps. We then use the known hash function to both reduce the error rate in our slice identification method and to reduce the work by extrapolating slice mappings to untested memory addresses. Finally, we show how to propagate information on eviction sets across different page offsets for the hitherto unexplored case of non-linear hash functions.
Our contributions allow for entire LLC eviction set generation in 0.7 seconds on the Intel i7-9850H and 1.6 seconds on the i9-10900K, both using non-linear functions. This represents a significant improvement compared to state-of-the-art techniques taking 9x and 10x longer, respectively.
△ Less
Submitted 20 April, 2025; v1 submitted 15 April, 2025;
originally announced April 2025.
-
FuzzSense: Towards A Modular Fuzzing Framework for Autonomous Driving Software
Authors:
Andrew Roberts,
Lorenz Teply,
Mert D. Pese,
Olaf Maennel,
Mohammad Hamad,
Sebastian Steinhorst
Abstract:
Fuzz testing to find semantic control vulnerabilities is an essential activity to evaluate the robustness of autonomous driving (AD) software. Whilst there is a preponderance of disparate fuzzing tools that target different parts of the test environment, such as the scenario, sensors, and vehicle dynamics, there is a lack of fuzzing strategies that ensemble these fuzzers to enable concurrent fuzzi…
▽ More
Fuzz testing to find semantic control vulnerabilities is an essential activity to evaluate the robustness of autonomous driving (AD) software. Whilst there is a preponderance of disparate fuzzing tools that target different parts of the test environment, such as the scenario, sensors, and vehicle dynamics, there is a lack of fuzzing strategies that ensemble these fuzzers to enable concurrent fuzzing, utilizing diverse techniques and targets. This research proposes FuzzSense, a modular, black-box, mutation-based fuzzing framework that is architected to ensemble diverse AD fuzzing tools. To validate the utility of FuzzSense, a LiDAR sensor fuzzer was developed as a plug-in, and the fuzzer was implemented in the new AD simulation platform AWSIM and Autoware.Universe AD software platform. The results demonstrated that FuzzSense was able to find vulnerabilities in the new Autoware.Universe software. We contribute to FuzzSense open-source with the aim of initiating a conversation in the community on the design of AD-specific fuzzers and the establishment of a community fuzzing framework to better target the diverse technology base of autonomous vehicles.
△ Less
Submitted 14 April, 2025;
originally announced April 2025.
-
We Care Each Pixel: Calibrating on Medical Segmentation Model
Authors:
Wenhao Liang,
Wei Zhang,
Lin Yue,
Miao Xu,
Olaf Maennel,
Weitong Chen
Abstract:
Medical image segmentation is fundamental for computer-aided diagnostics, providing accurate delineation of anatomical structures and pathological regions. While common metrics such as Accuracy, DSC, IoU, and HD primarily quantify spatial agreement between predictions and ground-truth labels, they do not assess the calibration quality of segmentation models, which is crucial for clinical reliabili…
▽ More
Medical image segmentation is fundamental for computer-aided diagnostics, providing accurate delineation of anatomical structures and pathological regions. While common metrics such as Accuracy, DSC, IoU, and HD primarily quantify spatial agreement between predictions and ground-truth labels, they do not assess the calibration quality of segmentation models, which is crucial for clinical reliability. To address this limitation, we propose pixel-wise Expected Calibration Error (pECE), a novel metric that explicitly measures miscalibration at the pixel level, thereby ensuring both spatial precision and confidence reliability. We further introduce a morphological adaptation strategy that applies morphological operations to ground-truth masks before computing calibration losses, particularly benefiting margin-based losses such as Margin SVLS and NACL. Additionally, we present the Signed Distance Calibration Loss (SDC), which aligns boundary geometry with calibration objectives by penalizing discrepancies between predicted and ground-truth signed distance functions (SDFs). Extensive experiments demonstrate that our method not only enhances segmentation performance but also improves calibration quality, yielding more trustworthy confidence estimates. Code is available at: https://github.com/EagleAdelaide/SDC-Loss.
△ Less
Submitted 13 June, 2025; v1 submitted 6 March, 2025;
originally announced March 2025.
-
PostHoc FREE Calibrating on Kolmogorov Arnold Networks
Authors:
Wenhao Liang,
Wei Emma Zhang,
Lin Yue,
Miao Xu,
Olaf Maennel,
Weitong Chen
Abstract:
Kolmogorov Arnold Networks (KANs) are neural architectures inspired by the Kolmogorov Arnold representation theorem that leverage B Spline parameterizations for flexible, locally adaptive function approximation. Although KANs can capture complex nonlinearities beyond those modeled by standard MultiLayer Perceptrons (MLPs), they frequently exhibit miscalibrated confidence estimates manifesting as o…
▽ More
Kolmogorov Arnold Networks (KANs) are neural architectures inspired by the Kolmogorov Arnold representation theorem that leverage B Spline parameterizations for flexible, locally adaptive function approximation. Although KANs can capture complex nonlinearities beyond those modeled by standard MultiLayer Perceptrons (MLPs), they frequently exhibit miscalibrated confidence estimates manifesting as overconfidence in dense data regions and underconfidence in sparse areas. In this work, we systematically examine the impact of four critical hyperparameters including Layer Width, Grid Order, Shortcut Function, and Grid Range on the calibration of KANs. Furthermore, we introduce a novel TemperatureScaled Loss (TSL) that integrates a temperature parameter directly into the training objective, dynamically adjusting the predictive distribution during learning. Both theoretical analysis and extensive empirical evaluations on standard benchmarks demonstrate that TSL significantly reduces calibration errors, thereby improving the reliability of probabilistic predictions. Overall, our study provides actionable insights into the design of spline based neural networks and establishes TSL as a robust loss solution for enhancing calibration.
△ Less
Submitted 3 March, 2025;
originally announced March 2025.
-
Free-Knots Kolmogorov-Arnold Network: On the Analysis of Spline Knots and Advancing Stability
Authors:
Liangwewi Nathan Zheng,
Wei Emma Zhang,
Lin Yue,
Miao Xu,
Olaf Maennel,
Weitong Chen
Abstract:
Kolmogorov-Arnold Neural Networks (KANs) have gained significant attention in the machine learning community. However, their implementation often suffers from poor training stability and heavy trainable parameter. Furthermore, there is limited understanding of the behavior of the learned activation functions derived from B-splines. In this work, we analyze the behavior of KANs through the lens of…
▽ More
Kolmogorov-Arnold Neural Networks (KANs) have gained significant attention in the machine learning community. However, their implementation often suffers from poor training stability and heavy trainable parameter. Furthermore, there is limited understanding of the behavior of the learned activation functions derived from B-splines. In this work, we analyze the behavior of KANs through the lens of spline knots and derive the lower and upper bound for the number of knots in B-spline-based KANs. To address existing limitations, we propose a novel Free Knots KAN that enhances the performance of the original KAN while reducing the number of trainable parameters to match the trainable parameter scale of standard Multi-Layer Perceptrons (MLPs). Additionally, we introduce new a training strategy to ensure $C^2$ continuity of the learnable spline, resulting in smoother activation compared to the original KAN and improve the training stability by range expansion. The proposed method is comprehensively evaluated on 8 datasets spanning various domains, including image, text, time series, multimodal, and function approximation tasks. The promising results demonstrates the feasibility of KAN-based network and the effectiveness of proposed method.
△ Less
Submitted 15 January, 2025;
originally announced January 2025.
-
Understanding Why Large Language Models Can Be Ineffective in Time Series Analysis: The Impact of Modality Alignment
Authors:
Liangwei Nathan Zheng,
Chang George Dong,
Wei Emma Zhang,
Lin Yue,
Miao Xu,
Olaf Maennel,
Weitong Chen
Abstract:
Large Language Models (LLMs) have demonstrated impressive performance in time series analysis and seems to understand the time temporal relationship well than traditional transformer-based approaches. However, since LLMs are not designed for time series tasks, simpler models like linear regressions can often achieve comparable performance with far less complexity. In this study, we perform extensi…
▽ More
Large Language Models (LLMs) have demonstrated impressive performance in time series analysis and seems to understand the time temporal relationship well than traditional transformer-based approaches. However, since LLMs are not designed for time series tasks, simpler models like linear regressions can often achieve comparable performance with far less complexity. In this study, we perform extensive experiments to assess the effectiveness of applying LLMs to key time series tasks, including forecasting, classification, imputation, and anomaly detection. We compare the performance of LLMs against simpler baseline models, such as single layer linear models and randomly initialized LLMs. Our results reveal that LLMs offer minimal advantages for these core time series tasks and may even distort the temporal structure of the data. In contrast, simpler models consistently outperform LLMs while requiring far fewer parameters. Furthermore, we analyze existing reprogramming techniques and show, through data manifold analysis, that these methods fail to effectively align time series data with language and display "pseudo-alignment" behavior in embedding space. Our findings suggest that the performance of LLM based methods in time series tasks arises from the intrinsic characteristics and structure of time series data, rather than any meaningful alignment with the language model architecture.
△ Less
Submitted 26 May, 2025; v1 submitted 16 October, 2024;
originally announced October 2024.
-
Irregularity-Informed Time Series Analysis: Adaptive Modelling of Spatial and Temporal Dynamics
Authors:
Liangwei Nathan Zheng,
Zhengyang Li,
Chang George Dong,
Wei Emma Zhang,
Lin Yue,
Miao Xu,
Olaf Maennel,
Weitong Chen
Abstract:
Irregular Time Series Data (IRTS) has shown increasing prevalence in real-world applications. We observed that IRTS can be divided into two specialized types: Natural Irregular Time Series (NIRTS) and Accidental Irregular Time Series (AIRTS). Various existing methods either ignore the impacts of irregular patterns or statically learn the irregular dynamics of NIRTS and AIRTS data and suffer from l…
▽ More
Irregular Time Series Data (IRTS) has shown increasing prevalence in real-world applications. We observed that IRTS can be divided into two specialized types: Natural Irregular Time Series (NIRTS) and Accidental Irregular Time Series (AIRTS). Various existing methods either ignore the impacts of irregular patterns or statically learn the irregular dynamics of NIRTS and AIRTS data and suffer from limited data availability due to the sparsity of IRTS. We proposed a novel transformer-based framework for general irregular time series data that treats IRTS from four views: Locality, Time, Spatio and Irregularity to motivate the data usage to the highest potential. Moreover, we design a sophisticated irregularity-gate mechanism to adaptively select task-relevant information from irregularity, which improves the generalization ability to various IRTS data. We implement extensive experiments to demonstrate the resistance of our work to three highly missing ratio datasets (88.4\%, 94.9\%, 60\% missing value) and investigate the significance of the irregularity information for both NIRTS and AIRTS by additional ablation study. We release our implementation in https://github.com/IcurasLW/MTSFormer-Irregular_Time_Series.git
△ Less
Submitted 16 October, 2024;
originally announced October 2024.
-
REACT: Autonomous Intrusion Response System for Intelligent Vehicles
Authors:
Mohammad Hamad,
Andreas Finkenzeller,
Michael Kühr,
Andrew Roberts,
Olaf Maennel,
Vassilis Prevelakis,
Sebastian Steinhorst
Abstract:
Autonomous and connected vehicles are rapidly evolving, integrating numerous technologies and software. This progress, however, has made them appealing targets for cybersecurity attacks. As the risk of cyber threats escalates with this advancement, the focus is shifting from solely preventing these attacks to also mitigating their impact. Current solutions rely on vehicle security operation center…
▽ More
Autonomous and connected vehicles are rapidly evolving, integrating numerous technologies and software. This progress, however, has made them appealing targets for cybersecurity attacks. As the risk of cyber threats escalates with this advancement, the focus is shifting from solely preventing these attacks to also mitigating their impact. Current solutions rely on vehicle security operation centers, where attack information is analyzed before deciding on a response strategy. However, this process can be time-consuming and faces scalability challenges, along with other issues stemming from vehicle connectivity. This paper proposes a dynamic intrusion response system integrated within the vehicle. This system enables the vehicle to respond to a variety of incidents almost instantly, thereby reducing the need for interaction with the vehicle security operation center. The system offers a comprehensive list of potential responses, a methodology for response evaluation, and various response selection methods. The proposed solution was implemented on an embedded platform. Two distinct cyberattack use cases served as the basis for evaluating the system. The evaluation highlights the system's adaptability, its ability to respond swiftly, its minimal memory footprint, and its capacity for dynamic system parameter adjustments. The proposed solution underscores the necessity and feasibility of incorporating dynamic response mechanisms in smart vehicles. This is a crucial factor in ensuring the safety and resilience of future smart mobility.
△ Less
Submitted 16 January, 2024; v1 submitted 9 January, 2024;
originally announced January 2024.
-
RiPKI: The Tragic Story of RPKI Deployment in the Web Ecosystem
Authors:
Matthias Wählisch,
Robert Schmidt,
Thomas C. Schmidt,
Olaf Maennel,
Steve Uhlig,
Gareth Tyson
Abstract:
Web content delivery is one of the most important services on the Internet. Access to websites is typically secured via TLS. However, this security model does not account for prefix hijacking on the network layer, which may lead to traffic blackholing or transparent interception. Thus, to achieve comprehensive security and service availability, additional protective mechanisms are necessary such a…
▽ More
Web content delivery is one of the most important services on the Internet. Access to websites is typically secured via TLS. However, this security model does not account for prefix hijacking on the network layer, which may lead to traffic blackholing or transparent interception. Thus, to achieve comprehensive security and service availability, additional protective mechanisms are necessary such as the RPKI, a recently deployed Resource Public Key Infrastructure to prevent hijacking of traffic by networks. This paper argues two positions. First, that modern web hosting practices make route protection challenging due to the propensity to spread servers across many different networks, often with unpredictable client redirection strategies, and, second, that we need a better understanding why protection mechanisms are not deployed. To initiate this, we empirically explore the relationship between web hosting infrastructure and RPKI deployment. Perversely, we find that less popular websites are more likely to be secured than the prominent sites. Worryingly, we find many large-scale CDNs do not support RPKI, thus making their customers vulnerable. This leads us to explore business reasons why operators are hesitant to deploy RPKI, which may help to guide future research on improving Internet security.
△ Less
Submitted 2 November, 2015; v1 submitted 2 August, 2014;
originally announced August 2014.
-
Beyond Node Degree: Evaluating AS Topology Models
Authors:
Hamed Haddadi,
Damien Fay,
Almerima Jamakovic,
Olaf Maennel,
Andrew W. Moore,
Richard Mortier,
Miguel Rio,
Steve Uhlig
Abstract:
Many models have been proposed to generate Internet Autonomous System (AS) topologies, most of which make structural assumptions about the AS graph. In this paper we compare AS topology generation models with several observed AS topologies. In contrast to most previous works, we avoid making assumptions about which topological properties are important to characterize the AS topology. Our analysi…
▽ More
Many models have been proposed to generate Internet Autonomous System (AS) topologies, most of which make structural assumptions about the AS graph. In this paper we compare AS topology generation models with several observed AS topologies. In contrast to most previous works, we avoid making assumptions about which topological properties are important to characterize the AS topology. Our analysis shows that, although matching degree-based properties, the existing AS topology generation models fail to capture the complexity of the local interconnection structure between ASs. Furthermore, we use BGP data from multiple vantage points to show that additional measurement locations significantly affect local structure properties, such as clustering and node centrality. Degree-based properties, however, are not notably affected by additional measurements locations. These observations are particularly valid in the core. The shortcomings of AS topology generation models stems from an underestimation of the complexity of the connectivity in the core caused by inappropriate use of BGP data.
△ Less
Submitted 13 July, 2008;
originally announced July 2008.