-
On-Device Qwen2.5: Efficient LLM Inference with Model Compression and Hardware Acceleration
Authors:
Maoyang Xiang,
Ramesh Fernando,
Bo Wang
Abstract:
Transformer-based Large Language Models (LLMs) have significantly advanced AI capabilities but pose considerable challenges for deployment on edge devices due to high computational demands, memory bandwidth constraints, and energy consumption. This paper addresses these challenges by presenting an efficient framework for deploying the Qwen2.5-0.5B model on the Xilinx Kria KV260 edge platform, a he…
▽ More
Transformer-based Large Language Models (LLMs) have significantly advanced AI capabilities but pose considerable challenges for deployment on edge devices due to high computational demands, memory bandwidth constraints, and energy consumption. This paper addresses these challenges by presenting an efficient framework for deploying the Qwen2.5-0.5B model on the Xilinx Kria KV260 edge platform, a heterogeneous system integrating an ARM Cortex-A53 CPU with reconfigurable FPGA logic. Leveraging Activation-aware Weight Quantization (AWQ) with FPGA-accelerated execution pipelines, the proposed approach enhances both model compression rate and system throughput. Additionally, we propose a hybrid execution strategy that intelligently offloads compute-intensive operations to the FPGA while utilizing the CPU for lighter tasks, effectively balancing the computational workload and maximizing overall performance. Our framework achieves a model compression rate of 55.08% compared to the original model and produces output at a rate of 5.1 tokens per second, outperforming the baseline performance of 2.8 tokens per second.
△ Less
Submitted 24 April, 2025;
originally announced April 2025.
-
Saturated de Rham-Witt complexes with unit-root coefficients
Authors:
Ravi Fernando
Abstract:
The saturated de Rham-Witt complex, introduced by Bhatt-Lurie-Mathew, is a variant of the classical de Rham-Witt complex which provides a conceptual simplification of the construction and which is expected to produce better results for non-smooth varieties. In this paper, we introduce a generalization of the saturated de Rham-Witt complex which allows coefficients in a unit-root $F$-crystal. We de…
▽ More
The saturated de Rham-Witt complex, introduced by Bhatt-Lurie-Mathew, is a variant of the classical de Rham-Witt complex which provides a conceptual simplification of the construction and which is expected to produce better results for non-smooth varieties. In this paper, we introduce a generalization of the saturated de Rham-Witt complex which allows coefficients in a unit-root $F$-crystal. We define our complex by a universal property in a category of so-called de Rham-Witt modules. We prove a number of results about it, including existence, quasicoherence, and comparisons to the de Rham-Witt complex of Bhatt-Lurie-Mathew and (in the smooth case) to crystalline cohomology and the classical de Rham-Witt complex with coefficients.
△ Less
Submitted 22 November, 2024; v1 submitted 5 June, 2024;
originally announced June 2024.
-
Hybrid Y-Net Architecture for Singing Voice Separation
Authors:
Rashen Fernando,
Pamudu Ranasinghe,
Udula Ranasinghe,
Janaka Wijayakulasooriya,
Pantaleon Perera
Abstract:
This research paper presents a novel deep learning-based neural network architecture, named Y-Net, for achieving music source separation. The proposed architecture performs end-to-end hybrid source separation by extracting features from both spectrogram and waveform domains. Inspired by the U-Net architecture, Y-Net predicts a spectrogram mask to separate vocal sources from a mixture signal. Our r…
▽ More
This research paper presents a novel deep learning-based neural network architecture, named Y-Net, for achieving music source separation. The proposed architecture performs end-to-end hybrid source separation by extracting features from both spectrogram and waveform domains. Inspired by the U-Net architecture, Y-Net predicts a spectrogram mask to separate vocal sources from a mixture signal. Our results demonstrate the effectiveness of the proposed architecture for music source separation with fewer parameters. Overall, our work presents a promising approach for improving the accuracy and efficiency of music source separation.
△ Less
Submitted 5 March, 2023;
originally announced March 2023.
-
Planetary Terrestrial Analogues Library Project: 3. Characterization of Samples with MicrOmega
Authors:
Loizeau Damien,
Pilorget Cédric,
Poulet François,
Lantz Cateline,
Bibring Jean-Pierre,
Hamm Vincent,
Royer Clément,
Dypvik Henning,
Krzesińska Agata M.,
Rull Fernando,
Werner Stephanie C
Abstract:
The PTAL (Planetary Terrestrial Analogues Library) project aims at building and exploiting a database involving several analytical techniques, to help characterizing the mineralogical evolution of terrestrial bodies, starting with Mars. Around 100 natural Earth rock samples have been collected from selected locations to gather a variety of analogues for Martian geology, from volcanic to sedimentar…
▽ More
The PTAL (Planetary Terrestrial Analogues Library) project aims at building and exploiting a database involving several analytical techniques, to help characterizing the mineralogical evolution of terrestrial bodies, starting with Mars. Around 100 natural Earth rock samples have been collected from selected locations to gather a variety of analogues for Martian geology, from volcanic to sedimentary origin with different levels of alteration. All samples are to be characterized within the PTAL project with different mineralogical and elemental analysis techniques, including techniques brought on actual and future instruments at the surface of Mars (Near InfraRed spectroscopy, Raman spectroscopy and Laser Induced Breakdown Spectroscopy). This paper presents the NIR measurements and interpretations acquired with the ExoMars MicrOmega spare instrument. MicrOmega is a NIR hyperspectral microscope, mounted in the analytical laboratory of the ExoMars rover Rosalind Franklin. All PTAL samples have been observed at least once with MicrOmega using a dedicated setup. For all PTAL samples data description and interpretation are presented. For some chosen examples, RGB images and spectra are presented a well. A comparison with characterizations by NIR and Raman spectrometry is discussed for some of the samples. In particular, the spectral imaging capacity of MicrOmega allows detections of mineral components and potential organic molecules that were not possible with other one-spot techniques. Additionally, it enables to estimate heterogeneities in the spatial distribution of various mineral species. The MicrOmega/PTAL data shall support the future observations and analyses performed by MicrOmega/Rosalind Franklin instrument.
△ Less
Submitted 3 January, 2022;
originally announced January 2022.
-
Deep and Statistical Learning in Biomedical Imaging: State of the Art in 3D MRI Brain Tumor Segmentation
Authors:
K. Ruwani M. Fernando,
Chris P. Tsokos
Abstract:
Clinical diagnostic and treatment decisions rely upon the integration of patient-specific data with clinical reasoning. Cancer presents a unique context that influence treatment decisions, given its diverse forms of disease evolution. Biomedical imaging allows noninvasive assessment of disease based on visual evaluations leading to better clinical outcome prediction and therapeutic planning. Early…
▽ More
Clinical diagnostic and treatment decisions rely upon the integration of patient-specific data with clinical reasoning. Cancer presents a unique context that influence treatment decisions, given its diverse forms of disease evolution. Biomedical imaging allows noninvasive assessment of disease based on visual evaluations leading to better clinical outcome prediction and therapeutic planning. Early methods of brain cancer characterization predominantly relied upon statistical modeling of neuroimaging data. Driven by the breakthroughs in computer vision, deep learning became the de facto standard in the domain of medical imaging. Integrated statistical and deep learning methods have recently emerged as a new direction in the automation of the medical practice unifying multi-disciplinary knowledge in medicine, statistics, and artificial intelligence. In this study, we critically review major statistical and deep learning models and their applications in brain imaging research with a focus on MRI-based brain tumor segmentation. The results do highlight that model-driven classical statistics and data-driven deep learning is a potent combination for developing automated systems in clinical oncology.
△ Less
Submitted 16 December, 2022; v1 submitted 9 March, 2021;
originally announced March 2021.
-
Restrictions on Weil polynomials of Jacobians of hyperelliptic curves
Authors:
Edgar Costa,
Ravi Donepudi,
Ravi Fernando,
Valentijn Karemaker,
Caleb Springer,
Mckenzie West
Abstract:
Inspired by experimental data, this paper investigates which isogeny classes of abelian varieties defined over a finite field of odd characteristic contain the Jacobian of a hyperelliptic curve. We provide a necessary condition by demonstrating that the Weil polynomial of a hyperelliptic Jacobian must have a particular form modulo 2. For fixed ${g\geq1}$, the proportion of isogeny classes of $g$ d…
▽ More
Inspired by experimental data, this paper investigates which isogeny classes of abelian varieties defined over a finite field of odd characteristic contain the Jacobian of a hyperelliptic curve. We provide a necessary condition by demonstrating that the Weil polynomial of a hyperelliptic Jacobian must have a particular form modulo 2. For fixed ${g\geq1}$, the proportion of isogeny classes of $g$ dimensional abelian varieties defined over $\mathbb{F}_q$ which fail this condition is $1 - Q(2g + 2)/2^g$ as $q\to\infty$ ranges over odd prime powers, where $Q(n)$ denotes the number of partitions of $n$ into odd parts.
△ Less
Submitted 25 November, 2020; v1 submitted 5 February, 2020;
originally announced February 2020.
-
Infinitely Many Carmichael Numbers for a Modified Miller-Rabin Prime Test
Authors:
Eric Bach,
Rex Fernando
Abstract:
We define a variant of the Miller-Rabin primality test, which is in between Miller-Rabin and Fermat in terms of strength. We show that this test has infinitely many "Carmichael" numbers. We show that the test can also be thought of as a variant of the Solovay-Strassen test. We explore the growth of the test's "Carmichael" numbers, giving some empirical results and a discussion of one particularly…
▽ More
We define a variant of the Miller-Rabin primality test, which is in between Miller-Rabin and Fermat in terms of strength. We show that this test has infinitely many "Carmichael" numbers. We show that the test can also be thought of as a variant of the Solovay-Strassen test. We explore the growth of the test's "Carmichael" numbers, giving some empirical results and a discussion of one particularly strong pattern which appears in the results.
△ Less
Submitted 2 December, 2015; v1 submitted 1 December, 2015;
originally announced December 2015.
-
On an Inequality of Dimension-like Invariants for Finite Groups
Authors:
Ravi Fernando
Abstract:
In this paper, we introduce several notions of "dimension" of a finite group, involving sizes of generating sets and certain configurations of maximal subgroups. We focus on the inequality $m(G) \leq \mathrm{MaxDim}(G)$, giving a family of examples where the inequality is strict, and showing that equality holds if $G$ is supersolvable.
In this paper, we introduce several notions of "dimension" of a finite group, involving sizes of generating sets and certain configurations of maximal subgroups. We focus on the inequality $m(G) \leq \mathrm{MaxDim}(G)$, giving a family of examples where the inequality is strict, and showing that equality holds if $G$ is supersolvable.
△ Less
Submitted 1 February, 2015;
originally announced February 2015.
-
Improving the throughput of the AES algorithm with multicore processors
Authors:
A. Barnes,
R. Fernando,
K. Mettananda,
R. G. Ragel
Abstract:
AES, Advanced Encryption Standard, can be considered the most widely used modern symmetric key encryption standard. To encrypt/decrypt a file using the AES algorithm, the file must undergo a set of complex computational steps. Therefore a software implementation of AES algorithm would be slow and consume large amount of time to complete. The immense increase of both stored and transferred data in…
▽ More
AES, Advanced Encryption Standard, can be considered the most widely used modern symmetric key encryption standard. To encrypt/decrypt a file using the AES algorithm, the file must undergo a set of complex computational steps. Therefore a software implementation of AES algorithm would be slow and consume large amount of time to complete. The immense increase of both stored and transferred data in the recent years had made this problem even more daunting when the need to encrypt/decrypt such data arises. As a solution to this problem, in this paper, we present an extensive study of enhancing the throughput of AES encryption algorithm by utilizing the state of the art multicore architectures. We take a sequential program that implements the AES algorithm and convert the same to run on multicore architectures with minimum effort. We implement two different parallel programmes, one with the fork system call in Linux and the other with the pthreads, the POSIX standard for threads. Later, we ran both the versions of the parallel programs on different multicore architectures and compared and analysed the throughputs between the implementations and among different architectures. The pthreads implementation outperformed in all the experiments we conducted and the best throughput obtained is around 7Gbps on a 32-core processor (the largest number of cores we had) with the pthreads implementation.
△ Less
Submitted 28 March, 2014;
originally announced March 2014.
-
A Note On 3Solitary Wave Solutions of the Compound Burgers-Korteweg-de Vries Equation"
Authors:
Claire David,
Rasika Fernando,
Zhaosheng Feng
Abstract:
The goal of this note is to construct a class of traveling solitary wave solutions for the compound Burgers-Korteweg-de Vries equation by means of a hyperbolic ansatz.
The goal of this note is to construct a class of traveling solitary wave solutions for the compound Burgers-Korteweg-de Vries equation by means of a hyperbolic ansatz.
△ Less
Submitted 26 July, 2006;
originally announced July 2006.