-
QueEn: A Large Language Model for Quechua-English Translation
Authors:
Junhao Chen,
Peng Shu,
Yiwei Li,
Huaqin Zhao,
Hanqi Jiang,
Yi Pan,
Yifan Zhou,
Zhengliang Liu,
Lewis C Howe,
Tianming Liu
Abstract:
Recent studies show that large language models (LLMs) are powerful tools for working with natural language, bringing advances in many areas of computational linguistics. However, these models face challenges when applied to low-resource languages due to limited training data and difficulty in understanding cultural nuances. In this paper, we propose QueEn, a novel approach for Quechua-English tran…
▽ More
Recent studies show that large language models (LLMs) are powerful tools for working with natural language, bringing advances in many areas of computational linguistics. However, these models face challenges when applied to low-resource languages due to limited training data and difficulty in understanding cultural nuances. In this paper, we propose QueEn, a novel approach for Quechua-English translation that combines Retrieval-Augmented Generation (RAG) with parameter-efficient fine-tuning techniques. Our method leverages external linguistic resources through RAG and uses Low-Rank Adaptation (LoRA) for efficient model adaptation. Experimental results show that our approach substantially exceeds baseline models, with a BLEU score of 17.6 compared to 1.5 for standard GPT models. The integration of RAG with fine-tuning allows our system to address the challenges of low-resource language translation while maintaining computational efficiency. This work contributes to the broader goal of preserving endangered languages through advanced language technologies.
△ Less
Submitted 6 December, 2024;
originally announced December 2024.
-
OracleSage: Towards Unified Visual-Linguistic Understanding of Oracle Bone Scripts through Cross-Modal Knowledge Fusion
Authors:
Hanqi Jiang,
Yi Pan,
Junhao Chen,
Zhengliang Liu,
Yifan Zhou,
Peng Shu,
Yiwei Li,
Huaqin Zhao,
Stephen Mihm,
Lewis C Howe,
Tianming Liu
Abstract:
Oracle bone script (OBS), as China's earliest mature writing system, present significant challenges in automatic recognition due to their complex pictographic structures and divergence from modern Chinese characters. We introduce OracleSage, a novel cross-modal framework that integrates hierarchical visual understanding with graph-based semantic reasoning. Specifically, we propose (1) a Hierarchic…
▽ More
Oracle bone script (OBS), as China's earliest mature writing system, present significant challenges in automatic recognition due to their complex pictographic structures and divergence from modern Chinese characters. We introduce OracleSage, a novel cross-modal framework that integrates hierarchical visual understanding with graph-based semantic reasoning. Specifically, we propose (1) a Hierarchical Visual-Semantic Understanding module that enables multi-granularity feature extraction through progressive fine-tuning of LLaVA's visual backbone, (2) a Graph-based Semantic Reasoning Framework that captures relationships between visual components and semantic concepts through dynamic message passing, and (3) OracleSem, a semantically enriched OBS dataset with comprehensive pictographic and semantic annotations. Experimental results demonstrate that OracleSage significantly outperforms state-of-the-art vision-language models. This research establishes a new paradigm for ancient text interpretation while providing valuable technical support for archaeological studies.
△ Less
Submitted 26 November, 2024;
originally announced November 2024.
-
Towards Scalable Quantum Networks
Authors:
Connor Howe,
Mohsin Aziz,
Ali Anwar
Abstract:
This paper presents a comprehensive study on the scalability challenges and opportunities in quantum communication networks, with the goal of determining parameters that impact networks most as well as the trends that appear when scaling networks. We design simulations of quantum networks comprised of router nodes made up of trapped-ion qubits, separated by quantum repeaters in the form of Bell St…
▽ More
This paper presents a comprehensive study on the scalability challenges and opportunities in quantum communication networks, with the goal of determining parameters that impact networks most as well as the trends that appear when scaling networks. We design simulations of quantum networks comprised of router nodes made up of trapped-ion qubits, separated by quantum repeaters in the form of Bell State Measurement (BSM) nodes. Such networks hold the promise of securely sharing quantum information and enabling high-power distributed quantum computing. Despite the promises, quantum networks encounter scalability issues due to noise and operational errors. Through a modular approach, our research aims to surmount these challenges, focusing on effects from scaling node counts and separation distances while monitoring low-quality communication arising from decoherence effects. We aim to pinpoint the critical features within networks essential for advancing scalable, large-scale quantum computing systems. Our findings underscore the impact of several network parameters on scalability, highlighting a critical insight into the trade-offs between the number of repeaters and the quality of entanglement generated. This paper lays the groundwork for future explorations into optimized quantum network designs and protocols.
△ Less
Submitted 12 September, 2024;
originally announced September 2024.
-
Light-Field Microscopy for optical imaging of neuronal activity: when model-based methods meet data-driven approaches
Authors:
Pingfan Song,
Herman Verinaz Jadan,
Carmel L. Howe,
Amanda J. Foust,
Pier Luigi Dragotti
Abstract:
Understanding how networks of neurons process information is one of the key challenges in modern neuroscience. A necessary step to achieve this goal is to be able to observe the dynamics of large populations of neurons over a large area of the brain. Light-field microscopy (LFM), a type of scanless microscope, is a particularly attractive candidate for high-speed three-dimensional (3D) imaging. It…
▽ More
Understanding how networks of neurons process information is one of the key challenges in modern neuroscience. A necessary step to achieve this goal is to be able to observe the dynamics of large populations of neurons over a large area of the brain. Light-field microscopy (LFM), a type of scanless microscope, is a particularly attractive candidate for high-speed three-dimensional (3D) imaging. It captures volumetric information in a single snapshot, allowing volumetric imaging at video frame-rates. Specific features of imaging neuronal activity using LFM call for the development of novel machine learning approaches that fully exploit priors embedded in physics and optics models. Signal processing theory and wave-optics theory could play a key role in filling this gap, and contribute to novel computational methods with enhanced interpretability and generalization by integrating model-driven and data-driven approaches. This paper is devoted to a comprehensive survey to state-of-the-art of computational methods for LFM, with a focus on model-based and data-driven approaches.
△ Less
Submitted 24 October, 2021;
originally announced October 2021.
-
Model-inspired Deep Learning for Light-Field Microscopy with Application to Neuron Localization
Authors:
Pingfan Song,
Herman Verinaz Jadan,
Carmel L. Howe,
Peter Quicke,
Amanda J. Foust,
Pier Luigi Dragotti
Abstract:
Light-field microscopes are able to capture spatial and angular information of incident light rays. This allows reconstructing 3D locations of neurons from a single snap-shot.In this work, we propose a model-inspired deep learning approach to perform fast and robust 3D localization of sources using light-field microscopy images. This is achieved by developing a deep network that efficiently solves…
▽ More
Light-field microscopes are able to capture spatial and angular information of incident light rays. This allows reconstructing 3D locations of neurons from a single snap-shot.In this work, we propose a model-inspired deep learning approach to perform fast and robust 3D localization of sources using light-field microscopy images. This is achieved by developing a deep network that efficiently solves a convolutional sparse coding (CSC) problem to map Epipolar Plane Images (EPI) to corresponding sparse codes. The network architecture is designed systematically by unrolling the convolutional Iterative Shrinkage and Thresholding Algorithm (ISTA) while the network parameters are learned from a training dataset. Such principled design enables the deep network to leverage both domain knowledge implied in the model, as well as new parameters learned from the data, thereby combining advantages of model-based and learning-based methods. Practical experiments on localization of mammalian neurons from light-fields show that the proposed approach simultaneously provides enhanced performance, interpretability and efficiency.
△ Less
Submitted 10 March, 2021;
originally announced March 2021.
-
Deep Learning for Lung Cancer Detection: Tackling the Kaggle Data Science Bowl 2017 Challenge
Authors:
Kingsley Kuan,
Mathieu Ravaut,
Gaurav Manek,
Huiling Chen,
Jie Lin,
Babar Nazir,
Cen Chen,
Tse Chiang Howe,
Zeng Zeng,
Vijay Chandrasekhar
Abstract:
We present a deep learning framework for computer-aided lung cancer diagnosis. Our multi-stage framework detects nodules in 3D lung CAT scans, determines if each nodule is malignant, and finally assigns a cancer probability based on these results. We discuss the challenges and advantages of our framework. In the Kaggle Data Science Bowl 2017, our framework ranked 41st out of 1972 teams.
We present a deep learning framework for computer-aided lung cancer diagnosis. Our multi-stage framework detects nodules in 3D lung CAT scans, determines if each nodule is malignant, and finally assigns a cancer probability based on these results. We discuss the challenges and advantages of our framework. In the Kaggle Data Science Bowl 2017, our framework ranked 41st out of 1972 teams.
△ Less
Submitted 26 May, 2017;
originally announced May 2017.