-
MEDEA: A Design-Time Multi-Objective Manager for Energy-Efficient DNN Inference on Heterogeneous Ultra-Low Power Platforms
Authors:
Hossein Taji,
José Miranda,
Miguel Peón-Quirós,
David Atienza
Abstract:
The growing demand for on-device AI necessitates energy-efficient execution of DNN based applications on resource-constrained ultra-low power (ULP) platforms. Heterogeneous architectures, combining specialized processing elements (PEs), have emerged as a key solution for achieving the required performance and energy efficiency. However, optimizing energy while executing applications on these platf…
▽ More
The growing demand for on-device AI necessitates energy-efficient execution of DNN based applications on resource-constrained ultra-low power (ULP) platforms. Heterogeneous architectures, combining specialized processing elements (PEs), have emerged as a key solution for achieving the required performance and energy efficiency. However, optimizing energy while executing applications on these platforms requires efficiently managing platform resources like PEs, power features, and memory footprint, all while adhering to critical application deadlines. This paper presents MEDEA, a novel design-time multi-objective manager for energy-efficient DNN inference on Heterogeneous ULP (HULP) platforms. MEDEA uniquely integrates: kernel-level dynamic voltage and frequency scaling (DVFS) for dynamic energy adaptation; kernel-level granularity scheduling, suitable for specialized accelerators; memory-aware adaptive tiling to navigate severe memory constraints; and all within a timing constraint-based optimization strategy, which minimizes energy based on application deadline. To showcase practical viability, we evaluate MEDEA on HEEPtimize, a heterogeneous ULP platform (22 nm, FPGA-prototyped) featuring a RISC-V processor besides Near-Memory Computing (NMC) and Coarse-Grained Reconfigurable Array (CGRA) accelerators. Experimental results, using a biomedical seizure detection case study, demonstrate that MEDEA achieves overall energy reductions of up to 38% compared to representative state-of-the-art methods, while consistently meeting all timing and memory requirements. This effectiveness is attributed to its integrated features, with our analysis showing that kernel-level DVFS alone can be responsible for over 31% of the energy savings in specific scenarios.
△ Less
Submitted 23 June, 2025;
originally announced June 2025.
-
Querying Large Automotive Software Models: Agentic vs. Direct LLM Approaches
Authors:
Lukasz Mazur,
Nenad Petrovic,
James Pontes Miranda,
Ansgar Radermacher,
Robert Rasche,
Alois Knoll
Abstract:
Large language models (LLMs) offer new opportunities for interacting with complex software artifacts, such as software models, through natural language. They present especially promising benefits for large software models that are difficult to grasp in their entirety, making traditional interaction and analysis approaches challenging. This paper investigates two approaches for leveraging LLMs to a…
▽ More
Large language models (LLMs) offer new opportunities for interacting with complex software artifacts, such as software models, through natural language. They present especially promising benefits for large software models that are difficult to grasp in their entirety, making traditional interaction and analysis approaches challenging. This paper investigates two approaches for leveraging LLMs to answer questions over software models: direct prompting, where the whole software model is provided in the context, and an agentic approach combining LLM-based agents with general-purpose file access tools. We evaluate these approaches using an Ecore metamodel designed for timing analysis and software optimization in automotive and embedded domains. Our findings show that while the agentic approach achieves accuracy comparable to direct prompting, it is significantly more efficient in terms of token usage. This efficiency makes the agentic approach particularly suitable for the automotive industry, where the large size of software models makes direct prompting infeasible, establishing LLM agents as not just a practical alternative but the only viable solution. Notably, the evaluation was conducted using small LLMs, which are more feasible to be executed locally - an essential advantage for meeting strict requirements around privacy, intellectual property protection, and regulatory compliance. Future work will investigate software models in diverse formats, explore more complex agent architectures, and extend agentic workflows to support not only querying but also modification of software models.
△ Less
Submitted 16 June, 2025;
originally announced June 2025.
-
Assessing a Safety Case: Bottom-up Guidance for Claims and Evidence Evaluation
Authors:
Scott Schnelle,
Francesca Favaro,
Laura Fraade-Blanar,
David Wichner,
Holland Broce,
Justin Miranda
Abstract:
As Automated Driving Systems (ADS) technology advances, ensuring safety and public trust requires robust assurance frameworks, with safety cases emerging as a critical tool toward such a goal. This paper explores an approach to assess how a safety case is supported by its claims and evidence, toward establishing credibility for the overall case. Starting from a description of the building blocks o…
▽ More
As Automated Driving Systems (ADS) technology advances, ensuring safety and public trust requires robust assurance frameworks, with safety cases emerging as a critical tool toward such a goal. This paper explores an approach to assess how a safety case is supported by its claims and evidence, toward establishing credibility for the overall case. Starting from a description of the building blocks of a safety case (claims, evidence, and optional format-dependent entries), this paper delves into the assessment of support of each claim through the provided evidence. Two domains of assessment are outlined for each claim: procedural support (formalizing process specification) and implementation support (demonstrating process application). Additionally, an assessment of evidence status is also undertaken, independently from the claims support. Scoring strategies and evaluation guidelines are provided, including detailed scoring tables for claim support and evidence status assessment. The paper further discusses governance, continual improvement, and timing considerations for safety case assessments. Reporting of results and findings is contextualized within its primary use for internal decision-making on continual improvement efforts. The presented approach builds on state of the art auditing practices, but specifically tackles the question of judging the credibility of a safety case. While not conclusive on its own, it provides a starting point toward a comprehensive "Case Credibility Assessment" (CCA), starting from the evaluation of the support for each claim (individually and in aggregate), as well as every piece of evidence provided. By delving into the technical intricacies of ADS safety cases, this work contributes to the ongoing discourse on safety assurance and aims to facilitate the responsible integration of ADS technology into society.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
The UD-NewsCrawl Treebank: Reflections and Challenges from a Large-scale Tagalog Syntactic Annotation Project
Authors:
Angelina A. Aquino,
Lester James V. Miranda,
Elsie Marie T. Or
Abstract:
This paper presents UD-NewsCrawl, the largest Tagalog treebank to date, containing 15.6k trees manually annotated according to the Universal Dependencies framework. We detail our treebank development process, including data collection, pre-processing, manual annotation, and quality assurance procedures. We provide baseline evaluations using multiple transformer-based models to assess the performan…
▽ More
This paper presents UD-NewsCrawl, the largest Tagalog treebank to date, containing 15.6k trees manually annotated according to the Universal Dependencies framework. We detail our treebank development process, including data collection, pre-processing, manual annotation, and quality assurance procedures. We provide baseline evaluations using multiple transformer-based models to assess the performance of state-of-the-art dependency parsers on Tagalog. We also highlight challenges in the syntactic analysis of Tagalog given its distinctive grammatical properties, and discuss its implications for the annotation of this treebank. We anticipate that UD-NewsCrawl and our baseline model implementations will serve as valuable resources for advancing computational linguistics research in underrepresented languages like Tagalog.
△ Less
Submitted 26 May, 2025;
originally announced May 2025.
-
R3: Robust Rubric-Agnostic Reward Models
Authors:
David Anugraha,
Zilu Tang,
Lester James V. Miranda,
Hanyang Zhao,
Mohammad Rifqi Farhansyah,
Garry Kuwanto,
Derry Wijaya,
Genta Indra Winata
Abstract:
Reward models are essential for aligning language model outputs with human preferences, yet existing approaches often lack both controllability and interpretability. These models are typically optimized for narrow objectives, limiting their generalizability to broader downstream tasks. Moreover, their scalar outputs are difficult to interpret without contextual reasoning. To address these limitati…
▽ More
Reward models are essential for aligning language model outputs with human preferences, yet existing approaches often lack both controllability and interpretability. These models are typically optimized for narrow objectives, limiting their generalizability to broader downstream tasks. Moreover, their scalar outputs are difficult to interpret without contextual reasoning. To address these limitations, we introduce R3, a novel reward modeling framework that is rubric-agnostic, generalizable across evaluation dimensions, and provides interpretable, reasoned score assignments. R3 enables more transparent and flexible evaluation of language models, supporting robust alignment with diverse human values and use cases. Our models, data, and code are available as open source at https://github.com/rubricreward/r3
△ Less
Submitted 26 May, 2025; v1 submitted 19 May, 2025;
originally announced May 2025.
-
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia
Authors:
Samuel Cahyawijaya,
Holy Lovenia,
Joel Ruben Antony Moniz,
Tack Hwa Wong,
Mohammad Rifqi Farhansyah,
Thant Thiri Maung,
Frederikus Hudi,
David Anugraha,
Muhammad Ravi Shulthan Habibi,
Muhammad Reza Qorib,
Amit Agarwal,
Joseph Marvin Imperial,
Hitesh Laxmichand Patel,
Vicky Feliren,
Bahrul Ilmi Nasution,
Manuel Antonio Rufino,
Genta Indra Winata,
Rian Adam Rajagede,
Carlos Rafael Catalan,
Mohamed Fazli Imam,
Priyaranjan Pattnayak,
Salsabila Zahirah Pranida,
Kevin Pratama,
Yeshil Bangera,
Adisai Na-Thalang
, et al. (67 additional authors not shown)
Abstract:
Southeast Asia (SEA) is a region of extraordinary linguistic and cultural diversity, yet it remains significantly underrepresented in vision-language (VL) research. This often results in artificial intelligence (AI) models that fail to capture SEA cultural nuances. To fill this gap, we present SEA-VL, an open-source initiative dedicated to developing high-quality, culturally relevant data for SEA…
▽ More
Southeast Asia (SEA) is a region of extraordinary linguistic and cultural diversity, yet it remains significantly underrepresented in vision-language (VL) research. This often results in artificial intelligence (AI) models that fail to capture SEA cultural nuances. To fill this gap, we present SEA-VL, an open-source initiative dedicated to developing high-quality, culturally relevant data for SEA languages. By involving contributors from SEA countries, SEA-VL aims to ensure better cultural relevance and diversity, fostering greater inclusivity of underrepresented languages in VL research. Beyond crowdsourcing, our initiative goes one step further in the exploration of the automatic collection of culturally relevant images through crawling and image generation. First, we find that image crawling achieves approximately ~85% cultural relevance while being more cost- and time-efficient than crowdsourcing. Second, despite the substantial progress in generative vision models, synthetic images remain unreliable in accurately reflecting SEA cultures. The generated images often fail to reflect the nuanced traditions and cultural contexts of the region. Collectively, we gather 1.28M SEA culturally-relevant images, more than 50 times larger than other existing datasets. Through SEA-VL, we aim to bridge the representation gap in SEA, fostering the development of more inclusive AI systems that authentically represent diverse cultures across SEA.
△ Less
Submitted 18 March, 2025; v1 submitted 10 March, 2025;
originally announced March 2025.
-
MMTEB: Massive Multilingual Text Embedding Benchmark
Authors:
Kenneth Enevoldsen,
Isaac Chung,
Imene Kerboua,
Márton Kardos,
Ashwin Mathur,
David Stap,
Jay Gala,
Wissam Siblini,
Dominik Krzemiński,
Genta Indra Winata,
Saba Sturua,
Saiteja Utpala,
Mathieu Ciancone,
Marion Schaeffer,
Gabriel Sequeira,
Diganta Misra,
Shreeya Dhakal,
Jonathan Rystrøm,
Roman Solomatin,
Ömer Çağatan,
Akash Kundu,
Martin Bernstorff,
Shitao Xiao,
Akshita Sukhlecha,
Bhavish Pahwa
, et al. (61 additional authors not shown)
Abstract:
Text embeddings are typically evaluated on a limited set of tasks, which are constrained by language, domain, and task diversity. To address these limitations and provide a more comprehensive evaluation, we introduce the Massive Multilingual Text Embedding Benchmark (MMTEB) - a large-scale, community-driven expansion of MTEB, covering over 500 quality-controlled evaluation tasks across 250+ langua…
▽ More
Text embeddings are typically evaluated on a limited set of tasks, which are constrained by language, domain, and task diversity. To address these limitations and provide a more comprehensive evaluation, we introduce the Massive Multilingual Text Embedding Benchmark (MMTEB) - a large-scale, community-driven expansion of MTEB, covering over 500 quality-controlled evaluation tasks across 250+ languages. MMTEB includes a diverse set of challenging, novel tasks such as instruction following, long-document retrieval, and code retrieval, representing the largest multilingual collection of evaluation tasks for embedding models to date. Using this collection, we develop several highly multilingual benchmarks, which we use to evaluate a representative set of models. We find that while large language models (LLMs) with billions of parameters can achieve state-of-the-art performance on certain language subsets and task categories, the best-performing publicly available model is multilingual-e5-large-instruct with only 560 million parameters. To facilitate accessibility and reduce computational cost, we introduce a novel downsampling method based on inter-task correlation, ensuring a diverse selection while preserving relative model rankings. Furthermore, we optimize tasks such as retrieval by sampling hard negatives, creating smaller but effective splits. These optimizations allow us to introduce benchmarks that drastically reduce computational demands. For instance, our newly introduced zero-shot English benchmark maintains a ranking order similar to the full-scale version but at a fraction of the computational cost.
△ Less
Submitted 8 June, 2025; v1 submitted 19 February, 2025;
originally announced February 2025.
-
FADE: Forecasting for Anomaly Detection on ECG
Authors:
Paula Ruiz-Barroso,
Francisco M. Castro,
José Miranda,
Denisa-Andreea Constantinescu,
David Atienza,
Nicolás Guil
Abstract:
Cardiovascular diseases, a leading cause of noncommunicable disease-related deaths, require early and accurate detection to improve patient outcomes. Taking advantage of advances in machine learning and deep learning, multiple approaches have been proposed in the literature to address the challenge of detecting ECG anomalies. Typically, these methods are based on the manual interpretation of ECG s…
▽ More
Cardiovascular diseases, a leading cause of noncommunicable disease-related deaths, require early and accurate detection to improve patient outcomes. Taking advantage of advances in machine learning and deep learning, multiple approaches have been proposed in the literature to address the challenge of detecting ECG anomalies. Typically, these methods are based on the manual interpretation of ECG signals, which is time consuming and depends on the expertise of healthcare professionals. The objective of this work is to propose a deep learning system, FADE, designed for normal ECG forecasting and anomaly detection, which reduces the need for extensive labeled datasets and manual interpretation. FADE has been trained in a self-supervised manner with a novel morphological inspired loss function. Unlike conventional models that learn from labeled anomalous ECG waveforms, our approach predicts the future of normal ECG signals, thus avoiding the need for extensive labeled datasets. Using a novel distance function to compare forecasted ECG signals with actual sensor data, our method effectively identifies cardiac anomalies. Additionally, this approach can be adapted to new contexts through domain adaptation techniques. To evaluate our proposal, we performed a set of experiments using two publicly available datasets: MIT-BIH NSR and MIT-BIH Arrythmia. The results demonstrate that our system achieves an average accuracy of 83.84% in anomaly detection, while correctly classifying normal ECG signals with an accuracy of 85.46%. Our proposed approach exhibited superior performance in the early detection of cardiac anomalies in ECG signals, surpassing previous methods that predominantly identify a limited range of anomalies. FADE effectively detects both abnormal heartbeats and arrhythmias, offering significant advantages in healthcare through cost reduction or processing of large-scale ECG data.
△ Less
Submitted 11 February, 2025;
originally announced February 2025.
-
2 OLMo 2 Furious
Authors:
Team OLMo,
Pete Walsh,
Luca Soldaini,
Dirk Groeneveld,
Kyle Lo,
Shane Arora,
Akshita Bhagia,
Yuling Gu,
Shengyi Huang,
Matt Jordan,
Nathan Lambert,
Dustin Schwenk,
Oyvind Tafjord,
Taira Anderson,
David Atkinson,
Faeze Brahman,
Christopher Clark,
Pradeep Dasigi,
Nouha Dziri,
Michal Guerquin,
Hamish Ivison,
Pang Wei Koh,
Jiacheng Liu,
Saumya Malik,
William Merrill
, et al. (15 additional authors not shown)
Abstract:
We present OLMo 2, the next generation of our fully open language models. OLMo 2 includes dense autoregressive models with improved architecture and training recipe, pretraining data mixtures, and instruction tuning recipes. Our modified model architecture and training recipe achieve both better training stability and improved per-token efficiency. Our updated pretraining data mixture introduces a…
▽ More
We present OLMo 2, the next generation of our fully open language models. OLMo 2 includes dense autoregressive models with improved architecture and training recipe, pretraining data mixtures, and instruction tuning recipes. Our modified model architecture and training recipe achieve both better training stability and improved per-token efficiency. Our updated pretraining data mixture introduces a new, specialized data mix called Dolmino Mix 1124, which significantly improves model capabilities across many downstream task benchmarks when introduced via late-stage curriculum training (i.e. specialized data during the annealing phase of pretraining). Finally, we incorporate best practices from Tülu 3 to develop OLMo 2-Instruct, focusing on permissive data and extending our final-stage reinforcement learning with verifiable rewards (RLVR). Our OLMo 2 base models sit at the Pareto frontier of performance to compute, often matching or outperforming open-weight only models like Llama 3.1 and Qwen 2.5 while using fewer FLOPs and with fully transparent training data, code, and recipe. Our fully open OLMo 2-Instruct models are competitive with or surpassing open-weight only models of comparable size, including Qwen 2.5, Llama 3.1 and Gemma 2. We release all OLMo 2 artifacts openly -- models at 7B and 13B scales, both pretrained and post-trained, including their full training data, training code and recipes, training logs and thousands of intermediate checkpoints. The final instruction model is available on the Ai2 Playground as a free research demo.
△ Less
Submitted 14 January, 2025; v1 submitted 31 December, 2024;
originally announced January 2025.
-
Bridging the Data Provenance Gap Across Text, Speech and Video
Authors:
Shayne Longpre,
Nikhil Singh,
Manuel Cherep,
Kushagra Tiwary,
Joanna Materzynska,
William Brannon,
Robert Mahari,
Naana Obeng-Marnu,
Manan Dey,
Mohammed Hamdy,
Nayan Saxena,
Ahmad Mustafa Anis,
Emad A. Alghamdi,
Vu Minh Chien,
Da Yin,
Kun Qian,
Yizhi Li,
Minnie Liang,
An Dinh,
Shrestha Mohanty,
Deividas Mataciunas,
Tobin South,
Jianguo Zhang,
Ariel N. Lee,
Campbell S. Lund
, et al. (18 additional authors not shown)
Abstract:
Progress in AI is driven largely by the scale and quality of training data. Despite this, there is a deficit of empirical analysis examining the attributes of well-established datasets beyond text. In this work we conduct the largest and first-of-its-kind longitudinal audit across modalities--popular text, speech, and video datasets--from their detailed sourcing trends and use restrictions to thei…
▽ More
Progress in AI is driven largely by the scale and quality of training data. Despite this, there is a deficit of empirical analysis examining the attributes of well-established datasets beyond text. In this work we conduct the largest and first-of-its-kind longitudinal audit across modalities--popular text, speech, and video datasets--from their detailed sourcing trends and use restrictions to their geographical and linguistic representation. Our manual analysis covers nearly 4000 public datasets between 1990-2024, spanning 608 languages, 798 sources, 659 organizations, and 67 countries. We find that multimodal machine learning applications have overwhelmingly turned to web-crawled, synthetic, and social media platforms, such as YouTube, for their training sets, eclipsing all other sources since 2019. Secondly, tracing the chain of dataset derivations we find that while less than 33% of datasets are restrictively licensed, over 80% of the source content in widely-used text, speech, and video datasets, carry non-commercial restrictions. Finally, counter to the rising number of languages and geographies represented in public AI training datasets, our audit demonstrates measures of relative geographical and multilingual representation have failed to significantly improve their coverage since 2013. We believe the breadth of our audit enables us to empirically examine trends in data sourcing, restrictions, and Western-centricity at an ecosystem-level, and that visibility into these questions are essential to progress in responsible AI. As a contribution to ongoing improvements in dataset transparency and responsible use, we release our entire multimodal audit, allowing practitioners to trace data provenance across text, speech, and video.
△ Less
Submitted 18 February, 2025; v1 submitted 18 December, 2024;
originally announced December 2024.
-
Tulu 3: Pushing Frontiers in Open Language Model Post-Training
Authors:
Nathan Lambert,
Jacob Morrison,
Valentina Pyatkin,
Shengyi Huang,
Hamish Ivison,
Faeze Brahman,
Lester James V. Miranda,
Alisa Liu,
Nouha Dziri,
Shane Lyu,
Yuling Gu,
Saumya Malik,
Victoria Graf,
Jena D. Hwang,
Jiangjiang Yang,
Ronan Le Bras,
Oyvind Tafjord,
Chris Wilhelm,
Luca Soldaini,
Noah A. Smith,
Yizhong Wang,
Pradeep Dasigi,
Hannaneh Hajishirzi
Abstract:
Language model post-training is applied to refine behaviors and unlock new skills across a wide range of recent language models, but open recipes for applying these techniques lag behind proprietary ones. The underlying training data and recipes for post-training are simultaneously the most important pieces of the puzzle and the portion with the least transparency. To bridge this gap, we introduce…
▽ More
Language model post-training is applied to refine behaviors and unlock new skills across a wide range of recent language models, but open recipes for applying these techniques lag behind proprietary ones. The underlying training data and recipes for post-training are simultaneously the most important pieces of the puzzle and the portion with the least transparency. To bridge this gap, we introduce Tulu 3, a family of fully-open state-of-the-art post-trained models, alongside its data, code, and training recipes, serving as a comprehensive guide for modern post-training techniques. Tulu 3, which builds on Llama 3.1 base models, achieves results surpassing the instruct versions of Llama 3.1, Qwen 2.5, Mistral, and even closed models such as GPT-4o-mini and Claude 3.5-Haiku. The training algorithms for our models include supervised finetuning (SFT), Direct Preference Optimization (DPO), and a novel method we call Reinforcement Learning with Verifiable Rewards (RLVR). With Tulu 3, we introduce a multi-task evaluation scheme for post-training recipes with development and unseen evaluations, standard benchmark implementations, and substantial decontamination of existing open datasets on said benchmarks. We conclude with analysis and discussion of training methods that did not reliably improve performance.
In addition to the Tulu 3 model weights and demo, we release the complete recipe -- including datasets for diverse core skills, a robust toolkit for data curation and evaluation, the training code and infrastructure, and, most importantly, a detailed report for reproducing and further adapting the Tulu 3 approach to more domains.
△ Less
Submitted 14 April, 2025; v1 submitted 22 November, 2024;
originally announced November 2024.
-
Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback
Authors:
Lester James V. Miranda,
Yizhong Wang,
Yanai Elazar,
Sachin Kumar,
Valentina Pyatkin,
Faeze Brahman,
Noah A. Smith,
Hannaneh Hajishirzi,
Pradeep Dasigi
Abstract:
Learning from human feedback has enabled the alignment of language models (LMs) with human preferences. However, collecting human preferences is expensive and time-consuming, with highly variable annotation quality. An appealing alternative is to distill preferences from LMs as a source of synthetic annotations, offering a cost-effective and scalable alternative, albeit susceptible to other biases…
▽ More
Learning from human feedback has enabled the alignment of language models (LMs) with human preferences. However, collecting human preferences is expensive and time-consuming, with highly variable annotation quality. An appealing alternative is to distill preferences from LMs as a source of synthetic annotations, offering a cost-effective and scalable alternative, albeit susceptible to other biases and errors. In this work, we introduce HyPER, a Hybrid Preference routER that defers an annotation to either humans or LMs, achieving better annotation quality while reducing the cost of human-only annotation. We formulate this as an optimization problem: given a preference dataset and an evaluation metric, we (1) train a performance prediction model (PPM) to predict a reward model's (RM) performance on an arbitrary combination of human and LM annotations and (2) employ a routing strategy that selects a combination that maximizes the predicted performance. We train the PPM on MultiPref, a new preference dataset with 10k instances paired with humans and LM labels. We show that the selected hybrid mixture of synthetic and direct human preferences using HyPER achieves better RM performance compared to using either one exclusively by 7-13% on RewardBench and generalizes across unseen preference datasets and other base models. We also observe the same trend in other benchmarks using Best-of-N reranking, where the hybrid mix has 2-3% better performance. Finally, we analyze features from HyPER and find that prompts with moderate safety concerns or complexity benefit the most from human feedback.
△ Less
Submitted 30 May, 2025; v1 submitted 24 October, 2024;
originally announced October 2024.
-
M-RewardBench: Evaluating Reward Models in Multilingual Settings
Authors:
Srishti Gureja,
Lester James V. Miranda,
Shayekh Bin Islam,
Rishabh Maheshwary,
Drishti Sharma,
Gusti Winata,
Nathan Lambert,
Sebastian Ruder,
Sara Hooker,
Marzieh Fadaee
Abstract:
Reward models (RMs) have driven the state-of-the-art performance of LLMs today by enabling the integration of human feedback into the language modeling process. However, RMs are primarily trained and evaluated in English, and their capabilities in multilingual settings remain largely understudied. In this work, we conduct a systematic evaluation of several reward models in multilingual settings. W…
▽ More
Reward models (RMs) have driven the state-of-the-art performance of LLMs today by enabling the integration of human feedback into the language modeling process. However, RMs are primarily trained and evaluated in English, and their capabilities in multilingual settings remain largely understudied. In this work, we conduct a systematic evaluation of several reward models in multilingual settings. We first construct the first-of-its-kind multilingual RM evaluation benchmark, M-RewardBench, consisting of 2.87k preference instances for 23 typologically diverse languages, that tests the chat, safety, reasoning, and translation capabilities of RMs. We then rigorously evaluate a wide range of reward models on M-RewardBench, offering fresh insights into their performance across diverse languages. We identify a significant gap in RMs' performances between English and non-English languages and show that RM preferences can change substantially from one language to another. We also present several findings on how different multilingual aspects impact RM performance. Specifically, we show that the performance of RMs is improved with improved translation quality. Similarly, we demonstrate that the models exhibit better performance for high-resource languages. We release M-RewardBench dataset and the codebase in this study to facilitate a better understanding of RM evaluation in multilingual settings.
△ Less
Submitted 20 May, 2025; v1 submitted 20 October, 2024;
originally announced October 2024.
-
Don't Think It Twice: Exploit Shift Invariance for Efficient Online Streaming Inference of CNNs
Authors:
Christodoulos Kechris,
Jonathan Dan,
Jose Miranda,
David Atienza
Abstract:
Deep learning time-series processing often relies on convolutional neural networks with overlapping windows. This overlap allows the network to produce an output faster than the window length. However, it introduces additional computations. This work explores the potential to optimize computational efficiency during inference by exploiting convolution's shift-invariance properties to skip the calc…
▽ More
Deep learning time-series processing often relies on convolutional neural networks with overlapping windows. This overlap allows the network to produce an output faster than the window length. However, it introduces additional computations. This work explores the potential to optimize computational efficiency during inference by exploiting convolution's shift-invariance properties to skip the calculation of layer activations between successive overlapping windows. Although convolutions are shift-invariant, zero-padding and pooling operations, widely used in such networks, are not efficient and complicate efficient streaming inference. We introduce StreamiNNC, a strategy to deploy Convolutional Neural Networks for online streaming inference. We explore the adverse effects of zero padding and pooling on the accuracy of streaming inference, deriving theoretical error upper bounds for pooling during streaming. We address these limitations by proposing signal padding and pooling alignment and provide guidelines for designing and deploying models for StreamiNNC. We validate our method in simulated data and on three real-world biomedical signal processing applications. StreamiNNC achieves a low deviation between streaming output and normal inference for all three networks (2.03 - 3.55% NRMSE). This work demonstrates that it is possible to linearly speed up the inference of streaming CNNs processing overlapping windows, negating the additional computation typically incurred by overlapping windows.
△ Less
Submitted 6 August, 2024;
originally announced August 2024.
-
DC is all you need: describing ReLU from a signal processing standpoint
Authors:
Christodoulos Kechris,
Jonathan Dan,
Jose Miranda,
David Atienza
Abstract:
Non-linear activation functions are crucial in Convolutional Neural Networks. However, until now they have not been well described in the frequency domain. In this work, we study the spectral behavior of ReLU, a popular activation function. We use the ReLU's Taylor expansion to derive its frequency domain behavior. We demonstrate that ReLU introduces higher frequency oscillations in the signal and…
▽ More
Non-linear activation functions are crucial in Convolutional Neural Networks. However, until now they have not been well described in the frequency domain. In this work, we study the spectral behavior of ReLU, a popular activation function. We use the ReLU's Taylor expansion to derive its frequency domain behavior. We demonstrate that ReLU introduces higher frequency oscillations in the signal and a constant DC component. Furthermore, we investigate the importance of this DC component, where we demonstrate that it helps the model extract meaningful features related to the input frequency content. We accompany our theoretical derivations with experiments and real-world examples. First, we numerically validate our frequency response model. Then we observe ReLU's spectral behavior on two example models and a real-world one. Finally, we experimentally investigate the role of the DC component introduced by ReLU in the CNN's representations. Our results indicate that the DC helps to converge to a weight configuration that is close to the initial random weights.
△ Less
Submitted 11 May, 2025; v1 submitted 23 July, 2024;
originally announced July 2024.
-
Perceived Importance of ICT Proficiency for Teaching, Learning, and Career Progression among Physical Education Teachers in Pampanga
Authors:
Kristine Joy D. Magallanes,
Mark Brianne C. Carreon,
Kristalyn C. Miclat,
Niña Vina V. Salita,
Gino A. Sumilhig,
Raymart Christopher C. Guevarra,
John Paul P. Miranda
Abstract:
The integration of information and communication technology (ICT) has become increasingly vital across various educational fields, including physical education (PE). This study aimed to evaluate the proficiency levels of PE teachers in using various ICT applications and to examine the relationship between the perceived importance of ICT proficiency for teaching and learning, career advancement, an…
▽ More
The integration of information and communication technology (ICT) has become increasingly vital across various educational fields, including physical education (PE). This study aimed to evaluate the proficiency levels of PE teachers in using various ICT applications and to examine the relationship between the perceived importance of ICT proficiency for teaching and learning, career advancement, and actual proficiency among Senior High school PE teachers in the municipality of Mexico, Pampanga. This study employed a quantitative descriptive approach. PE teachers from the municipality of Mexico, Pampanga, were selected as the respondents. This study used a two-part survey. The first section collected demographic data, such as age, gender, rank/position, and years of teaching experience, and the second section assessed ICT skill levels and the perceived importance of ICT in teaching, learning, and career progression. The results revealed that the majority of PE teachers had access to ICT resources. However, their proficiency levels with these tools varied significantly. Factors such as age, teaching experience, and professional position were found to significantly influence teachers proficiency and their perceptions of the benefits of ICT integration in PE instruction. The study provided a glimpse of the current state of ICT integration among Senior High school PE teachers in Mexico, Pampanga, Philippines. This also highlights areas of improvement. The study suggests that policymakers, administrators, and training program developers should focus on enhancing the ICT proficiency of PE teachers to improve teaching practices and student engagement. Enhancing the ICT proficiency of PE teachers is recommended to foster better teaching experiences, increase student engagement, and promote overall educational outcomes.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages
Authors:
Holy Lovenia,
Rahmad Mahendra,
Salsabil Maulana Akbar,
Lester James V. Miranda,
Jennifer Santoso,
Elyanah Aco,
Akhdan Fadhilah,
Jonibek Mansurov,
Joseph Marvin Imperial,
Onno P. Kampman,
Joel Ruben Antony Moniz,
Muhammad Ravi Shulthan Habibi,
Frederikus Hudi,
Railey Montalan,
Ryan Ignatius,
Joanito Agili Lopo,
William Nixon,
Börje F. Karlsson,
James Jaya,
Ryandito Diandaru,
Yuze Gao,
Patrick Amadeus,
Bin Wang,
Jan Christian Blaise Cruz,
Chenxi Whitehouse
, et al. (36 additional authors not shown)
Abstract:
Southeast Asia (SEA) is a region rich in linguistic diversity and cultural variety, with over 1,300 indigenous languages and a population of 671 million people. However, prevailing AI models suffer from a significant lack of representation of texts, images, and audio datasets from SEA, compromising the quality of AI models for SEA languages. Evaluating models for SEA languages is challenging due t…
▽ More
Southeast Asia (SEA) is a region rich in linguistic diversity and cultural variety, with over 1,300 indigenous languages and a population of 671 million people. However, prevailing AI models suffer from a significant lack of representation of texts, images, and audio datasets from SEA, compromising the quality of AI models for SEA languages. Evaluating models for SEA languages is challenging due to the scarcity of high-quality datasets, compounded by the dominance of English training data, raising concerns about potential cultural misrepresentation. To address these challenges, we introduce SEACrowd, a collaborative initiative that consolidates a comprehensive resource hub that fills the resource gap by providing standardized corpora in nearly 1,000 SEA languages across three modalities. Through our SEACrowd benchmarks, we assess the quality of AI models on 36 indigenous languages across 13 tasks, offering valuable insights into the current AI landscape in SEA. Furthermore, we propose strategies to facilitate greater AI advancements, maximizing potential utility and resource equity for the future of AI in SEA.
△ Less
Submitted 10 March, 2025; v1 submitted 14 June, 2024;
originally announced June 2024.
-
KID-PPG: Knowledge Informed Deep Learning for Extracting Heart Rate from a Smartwatch
Authors:
Christodoulos Kechris,
Jonathan Dan,
Jose Miranda,
David Atienza
Abstract:
Accurate extraction of heart rate from photoplethysmography (PPG) signals remains challenging due to motion artifacts and signal degradation. Although deep learning methods trained as a data-driven inference problem offer promising solutions, they often underutilize existing knowledge from the medical and signal processing community. In this paper, we address three shortcomings of deep learning mo…
▽ More
Accurate extraction of heart rate from photoplethysmography (PPG) signals remains challenging due to motion artifacts and signal degradation. Although deep learning methods trained as a data-driven inference problem offer promising solutions, they often underutilize existing knowledge from the medical and signal processing community. In this paper, we address three shortcomings of deep learning models: motion artifact removal, degradation assessment, and physiologically plausible analysis of the PPG signal. We propose KID-PPG, a knowledge-informed deep learning model that integrates expert knowledge through adaptive linear filtering, deep probabilistic inference, and data augmentation. We evaluate KID-PPG on the PPGDalia dataset, achieving an average mean absolute error of 2.85 beats per minute, surpassing existing reproducible methods. Our results demonstrate a significant performance improvement in heart rate tracking through the incorporation of prior knowledge into deep learning models. This approach shows promise in enhancing various biomedical applications by incorporating existing expert knowledge in deep learning models.
△ Less
Submitted 9 October, 2024; v1 submitted 2 May, 2024;
originally announced May 2024.
-
STRELA: STReaming ELAstic CGRA Accelerator for Embedded Systems
Authors:
Daniel Vazquez,
Jose Miranda,
Alfonso Rodriguez,
Andres Otero,
Pascuale Davide Schiavone,
David Atienza
Abstract:
Reconfigurable computing offers a good balance between flexibility and energy efficiency. When combined with software-programmable devices such as CPUs, it is possible to obtain higher performance by spatially distributing the parallelizable sections of an application throughout the reconfigurable device while the CPU is in charge of control-intensive sections. This work introduces an elastic Coar…
▽ More
Reconfigurable computing offers a good balance between flexibility and energy efficiency. When combined with software-programmable devices such as CPUs, it is possible to obtain higher performance by spatially distributing the parallelizable sections of an application throughout the reconfigurable device while the CPU is in charge of control-intensive sections. This work introduces an elastic Coarse-Grained Reconfigurable Architecture (CGRA) integrated into an energy-efficient RISC-V-based SoC designed for the embedded domain. The microarchitecture of CGRA supports conditionals and irregular loops, making it adaptable to domain-specific applications. Additionally, we propose specific mapping strategies that enable the efficient utilization of the CGRA for both simple applications, where the fabric is only reconfigured once (one-shot kernel), and more complex ones, where it is necessary to reconfigure the CGRA multiple times to complete them (multi-shot kernels). Large kernels also benefit from the independent memory nodes incorporated to streamline data accesses. Due to the integration of CGRA as an accelerator of the RISC-V processor enables a versatile and efficient framework, providing adaptability, processing capacity, and overall performance across various applications.
The design has been implemented in TSMC 65 nm, achieving a maximum frequency of 250 MHz. It achieves a peak performance of 1.22 GOPs computing one-shot kernels and 1.17 GOPs computing multi-shot kernels. The best energy efficiency is 72.68 MOPs/mW for one-shot kernels and 115.96 MOPs/mW for multi-shot kernels. The design integrates power and clock-gating techniques to tailor the architecture to the embedded domain while maintaining performance. The best speed-ups are 17.63x and 18.61x for one-shot and multi-shot kernels. The best energy savings in the SoC are 9.05x and 11.10x for one-shot and multi-shot kernels.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
calamanCy: A Tagalog Natural Language Processing Toolkit
Authors:
Lester James V. Miranda
Abstract:
We introduce calamanCy, an open-source toolkit for constructing natural language processing (NLP) pipelines for Tagalog. It is built on top of spaCy, enabling easy experimentation and integration with other frameworks. calamanCy addresses the development gap by providing a consistent API for building NLP applications and offering general-purpose multitask models with out-of-the-box support for dep…
▽ More
We introduce calamanCy, an open-source toolkit for constructing natural language processing (NLP) pipelines for Tagalog. It is built on top of spaCy, enabling easy experimentation and integration with other frameworks. calamanCy addresses the development gap by providing a consistent API for building NLP applications and offering general-purpose multitask models with out-of-the-box support for dependency parsing, parts-of-speech (POS) tagging, and named entity recognition (NER). calamanCy aims to accelerate the progress of Tagalog NLP by consolidating disjointed resources in a unified framework. The calamanCy toolkit is available on GitHub: https://github.com/ljvmiranda921/calamanCy.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
Developing a Named Entity Recognition Dataset for Tagalog
Authors:
Lester James V. Miranda
Abstract:
We present the development of a Named Entity Recognition (NER) dataset for Tagalog. This corpus helps fill the resource gap present in Philippine languages today, where NER resources are scarce. The texts were obtained from a pretraining corpora containing news reports, and were labeled by native speakers in an iterative fashion. The resulting dataset contains ~7.8k documents across three entity t…
▽ More
We present the development of a Named Entity Recognition (NER) dataset for Tagalog. This corpus helps fill the resource gap present in Philippine languages today, where NER resources are scarce. The texts were obtained from a pretraining corpora containing news reports, and were labeled by native speakers in an iterative fashion. The resulting dataset contains ~7.8k documents across three entity types: Person, Organization, and Location. The inter-annotator agreement, as measured by Cohen's $κ$, is 0.81. We also conducted extensive empirical evaluation of state-of-the-art methods across supervised and transfer learning settings. Finally, we released the data and processing code publicly to inspire future work on Tagalog NLP.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
Personalised and Adjustable Interval Type-2 Fuzzy-Based PPG Quality Assessment for the Edge
Authors:
Jose A. Miranda,
Celia López-Ongil,
Javier Andreu-Perez
Abstract:
Most of today's wearable technology provides seamless cardiac activity monitoring. Specifically, the vast majority employ Photoplethysmography (PPG) sensors to acquire blood volume pulse information, which is further analysed to extract useful and physiologically related features. Nevertheless, PPG-based signal reliability presents different challenges that strongly affect such data processing. Th…
▽ More
Most of today's wearable technology provides seamless cardiac activity monitoring. Specifically, the vast majority employ Photoplethysmography (PPG) sensors to acquire blood volume pulse information, which is further analysed to extract useful and physiologically related features. Nevertheless, PPG-based signal reliability presents different challenges that strongly affect such data processing. This is mainly related to the fact of PPG morphological wave distortion due to motion artefacts, which can lead to erroneous interpretation of the extracted cardiac-related features. On this basis, in this paper, we propose a novel personalised and adjustable Interval Type-2 Fuzzy Logic System (IT2FLS) for assessing the quality of PPG signals. The proposed system employs a personalised approach to adapt the IT2FLS parameters to the unique characteristics of each individual's PPG signals.Additionally, the system provides adjustable levels of personalisation, allowing healthcare providers to adjust the system to meet specific requirements for different applications. The proposed system obtained up to 93.72\% for average accuracy during validation. The presented system has the potential to enable ultra-low complexity and real-time PPG quality assessment, improving the accuracy and reliability of PPG-based health monitoring systems at the edge.
△ Less
Submitted 23 September, 2023;
originally announced September 2023.
-
A Game-Based Learning Application to Help Learners to Practice Mathematical Patterns and Structures
Authors:
Adrian S. Lozano,
Reister Justine B. Canlas,
Kimberly M. Coronel,
Justin M. Canlas,
Jerico G. Duya,
Regina C. Macapagal,
Ericson M. Dungca,
John Paul P. Miranda
Abstract:
Purpose - The purpose of this study is to develop a game-based mobile application to help learners practice mathematical patterns and structures.
Method - The study followed a mixed-method research design and prototyping methodology to guide the study in developing the mobile application. An instrument based on the Octalysis framework was developed as an evaluation tool for the study.
Results…
▽ More
Purpose - The purpose of this study is to develop a game-based mobile application to help learners practice mathematical patterns and structures.
Method - The study followed a mixed-method research design and prototyping methodology to guide the study in developing the mobile application. An instrument based on the Octalysis framework was developed as an evaluation tool for the study.
Results - The study developed a mobile application based on the Octalysis framework. The application has fully achieved all its intended features based on the rating provided by the students and IT experts.
Conclusion - The study successfully developed a mobile learning application for mathematical patterns and structures. By incorporating GBL principles and the Octalysis framework, the app achieved its intended features and received positive evaluations from students and IT experts. This highlights the potential of the app in promoting mathematical learning.
Recommendations - This study recommends that the application be further enhanced to include other topics. Incorporating other game-based principles and approaches like timed questions and the difficulty level is also worth pursuing. Actual testing for end-users is also needed to verify the application's effectiveness.
Practical Implications - Successful development of a game-based mobile app for practicing mathematical patterns and structures can transform education technology by engaging learners and enhancing their experience. This study provides valuable insights for future researchers developing similar applications, highlighting the potential to revolutionize traditional approaches and create an interactive learning environment for improving mathematical abilities.
△ Less
Submitted 22 June, 2023;
originally announced June 2023.
-
Multi hash embeddings in spaCy
Authors:
Lester James Miranda,
Ákos Kádár,
Adriane Boyd,
Sofie Van Landeghem,
Anders Søgaard,
Matthew Honnibal
Abstract:
The distributed representation of symbols is one of the key technologies in machine learning systems today, playing a pivotal role in modern natural language processing. Traditional word embeddings associate a separate vector with each word. While this approach is simple and leads to good performance, it requires a lot of memory for representing a large vocabulary. To reduce the memory footprint,…
▽ More
The distributed representation of symbols is one of the key technologies in machine learning systems today, playing a pivotal role in modern natural language processing. Traditional word embeddings associate a separate vector with each word. While this approach is simple and leads to good performance, it requires a lot of memory for representing a large vocabulary. To reduce the memory footprint, the default embedding layer in spaCy is a hash embeddings layer. It is a stochastic approximation of traditional embeddings that provides unique vectors for a large number of words without explicitly storing a separate vector for each of them. To be able to compute meaningful representations for both known and unknown words, hash embeddings represent each word as a summary of the normalized word form, subword information and word shape. Together, these features produce a multi-embedding of a word. In this technical report we lay out a bit of history and introduce the embedding methods in spaCy in detail. Second, we critically evaluate the hash embedding architecture with multi-embeddings on Named Entity Recognition datasets from a variety of domains and languages. The experiments validate most key design choices behind spaCy's embedders, but we also uncover a few surprising results.
△ Less
Submitted 19 December, 2022;
originally announced December 2022.
-
Datasets of Fire and Crime Incidents in Pampanga, Philippines
Authors:
John Paul P. Miranda,
Julieta M. Umali,
Aileen P. de Leon
Abstract:
The fire and crime incident datasets were requested and collected from two Philippine regional agencies (i.e., the Bureau of Fire Protection and the Philippine National Police). The datasets were used to initially analyze and map both fire and crime incidents within the province of Pampanga for a specific time frame. Several data preparation, normalization, and data cleaning steps were implemented…
▽ More
The fire and crime incident datasets were requested and collected from two Philippine regional agencies (i.e., the Bureau of Fire Protection and the Philippine National Police). The datasets were used to initially analyze and map both fire and crime incidents within the province of Pampanga for a specific time frame. Several data preparation, normalization, and data cleaning steps were implemented to properly map and identify patterns within the datasets. The initial results also indicate the leading causes of fire and crimes are rubbish and acts against property. Fires mostly occur during the dry season in the province. Crime is particularly high during December, and most of the fire and crime incidents occur during the time when people are most active. The dataset was able to present the temporal characteristics of the fire and crime incidents that occurred in the province of Pampanga. Merge the existing dataset with the other datasets from other related agencies to get a bigger picture and produce more objective results that could be used for decision-making.
△ Less
Submitted 1 November, 2022;
originally announced November 2022.
-
Development of Augmented Reality Application for Made-to-Order Furniture Industry in Pampanga, Philippines
Authors:
Jaymark A. Yambao,
John Paul P. Miranda,
Earl Lawrence B. Pelayo
Abstract:
The focus of the study was to develop a mobile application utilizing marker-less augmented reality for specific made-to-order products to support furniture and fixtures businesses. The study implemented mixed-methodology to properly identify the various stakeholders' considerations in developing the application. Interviews with key informants were conducted to ensure that the features were appropr…
▽ More
The focus of the study was to develop a mobile application utilizing marker-less augmented reality for specific made-to-order products to support furniture and fixtures businesses. The study implemented mixed-methodology to properly identify the various stakeholders' considerations in developing the application. Interviews with key informants were conducted to ensure that the features were appropriate for the intended user needs, and selected ISO standards were used as evaluation criteria. The results indicate that the mobile application with marker-less AR technology was found to be highly acceptable by three evaluators (i.e., customers, owners, and IT experts). The study also highlighted the use of AR-related technology in this case, where marker-less has the potential to improve customer purchasing experience even further. Future studies may include using newer technologies to further improve the application. The study suggests that Augmented Reality technology could be used to connect specific businesses directly to consumers regardless of setting or context.
△ Less
Submitted 13 August, 2022;
originally announced August 2022.
-
WEMAC: Women and Emotion Multi-modal Affective Computing dataset
Authors:
Jose A. Miranda,
Esther Rituerto-González,
Laura Gutiérrez-Martín,
Clara Luis-Mingueza,
Manuel F. Canabal,
Alberto Ramírez Bárcenas,
Jose M. Lanza-Gutiérrez,
Carmen Peláez-Moreno,
Celia López-Ongil
Abstract:
Among the seventeen Sustainable Development Goals (SDGs) proposed within the 2030 Agenda and adopted by all the United Nations member states, the Fifth SDG is a call for action to turn Gender Equality into a fundamental human right and an essential foundation for a better world. It includes the eradication of all types of violence against women. Within this context, the UC3M4Safety research team a…
▽ More
Among the seventeen Sustainable Development Goals (SDGs) proposed within the 2030 Agenda and adopted by all the United Nations member states, the Fifth SDG is a call for action to turn Gender Equality into a fundamental human right and an essential foundation for a better world. It includes the eradication of all types of violence against women. Within this context, the UC3M4Safety research team aims to develop Bindi. This is a cyber-physical system which includes embedded Artificial Intelligence algorithms, for user real-time monitoring towards the detection of affective states, with the ultimate goal of achieving the early detection of risk situations for women. On this basis, we make use of wearable affective computing including smart sensors, data encryption for secure and accurate collection of presumed crime evidence, as well as the remote connection to protecting agents. Towards the development of such system, the recordings of different laboratory and into-the-wild datasets are in process. These are contained within the UC3M4Safety Database. Thus, this paper presents and details the first release of WEMAC, a novel multi-modal dataset, which comprises a laboratory-based experiment for 47 women volunteers that were exposed to validated audio-visual stimuli to induce real emotions by using a virtual reality headset while physiological, speech signals and self-reports were acquired and collected. We believe this dataset will serve and assist research on multi-modal affective computing using physiological and speech information.
△ Less
Submitted 16 April, 2024; v1 submitted 1 March, 2022;
originally announced March 2022.
-
Dataset of Philippine Presidents Speeches from 1935 to 2016
Authors:
John Paul P. Miranda
Abstract:
The dataset was collected to examine and identify possible key topics within these texts. Data preparation such as data cleaning, transformation, tokenization, removal of stop words from both English and Filipino, and word stemming was employed in the dataset before feeding it to sentiment analysis and the LDA model. The topmost occurring word within the dataset is "development" and there are thre…
▽ More
The dataset was collected to examine and identify possible key topics within these texts. Data preparation such as data cleaning, transformation, tokenization, removal of stop words from both English and Filipino, and word stemming was employed in the dataset before feeding it to sentiment analysis and the LDA model. The topmost occurring word within the dataset is "development" and there are three (3) likely topics from the speeches of Philippine presidents: economic development, enhancement of public services, and addressing challenges. The dataset was able to provide valuable insights contained among official documents. While the study showed that presidents have used their annual address to express their visions for the country. It also presented that the presidents from 1935 to 2016 faced the same problems during their term. Future researchers may collect other speeches made by presidents during their term; combine them to the dataset used in this study to further investigate these important texts by subjecting them to the same methodology used in this study. The dataset may be requested from the authors and it is recommended for further analysis. For example, determine how the speeches of the president reflect the preamble or foundations of the Philippine constitution.
△ Less
Submitted 12 November, 2021;
originally announced November 2021.
-
Towards the Development of 3D Engine Assembly Simulation Learning Module for Senior High School
Authors:
John Paul P. Miranda,
Jaymark A. Yambao,
Jhon Asley M. Marcelo,
Christopher Robert N. Gonzales,
Vee-jay T. Mungcal
Abstract:
The focus of the study is to develop a 3D engine assembly simulation learning module to address the lack of equipment in one senior high school in the Philippines. The study used mixed-method to determine the considerations needed in developing an application for educational use particularly among laboratory/practical subjects like engine assembly. The study used ISO 25010 quality standards in eva…
▽ More
The focus of the study is to develop a 3D engine assembly simulation learning module to address the lack of equipment in one senior high school in the Philippines. The study used mixed-method to determine the considerations needed in developing an application for educational use particularly among laboratory/practical subjects like engine assembly. The study used ISO 25010 quality standards in evaluating the application(n=153 students and 3 ICT experts).Results showed that the application is moderately acceptable(overall mean = 3.52) under ISO 25010 quality standards. The study created an engine simulation learning assembly in which teachers can use to augment their lesson. The study also highlights the applicability of using 3D-related technologies for practical and laboratory subjects particularly highly technical-related subjects. Future studies may develop a similar application in the same context using mobile and other emerging technology(i.e., Virtual Reality, Augmented Reality) as well as making the content more customizable. Effectivity of the system in an actual setting is also worth pursuing. The study highlighted the potential use of 3D technology in a classroom setting.
△ Less
Submitted 19 November, 2020;
originally announced November 2020.
-
Geomancer: An Open-Source Framework for Geospatial Feature Engineering
Authors:
Lester James V. Miranda,
Mark Steve Samson,
Alfiero K. Orden II,
Bianca S. Silmaro,
Ram K. De Guzman III,
Stephanie S. Sy
Abstract:
This paper presents Geomancer, an open-source framework for geospatial feature engineering. It simplifies the acquisition of geospatial attributes for downstream, large-scale machine learning tasks. Geomancer leverages any geospatial dataset stored in a data warehouse, users need only to define the features (Spells) they want to create, and cast them on any spatial dataset. In addition, these feat…
▽ More
This paper presents Geomancer, an open-source framework for geospatial feature engineering. It simplifies the acquisition of geospatial attributes for downstream, large-scale machine learning tasks. Geomancer leverages any geospatial dataset stored in a data warehouse, users need only to define the features (Spells) they want to create, and cast them on any spatial dataset. In addition, these features can be exported into a JSON file (SpellBook) for sharing and reproducibility. Geomancer has been useful to some of our production use-cases such as property value estimation, area valuation, and more. It is available on Github, and can be installed from PyPI.
△ Less
Submitted 12 October, 2019;
originally announced October 2019.
-
Fractal and Multifractal Properties of Electrographic Recordings of Human Brain Activity: Toward Its Use as a Signal Feature for Machine Learning in Clinical Applications
Authors:
Lucas G. S. França,
José G. V. Miranda,
Marco Leite,
Niraj K. Sharma,
Matthew C. Walker,
Louis Lemieux,
Yujiang Wang
Abstract:
The brain is a system operating on multiple time scales, and characterisation of dynamics across time scales remains a challenge. One framework to study such dynamics is that of fractal geometry. However, currently there exists no established method for the study of brain dynamics using fractal geometry, due to the many challenges in the conceptual and technical understanding of the methods. We ai…
▽ More
The brain is a system operating on multiple time scales, and characterisation of dynamics across time scales remains a challenge. One framework to study such dynamics is that of fractal geometry. However, currently there exists no established method for the study of brain dynamics using fractal geometry, due to the many challenges in the conceptual and technical understanding of the methods. We aim to highlight some of the practical challenges of applying fractal geometry to brain dynamics and propose solutions to enable its wider use in neuroscience. Using intracranially recorded EEG and simulated data, we compared monofractal and multifractal methods with regards to their sensitivity to signal variance. We found that both correlate closely with signal variance, thus not offering new information about the signal. However, after applying an epoch-wise standardisation procedure to the signal, we found that multifractal measures could offer non-redundant information compared to signal variance, power and other established EEG signal measures. We also compared different multifractal estimation methods and found that the Chhabra-Jensen algorithm performed best. Finally, we investigated the impact of sampling frequency and epoch length on multifractal properties. Using epileptic seizures as an example event in the EEG, we show that there may be an optimal time scale for detecting temporal changes in multifractal properties around seizures. The practical issues we highlighted and our suggested solutions should help in developing a robust method for the application of fractal geometry in EEG signals. Our analyses and observations also aid the theoretical understanding of the multifractal properties of the brain and might provide grounds for new discoveries in the study of brain signals. These could be crucial for understanding of neurological function and for the developments of new treatments.
△ Less
Submitted 11 December, 2018; v1 submitted 11 June, 2018;
originally announced June 2018.
-
Validity and reliability of free software for bidimensional gait analysis
Authors:
Ana Paula Quixadá,
Andrea Naomi Onodera,
Norberto Peña,
José Garcia Vivas Miranda,
Katia Nunes Sá
Abstract:
Despite the evaluation systems of human movement that have been advancing in recent decades, their use are not feasible for clinical practice because it has a high cost and scarcity of trained operators to interpret their results. An ideal videogrammetry system should be easy to use, low cost, with minimal equipment, and fast realization. The CvMob is a free tool for dynamic evaluation of human mo…
▽ More
Despite the evaluation systems of human movement that have been advancing in recent decades, their use are not feasible for clinical practice because it has a high cost and scarcity of trained operators to interpret their results. An ideal videogrammetry system should be easy to use, low cost, with minimal equipment, and fast realization. The CvMob is a free tool for dynamic evaluation of human movements that express measurements in figures, tables, and graphics. This paper aims to determine if CvMob is a reliable tool for the evaluation of two dimensional human gait. This is a validity and reliability study. The sample was composed of 56 healthy individuals who walked on a 9-meterlong walkway and were simultaneously filmed by CvMob and Vicon system cameras. Linear trajectories and angular measurements were compared to validate the CvMob system, and inter and intrarater findings of the same measurements were used to determine reliability. A strong correlation (rs mean = 0.988) of the linear trajectories between systems and inter and intrarater analysis were found. According to the Bland-Altman method, the angles that had good agreement between systems were maximum flexion and extension (stance and swing) of the knee and dorsiflexion range of motion and stride length. The CvMob is a reliable tool for analysis of linear motion and lengths in two-dimensional evaluations of human gait. The angular measurements demonstrate high agreement for the knee joint; however, the hip and ankle measurements were limited by differences between systems.
△ Less
Submitted 14 February, 2016;
originally announced February 2016.
-
Spread-Spectrum Based on Finite Field Fourier Transforms
Authors:
H. M. de Oliveira,
J. P. C. L. Miranda,
R. M. Campello de Souza
Abstract:
Spread-spectrum systems are presented, which are based on Finite Field Fourier Transforms. Orthogonal spreading sequences defined over a finite field are derived. New digital multiplex schemes based on such spread-spectrum systems are also introduced, which are multilevel Coding Division Multiplex. These schemes termed Galois-field Division Multiplex (GDM) offer compact bandwidth requirements beca…
▽ More
Spread-spectrum systems are presented, which are based on Finite Field Fourier Transforms. Orthogonal spreading sequences defined over a finite field are derived. New digital multiplex schemes based on such spread-spectrum systems are also introduced, which are multilevel Coding Division Multiplex. These schemes termed Galois-field Division Multiplex (GDM) offer compact bandwidth requirements because only leaders of cyclotomic cosets are needed to be transmitted.
△ Less
Submitted 12 February, 2015;
originally announced March 2015.
-
Uplink Performance Evaluation of Massive MU-MIMO Systems
Authors:
Felipe A. P. de Figueiredo,
Joao Paulo Miranda,
Fabricio L. Figueiredo,
Fabbryccio A. C. M. Cardoso
Abstract:
The present paper deals with an OFDM-based uplink within a multi-user MIMO (MU-MIMO) system where a massive MIMO approach is employed. In this context, the linear detectors Minimum Mean-Squared Error (MMSE), Zero Forcing (ZF) and Maximum Ratio Combining (MRC) are considered and assessed. This papers includes Bit Error Rate (BER) results for uncoded QPSK/OFDM transmissions through a flat Rayleigh f…
▽ More
The present paper deals with an OFDM-based uplink within a multi-user MIMO (MU-MIMO) system where a massive MIMO approach is employed. In this context, the linear detectors Minimum Mean-Squared Error (MMSE), Zero Forcing (ZF) and Maximum Ratio Combining (MRC) are considered and assessed. This papers includes Bit Error Rate (BER) results for uncoded QPSK/OFDM transmissions through a flat Rayleigh fading channel under the assumption of perfect power control and channel estimation. BER results are obtained through Monte Carlo simulations. Performance results are discussed in detail and we confirm the achievable "massive MIMO" effects, even for a reduced complexity detection technique, when the number of receive antennas at BS is much larger than the number of transmit antennas.
△ Less
Submitted 7 March, 2015;
originally announced March 2015.
-
On Galois-Division Multiple Access Systems: Figures of Merit and Performance Evaluation
Authors:
J. P. C. L. Miranda,
H. M. de Oliveira
Abstract:
A new approach to multiple access based on finite field transforms is investigated. These schemes, termed Galois-Division Multiple Access (GDMA), offer compact bandwidth requirements. A new digital transform, the Finite Field Hartley Transform (FFHT) requires to deal with fields of characteristic p, p \neq 2. A binary-to-p-ary (p \neq 2) mapping based on the opportunistic secondary channel is intr…
▽ More
A new approach to multiple access based on finite field transforms is investigated. These schemes, termed Galois-Division Multiple Access (GDMA), offer compact bandwidth requirements. A new digital transform, the Finite Field Hartley Transform (FFHT) requires to deal with fields of characteristic p, p \neq 2. A binary-to-p-ary (p \neq 2) mapping based on the opportunistic secondary channel is introduced. This allows the use of GDMA in conjunction with available digital systems. The performance of GDMA is also evaluated.
△ Less
Submitted 12 February, 2015;
originally announced February 2015.
-
Massive MIMO and Waveform Design for 5th Generation Wireless Communication Systems
Authors:
Arman Farhang,
Nicola Marchetti,
Fabricio Figueiredo,
Joao Paulo Miranda
Abstract:
This article reviews existing related work and identifies the main challenges in the key 5G area at the intersection of waveform design and large-scale multiple antenna systems, also known as Massive MIMO. The property of self-equalization is introduced for Filter Bank Multicarrier (FBMC)-based Massive MIMO, which can reduce the number of subcarriers required by the system. It is also shown that t…
▽ More
This article reviews existing related work and identifies the main challenges in the key 5G area at the intersection of waveform design and large-scale multiple antenna systems, also known as Massive MIMO. The property of self-equalization is introduced for Filter Bank Multicarrier (FBMC)-based Massive MIMO, which can reduce the number of subcarriers required by the system. It is also shown that the blind channel tracking property of FBMC can be used to address pilot contamination -- one of the main limiting factors of Massive MIMO systems. Our findings shed light into and motivate for an entirely new research line towards a better understanding of waveform design with emphasis on FBMC-based Massive MIMO networks.
△ Less
Submitted 23 September, 2016; v1 submitted 1 January, 2015;
originally announced January 2015.
-
Free Instrument for Movement Measure
Authors:
Norberto Peña,
Bruno Cecílio Credidio,
Lorena Peixoto Nogueira Rodriguez Martinez Salles Corrêa,
Lucas Gabriel Souza França,
Marcelo do Vale Cunha,
Marcos Cavalcanti de Sousa,
João Paulo Bomfim Cruz Vieira,
José Garcia Vivas Miranda
Abstract:
This paper presents the validation of a computational tool that serves to obtain continuous measurements of moving objects. The software uses techniques of computer vision, pattern recognition and optical flow, to enable tracking of objects in videos, generating data trajectory, velocity, acceleration and angular movement. The program was applied to track a ball around a simple pendulum. The metho…
▽ More
This paper presents the validation of a computational tool that serves to obtain continuous measurements of moving objects. The software uses techniques of computer vision, pattern recognition and optical flow, to enable tracking of objects in videos, generating data trajectory, velocity, acceleration and angular movement. The program was applied to track a ball around a simple pendulum. The methodology used to validate it, taking as a basis to compare the values measured by the program, as well as the theoretical values expected according to the model of a simple pendulum. The experiment is appropriate to the method because it was built within the limits of the linear harmonic oscillator and energy losses due to friction had been minimized, making it the most ideal possible. The results indicate that the tool is sensitive and accurate. Deviations of less than a millimeter to the extent of the trajectory, ensures the applicability of the software on physics, whether in research or in teaching topics.
△ Less
Submitted 29 June, 2013;
originally announced July 2013.
-
Analysis of communities in a mythological social network
Authors:
Pedro J. Miranda,
Murilo S. Baptista,
Sandro E. de S. Pinto
Abstract:
The intriguing nature of classical Homeric narratives has always fascinated the occidental culture contributing to philosophy, history, mythology and straight forwardly to literature. However what would be so intriguing about Homer's narratives' At a first gaze we shall recognize the very literal appeal and aesthetic pleasure presented on every page across Homer's chants in Odyssey and rhapsodies…
▽ More
The intriguing nature of classical Homeric narratives has always fascinated the occidental culture contributing to philosophy, history, mythology and straight forwardly to literature. However what would be so intriguing about Homer's narratives' At a first gaze we shall recognize the very literal appeal and aesthetic pleasure presented on every page across Homer's chants in Odyssey and rhapsodies in Iliad. Secondly we may perceive a biased aspect of its stories contents, varying from real-historical to fictional-mythological. To encompass this glance, there are some new archeological finding that supports historicity of some events described within Iliad, and consequently to Odyssey. Considering these observations and using complex network theory concepts, we managed to built and analyze a social network gathered across the classical epic, Odyssey of Homer. Longing for further understanding, topological quantities were collected in order to classify its social network qualitatively into real or fictional. It turns out that most of the found properties belong to real social networks besides assortativity and giant component's size. In order to test the network's possibilities to be real, we removed some mythological members that could imprint a fictional aspect on the network. Carrying on this maneuver the modified social network resulted on assortative mixing and reduction of the giant component, as expected for real social networks. Overall we observe that Odyssey might be an amalgam of fictional elements plus real based human relations, which corroborates other author's findings for Iliad and archeological evidences.
△ Less
Submitted 19 June, 2013; v1 submitted 11 June, 2013;
originally announced June 2013.