Skip to main content

Showing 1–38 of 38 results for author: Soudris, D

.
  1. MAx-DNN: Multi-Level Arithmetic Approximation for Energy-Efficient DNN Hardware Accelerators

    Authors: Vasileios Leon, Georgios Makris, Sotirios Xydis, Kiamal Pekmestzi, Dimitrios Soudris

    Abstract: Nowadays, the rapid growth of Deep Neural Network (DNN) architectures has established them as the defacto approach for providing advanced Machine Learning tasks with excellent accuracy. Targeting low-power DNN computing, this paper examines the interplay of fine-grained error resilience of DNN workloads in collaboration with hardware approximation techniques, to achieve higher levels of energy eff… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

    Comments: Presented at the 13th IEEE LASCAS Conference

    Journal ref: 13th IEEE Latin America Symposium on Circuits and System (LASCAS), 2022

  2. arXiv:2506.21073  [pdf, ps, other

    cs.AR

    Post-Quantum and Blockchain-Based Attestation for Trusted FPGAs in B5G Networks

    Authors: Ilias Papalamprou, Nikolaos Fotos, Nikolaos Chatzivasileiadis, Anna Angelogianni, Dimosthenis Masouros, Dimitrios Soudris

    Abstract: The advent of 5G and beyond has brought increased performance networks, facilitating the deployment of services closer to the user. To meet performance requirements such services require specialized hardware, such as Field Programmable Gate Arrays (FPGAs). However, FPGAs are often deployed in unprotected environments, leaving the user's applications vulnerable to multiple attacks. With the rise of… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

  3. Combining Fault Tolerance Techniques and COTS SoC Accelerators for Payload Processing in Space

    Authors: Vasileios Leon, Elissaios Alexios Papatheofanous, George Lentaris, Charalampos Bezaitis, Nikolaos Mastorakis, Georgios Bampilis, Dionysios Reisis, Dimitrios Soudris

    Abstract: The ever-increasing demand for computational power and I/O throughput in space applications is transforming the landscape of on-board computing. A variety of Commercial-Off-The-Shelf (COTS) accelerators emerges as an attractive solution for payload processing to outperform the traditional radiation-hardened devices. Towards increasing the reliability of such COTS accelerators, the current paper ex… ▽ More

    Submitted 15 June, 2025; originally announced June 2025.

    Comments: Presented at the 30th IFIP/IEEE VLSI-SoC Conference

    Journal ref: 30th IFIP/IEEE International Conference on Very Large Scale Integration (VLSI-SoC), 2022

  4. Towards Employing FPGA and ASIP Acceleration to Enable Onboard AI/ML in Space Applications

    Authors: Vasileios Leon, George Lentaris, Dimitrios Soudris, Simon Vellas, Mathieu Bernou

    Abstract: The success of AI/ML in terrestrial applications and the commercialization of space are now paving the way for the advent of AI/ML in satellites. However, the limited processing power of classical onboard processors drives the community towards extending the use of FPGAs in space with both rad-hard and Commercial-Off-The-Shelf devices. The increased performance of FPGAs can be complemented with VP… ▽ More

    Submitted 15 June, 2025; originally announced June 2025.

    Comments: Presented at the 30th IFIP/IEEE VLSI-SoC Conference

    Journal ref: 30th IFIP/IEEE International Conference on Very Large Scale Integration (VLSI-SoC), 2022

  5. FPGA & VPU Co-Processing in Space Applications: Development and Testing with DSP/AI Benchmarks

    Authors: Vasileios Leon, Charalampos Bezaitis, George Lentaris, Dimitrios Soudris, Dionysios Reisis, Elissaios-Alexios Papatheofanous, Angelos Kyriakos, Aubrey Dunne, Arne Samuelsson, David Steenari

    Abstract: The advent of computationally demanding algorithms and high data rate instruments in new space applications pushes the space industry to explore disruptive solutions for on-board data processing. We examine heterogeneous computing architectures involving high-performance and low-power commercial SoCs. The current paper implements an FPGA with VPU co-processing architecture utilizing the CIF & LCD… ▽ More

    Submitted 15 June, 2025; originally announced June 2025.

    Comments: Presented at the 28th IEEE ICECS Conference

    Journal ref: 28th IEEE International Conference on Electronics, Circuits and Systems (ICECS), 2021

  6. arXiv:2505.23553  [pdf, ps, other

    cs.AR

    A Unified Framework for Mapping and Synthesis of Approximate R-Blocks CGRAs

    Authors: Georgios Alexandris, Panagiotis Chaidos, Alexis Maras, Barry de Bruin, Manil Dev Gomony, Henk Corporaal, Dimitrios Soudris, Sotirios Xydis

    Abstract: The ever-increasing complexity and operational diversity of modern Neural Networks (NNs) have caused the need for low-power and, at the same time, high-performance edge devices for AI applications. Coarse Grained Reconfigurable Architectures (CGRAs) form a promising design paradigm to address these challenges, delivering a close-to-ASIC performance while allowing for hardware programmability. In t… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

  7. arXiv:2504.04874  [pdf, other

    cs.OS cs.AI cs.PL

    Futureproof Static Memory Planning

    Authors: Christos Lamprakos, Panagiotis Xanthopoulos, Manolis Katsaragakis, Sotirios Xydis, Dimitrios Soudris, Francky Catthoor

    Abstract: The NP-complete combinatorial optimization task of assigning offsets to a set of buffers with known sizes and lifetimes so as to minimize total memory usage is called dynamic storage allocation (DSA). Existing DSA implementations bypass the theoretical state-of-the-art algorithms in favor of either fast but wasteful heuristics, or memory-efficient approaches that do not scale beyond one thousand b… ▽ More

    Submitted 7 April, 2025; originally announced April 2025.

    Comments: Submitted to ACM TOPLAS

  8. arXiv:2503.21671  [pdf, other

    cs.AR

    A Bespoke Design Approach to Low-Power Printed Microprocessors for Machine Learning Applications

    Authors: Panagiotis Chaidos, Giorgos Armeniakos, Sotirios Xydis, Dimitrios Soudris

    Abstract: Printed electronics have gained significant traction in recent years, presenting a viable path to integrating computing into everyday items, from disposable products to low-cost healthcare. However, the adoption of computing in these domains is hindered by strict area and power constraints, limiting the effectiveness of general-purpose microprocessors. This paper proposes a bespoke microprocessor… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

    Comments: Accepted for publication at the IEEE International Symposium on Circuits and Systems (ISCAS `25), May 25-28, London, United Kingdom

  9. arXiv:2409.16815  [pdf, other

    cs.LG

    Accelerating TinyML Inference on Microcontrollers through Approximate Kernels

    Authors: Giorgos Armeniakos, Georgios Mentzos, Dimitrios Soudris

    Abstract: The rapid growth of microcontroller-based IoT devices has opened up numerous applications, from smart manufacturing to personalized healthcare. Despite the widespread adoption of energy-efficient microcontroller units (MCUs) in the Tiny Machine Learning (TinyML) domain, they still face significant limitations in terms of performance and memory (RAM, Flash). In this work, we combine approximate com… ▽ More

    Submitted 25 September, 2024; originally announced September 2024.

  10. Accelerating AI and Computer Vision for Satellite Pose Estimation on the Intel Myriad X Embedded SoC

    Authors: Vasileios Leon, Panagiotis Minaidis, George Lentaris, Dimitrios Soudris

    Abstract: The challenging deployment of Artificial Intelligence (AI) and Computer Vision (CV) algorithms at the edge pushes the community of embedded computing to examine heterogeneous System-on-Chips (SoCs). Such novel computing platforms provide increased diversity in interfaces, processors and storage, however, the efficient partitioning and mapping of AI/CV workloads still remains an open issue. In this… ▽ More

    Submitted 19 September, 2024; originally announced September 2024.

    Comments: Accepted for publication at Elsevier Microprocessors and Microsystems

    Journal ref: Elsevier Microprocessors and Microsystems, Vol. 103, Nov. 2023

  11. MPAI: A Co-Processing Architecture with MPSoC & AI Accelerators for Vision Applications in Space

    Authors: Vasileios Leon, Panagiotis Minaidis, Dimitrios Soudris, George Lentaris

    Abstract: The emerging need for fast and power-efficient AI/ML deployment on-board spacecraft has forced the space industry to examine specialized accelerators, which have been successfully used in terrestrial applications. Towards this direction, the current work introduces a very heterogeneous co-processing architecture that is built around UltraScale+ MPSoC and its programmable DPU, as well as commercial… ▽ More

    Submitted 18 September, 2024; originally announced September 2024.

    Comments: Accepted for publication at the 31st IEEE ICECS Conference, 18-20 Nov, 2024, Nancy, France

    Journal ref: 31st IEEE International Conference on Electronics, Circuits and Systems (ICECS), 2024

  12. Development of High-Performance DSP Algorithms on the European Rad-Hard NG-ULTRA SoC FPGA

    Authors: Vasileios Leon, Anastasios Xynos, Dimitrios Soudris, George Lentaris, Ruben Domingo, Arturo Perez, David Gonzalez-Arjona, Isabelle Conway, David Merodio Codinachs

    Abstract: The emergence of demanding space applications has modified the traditional landscape of computing systems in space. When reliability is a first-class concern, in addition to enhanced performance-per-Watt, radiation-hardened FPGAs are favored. In this context, the current paper evaluates the first European radiation-hardened SoC FPGA, i.e., NanoXplore's NG-ULTRA, for accelerating high-performance D… ▽ More

    Submitted 18 September, 2024; originally announced September 2024.

    Comments: Accepted for publication at the 31st IEEE ICECS Conference, 18-20 Nov, 2024, Nancy, France

    Journal ref: 31st IEEE International Conference on Electronics, Circuits and Systems (ICECS), 2024

  13. arXiv:2408.05235  [pdf, other

    cs.DC cs.AI cs.AR cs.LG

    SLO-aware GPU Frequency Scaling for Energy Efficient LLM Inference Serving

    Authors: Andreas Kosmas Kakolyris, Dimosthenis Masouros, Petros Vavaroutsos, Sotirios Xydis, Dimitrios Soudris

    Abstract: As Large Language Models (LLMs) gain traction, their reliance on power-hungry GPUs places ever-increasing energy demands, raising environmental and monetary concerns. Inference dominates LLM workloads, presenting a critical challenge for providers: minimizing energy costs under Service-Level Objectives (SLOs) that ensure optimal user experience. In this paper, we present \textit{throttLL'eM}, a fr… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

  14. arXiv:2407.18386  [pdf, other

    cs.DC

    Leveraging Core and Uncore Frequency Scaling for Power-Efficient Serverless Workflows

    Authors: Achilleas Tzenetopoulos, Dimosthenis Masouros, Sotirios Xydis, Dimitrios Soudris

    Abstract: Serverless workflows have emerged in Function-as-a-Service (FaaS) platforms to represent the operational structure of traditional applications. With latency propagation effects becoming increasingly prominent, step-wise resource tuning is required to address Service-Level-Objectives (SLOs). Modern processors' allowance for fine-grained Dynamic Voltage and Frequency Scaling (DVFS), coupled with ser… ▽ More

    Submitted 21 April, 2025; v1 submitted 25 July, 2024; originally announced July 2024.

  15. Mixed-precision Neural Networks on RISC-V Cores: ISA extensions for Multi-Pumped Soft SIMD Operations

    Authors: Giorgos Armeniakos, Alexis Maras, Sotirios Xydis, Dimitrios Soudris

    Abstract: Recent advancements in quantization and mixed-precision approaches offers substantial opportunities to improve the speed and energy efficiency of Neural Networks (NN). Research has shown that individual parameters with varying low precision, can attain accuracies comparable to full-precision counterparts. However, modern embedded microprocessors provide very limited support for mixed-precision NNs… ▽ More

    Submitted 13 August, 2024; v1 submitted 19 July, 2024; originally announced July 2024.

    Comments: Accepted for publication at the 43rd International Conference on Computer-Aided Design (ICCAD `24), Oct 27-31 2024, New Jersey, USA

  16. arXiv:2407.03711  [pdf, other

    cs.AR

    Decoupled Access-Execute enabled DVFS for tinyML deployments on STM32 microcontrollers

    Authors: Elisavet Lydia Alvanaki, Manolis Katsaragakis, Dimosthenis Masouros, Sotirios Xydis, Dimitrios Soudris

    Abstract: Over the last years the rapid growth Machine Learning (ML) inference applications deployed on the Edge is rapidly increasing. Recent Internet of Things (IoT) devices and microcontrollers (MCUs), become more and more mainstream in everyday activities. In this work we focus on the family of STM32 MCUs. We propose a novel methodology for CNN deployment on the STM32 family, focusing on power optimizat… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: 6 pages, 6 figures, 1 listing, presented in IEEE DATE 2024

    Journal ref: 2024 Design, Automation & Test in Europe Conference & Exhibition (DATE) (pp. 1-6). IEEE

  17. arXiv:2405.16953  [pdf, other

    cs.CV cs.DC cs.PF

    Evaluation of Resource-Efficient Crater Detectors on Embedded Systems

    Authors: Simon Vellas, Bill Psomas, Kalliopi Karadima, Dimitrios Danopoulos, Alexandros Paterakis, George Lentaris, Dimitrios Soudris, Konstantinos Karantzalos

    Abstract: Real-time analysis of Martian craters is crucial for mission-critical operations, including safe landings and geological exploration. This work leverages the latest breakthroughs for on-the-edge crater detection aboard spacecraft. We rigorously benchmark several YOLO networks using a Mars craters dataset, analyzing their performance on embedded systems with a focus on optimization for low-power de… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Accepted at 2024 IEEE International Geoscience and Remote Sensing Symposium

  18. TF2AIF: Facilitating development and deployment of accelerated AI models on the cloud-edge continuum

    Authors: Aimilios Leftheriotis, Achilleas Tzenetopoulos, George Lentaris, Dimitrios Soudris, Georgios Theodoridis

    Abstract: The B5G/6G evolution relies on connect-compute technologies and highly heterogeneous clusters with HW accelerators, which require specialized coding to be efficiently utilized. The current paper proposes a custom tool for generating multiple SW versions of a certain AI function input in high-level language, e.g., Python TensorFlow, while targeting multiple diverse HW+SW platforms. TF2AIF builds up… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: to be published in EUCNC & 6G Summit 2024

  19. TransAxx: Efficient Transformers with Approximate Computing

    Authors: Dimitrios Danopoulos, Georgios Zervakis, Dimitrios Soudris, Jörg Henkel

    Abstract: Vision Transformer (ViT) models which were recently introduced by the transformer architecture have shown to be very competitive and often become a popular alternative to Convolutional Neural Networks (CNNs). However, the high computational requirements of these models limit their practical applicability especially on low-power devices. Current state-of-the-art employs approximate multipliers to a… ▽ More

    Submitted 7 May, 2025; v1 submitted 12 February, 2024; originally announced February 2024.

  20. On-sensor Printed Machine Learning Classification via Bespoke ADC and Decision Tree Co-Design

    Authors: Giorgos Armeniakos, Paula L. Duarte, Priyanjana Pal, Georgios Zervakis, Mehdi B. Tahoori, Dimitrios Soudris

    Abstract: Printed electronics (PE) technology provides cost-effective hardware with unmet customization, due to their low non-recurring engineering and fabrication costs. PE exhibit features such as flexibility, stretchability, porosity, and conformality, which make them a prominent candidate for enabling ubiquitous computing. Still, the large feature sizes in PE limit the realization of complex printed cir… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

    Comments: Accepted for publication at the 27th Design, Automation and Test in Europe Conference (DATE'24), Mar 25-27 2024, Valencia, Spain

  21. arXiv:2307.11128  [pdf, other

    cs.AR cs.AI cs.ET cs.PL

    Approximate Computing Survey, Part II: Application-Specific & Architectural Approximation Techniques and Applications

    Authors: Vasileios Leon, Muhammad Abdullah Hanif, Giorgos Armeniakos, Xun Jiao, Muhammad Shafique, Kiamal Pekmestzi, Dimitrios Soudris

    Abstract: The challenging deployment of compute-intensive applications from domains such as Artificial Intelligence (AI) and Digital Signal Processing (DSP), forces the community of computing systems to explore new design approaches. Approximate Computing appears as an emerging solution, allowing to tune the quality of results in the design of a system in order to improve the energy efficiency and/or perfor… ▽ More

    Submitted 19 March, 2025; v1 submitted 20 July, 2023; originally announced July 2023.

    Comments: Published in ACM Computing Surveys (Volume 57, Issue 7, 2025)

    Journal ref: ACM Computing Surveys, Volume 57, Issue 7, Article 177, 2025

  22. arXiv:2307.11124  [pdf, other

    cs.AR cs.ET cs.PL

    Approximate Computing Survey, Part I: Terminology and Software & Hardware Approximation Techniques

    Authors: Vasileios Leon, Muhammad Abdullah Hanif, Giorgos Armeniakos, Xun Jiao, Muhammad Shafique, Kiamal Pekmestzi, Dimitrios Soudris

    Abstract: The rapid growth of demanding applications in domains applying multimedia processing and machine learning has marked a new era for edge and cloud computing. These applications involve massive data and compute-intensive tasks, and thus, typical computing paradigms in embedded systems and data centers are stressed to meet the worldwide demand for high performance. Concurrently, over the last 15 year… ▽ More

    Submitted 19 March, 2025; v1 submitted 20 July, 2023; originally announced July 2023.

    Comments: Published in ACM Computing Surveys (Volume 57, Issue 7, 2025)

    Journal ref: ACM Computing Surveys, Volume 57, Issue 7, Article 185, 2025

  23. The Unexpected Efficiency of Bin Packing Algorithms for Dynamic Storage Allocation in the Wild: An Intellectual Abstract

    Authors: Christos P. Lamprakos, Sotirios Xydis, Francky Catthoor, Dimitrios Soudris

    Abstract: Recent work has shown that viewing allocators as black-box 2DBP solvers bears meaning. For instance, there exists a 2DBP-based fragmentation metric which often correlates monotonically with maximum resident set size (RSS). Given the field's indeterminacy with respect to fragmentation definitions, as well as the immense value of physical memory savings, we are motivated to set allocator-generated p… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

    Comments: 13 pages, 10 figures, 3 tables. To appear in ISMM '23

  24. arXiv:2304.10862  [pdf, other

    cs.PL

    Viewing Allocators as Bin Packing Solvers Demystifies Fragmentation

    Authors: Christos P. Lamprakos, Sotirios Xydis, Francky Catthoor, Dimitrios Soudris

    Abstract: This paper presents a trace-based simulation methodology for constructing representations of workload-allocator interaction. We use two-dimensional rectangular bin packing (2DBP) as our foundation. Classical 2DBP algorithms minimize their products' makespan, but virtual memory systems employing demand paging deem such a criterion inappropriate. We view an allocator's placement decisions as a solut… ▽ More

    Submitted 24 April, 2023; v1 submitted 21 April, 2023; originally announced April 2023.

    Comments: 13 pages, 10 figures, 5 tables Edit: removed "regular submission" subtitle, cleaned page headers

  25. arXiv:2304.00953  [pdf, other

    cs.DB

    Energy Consumption Evaluation of Optane DC Persistent Memory for Indexing Data Structures

    Authors: Manolis Katsaragakis, Christos Baloukas, Lazaros Papadopoulos, Verena Kantere, Francky Catthoor, Dimitrios Soudris

    Abstract: The Intel Optane DC Persistent Memory (DCPM) is an attractive novel technology for building storage systems for data intensive HPC applications, as it provides lower cost per byte, low standby power and larger capacities than DRAM, with comparable latency. This work provides an in-depth evaluation of the energy consumption of the Optane DCPM, using well-established indexes specifically designed to… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

    Comments: 10 pages Has been accepted and presented to IEEE International Conference on High Performance Computing 2022(HiPC), Bengaluru, India

  26. Model-to-Circuit Cross-Approximation For Printed Machine Learning Classifiers

    Authors: Giorgos Armeniakos, Georgios Zervakis, Dimitrios Soudris, Mehdi B. Tahoori, Jörg Henkel

    Abstract: Printed electronics (PE) promises on-demand fabrication, low non-recurring engineering costs, and sub-cent fabrication costs. It also allows for high customization that would be infeasible in silicon, and bespoke architectures prevail to improve the efficiency of emerging PE machine learning (ML) applications. Nevertheless, large feature sizes in PE prohibit the realization of complex ML models in… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

    Comments: Accepted for publication by IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, March 2023. arXiv admin note: text overlap with arXiv:2203.05915

  27. Co-Design of Approximate Multilayer Perceptron for Ultra-Resource Constrained Printed Circuits

    Authors: Giorgos Armeniakos, Georgios Zervakis, Dimitrios Soudris, Mehdi B. Tahoori, Jörg Henkel

    Abstract: Printed Electronics (PE) exhibits on-demand, extremely low-cost hardware due to its additive manufacturing process, enabling machine learning (ML) applications for domains that feature ultra-low cost, conformity, and non-toxicity requirements that silicon-based systems cannot deliver. Nevertheless, large feature sizes in PE prohibit the realization of complex printed ML circuits. In this work, we… ▽ More

    Submitted 28 February, 2023; originally announced February 2023.

    Comments: Accepted for publication by IEEE Transactions on Computers, February 2023

  28. arXiv:2212.00873  [pdf, other

    cs.AR

    CONVOLVE: Smart and seamless design of smart edge processors

    Authors: M. Gomony, F. Putter, A. Gebregiorgis, G. Paulin, L. Mei, V. Jain, S. Hamdioui, V. Sanchez, T. Grosser, M. Geilen, M. Verhelst, F. Zenke, F. Gurkaynak, B. Bruin, S. Stuijk, S. Davidson, S. De, M. Ghogho, A. Jimborean, S. Eissa, L. Benini, D. Soudris, R. Bishnoi, S. Ainsworth, F. Corradi , et al. (3 additional authors not shown)

    Abstract: With the rise of Deep Learning (DL), our world braces for AI in every edge device, creating an urgent need for edge-AI SoCs. This SoC hardware needs to support high throughput, reliable and secure AI processing at Ultra Low Power (ULP), with a very short time to market. With its strong legacy in edge solutions and open processing platforms, the EU is well-positioned to become a leader in this SoC… ▽ More

    Submitted 2 May, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

  29. Towards making the most of NLP-based device mapping optimization for OpenCL kernels

    Authors: Petros Vavaroutsos, Ioannis Oroutzoglou, Dimosthenis Masouros, Dimitrios Soudris

    Abstract: Nowadays, we are living in an era of extreme device heterogeneity. Despite the high variety of conventional CPU architectures, accelerator devices, such as GPUs and FPGAs, also appear in the foreground exploding the pool of available solutions to execute applications. However, choosing the appropriate device per application needs is an extremely challenging task due to the abstract relationship be… ▽ More

    Submitted 30 August, 2022; originally announced August 2022.

    Comments: Accepted at IEEE COINS 2022

    Journal ref: 2022 IEEE International Conference on Omni-layer Intelligent Systems (COINS), 2022, pp. 1-6

  30. arXiv:2203.08737  [pdf, other

    cs.AR cs.LG

    Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey

    Authors: Giorgos Armeniakos, Georgios Zervakis, Dimitrios Soudris, Jörg Henkel

    Abstract: Deep Neural Networks (DNNs) are very popular because of their high performance in various cognitive tasks in Machine Learning (ML). Recent advancements in DNNs have brought beyond human accuracy in many tasks, but at the cost of high computational complexity. To enable efficient execution of DNN inference, more and more research works, therefore, exploit the inherent error resilience of DNNs and e… ▽ More

    Submitted 16 March, 2022; originally announced March 2022.

    Comments: This paper has been accepted by ACM Computing Surveys (CSUR), 2022

    Journal ref: ACM Computing Surveys 2022

  31. Cross-Layer Approximation For Printed Machine Learning Circuits

    Authors: Giorgos Armeniakos, Georgios Zervakis, Dimitrios Soudris, Mehdi B. Tahoori, Jörg Henkel

    Abstract: Printed electronics (PE) feature low non-recurring engineering costs and low per unit-area fabrication costs, enabling thus extremely low-cost and on-demand hardware. Such low-cost fabrication allows for high customization that would be infeasible in silicon, and bespoke architectures prevail to improve the efficiency of emerging PE machine learning (ML) applications. However, even with bespoke ar… ▽ More

    Submitted 11 March, 2022; originally announced March 2022.

    Comments: Accepted for publication at the 25th Design, Automation and Test in Europe Conference (DATE'22), Mar 14-23 2022, Antwerp, Belgium

  32. AdaPT: Fast Emulation of Approximate DNN Accelerators in PyTorch

    Authors: Dimitrios Danopoulos, Georgios Zervakis, Kostas Siozios, Dimitrios Soudris, Jörg Henkel

    Abstract: Current state-of-the-art employs approximate multipliers to address the highly increased power demands of DNN accelerators. However, evaluating the accuracy of approximate DNNs is cumbersome due to the lack of adequate support for approximate arithmetic in DNN frameworks. We address this inefficiency by presenting AdaPT, a fast emulation framework that extends PyTorch to support approximate infere… ▽ More

    Submitted 11 October, 2022; v1 submitted 8 March, 2022; originally announced March 2022.

    Comments: Accepted for publication in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

  33. EDEN: A high-performance, general-purpose, NeuroML-based neural simulator

    Authors: Sotirios Panagiotou, Harry Sidiropoulos, Mario Negrello, Dimitrios Soudris, Christos Strydis

    Abstract: Modern neuroscience employs in silico experimentation on ever-increasing and more detailed neural networks. The high modelling detail goes hand in hand with the need for high model reproducibility, reusability and transparency. Besides, the size of the models and the long timescales under study mandate the use of a simulation system with high computational performance, so as to provide an acceptab… ▽ More

    Submitted 12 June, 2021; originally announced June 2021.

    Comments: 29 pages, 9 figures

    Journal ref: Front. Neuroinform. 16 (2022)

  34. arXiv:2004.13873  [pdf, other

    eess.SY

    Automated Physics-Derived Code Generation for Sensor Fusion and State Estimation

    Authors: Orestis Kaparounakis, Vasileios Tsoutsouras, Dimitrios Soudris, Phillip Stanley-Marbell

    Abstract: We present a new method for automatically generating the implementation of state-estimation algorithms from a machine-readable specification of the physics of a sensing system and physics of its signals and signal constraints. We implement the new state-estimator code generation method as a backend for a physics specification language and we apply the backend to generate complete C code implementa… ▽ More

    Submitted 28 April, 2020; originally announced April 2020.

    Comments: 11 pages, 7 figures

  35. arXiv:1612.01501  [pdf, other

    cs.NE cs.DC

    BrainFrame: A node-level heterogeneous accelerator platform for neuron simulations

    Authors: Georgios Smaragdos, Georgios Chatzikonstantis, Rahul Kukreja, Harry Sidiropoulos, Dimitrios Rodopoulos, Ioannis Sourdis, Zaid Al-Ars, Christoforos Kachris, Dimitrios Soudris, Chris I. De Zeeuw, Christos Strydis

    Abstract: Objective: The advent of High-Performance Computing (HPC) in recent years has led to its increasing use in brain study through computational models. The scale and complexity of such models are constantly increasing, leading to challenging computational requirements. Even though modern HPC platforms can often deal with such challenges, the vast diversity of the modeling field does not permit for a… ▽ More

    Submitted 15 August, 2017; v1 submitted 5 December, 2016; originally announced December 2016.

    Comments: 16 pages, 18 figures, 5 tables

  36. arXiv:1406.0309  [pdf

    cs.NI cs.AR

    Network Function Virtualization based on FPGAs:A Framework for all-Programmable network devices

    Authors: Christoforos Kachris, Georgios Sirakoulis, Dimitrios Soudris

    Abstract: Network Function Virtualization (NFV) refers to the use of commodity hardware resources as the basic platform to perform specialized network functions as opposed to specialized hardware devices. Currently, NFV is mainly implemented based on general purpose processors, or general purpose network processors. In this paper we propose the use of FPGAs as an ideal platform for NFV that can be used to p… ▽ More

    Submitted 2 June, 2014; originally announced June 2014.

    Comments: Network function virtualizations, FPGA, dynamic reconfiguration

  37. arXiv:0710.4844  [pdf

    cs.AR

    A Partitioning Methodology for Accelerating Applications in Hybrid Reconfigurable Platforms

    Authors: M. D. Galanis, A. Milidonis, G. Theodoridis, D. Soudris, C. E. Goutis

    Abstract: In this paper, we propose a methodology for partitioning and mapping computational intensive applications in reconfigurable hardware blocks of different granularity. A generic hybrid reconfigurable architecture is considered so as the methodology can be applicable to a large number of heterogeneous reconfigurable platforms. The methodology mainly consists of two stages, the analysis and the mapp… ▽ More

    Submitted 25 October, 2007; originally announced October 2007.

    Comments: Submitted on behalf of EDAA (http://www.edaa.com/)

    Journal ref: Dans Design, Automation and Test in Europe | Designers'Forum - DATE'05, Munich : Allemagne (2005)

  38. arXiv:0710.4656  [pdf

    cs.AR

    A Memory Hierarchical Layer Assigning and Prefetching Technique to Overcome the Memory Performance/Energy Bottleneck

    Authors: Minas Dasygenis, Erik Brockmeyer, Bart Durinck, Francky Catthoor, Dimitrios Soudris, Antonios Thanailakis

    Abstract: The memory subsystem has always been a bottleneck in performance as well as significant power contributor in memory intensive applications. Many researchers have presented multi-layered memory hierarchies as a means to design energy and performance efficient systems. However, most of the previous work do not explore trade-offs systematically. We fill this gap by proposing a formalized technique… ▽ More

    Submitted 25 October, 2007; originally announced October 2007.

    Comments: Submitted on behalf of EDAA (http://www.edaa.com/)

    Journal ref: Dans Design, Automation and Test in Europe - DATE'05, Munich : Allemagne (2005)