Search | arXiv e-print repository

Cartesian Forest Matching

Authors: Bastien Auvray, Julien David, Richard Groult, Thierry Lecroq

Abstract: In this paper, we introduce the notion of Cartesian Forest, which generalizes Cartesian Trees, in order to deal with partially ordered sequences. We show that algorithms that solve both exact and approximate Cartesian Tree Matching can be adapted to solve Cartesian Forest Matching in average linear time. We adapt the notion of Cartesian Tree Signature to Cartesian Forests and show how filters can… ▽ More In this paper, we introduce the notion of Cartesian Forest, which generalizes Cartesian Trees, in order to deal with partially ordered sequences. We show that algorithms that solve both exact and approximate Cartesian Tree Matching can be adapted to solve Cartesian Forest Matching in average linear time. We adapt the notion of Cartesian Tree Signature to Cartesian Forests and show how filters can be used to experimentally improve the algorithm for the exact matching. We also show a one to one correspondence between Cartesian Forests and Schröder Trees. △ Less

Submitted 3 June, 2025; originally announced June 2025.

Comments: Submitted to SPIRE 2025

arXiv:2505.09236 [pdf, ps, other]

Approximate Cartesian Tree Matching with One Difference

Authors: Bastien Auvray, Julien David, Samah Ghazawi, Richard Groult, Gad M. Landau, Thierry Lecroq

Abstract: Cartesian tree pattern matching consists of finding all the factors of a text that have the same Cartesian tree than a given pattern. There already exist theoretical and practical solutions for the exact case. In this paper, we propose the first algorithms for solving approximate Cartesian tree pattern matching with one difference given a pattern of length m and a text of length n. We present a ge… ▽ More Cartesian tree pattern matching consists of finding all the factors of a text that have the same Cartesian tree than a given pattern. There already exist theoretical and practical solutions for the exact case. In this paper, we propose the first algorithms for solving approximate Cartesian tree pattern matching with one difference given a pattern of length m and a text of length n. We present a generic algorithm that find all the factors of the text that have the same Cartesian tree of the pattern with one difference, using different notions of differences. We show that this algorithm has a O(nM) worst-case complexity and that, for several random models, the algorithm has a linear average-case complexity. We also present an automaton based algorithm, adapting [PALP19], that can be generalized to deal with more than one difference. △ Less

Submitted 14 May, 2025; originally announced May 2025.

Comments: Submitted to Elsevier's Theoretical Computer Science (May 2025). arXiv admin note: text overlap with arXiv:2306.16065

arXiv:2408.15582 [pdf, other]

Spectral Masking with Explicit Time-Context Windowing for Neural Network-Based Monaural Speech Enhancement

Authors: Luan Vinícius Fiorio, Boris Karanov, Bruno Defraene, Johan David, Wim van Houtum, Frans Widdershoven, Ronald M. Aarts

Abstract: We propose and analyze the use of an explicit time-context window for neural network-based spectral masking speech enhancement to leverage signal context dependencies between neighboring frames. In particular, we concentrate on soft masking and loss computed on the time-frequency representation of the reconstructed speech. We show that the application of a time-context windowing function at both i… ▽ More We propose and analyze the use of an explicit time-context window for neural network-based spectral masking speech enhancement to leverage signal context dependencies between neighboring frames. In particular, we concentrate on soft masking and loss computed on the time-frequency representation of the reconstructed speech. We show that the application of a time-context windowing function at both input and output of the neural network model improves the soft mask estimation process by combining multiple estimates taken from different contexts. The proposed approach is only applied as post-optimization in inference mode, not requiring additional layers or special training for the neural network model. Our results show that the method consistently increases both intelligibility and signal quality of the denoised speech, as demonstrated for two classes of convolutional-based speech enhancement models. Importantly, the proposed method requires only a negligible ($\leq1\%$) increase in the number of model parameters, making it suitable for hardware-constrained applications. △ Less

Submitted 28 August, 2024; originally announced August 2024.

Comments: This work has been submitted to the IEEE for possible publication

arXiv:2404.11769 [pdf, other]

QGen: On the Ability to Generalize in Quantization Aware Training

Authors: MohammadHossein AskariHemmat, Ahmadreza Jeddi, Reyhane Askari Hemmat, Ivan Lazarevich, Alexander Hoffman, Sudhakar Sah, Ehsan Saboori, Yvon Savaria, Jean-Pierre David

Abstract: Quantization lowers memory usage, computational requirements, and latency by utilizing fewer bits to represent model weights and activations. In this work, we investigate the generalization properties of quantized neural networks, a characteristic that has received little attention despite its implications on model performance. In particular, first, we develop a theoretical model for quantization… ▽ More Quantization lowers memory usage, computational requirements, and latency by utilizing fewer bits to represent model weights and activations. In this work, we investigate the generalization properties of quantized neural networks, a characteristic that has received little attention despite its implications on model performance. In particular, first, we develop a theoretical model for quantization in neural networks and demonstrate how quantization functions as a form of regularization. Second, motivated by recent work connecting the sharpness of the loss landscape and generalization, we derive an approximate bound for the generalization of quantized models conditioned on the amount of quantization noise. We then validate our hypothesis by experimenting with over 2000 models trained on CIFAR-10, CIFAR-100, and ImageNet datasets on convolutional and transformer-based models. △ Less

Submitted 19 April, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

arXiv:2309.05142 [pdf, other]

Large Language Models for Difficulty Estimation of Foreign Language Content with Application to Language Learning

Authors: Michalis Vlachos, Mircea Lungu, Yash Raj Shrestha, Johannes-Rudolf David

Abstract: We use large language models to aid learners enhance proficiency in a foreign language. This is accomplished by identifying content on topics that the user is interested in, and that closely align with the learner's proficiency level in that foreign language. Our work centers on French content, but our approach is readily transferable to other languages. Our solution offers several distinctive cha… ▽ More We use large language models to aid learners enhance proficiency in a foreign language. This is accomplished by identifying content on topics that the user is interested in, and that closely align with the learner's proficiency level in that foreign language. Our work centers on French content, but our approach is readily transferable to other languages. Our solution offers several distinctive characteristics that differentiate it from existing language-learning solutions, such as, a) the discovery of content across topics that the learner cares about, thus increasing motivation, b) a more precise estimation of the linguistic difficulty of the content than traditional readability measures, and c) the availability of both textual and video-based content. The linguistic complexity of video content is derived from the video captions. It is our aspiration that such technology will enable learners to remain engaged in the language-learning process by continuously adapting the topics and the difficulty of the content to align with the learners' evolving interests and learning objectives. △ Less

Submitted 10 September, 2023; originally announced September 2023.

arXiv:2306.16065 [pdf, ps, other]

Approximate Cartesian Tree Matching: an Approach Using Swaps

Authors: Bastien Auvray, Julien David, Richard Groult, Thierry Lecroq

Abstract: Cartesian tree pattern matching consists of finding all the factors of a text that have the same Cartesian tree than a given pattern. There already exist theoretical and practical solutions for the exact case. In this paper, we propose the first algorithm for solving approximate Cartesian tree pattern matching. We consider Cartesian tree pattern matching with one swap: given a pattern of length m… ▽ More Cartesian tree pattern matching consists of finding all the factors of a text that have the same Cartesian tree than a given pattern. There already exist theoretical and practical solutions for the exact case. In this paper, we propose the first algorithm for solving approximate Cartesian tree pattern matching. We consider Cartesian tree pattern matching with one swap: given a pattern of length m and a text of length n we present two algorithms that find all the factors of the text that have the same Cartesian tree of the pattern after one transposition of two adjacent symbols. The first algorithm uses a characterization of a linear representation of the Cartesian trees called parent-distance after one swap and runs in time Theta(mn) using Theta(m) space. The second algorithm generates all the parent-distance tables of sequences that have the same Cartesian tree than the pattern after one swap. It runs in time O((m^2 + n)log m) and has O(m^2) space complexity. △ Less

Submitted 18 October, 2024; v1 submitted 28 June, 2023; originally announced June 2023.

Comments: Submitted to SPIRE 2023

arXiv:2306.09905 [pdf, other]

Sparq: A Custom RISC-V Vector Processor for Efficient Sub-Byte Quantized Inference

Authors: Théo Dupuis, Yoan Fournier, MohammadHossein AskariHemmat, Nizar El Zarif, François Leduc-Primeau, Jean Pierre David, Yvon Savaria

Abstract: Convolutional Neural Networks (CNNs) are used in a wide range of applications, with full-precision CNNs achieving high accuracy at the expense of portability. Recent progress in quantization techniques has demonstrated that sub-byte Quantized Neural Networks (QNNs) achieve comparable or superior accuracy while significantly reducing the computational cost and memory footprint. However, sub-byte co… ▽ More Convolutional Neural Networks (CNNs) are used in a wide range of applications, with full-precision CNNs achieving high accuracy at the expense of portability. Recent progress in quantization techniques has demonstrated that sub-byte Quantized Neural Networks (QNNs) achieve comparable or superior accuracy while significantly reducing the computational cost and memory footprint. However, sub-byte computation on commodity hardware is sub-optimal due to the lack of support for such precision. In this paper, we introduce Sparq, a Sub-byte vector Processor designed for the AcceleRation of QNN inference. This processor is based on a modified version of Ara, an open-source 64-bit RISC-V ``V'' compliant processor. Sparq is implemented in GLOBAL FOUNDRIES 22FDX FD-SOI technology and extends the Instruction Set Architecture (ISA) by adding a new multiply-shift-accumulate instruction to improve sub-byte computation effciency. The floating-point unit is also removed to minimize area and power usage. To demonstrate Sparq performance, we implement an ultra-low-precision (1-bit to 4-bit) vectorized conv2d operation taking advantage of the dedicated hardware. We show that Sparq can significantly accelerate sub-byte computations with respectively 3.2 times, and 1.7 times acceleration over an optimized 16-bit 2D convolution for 2-bit and 4-bit quantization. △ Less

Submitted 16 June, 2023; originally announced June 2023.

Comments: 5 pages, Accepted for publication in the 21st IEEE Interregional NEWCAS Conference (2023)

arXiv:2302.05996 [pdf, other]

Quark: An Integer RISC-V Vector Processor for Sub-Byte Quantized DNN Inference

Authors: MohammadHossein AskariHemmat, Theo Dupuis, Yoan Fournier, Nizar El Zarif, Matheus Cavalcante, Matteo Perotti, Frank Gurkaynak, Luca Benini, Francois Leduc-Primeau, Yvon Savaria, Jean-Pierre David

Abstract: In this paper, we present Quark, an integer RISC-V vector processor specifically tailored for sub-byte DNN inference. Quark is implemented in GlobalFoundries' 22FDX FD-SOI technology. It is designed on top of Ara, an open-source 64-bit RISC-V vector processor. To accommodate sub-byte DNN inference, Quark extends Ara by adding specialized vector instructions to perform sub-byte quantized operations… ▽ More In this paper, we present Quark, an integer RISC-V vector processor specifically tailored for sub-byte DNN inference. Quark is implemented in GlobalFoundries' 22FDX FD-SOI technology. It is designed on top of Ara, an open-source 64-bit RISC-V vector processor. To accommodate sub-byte DNN inference, Quark extends Ara by adding specialized vector instructions to perform sub-byte quantized operations. We also remove the floating-point unit from Quarks' lanes and use the CVA6 RISC-V scalar core for the re-scaling operations that are required in quantized neural network inference. This makes each lane of Quark 2 times smaller and 1.9 times more power efficient compared to the ones of Ara. In this paper we show that Quark can run quantized models at sub-byte precision. Notably we show that for 1-bit and 2-bit quantized models, Quark can accelerate computation of Conv2d over various ranges of inputs and kernel sizes. △ Less

Submitted 12 February, 2023; originally announced February 2023.

Comments: 5 pages. Accepted for publication in the 56th International Symposium on Circuits and Systems (ISCAS 2023)

ACM Class: C.1.3; C.3

arXiv:2301.00290 [pdf, other]

doi 10.1145/3566097.3567872

BARVINN: Arbitrary Precision DNN Accelerator Controlled by a RISC-V CPU

Authors: Mohammadhossein Askarihemmat, Sean Wagner, Olexa Bilaniuk, Yassine Hariri, Yvon Savaria, Jean-Pierre David

Abstract: We present a DNN accelerator that allows inference at arbitrary precision with dedicated processing elements that are configurable at the bit level. Our DNN accelerator has 8 Processing Elements controlled by a RISC-V controller with a combined 8.2 TMACs of computational power when implemented with the recent Alveo U250 FPGA platform. We develop a code generator tool that ingests CNN models in ONN… ▽ More We present a DNN accelerator that allows inference at arbitrary precision with dedicated processing elements that are configurable at the bit level. Our DNN accelerator has 8 Processing Elements controlled by a RISC-V controller with a combined 8.2 TMACs of computational power when implemented with the recent Alveo U250 FPGA platform. We develop a code generator tool that ingests CNN models in ONNX format and generates an executable command stream for the RISC-V controller. We demonstrate the scalable throughput of our accelerator by running different DNN kernels and models when different quantization levels are selected. Compared to other low precision accelerators, our accelerator provides run time programmability without hardware reconfiguration and can accelerate DNNs with multiple quantization levels, regardless of the target FPGA size. BARVINN is an open source project and it is available at https://github.com/hossein1387/BARVINN. △ Less

Submitted 31 December, 2022; originally announced January 2023.

Comments: 7 pages. Accepted for publication in the 2023, 28th Asia and South Pacific Design Automation Conference (ASP-DAC 2023)

ACM Class: C.1.3; C.3

arXiv:2208.14807 [pdf]

Data Analysis in Social Networks for Agribusiness -- A Systematic Mapping Study

Authors: Nedson Soares, Regina Braga, Jose Maria David, Kennya Siqueira, Victor Stroele

Abstract: The ability of companies to react to changes imposed by the market is related to information acquisition and knowledge generation. Big data technologies, crowdsourcing, and Online Social Network (OSN) are used for knowledge generation. These technologies assumed a significant position in agribusiness. This work investigates how social network analysis can promote agribusiness to provide a basis fo… ▽ More The ability of companies to react to changes imposed by the market is related to information acquisition and knowledge generation. Big data technologies, crowdsourcing, and Online Social Network (OSN) are used for knowledge generation. These technologies assumed a significant position in agribusiness. This work investigates how social network analysis can promote agribusiness to provide a basis for future applications and evaluations. We adopted a hybrid systematic mapping to conduct the investigation. Two hundred twenty-three works that propose solutions for agribusiness were found and categorized. Results showed the most used techniques, OSNs, and revealed an increase in the number of studies in this area. The information obtained indicates how social media monitoring can complement traditional methods for decision-making on the management and regulation of agricultural systems. However, agribusiness still lacks more studies using data analysis tools on social networks. Based on our results, we discuss some challenges and research directions. △ Less

Submitted 25 August, 2022; originally announced August 2022.

Comments: 33 pages, a SLM review on OSNs for Agribusiness

ACM Class: D.2

arXiv:2206.12372 [pdf, other]

QReg: On Regularization Effects of Quantization

Authors: MohammadHossein AskariHemmat, Reyhane Askari Hemmat, Alex Hoffman, Ivan Lazarevich, Ehsan Saboori, Olivier Mastropietro, Sudhakar Sah, Yvon Savaria, Jean-Pierre David

Abstract: In this paper we study the effects of quantization in DNN training. We hypothesize that weight quantization is a form of regularization and the amount of regularization is correlated with the quantization level (precision). We confirm our hypothesis by providing analytical study and empirical results. By modeling weight quantization as a form of additive noise to weights, we explore how this noise… ▽ More In this paper we study the effects of quantization in DNN training. We hypothesize that weight quantization is a form of regularization and the amount of regularization is correlated with the quantization level (precision). We confirm our hypothesis by providing analytical study and empirical results. By modeling weight quantization as a form of additive noise to weights, we explore how this noise propagates through the network at training time. We then show that the magnitude of this noise is correlated with the level of quantization. To confirm our analytical study, we performed an extensive list of experiments summarized in this paper in which we show that the regularization effects of quantization can be seen in various vision tasks and models, over various datasets. Based on our study, we propose that 8-bit quantization provides a reliable form of regularization in different vision tasks and models. △ Less

Submitted 26 June, 2022; v1 submitted 24 June, 2022; originally announced June 2022.

arXiv:2006.15624 [pdf, other]

Application of Statistical Methods in Software Engineering: Theory and Practice

Authors: T. F. M. Sirqueira, M. A. Miguel, H. L. O. Dalpra, M. A. P. Araujo, J. M. N. David

Abstract: The experimental evaluation of the methods and concepts covered in software engineering has been increasingly valued. This value indicates the constant search for new forms of assessment and validation of the results obtained in Software Engineering research. Results are validated in studies through evaluations, which in turn become increasingly stringent. As an alternative to aid in the verificat… ▽ More The experimental evaluation of the methods and concepts covered in software engineering has been increasingly valued. This value indicates the constant search for new forms of assessment and validation of the results obtained in Software Engineering research. Results are validated in studies through evaluations, which in turn become increasingly stringent. As an alternative to aid in the verification of the results, that is, whether they are positive or negative, we suggest the use of statistical methods. This article presents some of the main statistical techniques available, as well as their use in carrying out the implementation of data analysis in experimental studies in Software Engineering. This paper presents a practical approach proving statistical techniques through a decision tree, which was created in order to facilitate the understanding of the appropriate statistical method for each data analysis situation. Actual data from the software projects were employed to demonstrate the use of these statistical methods. Although it is not the aim of this work, basic experimentation and statistics concepts will be presented, as well as a concrete indication of the applicability of these techniques. △ Less

Submitted 28 June, 2020; originally announced June 2020.

arXiv:2002.09794 [pdf, other]

PoET-BiN: Power Efficient Tiny Binary Neurons

Authors: Sivakumar Chidambaram, J. M. Pierre Langlois, Jean Pierre David

Abstract: The success of neural networks in image classification has inspired various hardware implementations on embedded platforms such as Field Programmable Gate Arrays, embedded processors and Graphical Processing Units. These embedded platforms are constrained in terms of power, which is mainly consumed by the Multiply Accumulate operations and the memory accesses for weight fetching. Quantization and… ▽ More The success of neural networks in image classification has inspired various hardware implementations on embedded platforms such as Field Programmable Gate Arrays, embedded processors and Graphical Processing Units. These embedded platforms are constrained in terms of power, which is mainly consumed by the Multiply Accumulate operations and the memory accesses for weight fetching. Quantization and pruning have been proposed to address this issue. Though effective, these techniques do not take into account the underlying architecture of the embedded hardware. In this work, we propose PoET-BiN, a Look-Up Table based power efficient implementation on resource constrained embedded devices. A modified Decision Tree approach forms the backbone of the proposed implementation in the binary domain. A LUT access consumes far less power than the equivalent Multiply Accumulate operation it replaces, and the modified Decision Tree algorithm eliminates the need for memory accesses. We applied the PoET-BiN architecture to implement the classification layers of networks trained on MNIST, SVHN and CIFAR-10 datasets, with near state-of-the art results. The energy reduction for the classifier portion reaches up to six orders of magnitude compared to a floating point implementations and up to three orders of magnitude when compared to recent binary quantized neural networks. △ Less

Submitted 22 February, 2020; originally announced February 2020.

Comments: Accepted in MLSys 2020 conference

arXiv:1908.01073 [pdf, other]

U-Net Fixed-Point Quantization for Medical Image Segmentation

Authors: MohammadHossein AskariHemmat, Sina Honari, Lucas Rouhier, Christian S. Perone, Julien Cohen-Adad, Yvon Savaria, Jean-Pierre David

Abstract: Model quantization is leveraged to reduce the memory consumption and the computation time of deep neural networks. This is achieved by representing weights and activations with a lower bit resolution when compared to their high precision floating point counterparts. The suitable level of quantization is directly related to the model performance. Lowering the quantization precision (e.g. 2 bits), r… ▽ More Model quantization is leveraged to reduce the memory consumption and the computation time of deep neural networks. This is achieved by representing weights and activations with a lower bit resolution when compared to their high precision floating point counterparts. The suitable level of quantization is directly related to the model performance. Lowering the quantization precision (e.g. 2 bits), reduces the amount of memory required to store model parameters and the amount of logic required to implement computational blocks, which contributes to reducing the power consumption of the entire system. These benefits typically come at the cost of reduced accuracy. The main challenge is to quantize a network as much as possible, while maintaining the performance accuracy. In this work, we present a quantization method for the U-Net architecture, a popular model in medical image segmentation. We then apply our quantization algorithm to three datasets: (1) the Spinal Cord Gray Matter Segmentation (GM), (2) the ISBI challenge for segmentation of neuronal structures in Electron Microscopic (EM), and (3) the public National Institute of Health (NIH) dataset for pancreas segmentation in abdominal CT scans. The reported results demonstrate that with only 4 bits for weights and 6 bits for activations, we obtain 8 fold reduction in memory requirements while loosing only 2.21%, 0.57% and 2.09% dice overlap score for EM, GM and NIH datasets respectively. Our fixed point quantization provides a flexible trade off between accuracy and memory requirement which is not provided by previous quantization methods for U-Net such as TernaryNet. △ Less

Submitted 9 September, 2019; v1 submitted 2 August, 2019; originally announced August 2019.

Comments: Accepted to MICCAI 2019's Hardware Aware Learning for Medical Imaging and Computer Assisted Intervention

arXiv:1805.03563 [pdf, other]

Personal space of autonomous car's passengers sitting in the driver's seat

Authors: Eleonore Ferrier-Barbut, Dominique Vaufreydaz, Jean-Alix David, Jérôme Lussereau, Anne Spalanzani

Abstract: This article deals with the specific context of an autonomous car navigating in an urban center within a shared space between pedestrians and cars. The driver delegates the control to the autonomous system while remaining seated in the driver's seat. The proposed study aims at giving a first insight into the definition of human perception of space applied to vehicles by testing the existence of a… ▽ More This article deals with the specific context of an autonomous car navigating in an urban center within a shared space between pedestrians and cars. The driver delegates the control to the autonomous system while remaining seated in the driver's seat. The proposed study aims at giving a first insight into the definition of human perception of space applied to vehicles by testing the existence of a personal space around the car.It aims at measuring proxemic information about the driver's comfort zone in such conditions.Proxemics, or human perception of space, has been largely explored when applied to humans or to robots, leading to the concept of personal space, but poorly when applied to vehicles. In this article, we highlight the existence and the characteristics of a zone of comfort around the car which is not correlated to the risk of a collision between the car and other road users. Our experiment includes 19 volunteers using a virtual reality headset to look at 30 scenarios filmed in 360{\textdegree} from the point of view of a passenger sitting in the driver's seat of an autonomous car.They were asked to say "stop" when they felt discomfort visualizing the scenarios.As said, the scenarios voluntarily avoid collision effect as we do not want to measure fear but discomfort.The scenarios involve one or three pedestrians walking past the car at different distances from the wings of the car, relative to the direction of motion of the car, on both sides. The car is either static or moving straight forward at different speeds.The results indicate the existence of a comfort zone around the car in which intrusion causes discomfort.The size of the comfort zone is sensitive neither to the side of the car where the pedestrian passes nor to the number of pedestrians. In contrast, the feeling of discomfort is relative to the car's motion (static or moving).Another outcome from this study is an illustration of the usage of first person 360{\textdegree} video and a virtual reality headset to evaluate feelings of a passenger within an autonomous car. △ Less

Submitted 9 May, 2018; originally announced May 2018.

Journal ref: The 2018 IEEE Intelligent Vehicles Symposium (IV'18), Jun 2018, Changshu, Suzhou, China

arXiv:1606.06629 [pdf, other]

Parallel Galton Watson Process

Authors: Olivier Bodini, Camille Coti, Julien David

Abstract: In this paper, we study a parallel version of Galton-Watson processes for the random generation of tree-shaped structures. Random trees are useful in many situations (testing, binary search, simulation of physics phenomena,...) as attests more than 49000 citations on Google scholar. Using standard analytic combinatorics, we first give a theoretical, average-case study of the random process in orde… ▽ More In this paper, we study a parallel version of Galton-Watson processes for the random generation of tree-shaped structures. Random trees are useful in many situations (testing, binary search, simulation of physics phenomena,...) as attests more than 49000 citations on Google scholar. Using standard analytic combinatorics, we first give a theoretical, average-case study of the random process in order to evaluate how parallelism can be extracted from this process, and we deduce a parallel generation algorithm. Then we present how it can be implemented in a task-based parallel paradigm for shared memory (here, Intel Cilk). This implementation faces several challenges, among which efficient, thread-safe random bit generation, memory management and algorithmic modifications for small-grain parallelism. Finally, we evaluate the performance of our implementation and the impact of different choices and parameters. We obtain a significant efficiency improvement for the generation of big trees. We also conduct empirical and theoretical studies of the average behaviour of our algorithm. △ Less

Submitted 21 June, 2016; originally announced June 2016.

arXiv:1511.00363 [pdf, other]

BinaryConnect: Training Deep Neural Networks with binary weights during propagations

Authors: Matthieu Courbariaux, Yoshua Bengio, Jean-Pierre David

Abstract: Deep Neural Networks (DNN) have achieved state-of-the-art results in a wide range of tasks, with the best results obtained with large training sets and large models. In the past, GPUs enabled these breakthroughs because of their greater computational speed. In the future, faster computation at both training and test time is likely to be crucial for further progress and for consumer applications on… ▽ More Deep Neural Networks (DNN) have achieved state-of-the-art results in a wide range of tasks, with the best results obtained with large training sets and large models. In the past, GPUs enabled these breakthroughs because of their greater computational speed. In the future, faster computation at both training and test time is likely to be crucial for further progress and for consumer applications on low-power devices. As a result, there is much interest in research and development of dedicated hardware for Deep Learning (DL). Binary weights, i.e., weights which are constrained to only two possible values (e.g. -1 or 1), would bring great benefits to specialized DL hardware by replacing many multiply-accumulate operations by simple accumulations, as multipliers are the most space and power-hungry components of the digital implementation of neural networks. We introduce BinaryConnect, a method which consists in training a DNN with binary weights during the forward and backward propagations, while retaining precision of the stored weights in which gradients are accumulated. Like other dropout schemes, we show that BinaryConnect acts as regularizer and we obtain near state-of-the-art results with BinaryConnect on the permutation-invariant MNIST, CIFAR-10 and SVHN. △ Less

Submitted 18 April, 2016; v1 submitted 1 November, 2015; originally announced November 2015.

Comments: Accepted at NIPS 2015, 9 pages, 3 figures

arXiv:1412.7024 [pdf, other]

Training deep neural networks with low precision multiplications

Authors: Matthieu Courbariaux, Yoshua Bengio, Jean-Pierre David

Abstract: Multipliers are the most space and power-hungry arithmetic operators of the digital implementation of deep neural networks. We train a set of state-of-the-art neural networks (Maxout networks) on three benchmark datasets: MNIST, CIFAR-10 and SVHN. They are trained with three distinct formats: floating point, fixed point and dynamic fixed point. For each of those datasets and for each of those form… ▽ More Multipliers are the most space and power-hungry arithmetic operators of the digital implementation of deep neural networks. We train a set of state-of-the-art neural networks (Maxout networks) on three benchmark datasets: MNIST, CIFAR-10 and SVHN. They are trained with three distinct formats: floating point, fixed point and dynamic fixed point. For each of those datasets and for each of those formats, we assess the impact of the precision of the multiplications on the final error after training. We find that very low precision is sufficient not just for running trained networks but also for training them. For example, it is possible to train Maxout networks with 10 bits multiplications. △ Less

Submitted 22 September, 2015; v1 submitted 22 December, 2014; originally announced December 2014.

Comments: 10 pages, 5 figures, Accepted as a workshop contribution at ICLR 2015

arXiv:1109.5683 [pdf, ps, other]

Asymptotic enumeration of Minimal Automata

Authors: Frederique Bassino, Julien David, Andrea Sportiello

Abstract: We determine the asymptotic proportion of minimal automata, within n-state accessible deterministic complete automata over a k-letter alphabet, with the uniform distribution over the possible transition structures, and a binomial distribution over terminal states, with arbitrary parameter b. It turns out that a fraction ~ 1-C(k,b) n^{-k+2} of automata is minimal, with C(k,b) a function, explicitly… ▽ More We determine the asymptotic proportion of minimal automata, within n-state accessible deterministic complete automata over a k-letter alphabet, with the uniform distribution over the possible transition structures, and a binomial distribution over terminal states, with arbitrary parameter b. It turns out that a fraction ~ 1-C(k,b) n^{-k+2} of automata is minimal, with C(k,b) a function, explicitly determined, involving the solution of a transcendental equation. △ Less

Submitted 26 September, 2011; originally announced September 2011.

Comments: 12+5 pages, 2 figures, submitted to STACS 2012

ACM Class: F.2

arXiv:0902.1048 [pdf, ps, other]

On the Average Complexity of Moore's State Minimization Algorithm

Authors: Frédérique Bassino, Julien David, Cyril Nicaud

Abstract: We prove that, for any arbitrary finite alphabet and for the uniform distribution over deterministic and accessible automata with n states, the average complexity of Moore's state minimization algorithm is in O(n log n). Moreover this bound is tight in the case of unary utomata. We prove that, for any arbitrary finite alphabet and for the uniform distribution over deterministic and accessible automata with n states, the average complexity of Moore's state minimization algorithm is in O(n log n). Moreover this bound is tight in the case of unary utomata. △ Less

Submitted 6 February, 2009; originally announced February 2009.

Journal ref: 26th International Symposium on Theoretical Aspects of Computer Science STACS 2009 (2009) 123-134

Showing 1–20 of 20 results for author: David, J