-
Finer characterization of bounded languages described by GF(2)-grammars
Authors:
Vladislav Makarov,
Marat Movsin
Abstract:
GF(2)-grammars are a somewhat recently introduced grammar family that have some unusual algebraic properties and are closely connected to unambiguous grammars. In "Bounded languages described by GF(2)-grammars", Makarov proved a necessary condition for subsets of $a_1^* a_2^* \cdots a_k^*$ to be described by some GF(2)-grammar. By extending these methods further, we prove an even stronger upper bo…
▽ More
GF(2)-grammars are a somewhat recently introduced grammar family that have some unusual algebraic properties and are closely connected to unambiguous grammars. In "Bounded languages described by GF(2)-grammars", Makarov proved a necessary condition for subsets of $a_1^* a_2^* \cdots a_k^*$ to be described by some GF(2)-grammar. By extending these methods further, we prove an even stronger upper bound for these languages. Moreover, we establish a lower bound that closely matches the proven upper bound. Also, we prove the exact characterization for the special case of linear GF(2)-grammars. Finally, by using the previous result, we show that the class of languages described by linear GF(2)-grammars is not closed under GF(2)-concatenation
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
Agile gesture recognition for capacitive sensing devices: adapting on-the-job
Authors:
Ying Liu,
Liucheng Guo,
Valeri A. Makarov,
Yuxiang Huang,
Alexander Gorban,
Evgeny Mirkes,
Ivan Y. Tyukin
Abstract:
Automated hand gesture recognition has been a focus of the AI community for decades. Traditionally, work in this domain revolved largely around scenarios assuming the availability of the flow of images of the user hands. This has partly been due to the prevalence of camera-based devices and the wide availability of image data. However, there is growing demand for gesture recognition technology tha…
▽ More
Automated hand gesture recognition has been a focus of the AI community for decades. Traditionally, work in this domain revolved largely around scenarios assuming the availability of the flow of images of the user hands. This has partly been due to the prevalence of camera-based devices and the wide availability of image data. However, there is growing demand for gesture recognition technology that can be implemented on low-power devices using limited sensor data instead of high-dimensional inputs like hand images. In this work, we demonstrate a hand gesture recognition system and method that uses signals from capacitive sensors embedded into the etee hand controller. The controller generates real-time signals from each of the wearer five fingers. We use a machine learning technique to analyse the time series signals and identify three features that can represent 5 fingers within 500 ms. The analysis is composed of a two stage training strategy, including dimension reduction through principal component analysis and classification with K nearest neighbour. Remarkably, we found that this combination showed a level of performance which was comparable to more advanced methods such as supervised variational autoencoder. The base system can also be equipped with the capability to learn from occasional errors by providing it with an additional adaptive error correction mechanism. The results showed that the error corrector improve the classification performance in the base system without compromising its performance. The system requires no more than 1 ms of computing time per input sample, and is smaller than deep neural networks, demonstrating the feasibility of agile gesture recognition systems based on this technology.
△ Less
Submitted 12 May, 2023;
originally announced May 2023.
-
Cocke--Younger--Kasami--Schwartz--Zippel algorithm and relatives
Authors:
Vladislav Makarov
Abstract:
The equivalence problem for unambiguous grammars is an important, but very difficult open question in formal language theory. Consider the \emph{limited} equivalence problem for unambiguous grammars -- for two unambiguous grammars $G_1$ and $G_2$, tell whether or not they describe the same set of words of length $n$. Obviously, the naive approach requires exponential time with respect to $n$. By c…
▽ More
The equivalence problem for unambiguous grammars is an important, but very difficult open question in formal language theory. Consider the \emph{limited} equivalence problem for unambiguous grammars -- for two unambiguous grammars $G_1$ and $G_2$, tell whether or not they describe the same set of words of length $n$. Obviously, the naive approach requires exponential time with respect to $n$. By combining two classic algorithmic ideas, I introduce a $O({\rm poly}(n, |G_1|, |G_2|))$ algorithm for this problem. Moreover, the ideas behind the algorithm prove useful in various other scenarious.
△ Less
Submitted 7 December, 2022;
originally announced December 2022.
-
Why the equivalence problem for unambiguous grammars has not been solved back in 1966?
Authors:
Vladislav Makarov
Abstract:
In 1966, Semenov, by using a technique based on power series, suggested an algorithm that tells apart the languages described by an unambiguous grammar and a DFA. At the first glance, it may appear that the algorithm can be easily modified to yield a full solution of the equivalence problem for unambiguous grammars. This article shows why this hunch is, in fact, incorrect.
In 1966, Semenov, by using a technique based on power series, suggested an algorithm that tells apart the languages described by an unambiguous grammar and a DFA. At the first glance, it may appear that the algorithm can be easily modified to yield a full solution of the equivalence problem for unambiguous grammars. This article shows why this hunch is, in fact, incorrect.
△ Less
Submitted 7 December, 2022;
originally announced December 2022.
-
Counting ternary square-free words quickly
Authors:
Vladislav Makarov
Abstract:
An efficient, when compared to exhaustive enumeration, algorithm for computing the number of square-free words of length $n$ over the alphabet $\{a, b, c\}$ is presented.
An efficient, when compared to exhaustive enumeration, algorithm for computing the number of square-free words of length $n$ over the alphabet $\{a, b, c\}$ is presented.
△ Less
Submitted 10 May, 2021; v1 submitted 7 December, 2020;
originally announced December 2020.
-
Playing odds and evens with finite automata
Authors:
Vladislav Makarov
Abstract:
This paper is concerned with asymptotic behaviour of a repeated game of "odds and evens", with strategies of both players represented by finite automata. It is proved that, for every $n$, there is an automaton with $2^n \cdot \mathrm{poly}(n)$ states which defeats every $n$-state automaton, in the sense that it wins all rounds except for finitely many. Moreover, every such automaton has at least…
▽ More
This paper is concerned with asymptotic behaviour of a repeated game of "odds and evens", with strategies of both players represented by finite automata. It is proved that, for every $n$, there is an automaton with $2^n \cdot \mathrm{poly}(n)$ states which defeats every $n$-state automaton, in the sense that it wins all rounds except for finitely many. Moreover, every such automaton has at least $2^n \cdot (1 - o(1))$ states, meaning that the upper bound is tight up to polynomial factors. This is a significant improvement over a classic result of Ben-Porath in the special case of "odds and evens". Moreover, I conjecture that the approach can be generalised to arbitrary zero-sum games.
△ Less
Submitted 29 June, 2020; v1 submitted 9 May, 2020;
originally announced May 2020.
-
High--Dimensional Brain in a High-Dimensional World: Blessing of Dimensionality
Authors:
Alexander N. Gorban,
Valery A. Makarov,
Ivan Y. Tyukin
Abstract:
High-dimensional data and high-dimensional representations of reality are inherent features of modern Artificial Intelligence systems and applications of machine learning. The well-known phenomenon of the "curse of dimensionality" states: many problems become exponentially difficult in high dimensions. Recently, the other side of the coin, the "blessing of dimensionality", has attracted much atten…
▽ More
High-dimensional data and high-dimensional representations of reality are inherent features of modern Artificial Intelligence systems and applications of machine learning. The well-known phenomenon of the "curse of dimensionality" states: many problems become exponentially difficult in high dimensions. Recently, the other side of the coin, the "blessing of dimensionality", has attracted much attention. It turns out that generic high-dimensional datasets exhibit fairly simple geometric properties. Thus, there is a fundamental tradeoff between complexity and simplicity in high dimensional spaces. Here we present a brief explanatory review of recent ideas, results and hypotheses about the blessing of dimensionality and related simplifying effects relevant to machine learning and neuroscience.
△ Less
Submitted 14 January, 2020;
originally announced January 2020.
-
Bounded languages described by GF(2)-grammars
Authors:
Vladislav Makarov
Abstract:
GF(2)-grammars are a recently introduced grammar family with some unusual algebraic properties. They are closely connected to unambiguous grammars. By using the method of formal power series, we establish strong conditions that are necessary for subsets of a^* b^* and a^* b^* c^* to be described by some GF(2)-grammar. By further applying the established results, we settle the long-standing open qu…
▽ More
GF(2)-grammars are a recently introduced grammar family with some unusual algebraic properties. They are closely connected to unambiguous grammars. By using the method of formal power series, we establish strong conditions that are necessary for subsets of a^* b^* and a^* b^* c^* to be described by some GF(2)-grammar. By further applying the established results, we settle the long-standing open question of proving inherent ambiguity of the language {a^n b^m c^k | n != m or m != k}$, as well as give a new purely algebraic proof of the inherent ambiguity of the language {a^n b^m c^k}{n = m or m = k}.
△ Less
Submitted 20 November, 2023; v1 submitted 31 December, 2019;
originally announced December 2019.
-
Symphony of high-dimensional brain
Authors:
Alexander N. Gorban,
Valeri A. Makarov,
Ivan Y. Tyukin
Abstract:
This paper is the final part of the scientific discussion organised by the Journal "Physics of Life Rviews" about the simplicity revolution in neuroscience and AI. This discussion was initiated by the review paper "The unreasonable effectiveness of small neural ensembles in high-dimensional brain". Phys Life Rev 2019, doi 10.1016/j.plrev.2018.09.005, arXiv:1809.07656. The topics of the discussion…
▽ More
This paper is the final part of the scientific discussion organised by the Journal "Physics of Life Rviews" about the simplicity revolution in neuroscience and AI. This discussion was initiated by the review paper "The unreasonable effectiveness of small neural ensembles in high-dimensional brain". Phys Life Rev 2019, doi 10.1016/j.plrev.2018.09.005, arXiv:1809.07656. The topics of the discussion varied from the necessity to take into account the difference between the theoretical random distributions and "extremely non-random" real distributions and revise the common machine learning theory, to different forms of the curse of dimensionality and high-dimensional pitfalls in neuroscience. V. K{ů}rkov{á}, A. Tozzi and J.F. Peters, R. Quian Quiroga, P. Varona, R. Barrio, G. Kreiman, L. Fortuna, C. van Leeuwen, R. Quian Quiroga, and V. Kreinovich, A.N. Gorban, V.A. Makarov, and I.Y. Tyukin participated in the discussion. In this paper we analyse the symphony of opinions and the possible outcomes of the simplicity revolution for machine learning and neuroscience.
△ Less
Submitted 27 June, 2019;
originally announced June 2019.
-
Application of Machine Learning to accidents detection at directional drilling
Authors:
Ekaterina Gurina,
Nikita Klyuchnikov,
Alexey Zaytsev,
Evgenya Romanenkova,
Ksenia Antipova,
Igor Simon,
Victor Makarov,
Dmitry Koroteev
Abstract:
We present a data-driven algorithm and mathematical model for anomaly alarming at directional drilling. The algorithm is based on machine learning. It compares the real-time drilling telemetry with one corresponding to past accidents and analyses the level of similarity. The model performs a time-series comparison using aggregated statistics and Gradient Boosting classification. It is trained on h…
▽ More
We present a data-driven algorithm and mathematical model for anomaly alarming at directional drilling. The algorithm is based on machine learning. It compares the real-time drilling telemetry with one corresponding to past accidents and analyses the level of similarity. The model performs a time-series comparison using aggregated statistics and Gradient Boosting classification. It is trained on historical data containing the drilling telemetry of $80$ wells drilled within $19$ oilfields. The model can detect an anomaly and identify its type by comparing the real-time measurements while drilling with the ones from the database of past accidents. Validation tests show that our algorithm identifies half of the anomalies with about $0.53$ false alarms per day on average. The model performance ensures sufficient time and cost savings as it enables partial prevention of the failures and accidents at the well construction.
△ Less
Submitted 12 December, 2019; v1 submitted 6 June, 2019;
originally announced June 2019.
-
The unreasonable effectiveness of small neural ensembles in high-dimensional brain
Authors:
A. N. Gorban,
V. A. Makarov,
I. Y. Tyukin
Abstract:
Despite the widely-spread consensus on the brain complexity, sprouts of the single neuron revolution emerged in neuroscience in the 1970s. They brought many unexpected discoveries, including grandmother or concept cells and sparse coding of information in the brain.
In machine learning for a long time, the famous curse of dimensionality seemed to be an unsolvable problem. Nevertheless, the idea…
▽ More
Despite the widely-spread consensus on the brain complexity, sprouts of the single neuron revolution emerged in neuroscience in the 1970s. They brought many unexpected discoveries, including grandmother or concept cells and sparse coding of information in the brain.
In machine learning for a long time, the famous curse of dimensionality seemed to be an unsolvable problem. Nevertheless, the idea of the blessing of dimensionality becomes gradually more and more popular. Ensembles of non-interacting or weakly interacting simple units prove to be an effective tool for solving essentially multidimensional problems. This approach is especially useful for one-shot (non-iterative) correction of errors in large legacy artificial intelligence systems.
These simplicity revolutions in the era of complexity have deep fundamental reasons grounded in geometry of multidimensional data spaces. To explore and understand these reasons we revisit the background ideas of statistical physics. In the course of the 20th century they were developed into the concentration of measure theory. New stochastic separation theorems reveal the fine structure of the data clouds.
We review and analyse biological, physical, and mathematical problems at the core of the fundamental question: how can high-dimensional brain organise reliable and fast learning in high-dimensional world of data by simple tools?
Two critical applications are reviewed to exemplify the approach: one-shot correction of errors in intellectual systems and emergence of static and associative memories in ensembles of single neurons.
△ Less
Submitted 10 November, 2018; v1 submitted 20 September, 2018;
originally announced September 2018.
-
Homodyne-detector-blinding attack in continuous-variable quantum key distribution
Authors:
Hao Qin,
Rupesh Kumar,
Vadim Makarov,
Romain Alléaume
Abstract:
We propose an efficient strategy to attack a continuous-variable quantum key distribution (CV-QKD) system, that we call homodyne detector blinding. This attack strategy takes advantage of a generic vulnerability of homodyne receivers: a bright light pulse sent on the signal port can lead to a saturation of the detector electronics. While detector saturation has already been proposed to attack CV-Q…
▽ More
We propose an efficient strategy to attack a continuous-variable quantum key distribution (CV-QKD) system, that we call homodyne detector blinding. This attack strategy takes advantage of a generic vulnerability of homodyne receivers: a bright light pulse sent on the signal port can lead to a saturation of the detector electronics. While detector saturation has already been proposed to attack CV-QKD, the attack we study in this paper has the additional advantage of not requiring an eavesdropper to be phase locked with the homodyne receiver. We show that under certain conditions, an attacker can use a simple laser, incoherent with the homodyne receiver, to generate bright pulses and bias the excess noise to arbitrary small values, fully comprising CV-QKD security. These results highlight the feasibility and the impact of the detector blinding attack. We finally discuss how to design countermeasures in order to protect against this attack.
△ Less
Submitted 5 July, 2018; v1 submitted 4 May, 2018;
originally announced May 2018.
-
Implementation of the Programming Language Dino -- A Case Study in Dynamic Language Performance
Authors:
Vladimir N. Makarov
Abstract:
The article gives a brief overview of the current state of programming language Dino in order to see where its stands between other dynamic programming languages. Then it describes the current implementation, used tools and major implementation decisions including how to implement a stable, portable and simple JIT compiler.
We study the effect of major implementation decisions on the performance…
▽ More
The article gives a brief overview of the current state of programming language Dino in order to see where its stands between other dynamic programming languages. Then it describes the current implementation, used tools and major implementation decisions including how to implement a stable, portable and simple JIT compiler.
We study the effect of major implementation decisions on the performance of Dino on x86-64, AARCH64, and Powerpc64. In brief, the performance of some model benchmark on x86-64 was improved by $\textbf{3.1}$ times after moving from a stack based virtual machine to a register-transfer architecture, a further $\textbf{1.5}$ times by adding byte code combining, a further $\textbf{2.3}$ times through the use of JIT, and a further $\textbf{4.4}$ times by performing type inference with byte code specialization, with a resulting overall performance improvement of about $\textbf{47}$ times. To put these results in context, we include performance comparisons of Dino with widely used implementations of Ruby, Python 3, PyPy and JavaScript on the three platforms mentioned above.
The goal of this article is to share the experience of Dino implementation with other dynamic language implementors in hope that it can help them to improve implementation of popular dynamic languages to make them probably faster and more portable, using less developer resources, and may be to avoid some mistakes and wrong directions which were experienced during Dino development.
△ Less
Submitted 5 April, 2016;
originally announced April 2016.
-
Attacks exploiting deviation of mean photon number in quantum key distribution and coin tossing
Authors:
Shihan Sajeed,
Igor Radchenko,
Sarah Kaiser,
Jean-Philippe Bourgoin,
Anna Pappa,
Laurent Monat,
Matthieu Legre,
Vadim Makarov
Abstract:
The security of quantum communication using a weak coherent source requires an accurate knowledge of the source's mean photon number. Finite calibration precision or an active manipulation by an attacker may cause the actual emitted photon number to deviate from the known value. We model effects of this deviation on the security of three quantum communication protocols: the Bennett-Brassard 1984 (…
▽ More
The security of quantum communication using a weak coherent source requires an accurate knowledge of the source's mean photon number. Finite calibration precision or an active manipulation by an attacker may cause the actual emitted photon number to deviate from the known value. We model effects of this deviation on the security of three quantum communication protocols: the Bennett-Brassard 1984 (BB84) quantum key distribution (QKD) protocol without decoy states, Scarani-Acin-Ribordy-Gisin 2004 (SARG04) QKD protocol, and a coin-tossing protocol. For QKD, we model both a strong attack using technology possible in principle, and a realistic attack bounded by today's technology. To maintain the mean photon number in two-way systems, such as plug-and-play and relativistic quantum cryptography schemes, bright pulse energy incoming from the communication channel must be monitored. Implementation of a monitoring detector has largely been ignored so far, except for ID Quantique's commercial QKD system Clavis2. We scrutinize this implementation for security problems, and show that designing a hack-proof pulse-energy-measuring detector is far from trivial. Indeed the first implementation has three serious flaws confirmed experimentally, each of which may be exploited in a cleverly constructed Trojan-horse attack. We discuss requirements for a loophole-free implementation of the monitoring detector.
△ Less
Submitted 30 March, 2015; v1 submitted 27 December, 2014;
originally announced December 2014.
-
Quantum cryptography
Authors:
Dag Roar Hjelme,
Lars Lydersen,
Vadim Makarov
Abstract:
This is a chapter on quantum cryptography for the book "A Multidisciplinary Introduction to Information Security" to be published by CRC Press in 2011/2012. The chapter aims to introduce the topic to undergraduate-level and continuing-education students specializing in information and communication technology.
This is a chapter on quantum cryptography for the book "A Multidisciplinary Introduction to Information Security" to be published by CRC Press in 2011/2012. The chapter aims to introduce the topic to undergraduate-level and continuing-education students specializing in information and communication technology.
△ Less
Submitted 8 August, 2011;
originally announced August 2011.
-
Predicate Logic with Definitions
Authors:
Victor Makarov
Abstract:
Predicate Logic with Definitions (PLD or D-logic) is a modification of first-order logic intended mostly for practical formalization of mathematics. The main syntactic constructs of D-logic are terms, formulas and definitions. A definition is a definition of variables, a definition of constants, or a composite definition (D-logic has also abbreviation definitions called abbreviations). Definitio…
▽ More
Predicate Logic with Definitions (PLD or D-logic) is a modification of first-order logic intended mostly for practical formalization of mathematics. The main syntactic constructs of D-logic are terms, formulas and definitions. A definition is a definition of variables, a definition of constants, or a composite definition (D-logic has also abbreviation definitions called abbreviations). Definitions can be used inside terms and formulas. This possibility alleviates introducing new quantifier-like names. Composite definitions allow constructing new definitions from existing ones.
△ Less
Submitted 7 June, 1999;
originally announced June 1999.