-
Strategies for political-statement segmentation and labelling in unstructured text
Authors:
Dmitry Nikolaev,
Sean Papay
Abstract:
Analysis of parliamentary speeches and political-party manifestos has become an integral area of computational study of political texts. While speeches have been overwhelmingly analysed using unsupervised methods, a large corpus of manifestos with by-statement political-stance labels has been created by the participants of the MARPOR project. It has been recently shown that these labels can be pre…
▽ More
Analysis of parliamentary speeches and political-party manifestos has become an integral area of computational study of political texts. While speeches have been overwhelmingly analysed using unsupervised methods, a large corpus of manifestos with by-statement political-stance labels has been created by the participants of the MARPOR project. It has been recently shown that these labels can be predicted by a neural model; however, the current approach relies on provided statement boundaries, limiting out-of-domain applicability. In this work, we propose and test a range of unified split-and-label frameworks -- based on linear-chain CRFs, fine-tuned text-to-text models, and the combination of in-context learning with constrained decoding -- that can be used to jointly segment and classify statements from raw textual data. We show that our approaches achieve competitive accuracy when applied to raw text of political manifestos, and then demonstrate the research potential of our method by applying it to the records of the UK House of Commons and tracing the political trajectories of four major parties in the last three decades.
△ Less
Submitted 10 March, 2025;
originally announced March 2025.
-
Generalization of Brady-Yong Algorithm for Fast Hough Transform to Arbitrary Image Size
Authors:
Danil Kazimirov,
Dmitry Nikolaev,
Ekaterina Rybakova,
Arseniy Terekhin
Abstract:
Nowadays, the Hough (discrete Radon) transform (HT/DRT) has proved to be an extremely powerful and widespread tool harnessed in a number of application areas, ranging from general image processing to X-ray computed tomography. Efficient utilization of the HT to solve applied problems demands its acceleration and increased accuracy. Along with this, most fast algorithms for computing the HT, especi…
▽ More
Nowadays, the Hough (discrete Radon) transform (HT/DRT) has proved to be an extremely powerful and widespread tool harnessed in a number of application areas, ranging from general image processing to X-ray computed tomography. Efficient utilization of the HT to solve applied problems demands its acceleration and increased accuracy. Along with this, most fast algorithms for computing the HT, especially the pioneering Brady-Yong algorithm, operate on power-of-two size input images and are not adapted for arbitrary size images. This paper presents a new algorithm for calculating the HT for images of arbitrary size. It generalizes the Brady-Yong algorithm from which it inherits the optimal computational complexity. Moreover, the algorithm allows to compute the HT with considerably higher accuracy compared to the existing algorithm. Herewith, the paper provides a theoretical analysis of the computational complexity and accuracy of the proposed algorithm. The conclusions of the performed experiments conform with the theoretical results.
△ Less
Submitted 11 November, 2024;
originally announced November 2024.
-
Classifier identification in Ancient Egyptian as a low-resource sequence-labelling task
Authors:
Dmitry Nikolaev,
Jorke Grotenhuis,
Haleli Harel,
Orly Goldwasser
Abstract:
The complex Ancient Egyptian (AE) writing system was characterised by widespread use of graphemic classifiers (determinatives): silent (unpronounced) hieroglyphic signs clarifying the meaning or indicating the pronunciation of the host word. The study of classifiers has intensified in recent years with the launch and quick growth of the iClassifier project, a web-based platform for annotation and…
▽ More
The complex Ancient Egyptian (AE) writing system was characterised by widespread use of graphemic classifiers (determinatives): silent (unpronounced) hieroglyphic signs clarifying the meaning or indicating the pronunciation of the host word. The study of classifiers has intensified in recent years with the launch and quick growth of the iClassifier project, a web-based platform for annotation and analysis of classifiers in ancient and modern languages. Thanks to the data contributed by the project participants, it is now possible to formulate the identification of classifiers in AE texts as an NLP task. In this paper, we make first steps towards solving this task by implementing a series of sequence-labelling neural models, which achieve promising performance despite the modest amount of training data. We discuss tokenisation and operationalisation issues arising from tackling AE texts and contrast our approach with frequency-based baselines.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Beyond prompt brittleness: Evaluating the reliability and consistency of political worldviews in LLMs
Authors:
Tanise Ceron,
Neele Falk,
Ana Barić,
Dmitry Nikolaev,
Sebastian Padó
Abstract:
Due to the widespread use of large language models (LLMs), we need to understand whether they embed a specific "worldview" and what these views reflect. Recent studies report that, prompted with political questionnaires, LLMs show left-liberal leanings (Feng et al., 2023; Motoki et al., 2024). However, it is as yet unclear whether these leanings are reliable (robust to prompt variations) and wheth…
▽ More
Due to the widespread use of large language models (LLMs), we need to understand whether they embed a specific "worldview" and what these views reflect. Recent studies report that, prompted with political questionnaires, LLMs show left-liberal leanings (Feng et al., 2023; Motoki et al., 2024). However, it is as yet unclear whether these leanings are reliable (robust to prompt variations) and whether the leaning is consistent across policies and political leaning. We propose a series of tests which assess the reliability and consistency of LLMs' stances on political statements based on a dataset of voting-advice questionnaires collected from seven EU countries and annotated for policy issues. We study LLMs ranging in size from 7B to 70B parameters and find that their reliability increases with parameter count. Larger models show overall stronger alignment with left-leaning parties but differ among policy programs: They show a (left-wing) positive stance towards environment protection, social welfare state and liberal society but also (right-wing) law and order, with no consistent preferences in the areas of foreign policy and migration.
△ Less
Submitted 8 August, 2024; v1 submitted 27 February, 2024;
originally announced February 2024.
-
Approximate Attributions for Off-the-Shelf Siamese Transformers
Authors:
Lucas Möller,
Dmitry Nikolaev,
Sebastian Padó
Abstract:
Siamese encoders such as sentence transformers are among the least understood deep models. Established attribution methods cannot tackle this model class since it compares two inputs rather than processing a single one. To address this gap, we have recently proposed an attribution method specifically for Siamese encoders (Möller et al., 2023). However, it requires models to be adjusted and fine-tu…
▽ More
Siamese encoders such as sentence transformers are among the least understood deep models. Established attribution methods cannot tackle this model class since it compares two inputs rather than processing a single one. To address this gap, we have recently proposed an attribution method specifically for Siamese encoders (Möller et al., 2023). However, it requires models to be adjusted and fine-tuned and therefore cannot be directly applied to off-the-shelf models. In this work, we reassess these restrictions and propose (i) a model with exact attribution ability that retains the original model's predictive performance and (ii) a way to compute approximate attributions for off-the-shelf models. We extensively compare approximate and exact attributions and use them to analyze the models' attendance to different linguistic aspects. We gain insights into which syntactic roles Siamese transformers attend to, confirm that they mostly ignore negation, explore how they judge semantically opposite adjectives, and find that they exhibit lexical bias.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Unfolder: Fast localization and image rectification of a document with a crease from folding in half
Authors:
A. M. Ershov,
D. V. Tropin,
E. E. Limonova,
D. P. Nikolaev,
V. V. Arlazarov
Abstract:
Presentation of folded documents is not an uncommon case in modern society. Digitizing such documents by capturing them with a smartphone camera can be tricky since a crease can divide the document contents into separate planes. To unfold the document, one could hold the edges potentially obscuring it in a captured image. While there are many geometrical rectification methods, they were usually de…
▽ More
Presentation of folded documents is not an uncommon case in modern society. Digitizing such documents by capturing them with a smartphone camera can be tricky since a crease can divide the document contents into separate planes. To unfold the document, one could hold the edges potentially obscuring it in a captured image. While there are many geometrical rectification methods, they were usually developed for arbitrary bends and folds. We consider such algorithms and propose a novel approach Unfolder developed specifically for images of documents with a crease from folding in half. Unfolder is robust to projective distortions of the document image and does not fragment the image in the vicinity of a crease after rectification. A new Folded Document Images dataset was created to investigate the rectification accuracy of folded (2, 3, 4, and 8 folds) documents. The dataset includes 1600 images captured when document placed on a table and when held in hand. The Unfolder algorithm allowed for a recognition error rate of 0.33, which is better than the advanced neural network methods DocTr (0.44) and DewarpNet (0.57). The average runtime for Unfolder was only 0.25 s/image on an iPhone XR.
△ Less
Submitted 1 December, 2023;
originally announced December 2023.
-
Improving Cross-Lingual Transfer through Subtree-Aware Word Reordering
Authors:
Ofir Arviv,
Dmitry Nikolaev,
Taelin Karidi,
Omri Abend
Abstract:
Despite the impressive growth of the abilities of multilingual language models, such as XLM-R and mT5, it has been shown that they still face difficulties when tackling typologically-distant languages, particularly in the low-resource setting. One obstacle for effective cross-lingual transfer is variability in word-order patterns. It can be potentially mitigated via source- or target-side word reo…
▽ More
Despite the impressive growth of the abilities of multilingual language models, such as XLM-R and mT5, it has been shown that they still face difficulties when tackling typologically-distant languages, particularly in the low-resource setting. One obstacle for effective cross-lingual transfer is variability in word-order patterns. It can be potentially mitigated via source- or target-side word reordering, and numerous approaches to reordering have been proposed. However, they rely on language-specific rules, work on the level of POS tags, or only target the main clause, leaving subordinate clauses intact. To address these limitations, we present a new powerful reordering method, defined in terms of Universal Dependencies, that is able to learn fine-grained word-order patterns conditioned on the syntactic context from a small amount of annotated data and can be applied at all levels of the syntactic tree. We conduct experiments on a diverse set of tasks and show that our method consistently outperforms strong baselines over different language pairs and model architectures. This performance advantage holds true in both zero-shot and few-shot scenarios.
△ Less
Submitted 20 October, 2023;
originally announced October 2023.
-
Multilingual estimation of political-party positioning: From label aggregation to long-input Transformers
Authors:
Dmitry Nikolaev,
Tanise Ceron,
Sebastian Padó
Abstract:
Scaling analysis is a technique in computational political science that assigns a political actor (e.g. politician or party) a score on a predefined scale based on a (typically long) body of text (e.g. a parliamentary speech or an election manifesto). For example, political scientists have often used the left--right scale to systematically analyse political landscapes of different countries. NLP m…
▽ More
Scaling analysis is a technique in computational political science that assigns a political actor (e.g. politician or party) a score on a predefined scale based on a (typically long) body of text (e.g. a parliamentary speech or an election manifesto). For example, political scientists have often used the left--right scale to systematically analyse political landscapes of different countries. NLP methods for automatic scaling analysis can find broad application provided they (i) are able to deal with long texts and (ii) work robustly across domains and languages. In this work, we implement and compare two approaches to automatic scaling analysis of political-party manifestos: label aggregation, a pipeline strategy relying on annotations of individual statements from the manifestos, and long-input-Transformer-based models, which compute scaling values directly from raw text. We carry out the analysis of the Comparative Manifestos Project dataset across 41 countries and 27 languages and find that the task can be efficiently solved by state-of-the-art models, with label aggregation producing the best results.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
-
Investigating semantic subspaces of Transformer sentence embeddings through linear structural probing
Authors:
Dmitry Nikolaev,
Sebastian Padó
Abstract:
The question of what kinds of linguistic information are encoded in different layers of Transformer-based language models is of considerable interest for the NLP community. Existing work, however, has overwhelmingly focused on word-level representations and encoder-only language models with the masked-token training objective. In this paper, we present experiments with semantic structural probing,…
▽ More
The question of what kinds of linguistic information are encoded in different layers of Transformer-based language models is of considerable interest for the NLP community. Existing work, however, has overwhelmingly focused on word-level representations and encoder-only language models with the masked-token training objective. In this paper, we present experiments with semantic structural probing, a method for studying sentence-level representations via finding a subspace of the embedding space that provides suitable task-specific pairwise distances between data-points. We apply our method to language models from different families (encoder-only, decoder-only, encoder-decoder) and of different sizes in the context of two tasks, semantic textual similarity and natural-language inference. We find that model families differ substantially in their performance and layer dynamics, but that the results are largely model-size invariant.
△ Less
Submitted 18 October, 2023;
originally announced October 2023.
-
An Attribution Method for Siamese Encoders
Authors:
Lucas Möller,
Dmitry Nikolaev,
Sebastian Padó
Abstract:
Despite the success of Siamese encoder models such as sentence transformers (ST), little is known about the aspects of inputs they pay attention to. A barrier is that their predictions cannot be attributed to individual features, as they compare two inputs rather than processing a single one. This paper derives a local attribution method for Siamese encoders by generalizing the principle of integr…
▽ More
Despite the success of Siamese encoder models such as sentence transformers (ST), little is known about the aspects of inputs they pay attention to. A barrier is that their predictions cannot be attributed to individual features, as they compare two inputs rather than processing a single one. This paper derives a local attribution method for Siamese encoders by generalizing the principle of integrated gradients to models with multiple inputs. The solution takes the form of feature-pair attributions, and can be reduced to a token-token matrix for STs. Our method involves the introduction of integrated Jacobians and inherits the advantageous formal properties of integrated gradients: it accounts for the model's full computation graph and is guaranteed to converge to the actual prediction. A pilot study shows that in an ST few token-pairs can often explain large fractions of predictions, and it focuses on nouns and verbs. For accurate predictions, it however needs to attend to the majority of tokens and parts of speech.
△ Less
Submitted 29 November, 2023; v1 submitted 9 October, 2023;
originally announced October 2023.
-
Adverbs, Surprisingly
Authors:
Dmitry Nikolaev,
Collin F. Baker,
Miriam R. L. Petruck,
Sebastian Padó
Abstract:
This paper begins with the premise that adverbs are neglected in computational linguistics. This view derives from two analyses: a literature review and a novel adverb dataset to probe a state-of-the-art language model, thereby uncovering systematic gaps in accounts for adverb meaning. We suggest that using Frame Semantics for characterizing word meaning, as in FrameNet, provides a promising appro…
▽ More
This paper begins with the premise that adverbs are neglected in computational linguistics. This view derives from two analyses: a literature review and a novel adverb dataset to probe a state-of-the-art language model, thereby uncovering systematic gaps in accounts for adverb meaning. We suggest that using Frame Semantics for characterizing word meaning, as in FrameNet, provides a promising approach to adverb analysis, given its ability to describe ambiguity, semantic roles, and null instantiation.
△ Less
Submitted 31 May, 2023;
originally announced May 2023.
-
Additive manifesto decomposition: A policy domain aware method for understanding party positioning
Authors:
Tanise Ceron,
Dmitry Nikolaev,
Sebastian Padó
Abstract:
Automatic extraction of party (dis)similarities from texts such as party election manifestos or parliamentary speeches plays an increasing role in computational political science. However, existing approaches are fundamentally limited to targeting only global party (dis)-similarity: they condense the relationship between a pair of parties into a single figure, their similarity. In aggregating over…
▽ More
Automatic extraction of party (dis)similarities from texts such as party election manifestos or parliamentary speeches plays an increasing role in computational political science. However, existing approaches are fundamentally limited to targeting only global party (dis)-similarity: they condense the relationship between a pair of parties into a single figure, their similarity. In aggregating over all policy domains (e.g., health or foreign policy), they do not provide any qualitative insights into which domains parties agree or disagree on. This paper proposes a workflow for estimating policy domain aware party similarity that overcomes this limitation. The workflow covers (a) definition of suitable policy domains; (b) automatic labeling of domains, if no manual labels are available; (c) computation of domain-level similarities and aggregation at a global level; (d) extraction of interpretable party positions on major policy axes via multidimensional scaling. We evaluate our workflow on manifestos from the German federal elections. We find that our method (a) yields high correlation when predicting party similarity at a global level and (b) provides accurate party-specific positions, even with automatically labelled policy domains.
△ Less
Submitted 17 May, 2023;
originally announced May 2023.
-
Representation biases in sentence transformers
Authors:
Dmitry Nikolaev,
Sebastian Padó
Abstract:
Variants of the BERT architecture specialised for producing full-sentence representations often achieve better performance on downstream tasks than sentence embeddings extracted from vanilla BERT. However, there is still little understanding of what properties of inputs determine the properties of such representations. In this study, we construct several sets of sentences with pre-defined lexical…
▽ More
Variants of the BERT architecture specialised for producing full-sentence representations often achieve better performance on downstream tasks than sentence embeddings extracted from vanilla BERT. However, there is still little understanding of what properties of inputs determine the properties of such representations. In this study, we construct several sets of sentences with pre-defined lexical and syntactic structures and show that SOTA sentence transformers have a strong nominal-participant-set bias: cosine similarities between pairs of sentences are more strongly determined by the overlap in the set of their noun participants than by having the same predicates, lengthy nominal modifiers, or adjuncts. At the same time, the precise syntactic-thematic functions of the participants are largely irrelevant.
△ Less
Submitted 30 January, 2023;
originally announced January 2023.
-
Word-order typology in Multilingual BERT: A case study in subordinate-clause detection
Authors:
Dmitry Nikolaev,
Sebastian Padó
Abstract:
The capabilities and limitations of BERT and similar models are still unclear when it comes to learning syntactic abstractions, in particular across languages. In this paper, we use the task of subordinate-clause detection within and across languages to probe these properties. We show that this task is deceptively simple, with easy gains offset by a long tail of harder cases, and that BERT's zero-…
▽ More
The capabilities and limitations of BERT and similar models are still unclear when it comes to learning syntactic abstractions, in particular across languages. In this paper, we use the task of subordinate-clause detection within and across languages to probe these properties. We show that this task is deceptively simple, with easy gains offset by a long tail of harder cases, and that BERT's zero-shot performance is dominated by word-order effects, mirroring the SVO/VSO/SOV typology.
△ Less
Submitted 24 May, 2022;
originally announced May 2022.
-
Fast matrix multiplication for binary and ternary CNNs on ARM CPU
Authors:
Anton Trusov,
Elena Limonova,
Dmitry Nikolaev,
Vladimir V. Arlazarov
Abstract:
Low-bit quantized neural networks are of great interest in practical applications because they significantly reduce the consumption of both memory and computational resources. Binary neural networks are memory and computationally efficient as they require only one bit per weight and activation and can be computed using Boolean logic and bit count operations. QNNs with ternary weights and activatio…
▽ More
Low-bit quantized neural networks are of great interest in practical applications because they significantly reduce the consumption of both memory and computational resources. Binary neural networks are memory and computationally efficient as they require only one bit per weight and activation and can be computed using Boolean logic and bit count operations. QNNs with ternary weights and activations and binary weights and ternary activations aim to improve recognition quality compared to BNNs while preserving low bit-width. However, their efficient implementation is usually considered on ASICs and FPGAs, limiting their applicability in real-life tasks. At the same time, one of the areas where efficient recognition is most in demand is recognition on mobile devices using their CPUs. However, there are no known fast implementations of TBNs and TNN, only the daBNN library for BNNs inference. In this paper, we propose novel fast algorithms of ternary, ternary-binary, and binary matrix multiplication for mobile devices with ARM architecture. In our algorithms, ternary weights are represented using 2-bit encoding and binary - using one bit. It allows us to replace matrix multiplication with Boolean logic operations that can be computed on 128-bits simultaneously, using ARM NEON SIMD extension. The matrix multiplication results are accumulated in 16-bit integer registers. We also use special reordering of values in left and right matrices. All that allows us to efficiently compute a matrix product while minimizing the number of loads and stores compared to the algorithm from daBNN. Our algorithms can be used to implement inference of convolutional and fully connected layers of TNNs, TBNs, and BNNs. We evaluate them experimentally on ARM Cortex-A73 CPU and compare their inference speed to efficient implementations of full-precision, 8-bit, and 4-bit quantized matrix multiplications.
△ Less
Submitted 18 May, 2022;
originally announced May 2022.
-
On the properties of some low-parameter models for color reproduction in terms of spectrum transformations and coverage of a color triangle
Authors:
Alexey Kroshnin,
Viacheslav Vasilev,
Egor Ershov,
Denis Shepelev,
Dmitry Nikolaev,
Mikhail Tchobanou
Abstract:
One of the classical approaches to solving color reproduction problems, such as color adaptation or color space transform, is the use of low-parameter spectral models. The strength of this approach is the ability to choose a set of properties that the model should have, be it a large coverage area of a color triangle, an accurate description of the addition or multiplication of spectra, knowing on…
▽ More
One of the classical approaches to solving color reproduction problems, such as color adaptation or color space transform, is the use of low-parameter spectral models. The strength of this approach is the ability to choose a set of properties that the model should have, be it a large coverage area of a color triangle, an accurate description of the addition or multiplication of spectra, knowing only the tristimulus corresponding to them. The disadvantage is that some of the properties of the mentioned spectral models are confirmed only experimentally. This work is devoted to the theoretical substantiation of various properties of spectral models. In particular, we prove that the banded model is the only model that simultaneously possesses the properties of closure under addition and multiplication. We also show that the Gaussian model is the limiting case of the von Mises model and prove that the set of protomers of the von Mises model unambiguously covers the color triangle in both the case of convex and non-convex spectral locus.
△ Less
Submitted 21 October, 2021;
originally announced October 2021.
-
On the Relation between Syntactic Divergence and Zero-Shot Performance
Authors:
Ofir Arviv,
Dmitry Nikolaev,
Taelin Karidi,
Omri Abend
Abstract:
We explore the link between the extent to which syntactic relations are preserved in translation and the ease of correctly constructing a parse tree in a zero-shot setting. While previous work suggests such a relation, it tends to focus on the macro level and not on the level of individual edges-a gap we aim to address. As a test case, we take the transfer of Universal Dependencies (UD) parsing fr…
▽ More
We explore the link between the extent to which syntactic relations are preserved in translation and the ease of correctly constructing a parse tree in a zero-shot setting. While previous work suggests such a relation, it tends to focus on the macro level and not on the level of individual edges-a gap we aim to address. As a test case, we take the transfer of Universal Dependencies (UD) parsing from English to a diverse set of languages and conduct two sets of experiments. In one, we analyze zero-shot performance based on the extent to which English source edges are preserved in translation. In another, we apply three linguistically motivated transformations to UD, creating more cross-lingually stable versions of it, and assess their zero-shot parsability. In order to compare parsing performance across different schemes, we perform extrinsic evaluation on the downstream task of cross-lingual relation extraction (RE) using a subset of a popular English RE benchmark translated to Russian and Korean. In both sets of experiments, our results suggest a strong relation between cross-lingual stability and zero-shot parsing performance.
△ Less
Submitted 9 October, 2021;
originally announced October 2021.
-
Advanced Hough-based method for on-device document localization
Authors:
D. V. Tropin,
A. M. Ershov,
D. P. Nikolaev,
V. V. Arlazarov
Abstract:
The demand for on-device document recognition systems increases in conjunction with the emergence of more strict privacy and security requirements. In such systems, there is no data transfer from the end device to a third-party information processing servers. The response time is vital to the user experience of on-device document recognition. Combined with the unavailability of discrete GPUs, powe…
▽ More
The demand for on-device document recognition systems increases in conjunction with the emergence of more strict privacy and security requirements. In such systems, there is no data transfer from the end device to a third-party information processing servers. The response time is vital to the user experience of on-device document recognition. Combined with the unavailability of discrete GPUs, powerful CPUs, or a large RAM capacity on consumer-grade end devices such as smartphones, the time limitations put significant constraints on the computational complexity of the applied algorithms for on-device execution.
In this work, we consider document location in an image without prior knowledge of the document content or its internal structure. In accordance with the published works, at least 5 systems offer solutions for on-device document location. All these systems use a location method which can be considered Hough-based. The precision of such systems seems to be lower than that of the state-of-the-art solutions which were not designed to account for the limited computational resources.
We propose an advanced Hough-based method. In contrast with other approaches, it accounts for the geometric invariants of the central projection model and combines both edge and color features for document boundary detection. The proposed method allowed for the second best result for SmartDoc dataset in terms of precision, surpassed by U-net like neural network. When evaluated on a more challenging MIDV-500 dataset, the proposed algorithm guaranteed the best precision compared to published methods. Our method retained the applicability to on-device computations.
△ Less
Submitted 18 June, 2021;
originally announced June 2021.
-
Part of Speech and Universal Dependency effects on English Arabic Machine Translation
Authors:
Ofek Rafaeli,
Omri Abend,
Leshem Choshen,
Dmitry Nikolaev
Abstract:
In this research paper, I will elaborate on a method to evaluate machine translation models based on their performance on underlying syntactical phenomena between English and Arabic languages. This method is especially important as such "neural" and "machine learning" are hard to fine-tune and change. Thus, finding a way to evaluate them easily and diversely would greatly help the task of betterin…
▽ More
In this research paper, I will elaborate on a method to evaluate machine translation models based on their performance on underlying syntactical phenomena between English and Arabic languages. This method is especially important as such "neural" and "machine learning" are hard to fine-tune and change. Thus, finding a way to evaluate them easily and diversely would greatly help the task of bettering them.
△ Less
Submitted 3 June, 2021; v1 submitted 1 June, 2021;
originally announced June 2021.
-
SERRANT: a syntactic classifier for English Grammatical Error Types
Authors:
Leshem Choshen,
Matanel Oren,
Dmitry Nikolaev,
Omri Abend
Abstract:
SERRANT is a system and code for automatic classification of English grammatical errors that combines SErCl and ERRANT. SERRANT uses ERRANT's annotations when they are informative and those provided by SErCl otherwise.
SERRANT is a system and code for automatic classification of English grammatical errors that combines SErCl and ERRANT. SERRANT uses ERRANT's annotations when they are informative and those provided by SErCl otherwise.
△ Less
Submitted 7 April, 2021; v1 submitted 6 April, 2021;
originally announced April 2021.
-
Illumination Estimation Challenge: experience of past two years
Authors:
Egor Ershov,
Alex Savchik,
Ilya Semenkov,
Nikola Banić,
Karlo Koscević,
Marko Subašić,
Alexander Belokopytov,
Zhihao Li,
Arseniy Terekhin,
Daria Senshina,
Artem Nikonorov,
Yanlin Qian,
Marco Buzzelli,
Riccardo Riva,
Simone Bianco,
Raimondo Schettini,
Sven Lončarić,
Dmitry Nikolaev
Abstract:
Illumination estimation is the essential step of computational color constancy, one of the core parts of various image processing pipelines of modern digital cameras. Having an accurate and reliable illumination estimation is important for reducing the illumination influence on the image colors. To motivate the generation of new ideas and the development of new algorithms in this field, the 2nd Il…
▽ More
Illumination estimation is the essential step of computational color constancy, one of the core parts of various image processing pipelines of modern digital cameras. Having an accurate and reliable illumination estimation is important for reducing the illumination influence on the image colors. To motivate the generation of new ideas and the development of new algorithms in this field, the 2nd Illumination estimation challenge~(IEC\#2) was conducted. The main advantage of testing a method on a challenge over testing in on some of the known datasets is the fact that the ground-truth illuminations for the challenge test images are unknown up until the results have been submitted, which prevents any potential hyperparameter tuning that may be biased.
The challenge had several tracks: general, indoor, and two-illuminant with each of them focusing on different parameters of the scenes. Other main features of it are a new large dataset of images (about 5000) taken with the same camera sensor model, a manual markup accompanying each image, diverse content with scenes taken in numerous countries under a huge variety of illuminations extracted by using the SpyderCube calibration object, and a contest-like markup for the images from the Cube+ dataset that was used in IEC\#1.
This paper focuses on the description of the past two challenges, algorithms which won in each track, and the conclusions that were drawn based on the results obtained during the 1st and 2nd challenge that can be useful for similar future developments.
△ Less
Submitted 31 December, 2020;
originally announced December 2020.
-
ProLab: perceptually uniform projective colour coordinate system
Authors:
Ivan A. Konovalenko,
Anna A. Smagina,
Dmitry P. Nikolaev,
Petr P. Nikolaev
Abstract:
In this work, we propose proLab: a new colour coordinate system derived as a 3D projective transformation of CIE XYZ. We show that proLab is far ahead of the widely used CIELAB coordinate system (though inferior to the modern CAM16-UCS) according to perceptual uniformity evaluated by the STRESS metric in reference to the CIEDE2000 colour difference formula. At the same time, angular errors of chro…
▽ More
In this work, we propose proLab: a new colour coordinate system derived as a 3D projective transformation of CIE XYZ. We show that proLab is far ahead of the widely used CIELAB coordinate system (though inferior to the modern CAM16-UCS) according to perceptual uniformity evaluated by the STRESS metric in reference to the CIEDE2000 colour difference formula. At the same time, angular errors of chromaticity estimation that are standard for linear colour spaces can also be used in proLab since projective transformations preserve the linearity of manifolds. Unlike in linear spaces, angular errors for different hues are normalized according to human colour discrimination thresholds within proLab. We also demonstrate that shot noise in proLab is more homoscedastic than in CAM16-UCS or other standard colour spaces. This makes proLab a convenient coordinate system in which to perform linear colour analysis.
△ Less
Submitted 11 January, 2021; v1 submitted 14 December, 2020;
originally announced December 2020.
-
Classifying Syntactic Errors in Learner Language
Authors:
Leshem Choshen,
Dmitry Nikolaev,
Yevgeni Berzak,
Omri Abend
Abstract:
We present a method for classifying syntactic errors in learner language, namely errors whose correction alters the morphosyntactic structure of a sentence.
The methodology builds on the established Universal Dependencies syntactic representation scheme, and provides complementary information to other error-classification systems.
Unlike existing error classification methods, our method is app…
▽ More
We present a method for classifying syntactic errors in learner language, namely errors whose correction alters the morphosyntactic structure of a sentence.
The methodology builds on the established Universal Dependencies syntactic representation scheme, and provides complementary information to other error-classification systems.
Unlike existing error classification methods, our method is applicable across languages, which we showcase by producing a detailed picture of syntactic errors in learner English and learner Russian. We further demonstrate the utility of the methodology for analyzing the outputs of leading Grammatical Error Correction (GEC) systems.
△ Less
Submitted 27 October, 2020; v1 submitted 21 October, 2020;
originally announced October 2020.
-
ResNet-like Architecture with Low Hardware Requirements
Authors:
Elena Limonova,
Daniil Alfonso,
Dmitry Nikolaev,
Vladimir V. Arlazarov
Abstract:
One of the most computationally intensive parts in modern recognition systems is an inference of deep neural networks that are used for image classification, segmentation, enhancement, and recognition. The growing popularity of edge computing makes us look for ways to reduce its time for mobile and embedded devices. One way to decrease the neural network inference time is to modify a neuron model…
▽ More
One of the most computationally intensive parts in modern recognition systems is an inference of deep neural networks that are used for image classification, segmentation, enhancement, and recognition. The growing popularity of edge computing makes us look for ways to reduce its time for mobile and embedded devices. One way to decrease the neural network inference time is to modify a neuron model to make it moreefficient for computations on a specific device. The example ofsuch a model is a bipolar morphological neuron model. The bipolar morphological neuron is based on the idea of replacing multiplication with addition and maximum operations. This model has been demonstrated for simple image classification with LeNet-like architectures [1]. In the paper, we introduce a bipolar morphological ResNet (BM-ResNet) model obtained from a much more complex ResNet architecture by converting its layers to bipolar morphological ones. We apply BM-ResNet to image classification on MNIST and CIFAR-10 datasets with only a moderate accuracy decrease from 99.3% to 99.1% and from 85.3% to 85.1%. We also estimate the computational complexity of the resulting model. We show that for the majority of ResNet layers, the considered model requires 2.1-2.9 times fewer logic gates for implementation and 15-30% lower latency.
△ Less
Submitted 21 October, 2020; v1 submitted 15 September, 2020;
originally announced September 2020.
-
Fast Implementation of 4-bit Convolutional Neural Networks for Mobile Devices
Authors:
Anton Trusov,
Elena Limonova,
Dmitry Slugin,
Dmitry Nikolaev,
Vladimir V. Arlazarov
Abstract:
Quantized low-precision neural networks are very popular because they require less computational resources for inference and can provide high performance, which is vital for real-time and embedded recognition systems. However, their advantages are apparent for FPGA and ASIC devices, while general-purpose processor architectures are not always able to perform low-bit integer computations efficientl…
▽ More
Quantized low-precision neural networks are very popular because they require less computational resources for inference and can provide high performance, which is vital for real-time and embedded recognition systems. However, their advantages are apparent for FPGA and ASIC devices, while general-purpose processor architectures are not always able to perform low-bit integer computations efficiently. The most frequently used low-precision neural network model for mobile central processors is an 8-bit quantized network. However, in a number of cases, it is possible to use fewer bits for weights and activations, and the only problem is the difficulty of efficient implementation. We introduce an efficient implementation of 4-bit matrix multiplication for quantized neural networks and perform time measurements on a mobile ARM processor. It shows 2.9 times speedup compared to standard floating-point multiplication and is 1.5 times faster than 8-bit quantized one. We also demonstrate a 4-bit quantized neural network for OCR recognition on the MIDV-500 dataset. 4-bit quantization gives 95.0% accuracy and 48% overall inference speedup, while an 8-bit quantized network gives 95.4% accuracy and 39% speedup. The results show that 4-bit quantization perfectly suits mobile devices, yielding good enough accuracy and low inference time.
△ Less
Submitted 20 October, 2020; v1 submitted 14 September, 2020;
originally announced September 2020.
-
Line detection via a lightweight CNN with a Hough Layer
Authors:
Lev Teplyakov,
Kirill Kaymakov,
Evgeny Shvets,
Dmitry Nikolaev
Abstract:
Line detection is an important computer vision task traditionally solved by Hough Transform. With the advance of deep learning, however, trainable approaches to line detection became popular. In this paper we propose a lightweight CNN for line detection with an embedded parameter-free Hough layer, which allows the network neurons to have global strip-like receptive fields. We argue that traditiona…
▽ More
Line detection is an important computer vision task traditionally solved by Hough Transform. With the advance of deep learning, however, trainable approaches to line detection became popular. In this paper we propose a lightweight CNN for line detection with an embedded parameter-free Hough layer, which allows the network neurons to have global strip-like receptive fields. We argue that traditional convolutional networks have two inherent problems when applied to the task of line detection and show how insertion of a Hough layer into the network solves them. Additionally, we point out some major inconsistencies in the current datasets used for line detection.
△ Less
Submitted 16 October, 2020; v1 submitted 20 August, 2020;
originally announced August 2020.
-
Approach for Document Detection by Contours and Contrasts
Authors:
Daniil V. Tropin,
Sergey A. Ilyuhin,
Dmitry P. Nikolaev,
Vladimir V. Arlazarov
Abstract:
This paper considers arbitrary document detection performed on a mobile device. The classical contour-based approach often fails in cases featuring occlusion, complex background, or blur. The region-based approach, which relies on the contrast between object and background, does not have application limitations, however, its known implementations are highly resource-consuming. We propose a modific…
▽ More
This paper considers arbitrary document detection performed on a mobile device. The classical contour-based approach often fails in cases featuring occlusion, complex background, or blur. The region-based approach, which relies on the contrast between object and background, does not have application limitations, however, its known implementations are highly resource-consuming. We propose a modification of the contour-based method, in which the competing contour location hypotheses are ranked according to the contrast between the areas inside and outside the border. In the experiments, such modification allows for the decrease of alternatives ordering errors by 40% and the decrease of the overall detection errors by 10%. The proposed method provides unmatched state-of-the-art performance on the open MIDV-500 dataset, and it demonstrates results comparable with state-of-the-art performance on the SmartDoc dataset.
△ Less
Submitted 19 October, 2020; v1 submitted 6 August, 2020;
originally announced August 2020.
-
Accelerated FBP for computed tomography image reconstruction
Authors:
Anastasiya Dolmatova,
Marina Chukalina,
Dmitry Nikolaev
Abstract:
Filtered back projection (FBP) is a commonly used technique in tomographic image reconstruction demonstrating acceptable quality. The classical direct implementations of this algorithm require the execution of $Θ(N^3)$ operations, where $N$ is the linear size of the 2D slice. Recent approaches including reconstruction via the Fourier slice theorem require $Θ(N^2\log N)$ multiplication operations.…
▽ More
Filtered back projection (FBP) is a commonly used technique in tomographic image reconstruction demonstrating acceptable quality. The classical direct implementations of this algorithm require the execution of $Θ(N^3)$ operations, where $N$ is the linear size of the 2D slice. Recent approaches including reconstruction via the Fourier slice theorem require $Θ(N^2\log N)$ multiplication operations. In this paper, we propose a novel approach that reduces the computational complexity of the algorithm to $Θ(N^2\log N)$ addition operations avoiding Fourier space. For speeding up the convolution, ramp filter is approximated by a pair of causal and anticausal recursive filters, also known as Infinite Impulse Response filters. The back projection is performed with the fast discrete Hough transform. Experimental results on simulated data demonstrate the efficiency of the proposed approach.
△ Less
Submitted 13 July, 2020;
originally announced July 2020.
-
Fine-Grained Analysis of Cross-Linguistic Syntactic Divergences
Authors:
Dmitry Nikolaev,
Ofir Arviv,
Taelin Karidi,
Neta Kenneth,
Veronika Mitnik,
Lilja Maria Saeboe,
Omri Abend
Abstract:
The patterns in which the syntax of different languages converges and diverges are often used to inform work on cross-lingual transfer. Nevertheless, little empirical work has been done on quantifying the prevalence of different syntactic divergences across language pairs. We propose a framework for extracting divergence patterns for any language pair from a parallel corpus, building on Universal…
▽ More
The patterns in which the syntax of different languages converges and diverges are often used to inform work on cross-lingual transfer. Nevertheless, little empirical work has been done on quantifying the prevalence of different syntactic divergences across language pairs. We propose a framework for extracting divergence patterns for any language pair from a parallel corpus, building on Universal Dependencies. We show that our framework provides a detailed picture of cross-language divergences, generalizes previous approaches, and lends itself to full automation. We further present a novel dataset, a manually word-aligned subset of the Parallel UD corpus in five languages, and use it to perform a detailed corpus study. We demonstrate the usefulness of the resulting analysis by showing that it can help account for performance patterns of a cross-lingual parser.
△ Less
Submitted 13 July, 2020; v1 submitted 7 May, 2020;
originally announced May 2020.
-
Fast Implementation of Morphological Filtering Using ARM NEON Extension
Authors:
Elena Limonova,
Arseny Terekhin,
Dmitry Nikolaev,
Vladimir Arlazarov
Abstract:
In this paper we consider speedup potential of morphological image filtering on ARM processors. Morphological operations are widely used in image analysis and recognition and their speedup in some cases can significantly reduce overall execution time of recognition. More specifically, we propose fast implementation of erosion and dilation using ARM SIMD extension NEON. These operations with the re…
▽ More
In this paper we consider speedup potential of morphological image filtering on ARM processors. Morphological operations are widely used in image analysis and recognition and their speedup in some cases can significantly reduce overall execution time of recognition. More specifically, we propose fast implementation of erosion and dilation using ARM SIMD extension NEON. These operations with the rectangular structuring element are separable. They were implemented using the advantages of separability as sequential horizontal and vertical passes. Each pass was implemented using van Herk/Gil-Werman algorithm for large windows and low-constant linear complexity algorithm for small windows. Final implementation was improved with SIMD and used a combination of these methods. We also considered fast transpose implementation of 8x8 and 16x16 matrices using ARM NEON to get additional computational gain for morphological operations. Experiments showed 3 times efficiency increase for final implementation of erosion and dilation compared to van Herk/Gil-Werman algorithm without SIMD, 5.7 times speedup for 8x8 matrix transpose and 12 times speedup for 16x16 matrix transpose compared to transpose without SIMD.
△ Less
Submitted 19 February, 2020;
originally announced February 2020.
-
Computational optimization of convolutional neural networks using separated filters architecture
Authors:
Elena Limonova,
Alexander Sheshkus,
Dmitry Nikolaev
Abstract:
This paper considers a convolutional neural network transformation that reduces computation complexity and thus speedups neural network processing. Usage of convolutional neural networks (CNN) is the standard approach to image recognition despite the fact they can be too computationally demanding, for example for recognition on mobile platforms or in embedded systems. In this paper we propose CNN…
▽ More
This paper considers a convolutional neural network transformation that reduces computation complexity and thus speedups neural network processing. Usage of convolutional neural networks (CNN) is the standard approach to image recognition despite the fact they can be too computationally demanding, for example for recognition on mobile platforms or in embedded systems. In this paper we propose CNN structure transformation which expresses 2D convolution filters as a linear combination of separable filters. It allows to obtain separated convolutional filters by standard training algorithms. We study the computation efficiency of this structure transformation and suggest fast implementation easily handled by CPU or GPU. We demonstrate that CNNs designed for letter and digit recognition of proposed structure show 15% speedup without accuracy loss in industrial image recognition system. In conclusion, we discuss the question of possible accuracy decrease and the application of proposed transformation to different recognition problems. convolutional neural networks, computational optimization, separable filters, complexity reduction.
△ Less
Submitted 18 February, 2020;
originally announced February 2020.
-
Vanishing Point Detection with Direct and Transposed Fast Hough Transform inside the neural network
Authors:
A. Sheshkus,
A. Chirvonaya,
D. Matveev,
D. Nikolaev,
V. L. Arlazarov
Abstract:
In this paper, we suggest a new neural network architecture for vanishing point detection in images. The key element is the use of the direct and transposed Fast Hough Transforms separated by convolutional layer blocks with standard activation functions. It allows us to get the answer in the coordinates of the input image at the output of the network and thus to calculate the coordinates of the va…
▽ More
In this paper, we suggest a new neural network architecture for vanishing point detection in images. The key element is the use of the direct and transposed Fast Hough Transforms separated by convolutional layer blocks with standard activation functions. It allows us to get the answer in the coordinates of the input image at the output of the network and thus to calculate the coordinates of the vanishing point by simply selecting the maximum. Besides, it was proved that calculation of the transposed Fast Hough Transform can be performed using the direct one. The use of integral operators enables the neural network to rely on global rectilinear features in the image, and so it is ideal for detecting vanishing points. To demonstrate the effectiveness of the proposed architecture, we use a set of images from a DVR and show its superiority over existing methods. Note, in addition, that the proposed neural network architecture essentially repeats the process of direct and back projection used, for example, in computed tomography.
△ Less
Submitted 7 July, 2020; v1 submitted 4 February, 2020;
originally announced February 2020.
-
A Document Skew Detection Method Using Fast Hough Transform
Authors:
Pavel Bezmaternykh,
Dmitry Nikolaev
Abstract:
The majority of document image analysis systems use a document skew detection algorithm to simplify all its further processing stages. A huge amount of such algorithms based on Hough transform (HT) analysis has already been proposed. Despite this, we managed to find only one work where the Fast Hough Transform (FHT) usage was suggested to solve the indicated problem. Unfortunately, no study of tha…
▽ More
The majority of document image analysis systems use a document skew detection algorithm to simplify all its further processing stages. A huge amount of such algorithms based on Hough transform (HT) analysis has already been proposed. Despite this, we managed to find only one work where the Fast Hough Transform (FHT) usage was suggested to solve the indicated problem. Unfortunately, no study of that method was provided. In this work, we propose and study a skew detection algorithm for the document images which relies on FHT analysis. To measure this algorithm quality we use the dataset from the problem oriented DISEC'13 contest and its evaluation methodology. Obtained values for AED, TOP80, and CE criteria are equal to 0.086, 0.056, 68.80 respectively.
△ Less
Submitted 5 December, 2019;
originally announced December 2019.
-
A Low Computational Approach for Price Tag Recognition
Authors:
M. A. Aliev,
D. A. Bocharov,
I. A. Kunina,
D. P. Nikolaev
Abstract:
In this work we discuss the task of search, localization and recognition of price zone within a photograph of the price tag. The task is being addressed for the case when image is acquired by small-scale digital camera and calculation device has significant resource constraints. The proposed approach is based on Niblack binarization algorithm, analysis and clasterization of connected components in…
▽ More
In this work we discuss the task of search, localization and recognition of price zone within a photograph of the price tag. The task is being addressed for the case when image is acquired by small-scale digital camera and calculation device has significant resource constraints. The proposed approach is based on Niblack binarization algorithm, analysis and clasterization of connected components in conditions of known price tag geometrical model. The algorithm was tested on a private dataset and has shown high quality.
△ Less
Submitted 4 December, 2019;
originally announced December 2019.
-
A Method of Detecting End-To-End Curves of Limited Curvature
Authors:
Ekaterina Panfilova,
Mikhail Aliev,
Irina Kunina,
Vasiliy Postnikov,
Dmitry Nikolaev
Abstract:
In this paper we consider a method for detecting end-to-end curves of limited curvature like the k-link polylines with bending angle between adjacent segments in a given range. The approximation accuracy is achieved by maximization of the quality function in the image matrix. The method is based on a dynamic programming scheme constructed over Fast Hough Transform calculation results for image ban…
▽ More
In this paper we consider a method for detecting end-to-end curves of limited curvature like the k-link polylines with bending angle between adjacent segments in a given range. The approximation accuracy is achieved by maximization of the quality function in the image matrix. The method is based on a dynamic programming scheme constructed over Fast Hough Transform calculation results for image bands. The proposed method asymptotic complexity is $O(h \cdot (w+ \frac{h}{k}) \cdot log(\frac{h}{k}))$, where $h$ and $w$ are the image size, and $k$ is the approximating polyline links number, which is an analogue of the complexity of the fast Fourier transform or the fast Hough transform. We also show the results of the proposed method on synthetic and real data.
△ Less
Submitted 4 December, 2019;
originally announced December 2019.
-
Bipolar Morphological Neural Networks: Convolution Without Multiplication
Authors:
Elena Limonova,
Daniil Matveev,
Dmitry Nikolaev,
Vladimir V. Arlazarov
Abstract:
In the paper we introduce a novel bipolar morphological neuron and bipolar morphological layer models. The models use only such operations as addition, subtraction and maximum inside the neuron and exponent and logarithm as activation functions for the layer. The proposed models unlike previously introduced morphological neural networks approximate the classical computations and show better recogn…
▽ More
In the paper we introduce a novel bipolar morphological neuron and bipolar morphological layer models. The models use only such operations as addition, subtraction and maximum inside the neuron and exponent and logarithm as activation functions for the layer. The proposed models unlike previously introduced morphological neural networks approximate the classical computations and show better recognition results. We also propose layer-by-layer approach to train the bipolar morphological networks, which can be further developed to an incremental approach for separate neurons to get higher accuracy. Both these approaches do not require special training algorithms and can use a variety of gradient descent methods. To demonstrate efficiency of the proposed model we consider classical convolutional neural networks and convert the pre-trained convolutional layers to the bipolar morphological layers. Seeing that the experiments on recognition of MNIST and MRZ symbols show only moderate decrease of accuracy after conversion and training, bipolar neuron model can provide faster inference and be very useful in mobile and embedded systems.
△ Less
Submitted 5 November, 2019;
originally announced November 2019.
-
HoughNet: neural network architecture for vanishing points detection
Authors:
Alexander Sheshkus,
Anastasia Ingacheva,
Vladimir Arlazarov,
Dmitry Nikolaev
Abstract:
In this paper we introduce a novel neural network architecture based on Fast Hough Transform layer. The layer of this type allows our neural network to accumulate features from linear areas across the entire image instead of local areas. We demonstrate its potential by solving the problem of vanishing points detection in the images of documents. Such problem occurs when dealing with camera shots o…
▽ More
In this paper we introduce a novel neural network architecture based on Fast Hough Transform layer. The layer of this type allows our neural network to accumulate features from linear areas across the entire image instead of local areas. We demonstrate its potential by solving the problem of vanishing points detection in the images of documents. Such problem occurs when dealing with camera shots of the documents in uncontrolled conditions. In this case, the document image can suffer several specific distortions including projective transform. To train our model, we use MIDV-500 dataset and provide testing results. The strong generalization ability of the suggested method is proven with its applying to a completely different ICDAR 2011 dewarping contest. In previously published papers considering these dataset authors measured the quality of vanishing point detection by counting correctly recognized words with open OCR engine Tesseract. To compare with them, we reproduce this experiment and show that our method outperforms the state-of-the-art result.
△ Less
Submitted 6 October, 2019; v1 submitted 9 September, 2019;
originally announced September 2019.
-
Linear colour segmentation revisited
Authors:
Anna Smagina,
Valentina Bozhkova,
Sergey Gladilin,
Dmitry Nikolaev
Abstract:
In this work we discuss the known algorithms for linear colour segmentation based on a physical approach and propose a new modification of segmentation algorithm. This algorithm is based on a region adjacency graph framework without a pre-segmentation stage. Proposed edge weight functions are defined from linear image model with normal noise. The colour space projective transform is introduced as…
▽ More
In this work we discuss the known algorithms for linear colour segmentation based on a physical approach and propose a new modification of segmentation algorithm. This algorithm is based on a region adjacency graph framework without a pre-segmentation stage. Proposed edge weight functions are defined from linear image model with normal noise. The colour space projective transform is introduced as a novel pre-processing technique for better handling of shadow and highlight areas. The resulting algorithm is tested on a benchmark dataset consisting of the images of 19 natural scenes selected from the Barnard's DXC-930 SFU dataset and 12 natural scene images newly published for common use. The dataset is provided with pixel-by-pixel ground truth colour segmentation for every image. Using this dataset, we show that the proposed algorithm modifications lead to qualitative advantages over other model-based segmentation algorithms, and also show the positive effect of each proposed modification. The source code and datasets for this work are available for free access at http://github.com/visillect/segmentation.
△ Less
Submitted 2 January, 2019;
originally announced January 2019.
-
On the use of FHT, its modification for practical applications and the structure of Hough image
Authors:
M. Aliev,
E. I. Ershov,
D. P. Nikolaev
Abstract:
This work focuses on the Fast Hough Transform (FHT) algorithm proposed by M.L. Brady. We propose how to modify the standard FHT to calculate sums along lines within any given range of their inclination angles. We also describe a new way to visualise Hough-image based on regrouping of accumulator space around its center. Finally, we prove that using Brady parameterization transforms any line into a…
▽ More
This work focuses on the Fast Hough Transform (FHT) algorithm proposed by M.L. Brady. We propose how to modify the standard FHT to calculate sums along lines within any given range of their inclination angles. We also describe a new way to visualise Hough-image based on regrouping of accumulator space around its center. Finally, we prove that using Brady parameterization transforms any line into a figure of type "angle".
△ Less
Submitted 14 November, 2018;
originally announced November 2018.