Search | arXiv e-print repository

Learning to Move Objects with Fluid Streams in a Differentiable Simulation

Authors: Karlis Freivalds, Laura Leja, Oskars Teikmanis

Abstract: We introduce a method for manipulating objects in three-dimensional space using controlled fluid streams. To achieve this, we train a neural network controller in a differentiable simulation and evaluate it in a simulated environment consisting of an 8x8 grid of vertical emitters. By carrying out various horizontal displacement tasks such as moving objects to specific positions while reacting to e… ▽ More We introduce a method for manipulating objects in three-dimensional space using controlled fluid streams. To achieve this, we train a neural network controller in a differentiable simulation and evaluate it in a simulated environment consisting of an 8x8 grid of vertical emitters. By carrying out various horizontal displacement tasks such as moving objects to specific positions while reacting to external perturbations, we demonstrate that a controller, trained with a limited number of iterations, can generalise to longer episodes and learn the complex dynamics of fluid-solid interactions. Importantly, our approach requires only the observation of the manipulated object's state, paving the way for the development of physical systems that enable contactless manipulation of objects using air streams. △ Less

Submitted 28 April, 2024; originally announced April 2024.

Comments: 8 pages, 7 figures

arXiv:2309.05295 [pdf, other]

Discrete Denoising Diffusion Approach to Integer Factorization

Authors: Karlis Freivalds, Emils Ozolins, Guntis Barzdins

Abstract: Integer factorization is a famous computational problem unknown whether being solvable in the polynomial time. With the rise of deep neural networks, it is interesting whether they can facilitate faster factorization. We present an approach to factorization utilizing deep neural networks and discrete denoising diffusion that works by iteratively correcting errors in a partially-correct solution. T… ▽ More Integer factorization is a famous computational problem unknown whether being solvable in the polynomial time. With the rise of deep neural networks, it is interesting whether they can facilitate faster factorization. We present an approach to factorization utilizing deep neural networks and discrete denoising diffusion that works by iteratively correcting errors in a partially-correct solution. To this end, we develop a new seq2seq neural network architecture, employ relaxed categorical distribution and adapt the reverse diffusion process to cope better with inaccuracies in the denoising step. The approach is able to find factors for integers of up to 56 bits long. Our analysis indicates that investment in training leads to an exponential decrease of sampling steps required at inference to achieve a given success rate, thus counteracting an exponential run-time increase depending on the bit-length. △ Less

Submitted 11 September, 2023; originally announced September 2023.

Comments: International Conference on Artificial Neural Networks ICANN 2023

arXiv:2212.00121 [pdf, other]

Denoising Diffusion for Sampling SAT Solutions

Authors: Karlis Freivalds, Sergejs Kozlovics

Abstract: Generating diverse solutions to the Boolean Satisfiability Problem (SAT) is a hard computational problem with practical applications for testing and functional verification of software and hardware designs. We explore the way to generate such solutions using Denoising Diffusion coupled with a Graph Neural Network to implement the denoising function. We find that the obtained accuracy is similar to… ▽ More Generating diverse solutions to the Boolean Satisfiability Problem (SAT) is a hard computational problem with practical applications for testing and functional verification of software and hardware designs. We explore the way to generate such solutions using Denoising Diffusion coupled with a Graph Neural Network to implement the denoising function. We find that the obtained accuracy is similar to the currently best purely neural method and the produced SAT solutions are highly diverse, even if the system is trained with non-random solutions from a standard solver. △ Less

Submitted 30 November, 2022; originally announced December 2022.

Comments: NeurIPS 2022 Workshop on Score-Based Methods

arXiv:2207.13667 [pdf, other]

Unsupervised Training for Neural TSP Solver

Authors: Elīza Gaile, Andis Draguns, Emīls Ozoliņš, Kārlis Freivalds

Abstract: There has been a growing number of machine learning methods for approximately solving the travelling salesman problem. However, these methods often require solved instances for training or use complex reinforcement learning approaches that need a large amount of tuning. To avoid these problems, we introduce a novel unsupervised learning approach. We use a relaxation of an integer linear program fo… ▽ More There has been a growing number of machine learning methods for approximately solving the travelling salesman problem. However, these methods often require solved instances for training or use complex reinforcement learning approaches that need a large amount of tuning. To avoid these problems, we introduce a novel unsupervised learning approach. We use a relaxation of an integer linear program for TSP to construct a loss function that does not require correct instance labels. With variable discretization, its minimum coincides with the optimal or near-optimal solution. Furthermore, this loss function is differentiable and thus can be used to train neural networks directly. We use our loss function with a Graph Neural Network and design controlled experiments on both Euclidean and asymmetric TSP. Our approach has the advantage over supervised learning of not requiring large labelled datasets. In addition, the performance of our approach surpasses reinforcement learning for asymmetric TSP and is comparable to reinforcement learning for Euclidean instances. Our approach is also more stable and easier to train than reinforcement learning. △ Less

Submitted 27 July, 2022; originally announced July 2022.

arXiv:2108.00527 [pdf, other]

Gates Are Not What You Need in RNNs

Authors: Ronalds Zakovskis, Andis Draguns, Eliza Gaile, Emils Ozolins, Karlis Freivalds

Abstract: Recurrent neural networks have flourished in many areas. Consequently, we can see new RNN cells being developed continuously, usually by creating or using gates in a new, original way. But what if we told you that gates in RNNs are redundant? In this paper, we propose a new recurrent cell called Residual Recurrent Unit (RRU) which beats traditional cells and does not employ a single gate. It is ba… ▽ More Recurrent neural networks have flourished in many areas. Consequently, we can see new RNN cells being developed continuously, usually by creating or using gates in a new, original way. But what if we told you that gates in RNNs are redundant? In this paper, we propose a new recurrent cell called Residual Recurrent Unit (RRU) which beats traditional cells and does not employ a single gate. It is based on the residual shortcut connection, linear transformations, ReLU, and normalization. To evaluate our cell's effectiveness, we compare its performance against the widely-used GRU and LSTM cells and the recently proposed Mogrifier LSTM on several tasks including, polyphonic music modeling, language modeling, and sentiment analysis. Our experiments show that RRU outperforms the traditional gated units on most of these tasks. Also, it has better robustness to parameter selection, allowing immediate application in new tasks without much tuning. We have implemented the RRU in TensorFlow, and the code is made available at https://github.com/LUMII-Syslab/RRU . △ Less

Submitted 21 November, 2023; v1 submitted 1 August, 2021; originally announced August 2021.

Comments: Published in Artificial Intelligence and Soft Computing. ICAISC 2023. Lecture Notes in Computer Science(), vol 14125. Springer, Cham., and is available online at https://doi.org/10.1007/978-3-031-42505-9_27

arXiv:2106.07162 [pdf, other]

doi 10.1109/IJCNN55064.2022.9892733

Goal-Aware Neural SAT Solver

Authors: Emils Ozolins, Karlis Freivalds, Andis Draguns, Eliza Gaile, Ronalds Zakovskis, Sergejs Kozlovics

Abstract: Modern neural networks obtain information about the problem and calculate the output solely from the input values. We argue that it is not always optimal, and the network's performance can be significantly improved by augmenting it with a query mechanism that allows the network at run time to make several solution trials and get feedback on the loss value on each trial. To demonstrate the capabili… ▽ More Modern neural networks obtain information about the problem and calculate the output solely from the input values. We argue that it is not always optimal, and the network's performance can be significantly improved by augmenting it with a query mechanism that allows the network at run time to make several solution trials and get feedback on the loss value on each trial. To demonstrate the capabilities of the query mechanism, we formulate an unsupervised (not depending on labels) loss function for Boolean Satisfiability Problem (SAT) and theoretically show that it allows the network to extract rich information about the problem. We then propose a neural SAT solver with a query mechanism called QuerySAT and show that it outperforms the neural baseline on a wide range of SAT tasks. △ Less

Submitted 30 May, 2022; v1 submitted 14 June, 2021; originally announced June 2021.

arXiv:2006.15892 [pdf, other]

Matrix Shuffle-Exchange Networks for Hard 2D Tasks

Authors: Emīls Ozoliņš, Kārlis Freivalds, Agris Šostaks

Abstract: Convolutional neural networks have become the main tools for processing two-dimensional data. They work well for images, yet convolutions have a limited receptive field that prevents its applications to more complex 2D tasks. We propose a new neural model, called Matrix Shuffle-Exchange network, that can efficiently exploit long-range dependencies in 2D data and has comparable speed to a convoluti… ▽ More Convolutional neural networks have become the main tools for processing two-dimensional data. They work well for images, yet convolutions have a limited receptive field that prevents its applications to more complex 2D tasks. We propose a new neural model, called Matrix Shuffle-Exchange network, that can efficiently exploit long-range dependencies in 2D data and has comparable speed to a convolutional neural network. It is derived from Neural Shuffle-Exchange network and has $\mathcal{O}( \log{n})$ layers and $\mathcal{O}( n^2 \log{n})$ total time and space complexity for processing a $n \times n$ data matrix. We show that the Matrix Shuffle-Exchange network is well-suited for algorithmic and logical reasoning tasks on matrices and dense graphs, exceeding convolutional and graph neural network baselines. Its distinct advantage is the capability of retaining full long-range dependency modelling when generalizing to larger instances - much larger than could be processed with models equipped with a dense attention mechanism. △ Less

Submitted 5 October, 2020; v1 submitted 29 June, 2020; originally announced June 2020.

arXiv:2004.04662 [pdf, other]

Residual Shuffle-Exchange Networks for Fast Processing of Long Sequences

Authors: Andis Draguns, Emīls Ozoliņš, Agris Šostaks, Matīss Apinis, Kārlis Freivalds

Abstract: Attention is a commonly used mechanism in sequence processing, but it is of O(n^2) complexity which prevents its application to long sequences. The recently introduced neural Shuffle-Exchange network offers a computation-efficient alternative, enabling the modelling of long-range dependencies in O(n log n) time. The model, however, is quite complex, involving a sophisticated gating mechanism deriv… ▽ More Attention is a commonly used mechanism in sequence processing, but it is of O(n^2) complexity which prevents its application to long sequences. The recently introduced neural Shuffle-Exchange network offers a computation-efficient alternative, enabling the modelling of long-range dependencies in O(n log n) time. The model, however, is quite complex, involving a sophisticated gating mechanism derived from the Gated Recurrent Unit. In this paper, we present a simple and lightweight variant of the Shuffle-Exchange network, which is based on a residual network employing GELU and Layer Normalization. The proposed architecture not only scales to longer sequences but also converges faster and provides better accuracy. It surpasses the Shuffle-Exchange network on the LAMBADA language modelling task and achieves state-of-the-art performance on the MusicNet dataset for music transcription while being efficient in the number of parameters. We show how to combine the improved Shuffle-Exchange network with convolutional layers, establishing it as a useful building block in long sequence processing applications. △ Less

Submitted 14 January, 2021; v1 submitted 6 April, 2020; originally announced April 2020.

Comments: 35th AAAI Conference on Artificial Intelligence (AAAI-21)

arXiv:1907.07897 [pdf, other]

Neural Shuffle-Exchange Networks -- Sequence Processing in O(n log n) Time

Authors: Kārlis Freivalds, Emīls Ozoliņš, Agris Šostaks

Abstract: A key requirement in sequence to sequence processing is the modeling of long range dependencies. To this end, a vast majority of the state-of-the-art models use attention mechanism which is of O($n^2$) complexity that leads to slow execution for long sequences. We introduce a new Shuffle-Exchange neural network model for sequence to sequence tasks which have O(log n) depth and O(n log n) total com… ▽ More A key requirement in sequence to sequence processing is the modeling of long range dependencies. To this end, a vast majority of the state-of-the-art models use attention mechanism which is of O($n^2$) complexity that leads to slow execution for long sequences. We introduce a new Shuffle-Exchange neural network model for sequence to sequence tasks which have O(log n) depth and O(n log n) total complexity. We show that this model is powerful enough to infer efficient algorithms for common algorithmic benchmarks including sorting, addition and multiplication. We evaluate our architecture on the challenging LAMBADA question answering dataset and compare it with the state-of-the-art models which use attention. Our model achieves competitive accuracy and scales to sequences with more than a hundred thousand of elements. We are confident that the proposed model has the potential for building more efficient architectures for processing large interrelated data in language modeling, music generation and other application domains. △ Less

Submitted 28 October, 2019; v1 submitted 18 July, 2019; originally announced July 2019.

Comments: 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada

arXiv:1807.09368 [pdf]

doi 10.1007/978-3-319-09174-7_22

Graph Compact Orthogonal Layout Algorithm

Authors: Karlis Freivalds, Jans Glagolevs

Abstract: There exist many orthogonal graph drawing algorithms that minimize edge crossings or edge bends, however they produce unsatisfactory drawings in many practical cases. In this paper we present a grid-based algorithm for drawing orthogonal graphs with nodes of prescribed size. It distinguishes by creating pleasant and compact drawings in relatively small running time. The main idea is to minimize th… ▽ More There exist many orthogonal graph drawing algorithms that minimize edge crossings or edge bends, however they produce unsatisfactory drawings in many practical cases. In this paper we present a grid-based algorithm for drawing orthogonal graphs with nodes of prescribed size. It distinguishes by creating pleasant and compact drawings in relatively small running time. The main idea is to minimize the total edge length that implicitly minimizes crossings and makes the drawing easy to comprehend. The algorithm is based on combining local and global improvements. Local improvements are moving each node to a new place and swapping of nodes. Global improvement is based on constrained quadratic programming approach that minimizes the total edge length while keeping node relative positions. △ Less

Submitted 24 July, 2018; originally announced July 2018.

Journal ref: Freivalds K., Glagolevs J. (2014) Graph Compact Orthogonal Layout Algorithm. In: Fouilhoux P., Gouveia L., Mahjoub A., Paschos V. (eds) Combinatorial Optimization. ISCO 2014. Lecture Notes in Computer Science, vol 8596. Springer, Cham

arXiv:1807.08335 [pdf]

doi 10.1145/3121360.3121364

A Statistical Method for Object Counting

Authors: Jans Glagolevs, Karlis Freivalds

Abstract: In this paper we present a new object counting method that is intended for counting similarly sized and mostly round objects. Unlike many other algorithms of the same purpose, the proposed method does not rely on identifying every object, it uses statistical data obtained from the image instead. The method is evaluated on images with human bone cells, oranges and pills achieving good accuracy. Its… ▽ More In this paper we present a new object counting method that is intended for counting similarly sized and mostly round objects. Unlike many other algorithms of the same purpose, the proposed method does not rely on identifying every object, it uses statistical data obtained from the image instead. The method is evaluated on images with human bone cells, oranges and pills achieving good accuracy. Its strengths are ability to deal with touching and partly overlapping objects, ability to work with different kinds of objects without prior configuration and good performance. △ Less

Submitted 22 July, 2018; originally announced July 2018.

Journal ref: In Proceedings of the International Conference on Graphics and Signal Processing (ICGSP '17). ACM, New York, NY, USA, 61-64 (2017)

arXiv:1702.08727 [pdf, other]

Improving the Neural GPU Architecture for Algorithm Learning

Authors: Karlis Freivalds, Renars Liepins

Abstract: Algorithm learning is a core problem in artificial intelligence with significant implications on automation level that can be achieved by machines. Recently deep learning methods are emerging for synthesizing an algorithm from its input-output examples, the most successful being the Neural GPU, capable of learning multiplication. We present several improvements to the Neural GPU that substantially… ▽ More Algorithm learning is a core problem in artificial intelligence with significant implications on automation level that can be achieved by machines. Recently deep learning methods are emerging for synthesizing an algorithm from its input-output examples, the most successful being the Neural GPU, capable of learning multiplication. We present several improvements to the Neural GPU that substantially reduces training time and improves generalization. We introduce a new technique - hard nonlinearities with saturation costs- that has general applicability. We also introduce a technique of diagonal gates that can be applied to active-memory models. The proposed architecture is the first capable of learning decimal multiplication end-to-end. △ Less

Submitted 4 July, 2018; v1 submitted 28 February, 2017; originally announced February 2017.

Comments: Minor edits

Journal ref: NAMPI v2 - Neural Abstract Machines & Program Induction v2, 2018

Showing 1–12 of 12 results for author: Freivalds, K