-
Learning to Move Objects with Fluid Streams in a Differentiable Simulation
Authors:
Karlis Freivalds,
Laura Leja,
Oskars Teikmanis
Abstract:
We introduce a method for manipulating objects in three-dimensional space using controlled fluid streams. To achieve this, we train a neural network controller in a differentiable simulation and evaluate it in a simulated environment consisting of an 8x8 grid of vertical emitters. By carrying out various horizontal displacement tasks such as moving objects to specific positions while reacting to e…
▽ More
We introduce a method for manipulating objects in three-dimensional space using controlled fluid streams. To achieve this, we train a neural network controller in a differentiable simulation and evaluate it in a simulated environment consisting of an 8x8 grid of vertical emitters. By carrying out various horizontal displacement tasks such as moving objects to specific positions while reacting to external perturbations, we demonstrate that a controller, trained with a limited number of iterations, can generalise to longer episodes and learn the complex dynamics of fluid-solid interactions. Importantly, our approach requires only the observation of the manipulated object's state, paving the way for the development of physical systems that enable contactless manipulation of objects using air streams.
△ Less
Submitted 28 April, 2024;
originally announced April 2024.
-
Discrete Denoising Diffusion Approach to Integer Factorization
Authors:
Karlis Freivalds,
Emils Ozolins,
Guntis Barzdins
Abstract:
Integer factorization is a famous computational problem unknown whether being solvable in the polynomial time. With the rise of deep neural networks, it is interesting whether they can facilitate faster factorization. We present an approach to factorization utilizing deep neural networks and discrete denoising diffusion that works by iteratively correcting errors in a partially-correct solution. T…
▽ More
Integer factorization is a famous computational problem unknown whether being solvable in the polynomial time. With the rise of deep neural networks, it is interesting whether they can facilitate faster factorization. We present an approach to factorization utilizing deep neural networks and discrete denoising diffusion that works by iteratively correcting errors in a partially-correct solution. To this end, we develop a new seq2seq neural network architecture, employ relaxed categorical distribution and adapt the reverse diffusion process to cope better with inaccuracies in the denoising step. The approach is able to find factors for integers of up to 56 bits long. Our analysis indicates that investment in training leads to an exponential decrease of sampling steps required at inference to achieve a given success rate, thus counteracting an exponential run-time increase depending on the bit-length.
△ Less
Submitted 11 September, 2023;
originally announced September 2023.
-
Denoising Diffusion for Sampling SAT Solutions
Authors:
Karlis Freivalds,
Sergejs Kozlovics
Abstract:
Generating diverse solutions to the Boolean Satisfiability Problem (SAT) is a hard computational problem with practical applications for testing and functional verification of software and hardware designs. We explore the way to generate such solutions using Denoising Diffusion coupled with a Graph Neural Network to implement the denoising function. We find that the obtained accuracy is similar to…
▽ More
Generating diverse solutions to the Boolean Satisfiability Problem (SAT) is a hard computational problem with practical applications for testing and functional verification of software and hardware designs. We explore the way to generate such solutions using Denoising Diffusion coupled with a Graph Neural Network to implement the denoising function. We find that the obtained accuracy is similar to the currently best purely neural method and the produced SAT solutions are highly diverse, even if the system is trained with non-random solutions from a standard solver.
△ Less
Submitted 30 November, 2022;
originally announced December 2022.
-
Unsupervised Training for Neural TSP Solver
Authors:
Elīza Gaile,
Andis Draguns,
Emīls Ozoliņš,
Kārlis Freivalds
Abstract:
There has been a growing number of machine learning methods for approximately solving the travelling salesman problem. However, these methods often require solved instances for training or use complex reinforcement learning approaches that need a large amount of tuning. To avoid these problems, we introduce a novel unsupervised learning approach. We use a relaxation of an integer linear program fo…
▽ More
There has been a growing number of machine learning methods for approximately solving the travelling salesman problem. However, these methods often require solved instances for training or use complex reinforcement learning approaches that need a large amount of tuning. To avoid these problems, we introduce a novel unsupervised learning approach. We use a relaxation of an integer linear program for TSP to construct a loss function that does not require correct instance labels. With variable discretization, its minimum coincides with the optimal or near-optimal solution. Furthermore, this loss function is differentiable and thus can be used to train neural networks directly. We use our loss function with a Graph Neural Network and design controlled experiments on both Euclidean and asymmetric TSP. Our approach has the advantage over supervised learning of not requiring large labelled datasets. In addition, the performance of our approach surpasses reinforcement learning for asymmetric TSP and is comparable to reinforcement learning for Euclidean instances. Our approach is also more stable and easier to train than reinforcement learning.
△ Less
Submitted 27 July, 2022;
originally announced July 2022.
-
Gates Are Not What You Need in RNNs
Authors:
Ronalds Zakovskis,
Andis Draguns,
Eliza Gaile,
Emils Ozolins,
Karlis Freivalds
Abstract:
Recurrent neural networks have flourished in many areas. Consequently, we can see new RNN cells being developed continuously, usually by creating or using gates in a new, original way. But what if we told you that gates in RNNs are redundant? In this paper, we propose a new recurrent cell called Residual Recurrent Unit (RRU) which beats traditional cells and does not employ a single gate. It is ba…
▽ More
Recurrent neural networks have flourished in many areas. Consequently, we can see new RNN cells being developed continuously, usually by creating or using gates in a new, original way. But what if we told you that gates in RNNs are redundant? In this paper, we propose a new recurrent cell called Residual Recurrent Unit (RRU) which beats traditional cells and does not employ a single gate. It is based on the residual shortcut connection, linear transformations, ReLU, and normalization. To evaluate our cell's effectiveness, we compare its performance against the widely-used GRU and LSTM cells and the recently proposed Mogrifier LSTM on several tasks including, polyphonic music modeling, language modeling, and sentiment analysis. Our experiments show that RRU outperforms the traditional gated units on most of these tasks. Also, it has better robustness to parameter selection, allowing immediate application in new tasks without much tuning. We have implemented the RRU in TensorFlow, and the code is made available at https://github.com/LUMII-Syslab/RRU .
△ Less
Submitted 21 November, 2023; v1 submitted 1 August, 2021;
originally announced August 2021.
-
Goal-Aware Neural SAT Solver
Authors:
Emils Ozolins,
Karlis Freivalds,
Andis Draguns,
Eliza Gaile,
Ronalds Zakovskis,
Sergejs Kozlovics
Abstract:
Modern neural networks obtain information about the problem and calculate the output solely from the input values. We argue that it is not always optimal, and the network's performance can be significantly improved by augmenting it with a query mechanism that allows the network at run time to make several solution trials and get feedback on the loss value on each trial. To demonstrate the capabili…
▽ More
Modern neural networks obtain information about the problem and calculate the output solely from the input values. We argue that it is not always optimal, and the network's performance can be significantly improved by augmenting it with a query mechanism that allows the network at run time to make several solution trials and get feedback on the loss value on each trial. To demonstrate the capabilities of the query mechanism, we formulate an unsupervised (not depending on labels) loss function for Boolean Satisfiability Problem (SAT) and theoretically show that it allows the network to extract rich information about the problem. We then propose a neural SAT solver with a query mechanism called QuerySAT and show that it outperforms the neural baseline on a wide range of SAT tasks.
△ Less
Submitted 30 May, 2022; v1 submitted 14 June, 2021;
originally announced June 2021.
-
Matrix Shuffle-Exchange Networks for Hard 2D Tasks
Authors:
Emīls Ozoliņš,
Kārlis Freivalds,
Agris Šostaks
Abstract:
Convolutional neural networks have become the main tools for processing two-dimensional data. They work well for images, yet convolutions have a limited receptive field that prevents its applications to more complex 2D tasks. We propose a new neural model, called Matrix Shuffle-Exchange network, that can efficiently exploit long-range dependencies in 2D data and has comparable speed to a convoluti…
▽ More
Convolutional neural networks have become the main tools for processing two-dimensional data. They work well for images, yet convolutions have a limited receptive field that prevents its applications to more complex 2D tasks. We propose a new neural model, called Matrix Shuffle-Exchange network, that can efficiently exploit long-range dependencies in 2D data and has comparable speed to a convolutional neural network. It is derived from Neural Shuffle-Exchange network and has $\mathcal{O}( \log{n})$ layers and $\mathcal{O}( n^2 \log{n})$ total time and space complexity for processing a $n \times n$ data matrix. We show that the Matrix Shuffle-Exchange network is well-suited for algorithmic and logical reasoning tasks on matrices and dense graphs, exceeding convolutional and graph neural network baselines. Its distinct advantage is the capability of retaining full long-range dependency modelling when generalizing to larger instances - much larger than could be processed with models equipped with a dense attention mechanism.
△ Less
Submitted 5 October, 2020; v1 submitted 29 June, 2020;
originally announced June 2020.
-
Residual Shuffle-Exchange Networks for Fast Processing of Long Sequences
Authors:
Andis Draguns,
Emīls Ozoliņš,
Agris Šostaks,
Matīss Apinis,
Kārlis Freivalds
Abstract:
Attention is a commonly used mechanism in sequence processing, but it is of O(n^2) complexity which prevents its application to long sequences. The recently introduced neural Shuffle-Exchange network offers a computation-efficient alternative, enabling the modelling of long-range dependencies in O(n log n) time. The model, however, is quite complex, involving a sophisticated gating mechanism deriv…
▽ More
Attention is a commonly used mechanism in sequence processing, but it is of O(n^2) complexity which prevents its application to long sequences. The recently introduced neural Shuffle-Exchange network offers a computation-efficient alternative, enabling the modelling of long-range dependencies in O(n log n) time. The model, however, is quite complex, involving a sophisticated gating mechanism derived from the Gated Recurrent Unit. In this paper, we present a simple and lightweight variant of the Shuffle-Exchange network, which is based on a residual network employing GELU and Layer Normalization. The proposed architecture not only scales to longer sequences but also converges faster and provides better accuracy. It surpasses the Shuffle-Exchange network on the LAMBADA language modelling task and achieves state-of-the-art performance on the MusicNet dataset for music transcription while being efficient in the number of parameters. We show how to combine the improved Shuffle-Exchange network with convolutional layers, establishing it as a useful building block in long sequence processing applications.
△ Less
Submitted 14 January, 2021; v1 submitted 6 April, 2020;
originally announced April 2020.
-
Neural Shuffle-Exchange Networks -- Sequence Processing in O(n log n) Time
Authors:
Kārlis Freivalds,
Emīls Ozoliņš,
Agris Šostaks
Abstract:
A key requirement in sequence to sequence processing is the modeling of long range dependencies. To this end, a vast majority of the state-of-the-art models use attention mechanism which is of O($n^2$) complexity that leads to slow execution for long sequences. We introduce a new Shuffle-Exchange neural network model for sequence to sequence tasks which have O(log n) depth and O(n log n) total com…
▽ More
A key requirement in sequence to sequence processing is the modeling of long range dependencies. To this end, a vast majority of the state-of-the-art models use attention mechanism which is of O($n^2$) complexity that leads to slow execution for long sequences. We introduce a new Shuffle-Exchange neural network model for sequence to sequence tasks which have O(log n) depth and O(n log n) total complexity. We show that this model is powerful enough to infer efficient algorithms for common algorithmic benchmarks including sorting, addition and multiplication. We evaluate our architecture on the challenging LAMBADA question answering dataset and compare it with the state-of-the-art models which use attention. Our model achieves competitive accuracy and scales to sequences with more than a hundred thousand of elements. We are confident that the proposed model has the potential for building more efficient architectures for processing large interrelated data in language modeling, music generation and other application domains.
△ Less
Submitted 28 October, 2019; v1 submitted 18 July, 2019;
originally announced July 2019.
-
Graph Compact Orthogonal Layout Algorithm
Authors:
Karlis Freivalds,
Jans Glagolevs
Abstract:
There exist many orthogonal graph drawing algorithms that minimize edge crossings or edge bends, however they produce unsatisfactory drawings in many practical cases. In this paper we present a grid-based algorithm for drawing orthogonal graphs with nodes of prescribed size. It distinguishes by creating pleasant and compact drawings in relatively small running time. The main idea is to minimize th…
▽ More
There exist many orthogonal graph drawing algorithms that minimize edge crossings or edge bends, however they produce unsatisfactory drawings in many practical cases. In this paper we present a grid-based algorithm for drawing orthogonal graphs with nodes of prescribed size. It distinguishes by creating pleasant and compact drawings in relatively small running time. The main idea is to minimize the total edge length that implicitly minimizes crossings and makes the drawing easy to comprehend. The algorithm is based on combining local and global improvements. Local improvements are moving each node to a new place and swapping of nodes. Global improvement is based on constrained quadratic programming approach that minimizes the total edge length while keeping node relative positions.
△ Less
Submitted 24 July, 2018;
originally announced July 2018.
-
A Statistical Method for Object Counting
Authors:
Jans Glagolevs,
Karlis Freivalds
Abstract:
In this paper we present a new object counting method that is intended for counting similarly sized and mostly round objects. Unlike many other algorithms of the same purpose, the proposed method does not rely on identifying every object, it uses statistical data obtained from the image instead. The method is evaluated on images with human bone cells, oranges and pills achieving good accuracy. Its…
▽ More
In this paper we present a new object counting method that is intended for counting similarly sized and mostly round objects. Unlike many other algorithms of the same purpose, the proposed method does not rely on identifying every object, it uses statistical data obtained from the image instead. The method is evaluated on images with human bone cells, oranges and pills achieving good accuracy. Its strengths are ability to deal with touching and partly overlapping objects, ability to work with different kinds of objects without prior configuration and good performance.
△ Less
Submitted 22 July, 2018;
originally announced July 2018.
-
Improving the Neural GPU Architecture for Algorithm Learning
Authors:
Karlis Freivalds,
Renars Liepins
Abstract:
Algorithm learning is a core problem in artificial intelligence with significant implications on automation level that can be achieved by machines. Recently deep learning methods are emerging for synthesizing an algorithm from its input-output examples, the most successful being the Neural GPU, capable of learning multiplication. We present several improvements to the Neural GPU that substantially…
▽ More
Algorithm learning is a core problem in artificial intelligence with significant implications on automation level that can be achieved by machines. Recently deep learning methods are emerging for synthesizing an algorithm from its input-output examples, the most successful being the Neural GPU, capable of learning multiplication. We present several improvements to the Neural GPU that substantially reduces training time and improves generalization. We introduce a new technique - hard nonlinearities with saturation costs- that has general applicability. We also introduce a technique of diagonal gates that can be applied to active-memory models. The proposed architecture is the first capable of learning decimal multiplication end-to-end.
△ Less
Submitted 4 July, 2018; v1 submitted 28 February, 2017;
originally announced February 2017.