-
An Improved Rapidly Exploring Random Tree Algorithm for Path Planning in Configuration Spaces with Narrow Channels
Authors:
Mathew Mithra Noel,
Akshay Chawla
Abstract:
Rapidly-exploring Random Tree (RRT) algorithms have been applied successfully to challenging robot motion planning and under-actuated nonlinear control problems. However a fundamental limitation of the RRT approach is the slow convergence in configuration spaces with narrow channels because of the small probability of generating test points inside narrow channels. This paper presents an improved R…
▽ More
Rapidly-exploring Random Tree (RRT) algorithms have been applied successfully to challenging robot motion planning and under-actuated nonlinear control problems. However a fundamental limitation of the RRT approach is the slow convergence in configuration spaces with narrow channels because of the small probability of generating test points inside narrow channels. This paper presents an improved RRT algorithm that takes advantage of narrow channels between the initial and goal states to find shorter paths by improving the exploration of narrow regions in the configuration space. The proposed algorithm detects the presence of narrow channel by checking for collision of neighborhood points with the infeasible set and attempts to add points within narrow channels with a predetermined bias. This approach is compared with the classical RRT and its variants on a variety of benchmark planning problems. Simulation results indicate that the algorithm presented in this paper computes a significantly shorter path in spaces with narrow channels.
△ Less
Submitted 1 November, 2024;
originally announced November 2024.
-
A Significantly Better Class of Activation Functions Than ReLU Like Activation Functions
Authors:
Mathew Mithra Noel,
Yug Oswal
Abstract:
This paper introduces a significantly better class of activation functions than the almost universally used ReLU like and Sigmoidal class of activation functions. Two new activation functions referred to as the Cone and Parabolic-Cone that differ drastically from popular activation functions and significantly outperform these on the CIFAR-10 and Imagenette benchmmarks are proposed. The cone activa…
▽ More
This paper introduces a significantly better class of activation functions than the almost universally used ReLU like and Sigmoidal class of activation functions. Two new activation functions referred to as the Cone and Parabolic-Cone that differ drastically from popular activation functions and significantly outperform these on the CIFAR-10 and Imagenette benchmmarks are proposed. The cone activation functions are positive only on a finite interval and are strictly negative except at the end-points of the interval, where they become zero. Thus the set of inputs that produce a positive output for a neuron with cone activation functions is a hyperstrip and not a half-space as is the usual case. Since a hyper strip is the region between two parallel hyper-planes, it allows neurons to more finely divide the input feature space into positive and negative classes than with infinitely wide half-spaces. In particular the XOR function can be learn by a single neuron with cone-like activation functions. Both the cone and parabolic-cone activation functions are shown to achieve higher accuracies with significantly fewer neurons on benchmarks. The results presented in this paper indicate that many nonlinear real-world datasets may be separated with fewer hyperstrips than half-spaces. The Cone and Parabolic-Cone activation functions have larger derivatives than ReLU and are shown to significantly speedup training.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Efficient Vectorized Backpropagation Algorithms for Training Feedforward Networks Composed of Quadratic Neurons
Authors:
Mathew Mithra Noel,
Venkataraman Muthiah-Nakarajan,
Yug D Oswal
Abstract:
Higher order artificial neurons whose outputs are computed by applying an activation function to a higher order multinomial function of the inputs have been considered in the past, but did not gain acceptance due to the extra parameters and computational cost. However, higher order neurons have significantly greater learning capabilities since the decision boundaries of higher order neurons can be…
▽ More
Higher order artificial neurons whose outputs are computed by applying an activation function to a higher order multinomial function of the inputs have been considered in the past, but did not gain acceptance due to the extra parameters and computational cost. However, higher order neurons have significantly greater learning capabilities since the decision boundaries of higher order neurons can be complex surfaces instead of just hyperplanes. The boundary of a single quadratic neuron can be a general hyper-quadric surface allowing it to learn many nonlinearly separable datasets. Since quadratic forms can be represented by symmetric matrices, only $\frac{n(n+1)}{2}$ additional parameters are needed instead of $n^2$. A quadratic Logistic regression model is first presented. Solutions to the XOR problem with a single quadratic neuron are considered. The complete vectorized equations for both forward and backward propagation in feedforward networks composed of quadratic neurons are derived. A reduced parameter quadratic neural network model with just $ n $ additional parameters per neuron that provides a compromise between learning ability and computational cost is presented. Comparison on benchmark classification datasets are used to demonstrate that a final layer of quadratic neurons enables networks to achieve higher accuracy with significantly fewer hidden layer neurons. In particular this paper shows that any dataset composed of $\mathcal{C}$ bounded clusters can be separated with only a single layer of $\mathcal{C}$ quadratic neurons.
△ Less
Submitted 21 April, 2025; v1 submitted 4 October, 2023;
originally announced October 2023.
-
Alternate Loss Functions for Classification and Robust Regression Can Improve the Accuracy of Artificial Neural Networks
Authors:
Mathew Mithra Noel,
Arindam Banerjee,
Yug Oswal,
Geraldine Bessie Amali D,
Venkataraman Muthiah-Nakarajan
Abstract:
All machine learning algorithms use a loss, cost, utility or reward function to encode the learning objective and oversee the learning process. This function that supervises learning is a frequently unrecognized hyperparameter that determines how incorrect outputs are penalized and can be tuned to improve performance. This paper shows that training speed and final accuracy of neural networks can s…
▽ More
All machine learning algorithms use a loss, cost, utility or reward function to encode the learning objective and oversee the learning process. This function that supervises learning is a frequently unrecognized hyperparameter that determines how incorrect outputs are penalized and can be tuned to improve performance. This paper shows that training speed and final accuracy of neural networks can significantly depend on the loss function used to train neural networks. In particular derivative values can be significantly different with different loss functions leading to significantly different performance after gradient descent based Backpropagation (BP) training. This paper explores the effect on performance of using new loss functions that are also convex but penalize errors differently compared to the popular Cross-entropy loss. Two new classification loss functions that significantly improve performance on a wide variety of benchmark tasks are proposed. A new loss function call smooth absolute error that outperforms the Squared error, Huber and Log-Cosh losses on datasets with significantly many outliers is proposed. This smooth absolute error loss function is infinitely differentiable and more closely approximates the absolute error loss compared to the Huber and Log-Cosh losses used for robust regression.
△ Less
Submitted 5 November, 2024; v1 submitted 17 March, 2023;
originally announced March 2023.
-
Biologically Inspired Oscillating Activation Functions Can Bridge the Performance Gap between Biological and Artificial Neurons
Authors:
Matthew Mithra Noel,
Shubham Bharadwaj,
Venkataraman Muthiah-Nakarajan,
Praneet Dutta,
Geraldine Bessie Amali
Abstract:
The recent discovery of special human neocortical pyramidal neurons that can individually learn the XOR function highlights the significant performance gap between biological and artificial neurons. The output of these pyramidal neurons first increases to a maximum with input and then decreases. Artificial neurons with similar characteristics can be designed with oscillating activation functions.…
▽ More
The recent discovery of special human neocortical pyramidal neurons that can individually learn the XOR function highlights the significant performance gap between biological and artificial neurons. The output of these pyramidal neurons first increases to a maximum with input and then decreases. Artificial neurons with similar characteristics can be designed with oscillating activation functions. Oscillating activation functions have multiple zeros allowing single neurons to have multiple hyper-planes in their decision boundary. This enables even single neurons to learn the XOR function. This paper proposes four new oscillating activation functions inspired by human pyramidal neurons that can also individually learn the XOR function. Oscillating activation functions are non-saturating for all inputs unlike popular activation functions, leading to improved gradient flow and faster convergence. Using oscillating activation functions instead of popular monotonic or non-monotonic single-zero activation functions enables neural networks to train faster and solve classification problems with fewer layers. An extensive comparison of 23 activation functions on CIFAR 10, CIFAR 100, and Imagentte benchmarks is presented and the oscillating activation functions proposed in this paper are shown to outperform all known popular activation functions.
△ Less
Submitted 10 May, 2023; v1 submitted 7 November, 2021;
originally announced November 2021.
-
Growing Cosine Unit: A Novel Oscillatory Activation Function That Can Speedup Training and Reduce Parameters in Convolutional Neural Networks
Authors:
Mathew Mithra Noel,
Arunkumar L,
Advait Trivedi,
Praneet Dutta
Abstract:
Convolutional neural networks have been successful in solving many socially important and economically significant problems. This ability to learn complex high-dimensional functions hierarchically can be attributed to the use of nonlinear activation functions. A key discovery that made training deep networks feasible was the adoption of the Rectified Linear Unit (ReLU) activation function to allev…
▽ More
Convolutional neural networks have been successful in solving many socially important and economically significant problems. This ability to learn complex high-dimensional functions hierarchically can be attributed to the use of nonlinear activation functions. A key discovery that made training deep networks feasible was the adoption of the Rectified Linear Unit (ReLU) activation function to alleviate the vanishing gradient problem caused by using saturating activation functions. Since then, many improved variants of the ReLU activation have been proposed. However, a majority of activation functions used today are non-oscillatory and monotonically increasing due to their biological plausibility. This paper demonstrates that oscillatory activation functions can improve gradient flow and reduce network size. Two theorems on limits of non-oscillatory activation functions are presented. A new oscillatory activation function called Growing Cosine Unit(GCU) defined as $C(z) = z\cos z$ that outperforms Sigmoids, Swish, Mish and ReLU on a variety of architectures and benchmarks is presented. The GCU activation has multiple zeros enabling single GCU neurons to have multiple hyperplanes in the decision boundary. This allows single GCU neurons to learn the XOR function without feature engineering. Experimental results indicate that replacing the activation function in the convolution layers with the GCU activation function significantly improves performance on CIFAR-10, CIFAR-100 and Imagenette.
△ Less
Submitted 12 January, 2023; v1 submitted 29 August, 2021;
originally announced August 2021.