-
3DPyranet Features Fusion for Spatio-temporal Feature Learning
Authors:
Ihsan Ullah,
Alfredo Petrosino
Abstract:
Convolutional neural network (CNN) slides a kernel over the whole image to produce an output map. This kernel scheme reduces the number of parameters with respect to a fully connected neural network (NN). While CNN has proven to be an effective model in recognition of handwritten characters and traffic signal sign boards, etc. recently, its deep variants have proven to be effective in similar as w…
▽ More
Convolutional neural network (CNN) slides a kernel over the whole image to produce an output map. This kernel scheme reduces the number of parameters with respect to a fully connected neural network (NN). While CNN has proven to be an effective model in recognition of handwritten characters and traffic signal sign boards, etc. recently, its deep variants have proven to be effective in similar as well as more challenging applications like object, scene and action recognition. Deep CNN add more layers and kernels to the classical CNN, increasing the number of parameters, and partly reducing the main advantage of CNN which is less parameters. In this paper, a 3D pyramidal neural network called 3DPyraNet and a discriminative approach for spatio-temporal feature learning based on it, called 3DPyraNet-F, are proposed. 3DPyraNet introduces a new weighting scheme which learns features from both spatial and temporal dimensions analyzing multiple adjacent frames and keeping a biological plausible structure. It keeps the spatial topology of the input image and presents fewer parameters and lower computational and memory costs compared to both fully connected NNs and recent deep CNNs. 3DPyraNet-F extract the features maps of the highest layer of the learned network, fuse them in a single vector, and provide it as input in such a way to a linear-SVM classifier that enhances the recognition of human actions and dynamic scenes from the videos. Encouraging results are reported with 3DPyraNet in real-world environments, especially in the presence of camera induced motion. Further, 3DPyraNet-F clearly outperforms the state-of-the-art on three benchmark datasets and shows comparable result for the fourth.
△ Less
Submitted 26 April, 2025;
originally announced April 2025.
-
An Analysis of Frequent Patterns in the World Trade Web
Authors:
Maddalena D'Anna,
Alfredo Petrosino
Abstract:
This paper employs a weighted network approach to study the empirical properties of the web of trade relationships among world countries, and its evolution over time. We show that most countries are characterized by weak trade links; yet, there exists a group of countries featuring a large number of strong relationships, thus hinting to a core-periphery structure. The World Trade Web (WTW) is char…
▽ More
This paper employs a weighted network approach to study the empirical properties of the web of trade relationships among world countries, and its evolution over time. We show that most countries are characterized by weak trade links; yet, there exists a group of countries featuring a large number of strong relationships, thus hinting to a core-periphery structure. The World Trade Web (WTW) is characterized by the following representation: a directed graph connecting world Countries with trade relationships, with the aim of finding its topological characterization in terms of motifs and isolating the key factors underlying its evolution. Frequent patterns can identify channels or infrastructures to be strengthened and can help in choosing the most suitable message routing schema or network protocol. In general, frequent patterns have been called {\it motifs} and overrepresented motifs have been recognized to be the low-level building blocks of networks and to be useful to explain many of their properties, playing a relevant role in determining their dynamic and evolution. In this paper triadic motifs are found first partitioning a network by strength of connections and then analyzing the partitions separately. The WTW has been split based on the weights of the graph to highlight structural differences between the big players in terms of volumes of trade and the rest of the world. As test case, the period 2003-2010 has been analyzed, to show the structural effect of the economical crisis in the year 2007.
△ Less
Submitted 20 March, 2018;
originally announced March 2018.
-
About Pyramid Structure in Convolutional Neural Networks
Authors:
Ihsan Ullah,
Alfredo Petrosino
Abstract:
Deep convolutional neural networks (CNN) brought revolution without any doubt to various challenging tasks, mainly in computer vision. However, their model designing still requires attention to reduce number of learnable parameters, with no meaningful reduction in performance. In this paper we investigate to what extend CNN may take advantage of pyramid structure typical of biological neurons. A g…
▽ More
Deep convolutional neural networks (CNN) brought revolution without any doubt to various challenging tasks, mainly in computer vision. However, their model designing still requires attention to reduce number of learnable parameters, with no meaningful reduction in performance. In this paper we investigate to what extend CNN may take advantage of pyramid structure typical of biological neurons. A generalized statement over convolutional layers from input till fully connected layer is introduced that helps further in understanding and designing a successful deep network. It reduces ambiguity, number of parameters, and their size on disk without degrading overall accuracy. Performance are shown on state-of-the-art models for MNIST, Cifar-10, Cifar-100, and ImageNet-12 datasets. Despite more than 80% reduction in parameters for Caffe_LENET, challenging results are obtained. Further, despite 10-20% reduction in training data along with 10-40% reduction in parameters for AlexNet model and its variations, competitive results are achieved when compared to similar well-engineered deeper architectures.
△ Less
Submitted 14 August, 2016;
originally announced August 2016.
-
Towards Benchmarking Scene Background Initialization
Authors:
Lucia Maddalena,
Alfredo Petrosino
Abstract:
Given a set of images of a scene taken at different times, the availability of an initial background model that describes the scene without foreground objects is the prerequisite for a wide range of applications, ranging from video surveillance to computational photography. Even though several methods have been proposed for scene background initialization, the lack of a common groundtruthed datase…
▽ More
Given a set of images of a scene taken at different times, the availability of an initial background model that describes the scene without foreground objects is the prerequisite for a wide range of applications, ranging from video surveillance to computational photography. Even though several methods have been proposed for scene background initialization, the lack of a common groundtruthed dataset and of a common set of metrics makes it difficult to compare their performance. To move first steps towards an easy and fair comparison of these methods, we assembled a dataset of sequences frequently adopted for background initialization, selected or created ground truths for quantitative evaluation through a selected suite of metrics, and compared results obtained by some existing methods, making all the material publicly available.
△ Less
Submitted 12 June, 2015;
originally announced June 2015.