-
LLM-Forest: Ensemble Learning of LLMs with Graph-Augmented Prompts for Data Imputation
Authors:
Xinrui He,
Yikun Ban,
Jiaru Zou,
Tianxin Wei,
Curtiss B. Cook,
Jingrui He
Abstract:
Missing data imputation is a critical challenge in various domains, such as healthcare and finance, where data completeness is vital for accurate analysis. Large language models (LLMs), trained on vast corpora, have shown strong potential in data generation, making them a promising tool for data imputation. However, challenges persist in designing effective prompts for a finetuning-free process an…
▽ More
Missing data imputation is a critical challenge in various domains, such as healthcare and finance, where data completeness is vital for accurate analysis. Large language models (LLMs), trained on vast corpora, have shown strong potential in data generation, making them a promising tool for data imputation. However, challenges persist in designing effective prompts for a finetuning-free process and in mitigating the risk of LLM hallucinations. To address these issues, we propose a novel framework, LLM-Forest, which introduces a "forest" of few-shot learning LLM "trees" with confidence-based weighted voting, inspired by ensemble learning (Random Forest). This framework is established on a new concept of bipartite information graphs to identify high-quality relevant neighboring entries with both feature and value granularity. Extensive experiments on 9 real-world datasets demonstrate the effectiveness and efficiency of LLM-Forest.
△ Less
Submitted 4 January, 2025; v1 submitted 28 October, 2024;
originally announced October 2024.
-
Colorimetric skin tone scale for improved accuracy and reduced perceptual bias of human skin tone annotations
Authors:
Cynthia M. Cook,
John J. Howard,
Laura R. Rabbitt,
Isabelle M. Shuggi,
Yevgeniy B. Sirotin,
Jerry L. Tipton,
Arun R. Vemury
Abstract:
Human image datasets used to develop and evaluate technology should represent the diversity of human phenotypes, including skin tone. Datasets that include skin tone information frequently rely on manual skin tone ratings based on the Fitzpatrick Skin Type (FST) or the Monk Skin Tone (MST) scales in lieu of the actual measured skin tone of the image dataset subjects. However, perceived skin tone i…
▽ More
Human image datasets used to develop and evaluate technology should represent the diversity of human phenotypes, including skin tone. Datasets that include skin tone information frequently rely on manual skin tone ratings based on the Fitzpatrick Skin Type (FST) or the Monk Skin Tone (MST) scales in lieu of the actual measured skin tone of the image dataset subjects. However, perceived skin tone is subject to known biases and skin tone appearance in digital images can vary substantially depending on the capture camera and environment, confounding manual ratings. Surprisingly, the relationship between skin-tone ratings and measured skin tone has not been explored. To close this research gap, we measured the relationship between skin tone ratings from existing scales (FST, MST) and skin tone values measured by a calibrated colorimeter. We also propose and assess a novel Colorimetric Skin Tone (CST) scale developed based on prior colorimetric measurements. Using experiments requiring humans to rate their own skin tone and the skin tone of subjects in images, we show that the new CST scale is more sensitive, consistent, and colorimetrically accurate. While skin tone ratings appeared to correct for some color variation across images, they introduced biases related to race and other factors. These biases must be considered before using manual skin-tone ratings in technology evaluations or for engineering decisions.
△ Less
Submitted 28 October, 2024;
originally announced October 2024.
-
Uncertainty-preserving deep knowledge tracing with state-space models
Authors:
S. Thomas Christie,
Carson Cook,
Anna N. Rafferty
Abstract:
A central goal of both knowledge tracing and traditional assessment is to quantify student knowledge and skills at a given point in time. Deep knowledge tracing flexibly considers a student's response history but does not quantify epistemic uncertainty, while IRT and CDM compute measurement error but only consider responses to individual tests in isolation from a student's past responses. Elo and…
▽ More
A central goal of both knowledge tracing and traditional assessment is to quantify student knowledge and skills at a given point in time. Deep knowledge tracing flexibly considers a student's response history but does not quantify epistemic uncertainty, while IRT and CDM compute measurement error but only consider responses to individual tests in isolation from a student's past responses. Elo and BKT could bridge this divide, but the simplicity of the underlying models limits information sharing across skills and imposes strong inductive biases. To overcome these limitations, we introduce Dynamic LENS, a modeling paradigm that combines the flexible uncertainty-preserving properties of variational autoencoders with the principled information integration of Bayesian state-space models. Dynamic LENS allows information from student responses to be collected across time, while treating responses from the same test as exchangeable observations generated by a shared latent state. It represents student knowledge as Gaussian distributions in high-dimensional space and combines estimates both within tests and across time using Bayesian updating. We show that Dynamic LENS has similar predictive performance to competing models, while preserving the epistemic uncertainty - the deep learning analogue to measurement error - that DKT models lack. This approach provides a conceptual bridge across an important divide between models designed for formative practice and summative assessment.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
A Classification-Based Adaptive Segmentation Pipeline: Feasibility Study Using Polycystic Liver Disease and Metastases from Colorectal Cancer CT Images
Authors:
Peilong Wang,
Timothy L. Kline,
Andy D. Missert,
Cole J. Cook,
Matthew R. Callstrom,
Alex Chan,
Robert P. Hartman,
Zachary S. Kelm,
Panagiotis Korfiatis
Abstract:
Automated segmentation tools often encounter accuracy and adaptability issues when applied to images of different pathology. The purpose of this study is to explore the feasibility of building a workflow to efficiently route images to specifically trained segmentation models. By implementing a deep learning classifier to automatically classify the images and route them to appropriate segmentation…
▽ More
Automated segmentation tools often encounter accuracy and adaptability issues when applied to images of different pathology. The purpose of this study is to explore the feasibility of building a workflow to efficiently route images to specifically trained segmentation models. By implementing a deep learning classifier to automatically classify the images and route them to appropriate segmentation models, we hope that our workflow can segment the images with different pathology accurately. The data we used in this study are 350 CT images from patients affected by polycystic liver disease and 350 CT images from patients presenting with liver metastases from colorectal cancer. All images had the liver manually segmented by trained imaging analysts. Our proposed adaptive segmentation workflow achieved a statistically significant improvement for the task of total liver segmentation compared to the generic single segmentation model (non-parametric Wilcoxon signed rank test, n=100, p-value << 0.001). This approach is applicable in a wide range of scenarios and should prove useful in clinical implementations of segmentation pipelines.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Reproducibility in medical image radiomic studies: contribution of dynamic histogram binning
Authors:
Darryl E. Wright,
Cole Cook,
Jason Klug,
Panagiotis Korfiatis,
Timothy L. Kline
Abstract:
The de facto standard of dynamic histogram binning for radiomic feature extraction leads to an elevated sensitivity to fluctuations in annotated regions. This may impact the majority of radiomic studies published recently and contribute to issues regarding poor reproducibility of radiomic-based machine learning that has led to significant efforts for data harmonization; however, we believe the iss…
▽ More
The de facto standard of dynamic histogram binning for radiomic feature extraction leads to an elevated sensitivity to fluctuations in annotated regions. This may impact the majority of radiomic studies published recently and contribute to issues regarding poor reproducibility of radiomic-based machine learning that has led to significant efforts for data harmonization; however, we believe the issues highlighted here are comparatively neglected, but often remedied by choosing static binning.
The field of radiomics has improved through the development of community standards and open-source libraries such as PyRadiomics. But differences in image acquisition, systematic differences between observers' annotations, and preprocessing steps still pose challenges. These can change the distribution of voxels altering extracted features and can be exacerbated with dynamic binning.
△ Less
Submitted 9 November, 2022;
originally announced November 2022.
-
Bioblox 2.5D -- Developing an Educational Game Based on Protein Docking
Authors:
Frederic Fol Leymarie,
William Latham,
Guido Salimbeni,
Suhail A. Islam,
Christopher Reynolds,
Charlie Cook,
Luis Armas Suarez,
Richard Leinfellner,
Michael J. E. Sternberg
Abstract:
We present the development process of Bioblox2-5D, an educational biology game aimed at teenagers. The game content refers to protein docking and aims to improve learning about molecular shape complexity, the roles of charges in molecular docking and the scoring function to calculate binding affinity. We developed the game as part of a collaboration between the Computing Department at Goldsmiths,…
▽ More
We present the development process of Bioblox2-5D, an educational biology game aimed at teenagers. The game content refers to protein docking and aims to improve learning about molecular shape complexity, the roles of charges in molecular docking and the scoring function to calculate binding affinity. We developed the game as part of a collaboration between the Computing Department at Goldsmiths, University of London, and the Structural Bioinformatics group at Imperial College London. The team at Imperial provided the content requirements and validated the technical solution adopted in the game. The team at Goldsmiths designed and implemented the content requirements into a fun and stimulating educational puzzle game that supports teaching and motivates students to engage with biology. We illustrate the game design choices, the compromises and solutions that we applied to accomplish the desired learning outcomes. This paper aims to illustrate useful insights and inspirations in the context of educational game development for biology students.
△ Less
Submitted 3 May, 2022; v1 submitted 26 April, 2022;
originally announced April 2022.
-
Awe Versus Aww: The Effectiveness of Two Kinds of Positive Emotional Stimulation on Stress Reduction for Online Content Moderators
Authors:
Christine L. Cook,
Jie Cai,
Donghee Yvette Wohn
Abstract:
When people have the freedom to create and post content on the internet, particularly anonymously, they do not always respect the rules and regulations of the websites on which they post, leaving other unsuspecting users vulnerable to sexism, racism, threats, and other unacceptable content in their daily cyberspace diet. However, content moderators witness the worst of humanity on a daily basis in…
▽ More
When people have the freedom to create and post content on the internet, particularly anonymously, they do not always respect the rules and regulations of the websites on which they post, leaving other unsuspecting users vulnerable to sexism, racism, threats, and other unacceptable content in their daily cyberspace diet. However, content moderators witness the worst of humanity on a daily basis in place of the average netizen. This takes its toll on moderators, causing stress, fatigue, and emotional distress akin to the symptomology of post-traumatic stress disorder (PTSD). The goal of the present study was to explore whether adding positive stimuli to breaktimes-images of baby animals or beautiful, aweinspiring landscapes-could help reduce the negative side-effects of being a content moderator. To test this, we had over 300 experienced content moderators read and decide whether 200 fake text-based social media posts were acceptable or not for public consumption. Although we set out to test positive emotional stimulation, however, we actually found that it is the cumulative nature of the negative emotions that likely negates most of the effects of the intervention: the longer the person had practiced content moderation, the stronger their negative experience. Connections to compassion fatigue and how best to spend work breaks as a content moderator are discussed.
△ Less
Submitted 11 February, 2022;
originally announced February 2022.
-
Learning Context-Aware Representations of Subtrees
Authors:
Cedric Cook
Abstract:
This thesis tackles the problem of learning efficient representations of complex, structured data with a natural application to web page and element classification. We hypothesise that the context around the element inside the web page is of high value to the problem and is currently under exploited. This thesis aims to solve the problem of classifying web elements as subtrees of a DOM tree by als…
▽ More
This thesis tackles the problem of learning efficient representations of complex, structured data with a natural application to web page and element classification. We hypothesise that the context around the element inside the web page is of high value to the problem and is currently under exploited. This thesis aims to solve the problem of classifying web elements as subtrees of a DOM tree by also considering their context.
To achieve this, first we discuss current expert knowledge systems that work on structures, such as Tree-LSTM. Then, we propose context-aware extensions to this model. We show that the new model achieves an average F1-score of 0.7973 on a multi-class web classification task. This model generates better representations for various subtrees and may be used for applications such element classification, state estimators in reinforcement learning over the Web and more.
△ Less
Submitted 8 November, 2021;
originally announced November 2021.
-
Multi-facet Contextual Bandits: A Neural Network Perspective
Authors:
Yikun Ban,
Jingrui He,
Curtiss B. Cook
Abstract:
Contextual multi-armed bandit has shown to be an effective tool in recommender systems. In this paper, we study a novel problem of multi-facet bandits involving a group of bandits, each characterizing the users' needs from one unique aspect. In each round, for the given user, we need to select one arm from each bandit, such that the combination of all arms maximizes the final reward. This problem…
▽ More
Contextual multi-armed bandit has shown to be an effective tool in recommender systems. In this paper, we study a novel problem of multi-facet bandits involving a group of bandits, each characterizing the users' needs from one unique aspect. In each round, for the given user, we need to select one arm from each bandit, such that the combination of all arms maximizes the final reward. This problem can find immediate applications in E-commerce, healthcare, etc. To address this problem, we propose a novel algorithm, named MuFasa, which utilizes an assembled neural network to jointly learn the underlying reward functions of multiple bandits. It estimates an Upper Confidence Bound (UCB) linked with the expected reward to balance between exploitation and exploration. Under mild assumptions, we provide the regret analysis of MuFasa. It can achieve the near-optimal $\widetilde{ \mathcal{O}}((K+1)\sqrt{T})$ regret bound where $K$ is the number of bandits and $T$ is the number of played rounds. Furthermore, we conduct extensive experiments to show that MuFasa outperforms strong baselines on real-world data sets.
△ Less
Submitted 30 June, 2021; v1 submitted 6 June, 2021;
originally announced June 2021.
-
Deep Learning and Bayesian Deep Learning Based Gender Prediction in Multi-Scale Brain Functional Connectivity
Authors:
Gengyan Zhao,
Gyujoon Hwang,
Cole J. Cook,
Fang Liu,
Mary E. Meyerand,
Rasmus M. Birn
Abstract:
Brain gender differences have been known for a long time and are the possible reason for many psychological, psychiatric and behavioral differences between males and females. Predicting genders from brain functional connectivity (FC) can build the relationship between brain activities and gender, and extracting important gender related FC features from the prediction model offers a way to investig…
▽ More
Brain gender differences have been known for a long time and are the possible reason for many psychological, psychiatric and behavioral differences between males and females. Predicting genders from brain functional connectivity (FC) can build the relationship between brain activities and gender, and extracting important gender related FC features from the prediction model offers a way to investigate the brain gender difference. Current predictive models applied to gender prediction demonstrate good accuracies, but usually extract individual functional connections instead of connectivity patterns in the whole connectivity matrix as features. In addition, current models often omit the effect of the input brain FC scale on prediction and cannot give any model uncertainty information. Hence, in this study we propose to predict gender from multiple scales of brain FC with deep learning, which can extract full FC patterns as features. We further develop the understanding of the feature extraction mechanism in deep neural network (DNN) and propose a DNN feature ranking method to extract the highly important features based on their contributions to the prediction. Moreover, we apply Bayesian deep learning to the brain FC gender prediction, which as a probabilistic model can not only make accurate predictions but also generate model uncertainty for each prediction. Experiments were done on the high-quality Human Connectome Project S1200 release dataset comprising the resting state functional MRI data of 1003 healthy adults. First, DNN reaches 83.0%, 87.6%, 92.0%, 93.5% and 94.1% accuracies respectively with the FC input derived from 25, 50, 100, 200, 300 independent component analysis (ICA) components. DNN outperforms the conventional machine learning methods on the 25-ICA-component scale FC, but the linear machine learning method catches up as the number of ICA components increases...
△ Less
Submitted 17 May, 2020;
originally announced May 2020.
-
Deep Independently Recurrent Neural Network (IndRNN)
Authors:
Shuai Li,
Wanqing Li,
Chris Cook,
Yanbo Gao
Abstract:
Recurrent neural networks (RNNs) are known to be difficult to train due to the gradient vanishing and exploding problems and thus difficult to learn long-term patterns and construct deep networks. To address these problems, this paper proposes a new type of RNNs with the recurrent connection formulated as Hadamard product, referred to as independently recurrent neural network (IndRNN), where neuro…
▽ More
Recurrent neural networks (RNNs) are known to be difficult to train due to the gradient vanishing and exploding problems and thus difficult to learn long-term patterns and construct deep networks. To address these problems, this paper proposes a new type of RNNs with the recurrent connection formulated as Hadamard product, referred to as independently recurrent neural network (IndRNN), where neurons in the same layer are independent of each other and connected across layers. Due to the better behaved gradient backpropagation, IndRNN with regulated recurrent weights effectively addresses the gradient vanishing and exploding problems and thus long-term dependencies can be learned. Moreover, an IndRNN can work with non-saturated activation functions such as ReLU (rectified linear unit) and be still trained robustly. Different deeper IndRNN architectures, including the basic stacked IndRNN, residual IndRNN and densely connected IndRNN, have been investigated, all of which can be much deeper than the existing RNNs. Furthermore, IndRNN reduces the computation at each time step and can be over 10 times faster than the commonly used Long short-term memory (LSTM). Experimental results have shown that the proposed IndRNN is able to process very long sequences and construct very deep networks. Better performance has been achieved on various tasks with IndRNNs compared with the traditional RNN, LSTM and the popular Transformer.
△ Less
Submitted 9 December, 2020; v1 submitted 11 October, 2019;
originally announced October 2019.
-
GPU-based Ising Computing for Solving Balanced Min-Cut Graph Partitioning Problem
Authors:
Chase Cook,
Wentian Jin,
Sheldon X. -D. Tan
Abstract:
Ising computing provides a new computing paradigm for many hard combinatorial optimization problems. Ising computing essentially tries to solve the quadratic unconstrained binary optimization problem, which is also described by the Ising spin glass model and is also the basis for so-called Quantum Annealing computers. In this work, we propose a novel General Purpose Graphics Processing Unit (GPGPU…
▽ More
Ising computing provides a new computing paradigm for many hard combinatorial optimization problems. Ising computing essentially tries to solve the quadratic unconstrained binary optimization problem, which is also described by the Ising spin glass model and is also the basis for so-called Quantum Annealing computers. In this work, we propose a novel General Purpose Graphics Processing Unit (GPGPU) solver for the balanced min-cut graph partitioning problem, which has many applications in the area of design automation and others. Ising model solvers for the balanced min-cut partitioning problem have been proposed in the past. However, they have rarely been demonstrated in existing quantum computers for many meaningful problem sizes. One difficulty is the fact that the balancing constraint in the balanced min-cut problem can result in a complete graph in the Ising model, which makes each local update a global update. Such global update from each GPU thread will diminish the efficiency of GPU computing, which favors many localized memory accesses for each thread. To mitigate this problem, we propose an novel Global Decoupled Ising (GDI) model and the corresponding annealing algorithm, in which the local update is still preserved to maintain the efficiency. As a result, the new Ising solver essentially eliminates the need for the fully connected graph and will use a more efficient method to track and update global balance without sacrificing cut quality. Experimental results show that the proposed Ising-based min-cut partitioning method outperforms the state of art partitioning tool, METIS, on G-set graph benchmarks in terms of partitioning quality with similar CPU/GPU times.
△ Less
Submitted 1 August, 2019;
originally announced August 2019.
-
A simulated annealing approach to the student-project allocation problem
Authors:
Abigail H. Chown,
Christopher J. Cook,
Nigel B. Wilding
Abstract:
We describe a solution to the student-project allocation problem using simulated annealing. The problem involves assigning students to projects, where each student has ranked a fixed number of projects in order of preference. Each project is offered by a specific supervisor (or supervisors), and the goal is to find an optimal matching of students to projects taking into account the students' prefe…
▽ More
We describe a solution to the student-project allocation problem using simulated annealing. The problem involves assigning students to projects, where each student has ranked a fixed number of projects in order of preference. Each project is offered by a specific supervisor (or supervisors), and the goal is to find an optimal matching of students to projects taking into account the students' preferences, the constraint that only one student can be assigned to a given project, and the constraint that supervisors have a maximum workload. We show that when applied to a real dataset from a university physics department, simulated annealing allows the rapid determination of high quality solutions to this allocation problem. The quality of the solution is quantified by a satisfaction metric derived from empirical student survey data. Our approach provides high quality allocations in a matter of minutes that are as good as those found previously by the course organizer using a laborious trial-and-error approach. We investigate how the quality of the allocation is affected by the ratio of the number of projects offered to the number of students and the number of projects ranked by each student. We briefly discuss how our approach can be generalized to include other types of constraints and discuss its potential applicability to wider allocation problems.
△ Less
Submitted 22 October, 2018;
originally announced October 2018.
-
A Fusion Framework for Camouflaged Moving Foreground Detection in the Wavelet Domain
Authors:
Shuai Li,
Dinei Florencio,
Wanqing Li,
Yaqin Zhao,
Chris Cook
Abstract:
Detecting camouflaged moving foreground objects has been known to be difficult due to the similarity between the foreground objects and the background. Conventional methods cannot distinguish the foreground from background due to the small differences between them and thus suffer from under-detection of the camouflaged foreground objects. In this paper, we present a fusion framework to address thi…
▽ More
Detecting camouflaged moving foreground objects has been known to be difficult due to the similarity between the foreground objects and the background. Conventional methods cannot distinguish the foreground from background due to the small differences between them and thus suffer from under-detection of the camouflaged foreground objects. In this paper, we present a fusion framework to address this problem in the wavelet domain. We first show that the small differences in the image domain can be highlighted in certain wavelet bands. Then the likelihood of each wavelet coefficient being foreground is estimated by formulating foreground and background models for each wavelet band. The proposed framework effectively aggregates the likelihoods from different wavelet bands based on the characteristics of the wavelet transform. Experimental results demonstrated that the proposed method significantly outperformed existing methods in detecting camouflaged foreground objects. Specifically, the average F-measure for the proposed algorithm was 0.87, compared to 0.71 to 0.8 for the other state-of-the-art methods.
△ Less
Submitted 16 April, 2018;
originally announced April 2018.
-
Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN
Authors:
Shuai Li,
Wanqing Li,
Chris Cook,
Ce Zhu,
Yanbo Gao
Abstract:
Recurrent neural networks (RNNs) have been widely used for processing sequential data. However, RNNs are commonly difficult to train due to the well-known gradient vanishing and exploding problems and hard to learn long-term patterns. Long short-term memory (LSTM) and gated recurrent unit (GRU) were developed to address these problems, but the use of hyperbolic tangent and the sigmoid action funct…
▽ More
Recurrent neural networks (RNNs) have been widely used for processing sequential data. However, RNNs are commonly difficult to train due to the well-known gradient vanishing and exploding problems and hard to learn long-term patterns. Long short-term memory (LSTM) and gated recurrent unit (GRU) were developed to address these problems, but the use of hyperbolic tangent and the sigmoid action functions results in gradient decay over layers. Consequently, construction of an efficiently trainable deep network is challenging. In addition, all the neurons in an RNN layer are entangled together and their behaviour is hard to interpret. To address these problems, a new type of RNN, referred to as independently recurrent neural network (IndRNN), is proposed in this paper, where neurons in the same layer are independent of each other and they are connected across layers. We have shown that an IndRNN can be easily regulated to prevent the gradient exploding and vanishing problems while allowing the network to learn long-term dependencies. Moreover, an IndRNN can work with non-saturated activation functions such as relu (rectified linear unit) and be still trained robustly. Multiple IndRNNs can be stacked to construct a network that is deeper than the existing RNNs. Experimental results have shown that the proposed IndRNN is able to process very long sequences (over 5000 time steps), can be used to construct very deep networks (21 layers used in the experiment) and still be trained robustly. Better performances have been achieved on various tasks by using IndRNNs compared with the traditional RNN and LSTM. The code is available at https://github.com/Sunnydreamrain/IndRNN_Theano_Lasagne.
△ Less
Submitted 22 May, 2018; v1 submitted 13 March, 2018;
originally announced March 2018.
-
Foreground Detection in Camouflaged Scenes
Authors:
Shuai Li,
Dinei Florencio,
Yaqin Zhao,
Chris Cook,
Wanqing Li
Abstract:
Foreground detection has been widely studied for decades due to its importance in many practical applications. Most of the existing methods assume foreground and background show visually distinct characteristics and thus the foreground can be detected once a good background model is obtained. However, there are many situations where this is not the case. Of particular interest in video surveillanc…
▽ More
Foreground detection has been widely studied for decades due to its importance in many practical applications. Most of the existing methods assume foreground and background show visually distinct characteristics and thus the foreground can be detected once a good background model is obtained. However, there are many situations where this is not the case. Of particular interest in video surveillance is the camouflage case. For example, an active attacker camouflages by intentionally wearing clothes that are visually similar to the background. In such cases, even given a decent background model, it is not trivial to detect foreground objects. This paper proposes a texture guided weighted voting (TGWV) method which can efficiently detect foreground objects in camouflaged scenes. The proposed method employs the stationary wavelet transform to decompose the image into frequency bands. We show that the small and hardly noticeable differences between foreground and background in the image domain can be effectively captured in certain wavelet frequency bands. To make the final foreground decision, a weighted voting scheme is developed based on intensity and texture of all the wavelet bands with weights carefully designed. Experimental results demonstrate that the proposed method achieves superior performance compared to the current state-of-the-art results.
△ Less
Submitted 11 July, 2017;
originally announced July 2017.
-
A Fully Trainable Network with RNN-based Pooling
Authors:
Shuai Li,
Wanqing Li,
Chris Cook,
Ce Zhu,
Yanbo Gao
Abstract:
Pooling is an important component in convolutional neural networks (CNNs) for aggregating features and reducing computational burden. Compared with other components such as convolutional layers and fully connected layers which are completely learned from data, the pooling component is still handcrafted such as max pooling and average pooling. This paper proposes a learnable pooling function using…
▽ More
Pooling is an important component in convolutional neural networks (CNNs) for aggregating features and reducing computational burden. Compared with other components such as convolutional layers and fully connected layers which are completely learned from data, the pooling component is still handcrafted such as max pooling and average pooling. This paper proposes a learnable pooling function using recurrent neural networks (RNN) so that the pooling can be fully adapted to data and other components of the network, leading to an improved performance. Such a network with learnable pooling function is referred to as a fully trainable network (FTN). Experimental results have demonstrated that the proposed RNN-based pooling can well approximate the existing pooling functions and improve the performance of the network. Especially for small networks, the proposed FTN can improve the performance by seven percentage points in terms of error rate on the CIFAR-10 dataset compared with the traditional CNN.
△ Less
Submitted 16 June, 2017;
originally announced June 2017.