-
Improving statistical learning methods via features selection without replacement sampling and random projection
Authors:
Sulaiman khan,
Muhammad Ahmad,
Fida Ullah,
Carlos Aguilar Ibañez,
José Eduardo Valdez Rodriguez
Abstract:
Cancer is fundamentally a genetic disease characterized by genetic and epigenetic alterations that disrupt normal gene expression, leading to uncontrolled cell growth and metastasis. High-dimensional microarray datasets pose challenges for classification models due to the "small n, large p" problem, resulting in overfitting. This study makes three different key contributions: 1) we propose a machi…
▽ More
Cancer is fundamentally a genetic disease characterized by genetic and epigenetic alterations that disrupt normal gene expression, leading to uncontrolled cell growth and metastasis. High-dimensional microarray datasets pose challenges for classification models due to the "small n, large p" problem, resulting in overfitting. This study makes three different key contributions: 1) we propose a machine learning-based approach integrating the Feature Selection Without Re-placement (FSWOR) technique and a projection method to improve classification accuracy. 2) We apply the Kendall statistical test to identify the most significant genes from the brain cancer mi-croarray dataset (GSE50161), reducing the feature space from 54,675 to 20,890 genes.3) we apply machine learning models using k-fold cross validation techniques in which our model incorpo-rates ensemble classifiers with LDA projection and Naïve Bayes, achieving a test score of 96%, outperforming existing methods by 9.09%. The results demonstrate the effectiveness of our ap-proach in high-dimensional gene expression analysis, improving classification accuracy while mitigating overfitting. This study contributes to cancer biomarker discovery, offering a robust computational method for analyzing microarray data.
△ Less
Submitted 28 May, 2025;
originally announced June 2025.
-
Bayesian estimation of Unit-Weibull distribution based on dual generalized order statistics with application to the Cotton Production Data
Authors:
Qazi J. Azhad,
Abdul Nasir Khan,
Bhagwati Devi,
Jahangir Sabbir Khan,
Ayush Tripathi
Abstract:
The Unit Weibull distribution with parameters $α$ and $β$ is considered to study in the context of dual generalized order statistics. For the analysis purpose, Bayes estimators based on symmetric and asymmetric loss functions are obtained. The methods which are utilized for Bayesian estimation are approximation and simulation tools such as Lindley, Tierney-Kadane and Markov chain Monte Carlo metho…
▽ More
The Unit Weibull distribution with parameters $α$ and $β$ is considered to study in the context of dual generalized order statistics. For the analysis purpose, Bayes estimators based on symmetric and asymmetric loss functions are obtained. The methods which are utilized for Bayesian estimation are approximation and simulation tools such as Lindley, Tierney-Kadane and Markov chain Monte Carlo methods. The authors have considered squared error loss function as symmetric and LINEX and general entropy loss function as asymmetric loss functions. After presenting the mathematical results, a simulation study is conducted to exhibit the performances of various derived estimators. As this study is considered for the dual generalized order statistics that is unification of models based distinct ordered random variable such as order statistics, record values, etc. This provides flexibility in our results and in continuation of this, the cotton production data of USA is analyzed for both submodels of ordered random variables: order statistics and record values.
△ Less
Submitted 5 February, 2025;
originally announced February 2025.
-
On the Effects of Irrelevant Variables in Treatment Effect Estimation with Deep Disentanglement
Authors:
Ahmad Saeed Khan,
Erik Schaffernicht,
Johannes Andreas Stork
Abstract:
Estimating treatment effects from observational data is paramount in healthcare, education, and economics, but current deep disentanglement-based methods to address selection bias are insufficiently handling irrelevant variables. We demonstrate in experiments that this leads to prediction errors. We disentangle pre-treatment variables with a deep embedding method and explicitly identify and repres…
▽ More
Estimating treatment effects from observational data is paramount in healthcare, education, and economics, but current deep disentanglement-based methods to address selection bias are insufficiently handling irrelevant variables. We demonstrate in experiments that this leads to prediction errors. We disentangle pre-treatment variables with a deep embedding method and explicitly identify and represent irrelevant variables, additionally to instrumental, confounding and adjustment latent factors. To this end, we introduce a reconstruction objective and create an embedding space for irrelevant variables using an attached autoencoder. Instead of relying on serendipitous suppression of irrelevant variables as in previous deep disentanglement approaches, we explicitly force irrelevant variables into this embedding space and employ orthogonalization to prevent irrelevant information from leaking into the latent space representations of the other factors. Our experiments with synthetic and real-world benchmark datasets show that we can better identify irrelevant variables and more precisely predict treatment effects than previous methods, while prediction quality degrades less when additional irrelevant variables are introduced.
△ Less
Submitted 26 August, 2024; v1 submitted 29 July, 2024;
originally announced July 2024.
-
Low-order outcomes and clustered designs: combining design and analysis for causal inference under network interference
Authors:
Matthew Eichhorn,
Samir Khan,
Johan Ugander,
Christina Lee Yu
Abstract:
Variance reduction for causal inference in the presence of network interference is often achieved through either outcome modeling, which is typically analyzed under unit-randomized Bernoulli designs, or clustered experimental designs, which are typically analyzed without strong parametric assumptions. In this work, we study the intersection of these two approaches and consider the problem of estim…
▽ More
Variance reduction for causal inference in the presence of network interference is often achieved through either outcome modeling, which is typically analyzed under unit-randomized Bernoulli designs, or clustered experimental designs, which are typically analyzed without strong parametric assumptions. In this work, we study the intersection of these two approaches and consider the problem of estimation in low-order outcome models using data from a general experimental design. Our contributions are threefold. First, we present an estimator of the total treatment effect (also called the global average treatment effect) in a low-degree outcome model when the data are collected under general experimental designs, generalizing previous results for Bernoulli designs. We refer to this estimator as the pseudoinverse estimator and give bounds on its bias and variance in terms of properties of the experimental design. Second, we evaluate these bounds for the case of cluster randomized designs with both Bernoulli and complete randomization. For clustered Bernoulli randomization, we find that our estimator is always unbiased and that its variance scales like the smaller of the variance obtained from a low-order assumption and the variance obtained from cluster randomization, showing that combining these variance reduction strategies is preferable to using either individually. For clustered complete randomization, we find a notable bias-variance trade-off mediated by specific features of the clustering. Third, when choosing a clustered experimental design, our bounds can be used to select a clustering from a set of candidate clusterings. Across a range of graphs and clustering algorithms, we show that our method consistently selects clusterings that perform well on a range of response models, suggesting that our bounds are useful to practitioners.
△ Less
Submitted 11 July, 2024; v1 submitted 13 May, 2024;
originally announced May 2024.
-
Individual participant data from digital sources informed and improved precision in the evaluation of predictive biomarkers in Bayesian network meta-analysis
Authors:
Chinyereugo M Umemneku-Chikere,
Lorna Wheaton,
Heather Poad,
Devleena Ray,
Ilse Cuevas Andrade,
Sam Khan,
Paul Tappenden,
Keith R Abrams,
Rhiannon K Owen,
Sylwia Bujkiewicz
Abstract:
Objective: We aimed to develop a meta-analytic model for evaluation of predictive biomarkers and targeted therapies, utilising data from digital sources when individual participant data (IPD) from randomised controlled trials (RCTs) are unavailable.
Methods: A Bayesian network meta-regression model, combining aggregate data (AD) from RCTs and IPD, was developed for modelling time-to-event data t…
▽ More
Objective: We aimed to develop a meta-analytic model for evaluation of predictive biomarkers and targeted therapies, utilising data from digital sources when individual participant data (IPD) from randomised controlled trials (RCTs) are unavailable.
Methods: A Bayesian network meta-regression model, combining aggregate data (AD) from RCTs and IPD, was developed for modelling time-to-event data to evaluate predictive biomarkers. IPD were sourced from electronic health records, using target trial emulation approach, or digitised Kaplan-Meier curves. The model is illustrated using two examples; breast cancer with a hormone receptor biomarker, and metastatic colorectal cancer with the Kirsten Rat Sarcoma (KRAS) biomarker.
Results: The model developed allowed for estimation of treatment effects in two subgroups of patients defined by their biomarker status. Effectiveness of taxane did not differ in hormone receptor positive and negative breast cancer patients. Epidermal growth factor receptor (EGFR) inhibitors were more effective than chemotherapy in KRAS wild type colorectal cancer patients but not in patients with KRAS mutant status. Use of IPD reduced uncertainty of the sub-group specific treatment effect estimates by up to 49%.
Conclusion: Utilisation of IPD allowed for more detailed evaluation of predictive biomarkers and cancer therapies and improved precision of the estimates compared to use of AD alone.
△ Less
Submitted 7 August, 2023;
originally announced August 2023.
-
Off-policy evaluation beyond overlap: partial identification through smoothness
Authors:
Samir Khan,
Martin Saveski,
Johan Ugander
Abstract:
Off-policy evaluation (OPE) is the problem of estimating the value of a target policy using historical data collected under a different logging policy. OPE methods typically assume overlap between the target and logging policy, enabling solutions based on importance weighting and/or imputation. In this work, we approach OPE without assuming either overlap or a well-specified model by considering a…
▽ More
Off-policy evaluation (OPE) is the problem of estimating the value of a target policy using historical data collected under a different logging policy. OPE methods typically assume overlap between the target and logging policy, enabling solutions based on importance weighting and/or imputation. In this work, we approach OPE without assuming either overlap or a well-specified model by considering a strategy based on partial identification under non-parametric assumptions on the conditional mean function, focusing especially on Lipschitz smoothness. Under such smoothness assumptions, we formulate a pair of linear programs whose optimal values upper and lower bound the contributions of the no-overlap region to the off-policy value. We show that these linear programs have a concise closed form solution that can be computed efficiently and that their solutions converge, under the Lipschitz assumption, to the sharp partial identification bounds on the off-policy value. Furthermore, we show that the rate of convergence is minimax optimal, up to log factors. We deploy our methods on two semi-synthetic examples, and obtain informative and valid bounds that are tighter than those possible without smoothness assumptions.
△ Less
Submitted 8 March, 2024; v1 submitted 19 May, 2023;
originally announced May 2023.
-
From Misalignment to Synergy: Analysis of Patents from Indian Universities & Research Institutions
Authors:
Shoyeb Khan,
Satyendra Kumar Sharma,
Arnab Kumar Laha
Abstract:
Indian Universities and Research Institutions have been the cornerstone of human resource development in the country, nurturing bright minds and shaping the leaders of tomorrow. Their unwavering commitment to excellence in education and research has not only empowered individuals but has also made significant contributions to the overall growth and progress of the nation. Despite the significant s…
▽ More
Indian Universities and Research Institutions have been the cornerstone of human resource development in the country, nurturing bright minds and shaping the leaders of tomorrow. Their unwavering commitment to excellence in education and research has not only empowered individuals but has also made significant contributions to the overall growth and progress of the nation. Despite the significant strides made by Indian universities and research institutions, the country still lags behind many developed nations in terms of the number of patents filed as well as in the commercialization of the granted patents. With 34 percent1 of students choosing STEM fields in India, and over 750 Universities and nearly 40,000 colleges, the concentration of patent applications in only a few top 10 institutions raises concerns.
Innovation and technological advancement have become key drivers of economic growth and development in modern times. Therefore, our study aims to unravel the patent landscape of Indian Universities and Research Institutions, examining it through the lens of supply and demand for innovations and ideas. Delving into the dynamics of patent filing and innovation trends, this study seeks to shed light on the current state of intellectual property generation in the country's academic and research ecosystem.
△ Less
Submitted 24 April, 2023;
originally announced April 2023.
-
Lightning Fast Video Anomaly Detection via Adversarial Knowledge Distillation
Authors:
Florinel-Alin Croitoru,
Nicolae-Catalin Ristea,
Dana Dascalescu,
Radu Tudor Ionescu,
Fahad Shahbaz Khan,
Mubarak Shah
Abstract:
We propose a very fast frame-level model for anomaly detection in video, which learns to detect anomalies by distilling knowledge from multiple highly accurate object-level teacher models. To improve the fidelity of our student, we distill the low-resolution anomaly maps of the teachers by jointly applying standard and adversarial distillation, introducing an adversarial discriminator for each tea…
▽ More
We propose a very fast frame-level model for anomaly detection in video, which learns to detect anomalies by distilling knowledge from multiple highly accurate object-level teacher models. To improve the fidelity of our student, we distill the low-resolution anomaly maps of the teachers by jointly applying standard and adversarial distillation, introducing an adversarial discriminator for each teacher to distinguish between target and generated anomaly maps. We conduct experiments on three benchmarks (Avenue, ShanghaiTech, UCSD Ped2), showing that our method is over 7 times faster than the fastest competing method, and between 28 and 62 times faster than object-centric models, while obtaining comparable results to recent methods. Our evaluation also indicates that our model achieves the best trade-off between speed and accuracy, due to its previously unheard-of speed of 1480 FPS. In addition, we carry out a comprehensive ablation study to justify our architectural design choices. Our code is freely available at: https://github.com/ristea/fast-aed.
△ Less
Submitted 17 July, 2024; v1 submitted 28 November, 2022;
originally announced November 2022.
-
Doubly-robust and heteroscedasticity-aware sample trimming for causal inference
Authors:
Samir Khan,
Johan Ugander
Abstract:
A popular method for variance reduction in observational causal inference is propensity-based trimming, the practice of removing units with extreme propensities from the sample. This practice has theoretical grounding when the data are homoscedastic and the propensity model is parametric (Yang and Ding, 2018; Crump et al. 2009), but in modern settings where heteroscedastic data are analyzed with n…
▽ More
A popular method for variance reduction in observational causal inference is propensity-based trimming, the practice of removing units with extreme propensities from the sample. This practice has theoretical grounding when the data are homoscedastic and the propensity model is parametric (Yang and Ding, 2018; Crump et al. 2009), but in modern settings where heteroscedastic data are analyzed with non-parametric models, existing theory fails to support current practice. In this work, we address this challenge by developing new methods and theory for sample trimming. Our contributions are three-fold: first, we describe novel procedures for selecting which units to trim. Our procedures differ from previous work in that we trim not only units with small propensities, but also units with extreme conditional variances. Second, we give new theoretical guarantees for inference after trimming. In particular, we show how to perform inference on the trimmed subpopulation without requiring that our regressions converge at parametric rates. Instead, we make only fourth-root rate assumptions like those in the double machine learning literature. This result applies to conventional propensity-based trimming as well and thus may be of independent interest. Finally, we propose a bootstrap-based method for constructing simultaneously valid confidence intervals for multiple trimmed sub-populations, which are valuable for navigating the trade-off between sample size and variance reduction inherent in trimming. We validate our methods in simulation, on the 2007-2008 National Health and Nutrition Examination Survey, and on a semi-synthetic Medicare dataset and find promising results in all settings.
△ Less
Submitted 29 January, 2024; v1 submitted 18 October, 2022;
originally announced October 2022.
-
Estimating a new panel MSK dataset for comparative analyses of national absorptive capacity systems, economic growth, and development in low and middle income economies
Authors:
Muhammad Salar Khan
Abstract:
Within the national innovation system literature, empirical analyses are severely lacking for developing economies. Particularly, the low- and middle-income countries (LMICs) eligible for the World Bank's International Development Association (IDA) support, are rarely part of any empirical discourse on growth, development, and innovation. One major issue hindering panel analyses in LMICs, and thus…
▽ More
Within the national innovation system literature, empirical analyses are severely lacking for developing economies. Particularly, the low- and middle-income countries (LMICs) eligible for the World Bank's International Development Association (IDA) support, are rarely part of any empirical discourse on growth, development, and innovation. One major issue hindering panel analyses in LMICs, and thus them being subject to any empirical discussion, is the lack of complete data availability. This work offers a new complete panel dataset with no missing values for LMICs eligible for IDA's support. I use a standard, widely respected multiple imputation technique (specifically, Predictive Mean Matching) developed by Rubin (1987). This technique respects the structure of multivariate continuous panel data at the country level. I employ this technique to create a large dataset consisting of many variables drawn from publicly available established sources. These variables, in turn, capture six crucial country-level capacities: technological capacity, financial capacity, human capital capacity, infrastructural capacity, public policy capacity, and social capacity. Such capacities are part and parcel of the National Absorptive Capacity Systems (NACS). The dataset (MSK dataset) thus produced contains data on 47 variables for 82 LMICs between 2005 and 2019. The dataset has passed a quality and reliability check and can thus be used for comparative analyses of national absorptive capacities and development, transition, and convergence analyses among LMICs.
△ Less
Submitted 12 September, 2021;
originally announced September 2021.
-
Efficacy of Statistical and Artificial Intelligence-based False Information Cyberattack Detection Models for Connected Vehicles
Authors:
Sakib Mahmud Khan,
Gurcan Comert,
Mashrur Chowdhury
Abstract:
Connected vehicles (CVs), because of the external connectivity with other CVs and connected infrastructure, are vulnerable to cyberattacks that can instantly compromise the safety of the vehicle itself and other connected vehicles and roadway infrastructure. One such cyberattack is the false information attack, where an external attacker injects inaccurate information into the connected vehicles a…
▽ More
Connected vehicles (CVs), because of the external connectivity with other CVs and connected infrastructure, are vulnerable to cyberattacks that can instantly compromise the safety of the vehicle itself and other connected vehicles and roadway infrastructure. One such cyberattack is the false information attack, where an external attacker injects inaccurate information into the connected vehicles and eventually can cause catastrophic consequences by compromising safety-critical applications like the forward collision warning. The occurrence and target of such attack events can be very dynamic, making real-time and near-real-time detection challenging. Change point models, can be used for real-time anomaly detection caused by the false information attack. In this paper, we have evaluated three change point-based statistical models; Expectation Maximization, Cumulative Summation, and Bayesian Online Change Point Algorithms for cyberattack detection in the CV data. Also, data-driven artificial intelligence (AI) models, which can be used to detect known and unknown underlying patterns in the dataset, have the potential of detecting a real-time anomaly in the CV data. We have used six AI models to detect false information attacks and compared the performance for detecting the attacks with our developed change point models. Our study shows that change points models performed better in real-time false information attack detection compared to the performance of the AI models. Change point models having the advantage of no training requirements can be a feasible and computationally efficient alternative to AI models for false information attack detection in connected vehicles.
△ Less
Submitted 2 August, 2021;
originally announced August 2021.
-
Optimizing Cost per Click for Digital Advertising Campaigns
Authors:
Aditya Jain,
Sahil Khan
Abstract:
Cost per click is a common metric to judge digital advertising campaign performance. In this paper we discuss an approach that generates a feature targeting recommendation to optimise cost per click. We also discuss a technique to assign bid prices to features without compromising on the number of features recommended.
Our approach utilises impression and click stream data sets corresponding to…
▽ More
Cost per click is a common metric to judge digital advertising campaign performance. In this paper we discuss an approach that generates a feature targeting recommendation to optimise cost per click. We also discuss a technique to assign bid prices to features without compromising on the number of features recommended.
Our approach utilises impression and click stream data sets corresponding to real time auctions that we have won. The data contains information about device type, website, RTB Exchange ID. We leverage data across all campaigns that we have access to while ensuring that recommendations are sensitive to both individual campaign level features and globally well performing features as well. We model Bid recommendation around the hypothesis that a click is a Bernoulli trial and click stream follows Binomial distribution which is then updated based on live performance ensuring week over week improvement.
This approach has been live tested over 10 weeks across 5 campaigns. We see Cost per click gains of 16-60% and click through rate improvement of 42-137%. At the same time, the campaign delivery was competitive.
△ Less
Submitted 2 August, 2021;
originally announced August 2021.
-
Bi-Directional Grid Constrained Stochastic Processes' Link to Multi-Skew Brownian Motion
Authors:
Aldo Taranto,
Ron Addie,
Shahjahan Khan
Abstract:
Bi-Directional Grid Constrained (BGC) stochastic processes (BGCSPs) constrain the random movement toward the origin steadily more and more, the further they deviate from the origin, rather than all at once imposing reflective barriers, as does the well-established theory of It^o diffusions with such reflective barriers. We identify that BGCSPs are a variant rather than a special case of the multi-…
▽ More
Bi-Directional Grid Constrained (BGC) stochastic processes (BGCSPs) constrain the random movement toward the origin steadily more and more, the further they deviate from the origin, rather than all at once imposing reflective barriers, as does the well-established theory of It^o diffusions with such reflective barriers. We identify that BGCSPs are a variant rather than a special case of the multi-skew Brownian motion (M-SBM). This is because they have their own complexities, such as the barriers being hidden (not known in advance) and not necessarily constant over time. We provide an M-SBM theoretical framework and also a simulation framework to elaborate deeper properties of BGCSPs. The simulation framework is then applied by generating numerous simulations of the constrained paths and the results are analysed. BGCSPs have applications in finance and indeed many other fields requiring graduated constraining, from both above and below the initial position.
△ Less
Submitted 26 July, 2021;
originally announced July 2021.
-
Adaptive normalization for IPW estimation
Authors:
Samir Khan,
Johan Ugander
Abstract:
Inverse probability weighting (IPW) is a general tool in survey sampling and causal inference, used both in Horvitz-Thompson estimators, which normalize by the sample size, and Hájek/self-normalized estimators, which normalize by the sum of the inverse probability weights. In this work we study a family of IPW estimators, first proposed by Trotter and Tukey in the context of Monte Carlo problems,…
▽ More
Inverse probability weighting (IPW) is a general tool in survey sampling and causal inference, used both in Horvitz-Thompson estimators, which normalize by the sample size, and Hájek/self-normalized estimators, which normalize by the sum of the inverse probability weights. In this work we study a family of IPW estimators, first proposed by Trotter and Tukey in the context of Monte Carlo problems, that are normalized by an affine combination of these two terms. We show how selecting an estimator from this family in a data-dependent way to minimize asymptotic variance leads to an iterative procedure that converges to an estimator with connections to regression control methods. We refer to this estimator as an adaptively normalized estimator. For mean estimation in survey sampling, this estimator has asymptotic variance that is never worse than the Horvitz--Thompson or Hájek estimators, and is smaller except in edge cases. Going further, we show that adaptive normalization can be used to propose improvements of the augmented IPW (AIPW) estimator, average treatment effect (ATE) estimators, and policy learning objectives. Appealingly, these proposals preserve both the asymptotic efficiency of AIPW and the regret bounds for policy learning with IPW objectives, and deliver consistent finite sample improvements in simulations for all three of mean estimation, ATE estimation, and policy learning.
△ Less
Submitted 9 July, 2021; v1 submitted 14 June, 2021;
originally announced June 2021.
-
Comparative Analysis of Machine Learning Approaches to Analyze and Predict the Covid-19 Outbreak
Authors:
Muhammad Naeem,
Jian Yu,
Muhammad Aamir,
Sajjad Ahmad Khan,
Olayinka Adeleye,
Zardad Khan
Abstract:
Background. Forecasting the time of forthcoming pandemic reduces the impact of diseases by taking precautionary steps such as public health messaging and raising the consciousness of doctors. With the continuous and rapid increase in the cumulative incidence of COVID-19, statistical and outbreak prediction models including various machine learning (ML) models are being used by the research communi…
▽ More
Background. Forecasting the time of forthcoming pandemic reduces the impact of diseases by taking precautionary steps such as public health messaging and raising the consciousness of doctors. With the continuous and rapid increase in the cumulative incidence of COVID-19, statistical and outbreak prediction models including various machine learning (ML) models are being used by the research community to track and predict the trend of the epidemic, and also in developing appropriate strategies to combat and manage its spread. Methods. In this paper, we present a comparative analysis of various ML approaches including Support Vector Machine, Random Forest, K-Nearest Neighbor and Artificial Neural Network in predicting the COVID-19 outbreak in the epidemiological domain. We first apply the autoregressive distributed lag (ARDL) method to identify and model the short and long-run relationships of the time-series COVID-19 datasets. That is, we determine the lags between a response variable and its respective explanatory time series variables as independent variables. Then, the resulting significant variables concerning their lags are used in the regression model selected by the ARDL for predicting and forecasting the trend of the epidemic. Results. Statistical measures i.e., Root Mean Square Error (RMSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) are used for model accuracy. The values of MAPE for the best selected models for confirmed, recovered and deaths cases are 0.407, 0.094 and 0.124 respectively, which falls under the category of highly accurate forecasts. In addition, we computed fifteen days ahead forecast for the daily deaths, recover, and confirm patients and the cases fluctuated across time in all aspects. Besides, the results reveal the advantages of ML algorithms for supporting decision making of evolving short term policies.
△ Less
Submitted 11 February, 2021;
originally announced February 2021.
-
Robust normalizing flows using Bernstein-type polynomials
Authors:
Sameera Ramasinghe,
Kasun Fernando,
Salman Khan,
Nick Barnes
Abstract:
Modeling real-world distributions can often be challenging due to sample data that are subjected to perturbations, e.g., instrumentation errors, or added random noise. Since flow models are typically nonlinear algorithms, they amplify these initial errors, leading to poor generalizations. This paper proposes a framework to construct Normalizing Flows (NF), which demonstrates higher robustness agai…
▽ More
Modeling real-world distributions can often be challenging due to sample data that are subjected to perturbations, e.g., instrumentation errors, or added random noise. Since flow models are typically nonlinear algorithms, they amplify these initial errors, leading to poor generalizations. This paper proposes a framework to construct Normalizing Flows (NF), which demonstrates higher robustness against such initial errors. To this end, we utilize Bernstein-type polynomials inspired by the optimal stability of the Bernstein basis. Further, compared to the existing NF frameworks, our method provides compelling advantages like theoretical upper bounds for the approximation error, higher interpretability, suitability for compactly supported densities, and the ability to employ higher degree polynomials without training instability. We conduct a thorough theoretical analysis and empirically demonstrate the efficacy of the proposed technique using experiments on both real-world and synthetic datasets.
△ Less
Submitted 9 October, 2022; v1 submitted 5 February, 2021;
originally announced February 2021.
-
GSSMD: A new standardized effect size measure to improve robustness and interpretability in biological applications
Authors:
Seongyong Park,
Shujaat Khan,
Muhammad Moinuddin,
Ubaid M. Al-Saggaf
Abstract:
In many biological applications, the primary objective of study is to quantify the magnitude of treatment effect between two groups. Cohens'd or strictly standardized mean difference (SSMD) can be used to measure effect size however, it is sensitive to violation of assumption of normality. Here, we propose an alternative metric of standardized effect size measure to improve robustness and interpre…
▽ More
In many biological applications, the primary objective of study is to quantify the magnitude of treatment effect between two groups. Cohens'd or strictly standardized mean difference (SSMD) can be used to measure effect size however, it is sensitive to violation of assumption of normality. Here, we propose an alternative metric of standardized effect size measure to improve robustness and interpretability, based on the overlap between two sample distributions. The proposed method is a non-parametric generalized variant of SSMD (Strictly Standardized Mean Difference). We characterized proposed measure in various simulation settings to illustrate its behavior. We also investigated finite sample properties on the estimation of effect size and draw some guidelines. As a case study, we applied our measure for hit selection problem in an RNAi experiment and showed superiority of proposed method.
△ Less
Submitted 14 November, 2020;
originally announced November 2020.
-
Switchable Deep Beamformer
Authors:
Shujaat Khan,
Jaeyoung Huh,
Jong Chul Ye
Abstract:
Recent proposals of deep beamformers using deep neural networks have attracted significant attention as computational efficient alternatives to adaptive and compressive beamformers. Moreover, deep beamformers are versatile in that image post-processing algorithms can be combined with the beamforming. Unfortunately, in the current technology, a separate beamformer should be trained and stored for e…
▽ More
Recent proposals of deep beamformers using deep neural networks have attracted significant attention as computational efficient alternatives to adaptive and compressive beamformers. Moreover, deep beamformers are versatile in that image post-processing algorithms can be combined with the beamforming. Unfortunately, in the current technology, a separate beamformer should be trained and stored for each application, demanding significant scanner resources. To address this problem, here we propose a {\em switchable} deep beamformer that can produce various types of output such as DAS, speckle removal, deconvolution, etc., using a single network with a simple switch. In particular, the switch is implemented through Adaptive Instance Normalization (AdaIN) layers, so that various output can be generated by merely changing the AdaIN code. Experimental results using B-mode focused ultrasound confirm the flexibility and efficacy of the proposed methods for various applications.
△ Less
Submitted 4 September, 2020; v1 submitted 31 August, 2020;
originally announced August 2020.
-
OT-driven Multi-Domain Unsupervised Ultrasound Image Artifact Removal using a Single CNN
Authors:
Jaeyoung Huh,
Shujaat Khan,
Jong Chul Ye
Abstract:
Ultrasound imaging (US) often suffers from distinct image artifacts from various sources. Classic approaches for solving these problems are usually model-based iterative approaches that have been developed specifically for each type of artifact, which are often computationally intensive. Recently, deep learning approaches have been proposed as computationally efficient and high performance alterna…
▽ More
Ultrasound imaging (US) often suffers from distinct image artifacts from various sources. Classic approaches for solving these problems are usually model-based iterative approaches that have been developed specifically for each type of artifact, which are often computationally intensive. Recently, deep learning approaches have been proposed as computationally efficient and high performance alternatives. Unfortunately, in the current deep learning approaches, a dedicated neural network should be trained with matched training data for each specific artifact type. This poses a fundamental limitation in the practical use of deep learning for US, since large number of models should be stored to deal with various US image artifacts. Inspired by the recent success of multi-domain image transfer, here we propose a novel, unsupervised, deep learning approach in which a single neural network can be used to deal with different types of US artifacts simply by changing a mask vector that switches between different target domains. Our algorithm is rigorously derived using an optimal transport (OT) theory for cascaded probability measures. Experimental results using phantom and in vivo data demonstrate that the proposed method can generate high quality image by removing distinct artifacts, which are comparable to those obtained by separately trained multiple neural networks.
△ Less
Submitted 10 July, 2020;
originally announced July 2020.
-
Multi-Kernel Fusion for RBF Neural Networks
Authors:
Syed Muhammad Atif,
Shujaat Khan,
Imran Naseem,
Roberto Togneri,
Mohammed Bennamoun
Abstract:
A simple yet effective architectural design of radial basis function neural networks (RBFNN) makes them amongst the most popular conventional neural networks. The current generation of radial basis function neural network is equipped with multiple kernels which provide significant performance benefits compared to the previous generation using only a single kernel. In existing multi-kernel RBF algo…
▽ More
A simple yet effective architectural design of radial basis function neural networks (RBFNN) makes them amongst the most popular conventional neural networks. The current generation of radial basis function neural network is equipped with multiple kernels which provide significant performance benefits compared to the previous generation using only a single kernel. In existing multi-kernel RBF algorithms, multi-kernel is formed by the convex combination of the base/primary kernels. In this paper, we propose a novel multi-kernel RBFNN in which every base kernel has its own (local) weight. This novel flexibility in the network provides better performance such as faster convergence rate, better local minima and resilience against stucking in poor local minima. These performance gains are achieved at a competitive computational complexity compared to the contemporary multi-kernel RBF algorithms. The proposed algorithm is thoroughly analysed for performance gain using mathematical and graphical illustrations and also evaluated on three different types of problems namely: (i) pattern classification, (ii) system identification and (iii) function approximation. Empirical results clearly show the superiority of the proposed algorithm compared to the existing state-of-the-art multi-kernel approaches.
△ Less
Submitted 6 July, 2020;
originally announced July 2020.
-
Pushing the Limit of Unsupervised Learning for Ultrasound Image Artifact Removal
Authors:
Shujaat Khan,
Jaeyoung Huh,
Jong Chul Ye
Abstract:
Ultrasound (US) imaging is a fast and non-invasive imaging modality which is widely used for real-time clinical imaging applications without concerning about radiation hazard. Unfortunately, it often suffers from poor visual quality from various origins, such as speckle noises, blurring, multi-line acquisition (MLA), limited RF channels, small number of view angles for the case of plane wave imagi…
▽ More
Ultrasound (US) imaging is a fast and non-invasive imaging modality which is widely used for real-time clinical imaging applications without concerning about radiation hazard. Unfortunately, it often suffers from poor visual quality from various origins, such as speckle noises, blurring, multi-line acquisition (MLA), limited RF channels, small number of view angles for the case of plane wave imaging, etc. Classical methods to deal with these problems include image-domain signal processing approaches using various adaptive filtering and model-based approaches. Recently, deep learning approaches have been successfully used for ultrasound imaging field. However, one of the limitations of these approaches is that paired high quality images for supervised training are difficult to obtain in many practical applications. In this paper, inspired by the recent theory of unsupervised learning using optimal transport driven cycleGAN (OT-cycleGAN), we investigate applicability of unsupervised deep learning for US artifact removal problems without matched reference data. Experimental results for various tasks such as deconvolution, speckle removal, limited data artifact removal, etc. confirmed that our unsupervised learning method provides comparable results to supervised learning for many practical applications.
△ Less
Submitted 25 June, 2020;
originally announced June 2020.
-
Federated Multi-view Matrix Factorization for Personalized Recommendations
Authors:
Adrian Flanagan,
Were Oyomno,
Alexander Grigorievskiy,
Kuan Eeik Tan,
Suleiman A. Khan,
Muhammad Ammad-Ud-Din
Abstract:
We introduce the federated multi-view matrix factorization method that extends the federated learning framework to matrix factorization with multiple data sources. Our method is able to learn the multi-view model without transferring the user's personal data to a central server. As far as we are aware this is the first federated model to provide recommendations using multi-view matrix factorizatio…
▽ More
We introduce the federated multi-view matrix factorization method that extends the federated learning framework to matrix factorization with multiple data sources. Our method is able to learn the multi-view model without transferring the user's personal data to a central server. As far as we are aware this is the first federated model to provide recommendations using multi-view matrix factorization. The model is rigorously evaluated on three datasets on production settings. Empirical validation confirms that federated multi-view matrix factorization outperforms simpler methods that do not take into account the multi-view structure of the data, in addition, it demonstrates the usefulness of the proposed method for the challenging prediction tasks of cold-start federated recommendations.
△ Less
Submitted 8 April, 2020;
originally announced April 2020.
-
iTAML: An Incremental Task-Agnostic Meta-learning Approach
Authors:
Jathushan Rajasegaran,
Salman Khan,
Munawar Hayat,
Fahad Shahbaz Khan,
Mubarak Shah
Abstract:
Humans can continuously learn new knowledge as their experience grows. In contrast, previous learning in deep neural networks can quickly fade out when they are trained on a new task. In this paper, we hypothesize this problem can be avoided by learning a set of generalized parameters, that are neither specific to old nor new tasks. In this pursuit, we introduce a novel meta-learning approach that…
▽ More
Humans can continuously learn new knowledge as their experience grows. In contrast, previous learning in deep neural networks can quickly fade out when they are trained on a new task. In this paper, we hypothesize this problem can be avoided by learning a set of generalized parameters, that are neither specific to old nor new tasks. In this pursuit, we introduce a novel meta-learning approach that seeks to maintain an equilibrium between all the encountered tasks. This is ensured by a new meta-update rule which avoids catastrophic forgetting. In comparison to previous meta-learning techniques, our approach is task-agnostic. When presented with a continuum of data, our model automatically identifies the task and quickly adapts to it with just a single update. We perform extensive experiments on five datasets in a class-incremental setting, leading to significant improvements over the state of the art methods (e.g., a 21.3% boost on CIFAR100 with 10 incremental tasks). Specifically, on large-scale datasets that generally prove difficult cases for incremental learning, our approach delivers absolute gains as high as 19.1% and 7.4% on ImageNet and MS-Celeb datasets, respectively.
△ Less
Submitted 25 March, 2020;
originally announced March 2020.
-
Deep Object Detection based Mitosis Analysis in Breast Cancer Histopathological Images
Authors:
Anabia Sohail,
Muhammad Ahsan Mukhtar,
Asifullah Khan,
Muhammad Mohsin Zafar,
Aneela Zameer,
Saranjam Khan
Abstract:
Empirical evaluation of breast tissue biopsies for mitotic nuclei detection is considered an important prognostic biomarker in tumor grading and cancer progression. However, automated mitotic nuclei detection poses several challenges because of the unavailability of pixel-level annotations, different morphological configurations of mitotic nuclei, their sparse representation, and close resemblance…
▽ More
Empirical evaluation of breast tissue biopsies for mitotic nuclei detection is considered an important prognostic biomarker in tumor grading and cancer progression. However, automated mitotic nuclei detection poses several challenges because of the unavailability of pixel-level annotations, different morphological configurations of mitotic nuclei, their sparse representation, and close resemblance with non-mitotic nuclei. These challenges undermine the precision of the automated detection model and thus make detection difficult in a single phase. This work proposes an end-to-end detection system for mitotic nuclei identification in breast cancer histopathological images. Deep object detection-based Mask R-CNN is adapted for mitotic nuclei detection that initially selects the candidate mitotic region with maximum recall. However, in the second phase, these candidate regions are refined by multi-object loss function to improve the precision. The performance of the proposed detection model shows improved discrimination ability (F-score of 0.86) for mitotic nuclei with significant precision (0.86) as compared to the two-stage detection models (F-score of 0.701) on TUPAC16 dataset. Promising results suggest that the deep object detection-based model has the potential to learn the characteristic features of mitotic nuclei from weakly annotated data and suggests that it can be adapted for the identification of other nuclear bodies in histopathological images.
△ Less
Submitted 16 March, 2020;
originally announced March 2020.
-
Incremental Object Detection via Meta-Learning
Authors:
K J Joseph,
Jathushan Rajasegaran,
Salman Khan,
Fahad Shahbaz Khan,
Vineeth N Balasubramanian
Abstract:
In a real-world setting, object instances from new classes can be continuously encountered by object detectors. When existing object detectors are applied to such scenarios, their performance on old classes deteriorates significantly. A few efforts have been reported to address this limitation, all of which apply variants of knowledge distillation to avoid catastrophic forgetting. We note that alt…
▽ More
In a real-world setting, object instances from new classes can be continuously encountered by object detectors. When existing object detectors are applied to such scenarios, their performance on old classes deteriorates significantly. A few efforts have been reported to address this limitation, all of which apply variants of knowledge distillation to avoid catastrophic forgetting. We note that although distillation helps to retain previous learning, it obstructs fast adaptability to new tasks, which is a critical requirement for incremental learning. In this pursuit, we propose a meta-learning approach that learns to reshape model gradients, such that information across incremental tasks is optimally shared. This ensures a seamless information transfer via a meta-learned gradient preconditioning that minimizes forgetting and maximizes knowledge transfer. In comparison to existing meta-learning methods, our approach is task-agnostic, allows incremental addition of new-classes and scales to high-capacity models for object detection. We evaluate our approach on a variety of incremental learning settings defined on PASCAL-VOC and MS COCO datasets, where our approach performs favourably well against state-of-the-art methods.
△ Less
Submitted 15 December, 2021; v1 submitted 17 March, 2020;
originally announced March 2020.
-
Towards Robust and Reproducible Active Learning Using Neural Networks
Authors:
Prateek Munjal,
Nasir Hayat,
Munawar Hayat,
Jamshid Sourati,
Shadab Khan
Abstract:
Active learning (AL) is a promising ML paradigm that has the potential to parse through large unlabeled data and help reduce annotation cost in domains where labeling data can be prohibitive. Recently proposed neural network based AL methods use different heuristics to accomplish this goal. In this study, we demonstrate that under identical experimental settings, different types of AL algorithms (…
▽ More
Active learning (AL) is a promising ML paradigm that has the potential to parse through large unlabeled data and help reduce annotation cost in domains where labeling data can be prohibitive. Recently proposed neural network based AL methods use different heuristics to accomplish this goal. In this study, we demonstrate that under identical experimental settings, different types of AL algorithms (uncertainty based, diversity based, and committee based) produce an inconsistent gain over random sampling baseline. Through a variety of experiments, controlling for sources of stochasticity, we show that variance in performance metrics achieved by AL algorithms can lead to results that are not consistent with the previously reported results. We also found that under strong regularization, AL methods show marginal or no advantage over the random sampling baseline under a variety of experimental conditions. Finally, we conclude with a set of recommendations on how to assess the results using a new AL algorithm to ensure results are reproducible and robust under changes in experimental conditions. We share our codes to facilitate AL evaluations. We believe our findings and recommendations will help advance reproducible research in AL using neural networks. We open source our code at https://github.com/PrateekMunjal/TorchAL
△ Less
Submitted 15 June, 2022; v1 submitted 21 February, 2020;
originally announced February 2020.
-
GSSMD: New metric for robust and interpretable assay quality assessment and hit selection
Authors:
Seongyong Park,
Shujaat Khan
Abstract:
In the high-throughput screening (HTS) campaigns, the Z'-factor and strictly standardized mean difference (SSMD) are commonly used to assess the quality of assays and to select hits. However, these measures are vulnerable to outliers and their performances are highly sensitive to background distributions. Here, we propose an alternative measure for assay quality assessment and hit selection. The p…
▽ More
In the high-throughput screening (HTS) campaigns, the Z'-factor and strictly standardized mean difference (SSMD) are commonly used to assess the quality of assays and to select hits. However, these measures are vulnerable to outliers and their performances are highly sensitive to background distributions. Here, we propose an alternative measure for assay quality assessment and hit selection. The proposed method is a non-parametric generalized variant of SSMD (GSSMD). In this paper, we have shown that the proposed method provides more robust and intuitive way of assay quality assessment and hit selection.
△ Less
Submitted 20 January, 2020; v1 submitted 17 January, 2020;
originally announced January 2020.
-
Contextual Minimum-Norm Estimates (CMNE): A Deep Learning Method for Source Estimation in Neuronal Networks
Authors:
Christoph Dinh,
John GW Samuelsson,
Alexander Hunold,
Matti S Hämäläinen,
Sheraz Khan
Abstract:
Magnetoencephalography (MEG) and Electroencephalography (EEG) source estimates have thus far mostly been derived sample by sample, i.e., independent of each other in time. However, neuronal assemblies are heavily interconnected, constraining the temporal evolution of neural activity in space as detected by MEG and EEG. The observed neural currents are thus highly context dependent. Here, a new met…
▽ More
Magnetoencephalography (MEG) and Electroencephalography (EEG) source estimates have thus far mostly been derived sample by sample, i.e., independent of each other in time. However, neuronal assemblies are heavily interconnected, constraining the temporal evolution of neural activity in space as detected by MEG and EEG. The observed neural currents are thus highly context dependent. Here, a new method is presented which integrates predictive deep learning networks with the Minimum-Norm Estimates (MNE) approach. Specifically, we employ Long Short-Term Memory (LSTM) networks, a type of recurrent neural network, for predicting brain activity. Because we use past activity (context) in the estimation, we call our method Contextual MNE (CMNE). We demonstrate that these contextual algorithms can be used for predicting activity based on previous brain states and when used in conjunction with MNE, they lead to more accurate source estimation. To evaluate the performance of CMNE, it was tested on simulated and experimental data from human auditory evoked response experiments.
△ Less
Submitted 5 September, 2019;
originally announced September 2019.
-
Blended Convolution and Synthesis for Efficient Discrimination of 3D Shapes
Authors:
Sameera Ramasinghe,
Salman Khan,
Nick Barnes,
Stephen Gould
Abstract:
Existing networks directly learn feature representations on 3D point clouds for shape analysis. We argue that 3D point clouds are highly redundant and hold irregular (permutation-invariant) structure, which makes it difficult to achieve inter-class discrimination efficiently. In this paper, we propose a two-faceted solution to this problem that is seamlessly integrated in a single `Blended Convolu…
▽ More
Existing networks directly learn feature representations on 3D point clouds for shape analysis. We argue that 3D point clouds are highly redundant and hold irregular (permutation-invariant) structure, which makes it difficult to achieve inter-class discrimination efficiently. In this paper, we propose a two-faceted solution to this problem that is seamlessly integrated in a single `Blended Convolution and Synthesis' layer. This fully differentiable layer performs two critical tasks in succession. In the first step, it projects the input 3D point clouds into a latent 3D space to synthesize a highly compact and more inter-class discriminative point cloud representation. Since, 3D point clouds do not follow a Euclidean topology, standard 2/3D Convolutional Neural Networks offer limited representation capability. Therefore, in the second step, it uses a novel 3D convolution operator functioning inside the unit ball ($\mathbb{B}^3$) to extract useful volumetric features. We extensively derive formulae to achieve both translation and rotation of our novel convolution kernels. Finally, using the proposed techniques we present an extremely light-weight, end-to-end architecture that achieves compelling results on 3D shape recognition and retrieval.
△ Less
Submitted 19 July, 2020; v1 submitted 24 August, 2019;
originally announced August 2019.
-
Chaotic Time Series Prediction using Spatio-Temporal RBF Neural Networks
Authors:
Alishba Sadiq,
Muhammad Sohail Ibrahim,
Muhammad Usman,
Muhammad Zubair,
Shujaat Khan
Abstract:
Due to the dynamic nature, chaotic time series are difficult predict. In conventional signal processing approaches signals are treated either in time or in space domain only. Spatio-temporal analysis of signal provides more advantages over conventional uni-dimensional approaches by harnessing the information from both the temporal and spatial domains. Herein, we propose an spatio-temporal extensio…
▽ More
Due to the dynamic nature, chaotic time series are difficult predict. In conventional signal processing approaches signals are treated either in time or in space domain only. Spatio-temporal analysis of signal provides more advantages over conventional uni-dimensional approaches by harnessing the information from both the temporal and spatial domains. Herein, we propose an spatio-temporal extension of RBF neural networks for the prediction of chaotic time series. The proposed algorithm utilizes the concept of time-space orthogonality and separately deals with the temporal dynamics and spatial non-linearity(complexity) of the chaotic series. The proposed RBF architecture is explored for the prediction of Mackey-Glass time series and results are compared with the standard RBF. The spatio-temporal RBF is shown to out perform the standard RBFNN by achieving significantly reduced estimation error.
△ Less
Submitted 17 August, 2019;
originally announced August 2019.
-
Spatio-Temporal RBF Neural Networks
Authors:
Shujaat Khan,
Jawwad Ahmad,
Alishba Sadiq,
Imran Naseem,
Muhammad Moinuddin
Abstract:
Herein, we propose a spatio-temporal extension of RBFNN for nonlinear system identification problem. The proposed algorithm employs the concept of time-space orthogonality and separately models the dynamics and nonlinear complexities of the system. The proposed RBF architecture is explored for the estimation of a highly nonlinear system and results are compared with the standard architecture for b…
▽ More
Herein, we propose a spatio-temporal extension of RBFNN for nonlinear system identification problem. The proposed algorithm employs the concept of time-space orthogonality and separately models the dynamics and nonlinear complexities of the system. The proposed RBF architecture is explored for the estimation of a highly nonlinear system and results are compared with the standard architecture for both the conventional and fractional gradient decent-based learning rules. The spatio-temporal RBF is shown to perform better than the standard and fractional RBFNNs by achieving fast convergence and significantly reduced estimation error.
△ Less
Submitted 4 August, 2019;
originally announced August 2019.
-
Spatio-Temporal Adversarial Learning for Detecting Unseen Falls
Authors:
Shehroz S. Khan,
Jacob Nogas,
Alex Mihailidis
Abstract:
Fall detection is an important problem from both the health and machine learning perspective. A fall can lead to severe injuries, long term impairments or even death in some cases. In terms of machine learning, it presents a severely class imbalance problem with very few or no training data for falls owing to the fact that falls occur rarely. In this paper, we take an alternate philosophy to detec…
▽ More
Fall detection is an important problem from both the health and machine learning perspective. A fall can lead to severe injuries, long term impairments or even death in some cases. In terms of machine learning, it presents a severely class imbalance problem with very few or no training data for falls owing to the fact that falls occur rarely. In this paper, we take an alternate philosophy to detect falls in the absence of their training data, by training the classifier on only the normal activities (that are available in abundance) and identifying a fall as an anomaly. To realize such a classifier, we use an adversarial learning framework, which comprises of a spatio-temporal autoencoder for reconstructing input video frames and a spatio-temporal convolution network to discriminate them against original video frames. 3D convolutions are used to learn spatial and temporal features from the input video frames. The adversarial learning of the spatio-temporal autoencoder will enable reconstructing the normal activities of daily living efficiently; thus, rendering detecting unseen falls plausible within this framework. We tested the performance of the proposed framework on camera sensing modalities that may preserve an individual's privacy (fully or partially), such as thermal and depth camera. Our results on three publicly available datasets show that the proposed spatio-temporal adversarial framework performed better than other baseline frame based (or spatial) adversarial learning methods.
△ Less
Submitted 2 March, 2020; v1 submitted 19 May, 2019;
originally announced May 2019.
-
A Novel Adaptive Kernel for the RBF Neural Networks
Authors:
Shujaat Khan,
Imran Naseem,
Roberto Togneri,
Mohammed Bennamoun
Abstract:
In this paper, we propose a novel adaptive kernel for the radial basis function (RBF) neural networks. The proposed kernel adaptively fuses the Euclidean and cosine distance measures to exploit the reciprocating properties of the two. The proposed framework dynamically adapts the weights of the participating kernels using the gradient descent method thereby alleviating the need for predetermined w…
▽ More
In this paper, we propose a novel adaptive kernel for the radial basis function (RBF) neural networks. The proposed kernel adaptively fuses the Euclidean and cosine distance measures to exploit the reciprocating properties of the two. The proposed framework dynamically adapts the weights of the participating kernels using the gradient descent method thereby alleviating the need for predetermined weights. The proposed method is shown to outperform the manual fusion of the kernels on three major problems of estimation namely nonlinear system identification, pattern classification and function approximation.
△ Less
Submitted 9 May, 2019;
originally announced May 2019.
-
KNN and ANN-based Recognition of Handwritten Pashto Letters using Zoning Features
Authors:
Sulaiman Khan,
Hazrat Ali,
Zahid Ullah,
Nasru Minallah,
Shahid Maqsood,
Abdul Hafeez
Abstract:
This paper presents a recognition system for handwritten Pashto letters. However, handwritten character recognition is a challenging task. These letters not only differ in shape and style but also vary among individuals. The recognition becomes further daunting due to the lack of standard datasets for inscribed Pashto letters. In this work, we have designed a database of moderate size, which encom…
▽ More
This paper presents a recognition system for handwritten Pashto letters. However, handwritten character recognition is a challenging task. These letters not only differ in shape and style but also vary among individuals. The recognition becomes further daunting due to the lack of standard datasets for inscribed Pashto letters. In this work, we have designed a database of moderate size, which encompasses a total of 4488 images, stemming from 102 distinguishing samples for each of the 44 letters in Pashto. The recognition framework uses zoning feature extractor followed by K-Nearest Neighbour (KNN) and Neural Network (NN) classifiers for classifying individual letter. Based on the evaluation of the proposed system, an overall classification accuracy of approximately 70.05% is achieved by using KNN while 72% is achieved by using NN.
△ Less
Submitted 8 June, 2019; v1 submitted 6 April, 2019;
originally announced April 2019.
-
Deep Learning-based Universal Beamformer for Ultrasound Imaging
Authors:
Shujaat Khan,
Jaeyoung Huh,
Jong Chul Ye
Abstract:
In ultrasound (US) imaging, individual channel RF measurements are back-propagated and accumulated to form an image after applying specific delays. While this time reversal is usually implemented using a hardware- or software-based delay-and-sum (DAS) beamformer, the performance of DAS decreases rapidly in situations where data acquisition is not ideal. Herein, for the first time, we demonstrate t…
▽ More
In ultrasound (US) imaging, individual channel RF measurements are back-propagated and accumulated to form an image after applying specific delays. While this time reversal is usually implemented using a hardware- or software-based delay-and-sum (DAS) beamformer, the performance of DAS decreases rapidly in situations where data acquisition is not ideal. Herein, for the first time, we demonstrate that a single data-driven adaptive beamformer designed as a deep neural network can generate high quality images robustly for various detector channel configurations and subsampling rates. The proposed deep beamformer is evaluated for two distinct acquisition schemes: focused ultrasound imaging and planewave imaging. Experimental results showed that the proposed deep beamformer exhibit significant performance gain for both focused and planar imaging schemes, in terms of contrast-to-noise ratio and structural similarity.
△ Less
Submitted 15 July, 2019; v1 submitted 4 April, 2019;
originally announced April 2019.
-
initKmix -- A Novel Initial Partition Generation Algorithm for Clustering Mixed Data using k-means-based Clustering
Authors:
Amir Ahmad,
Shehroz S. Khan
Abstract:
Mixed datasets consist of both numeric and categorical attributes. Various k-means-based clustering algorithms have been developed for these datasets. Generally, these algorithms use random partition as a starting point, which tends to produce different clustering results for different runs. In this paper, we propose, initKmix, a novel algorithm for finding an initial partition for k-means-based c…
▽ More
Mixed datasets consist of both numeric and categorical attributes. Various k-means-based clustering algorithms have been developed for these datasets. Generally, these algorithms use random partition as a starting point, which tends to produce different clustering results for different runs. In this paper, we propose, initKmix, a novel algorithm for finding an initial partition for k-means-based clustering algorithms for mixed datasets. In the initKmix algorithm, a k-means-based clustering algorithm is run many times, and in each run, one of the attributes is used to create initial clusters for that run. The clustering results of various runs are combined to produce the initial partition. This initial partition is then used as a seed to a k-means-based clustering algorithm to cluster mixed data. Experiments with various categorical and mixed datasets showed that initKmix produced accurate and consistent results, and outperformed the random initial partition method and other state-of-the-art initialization methods. Experiments also showed that k-means-based clustering for mixed datasets with initKmix performed similar to or better than many state-of-the-art clustering algorithms for categorical and mixed datasets.
△ Less
Submitted 22 July, 2020; v1 submitted 31 January, 2019;
originally announced February 2019.
-
Federated Collaborative Filtering for Privacy-Preserving Personalized Recommendation System
Authors:
Muhammad Ammad-ud-din,
Elena Ivannikova,
Suleiman A. Khan,
Were Oyomno,
Qiang Fu,
Kuan Eeik Tan,
Adrian Flanagan
Abstract:
The increasing interest in user privacy is leading to new privacy preserving machine learning paradigms. In the Federated Learning paradigm, a master machine learning model is distributed to user clients, the clients use their locally stored data and model for both inference and calculating model updates. The model updates are sent back and aggregated on the server to update the master model then…
▽ More
The increasing interest in user privacy is leading to new privacy preserving machine learning paradigms. In the Federated Learning paradigm, a master machine learning model is distributed to user clients, the clients use their locally stored data and model for both inference and calculating model updates. The model updates are sent back and aggregated on the server to update the master model then redistributed to the clients. In this paradigm, the user data never leaves the client, greatly enhancing the user' privacy, in contrast to the traditional paradigm of collecting, storing and processing user data on a backend server beyond the user's control. In this paper we introduce, as far as we are aware, the first federated implementation of a Collaborative Filter. The federated updates to the model are based on a stochastic gradient approach. As a classical case study in machine learning, we explore a personalized recommendation system based on users' implicit feedback and demonstrate the method's applicability to both the MovieLens and an in-house dataset. Empirical validation confirms a collaborative filter can be federated without a loss of accuracy compared to a standard implementation, hence enhancing the user's privacy in a widely used recommender application while maintaining recommender performance.
△ Less
Submitted 29 January, 2019;
originally announced January 2019.
-
Transfer Learning and Meta Classification Based Deep Churn Prediction System for Telecom Industry
Authors:
Uzair Ahmed,
Asifullah Khan,
Saddam Hussain Khan,
Abdul Basit,
Irfan Ul Haq,
Yeon Soo Lee
Abstract:
A churn prediction system guides telecom service providers to reduce revenue loss. However, the development of a churn prediction system for a telecom industry is a challenging task, mainly due to the large size of the data, high dimensional features, and imbalanced distribution of the data. In this paper, we present a solution to the inherent problems of churn prediction, using the concept of Tra…
▽ More
A churn prediction system guides telecom service providers to reduce revenue loss. However, the development of a churn prediction system for a telecom industry is a challenging task, mainly due to the large size of the data, high dimensional features, and imbalanced distribution of the data. In this paper, we present a solution to the inherent problems of churn prediction, using the concept of Transfer Learning (TL) and Ensemble-based Meta-Classification. The proposed method TL-DeepE is applied in two stages. The first stage employs TL by fine-tuning multiple pre-trained Deep Convolution Neural Networks (CNNs). Telecom datasets are normally in vector form, which is converted into 2D images because Deep CNNs have high learning capacity on images. In the second stage, predictions from these Deep CNNs are appended to the original feature vector and thus are used to build a final feature vector for the high-level Genetic Programming (GP) and AdaBoost based ensemble classifier. Thus, the experiments are conducted using various CNNs as base classifiers and the GP-AdaBoost as a meta-classifier. By using 10-fold cross-validation, the performance of the proposed TL-DeepE system is compared with existing techniques, for two standard telecommunication datasets; Orange and Cell2cell. Performing experiments on Orange and Cell2cell datasets, the prediction accuracy obtained was 75.4% and 68.2%, while the area under the curve was 0.83 and 0.74, respectively.
△ Less
Submitted 5 March, 2019; v1 submitted 18 January, 2019;
originally announced January 2019.
-
Universal Deep Beamformer for Variable Rate Ultrasound Imaging
Authors:
Shujaat Khan,
Jaeyoung Huh,
Jong Chul Ye
Abstract:
Ultrasound (US) imaging is based on the time-reversal principle, in which individual channel RF measurements are back-propagated and accumulated to form an image after applying specific delays. While this time reversal is usually implemented as a delay-and-sum (DAS) beamformer, the image quality quickly degrades as the number of measurement channels decreases. To address this problem, various type…
▽ More
Ultrasound (US) imaging is based on the time-reversal principle, in which individual channel RF measurements are back-propagated and accumulated to form an image after applying specific delays. While this time reversal is usually implemented as a delay-and-sum (DAS) beamformer, the image quality quickly degrades as the number of measurement channels decreases. To address this problem, various types of adaptive beamforming techniques have been proposed using predefined models of the signals. However, the performance of these adaptive beamforming approaches degrade when the underlying model is not sufficiently accurate. Here, we demonstrate for the first time that a single universal deep beamformer trained using a purely data-driven way can generate significantly improved images over widely varying aperture and channel subsampling patterns. In particular, we design an end-to-end deep learning framework that can directly process sub-sampled RF data acquired at different subsampling rate and detector configuration to generate high quality ultrasound images using a single beamformer. Experimental results using B-mode focused ultrasound confirm the efficacy of the proposed methods.
△ Less
Submitted 7 January, 2019;
originally announced January 2019.
-
Volumetric Convolution: Automatic Representation Learning in Unit Ball
Authors:
Sameera Ramasinghe,
Salman Khan,
Nick Barnes
Abstract:
Convolution is an efficient technique to obtain abstract feature representations using hierarchical layers in deep networks. Although performing convolution in Euclidean geometries is fairly straightforward, its extension to other topological spaces---such as a sphere ($\mathbb{S}^2$) or a unit ball ($\mathbb{B}^3$)---entails unique challenges. In this work, we propose a novel `\emph{volumetric co…
▽ More
Convolution is an efficient technique to obtain abstract feature representations using hierarchical layers in deep networks. Although performing convolution in Euclidean geometries is fairly straightforward, its extension to other topological spaces---such as a sphere ($\mathbb{S}^2$) or a unit ball ($\mathbb{B}^3$)---entails unique challenges. In this work, we propose a novel `\emph{volumetric convolution}' operation that can effectively convolve arbitrary functions in $\mathbb{B}^3$. We develop a theoretical framework for \emph{volumetric convolution} based on Zernike polynomials and efficiently implement it as a differentiable and an easily pluggable layer for deep networks. Furthermore, our formulation leads to derivation of a novel formula to measure the symmetry of a function in $\mathbb{B}^3$ around an arbitrary axis, that is useful in 3D shape analysis tasks. We demonstrate the efficacy of proposed volumetric convolution operation on a possible use-case i.e., 3D object recognition task.
△ Less
Submitted 3 January, 2019;
originally announced January 2019.
-
Learning to Unlearn: Building Immunity to Dataset Bias in Medical Imaging Studies
Authors:
Ahmed Ashraf,
Shehroz Khan,
Nikhil Bhagwat,
Mallar Chakravarty,
Babak Taati
Abstract:
Medical imaging machine learning algorithms are usually evaluated on a single dataset. Although training and testing are performed on different subsets of the dataset, models built on one study show limited capability to generalize to other studies. While database bias has been recognized as a serious problem in the computer vision community, it has remained largely unnoticed in medical imaging re…
▽ More
Medical imaging machine learning algorithms are usually evaluated on a single dataset. Although training and testing are performed on different subsets of the dataset, models built on one study show limited capability to generalize to other studies. While database bias has been recognized as a serious problem in the computer vision community, it has remained largely unnoticed in medical imaging research. Transfer learning thus remains confined to the re-use of feature representations requiring re-training on the new dataset. As a result, machine learning models do not generalize even when trained on imaging datasets that were captured to study the same variable of interest. The ability to transfer knowledge gleaned from one study to another, without the need for re-training, if possible, would provide reassurance that the models are learning knowledge fundamental to the problem under study instead of latching onto the idiosyncracies of a dataset. In this paper, we situate the problem of dataset bias in the context of medical imaging studies. We show empirical evidence that such a problem exists in medical datasets. We then present a framework to unlearn study membership as a means to handle the problem of database bias. Our main idea is to take the data from the original feature space to an intermediate space where the data points are indistinguishable in terms of which study they come from, while maintaining the recognition capability with respect to the variable of interest. This will promote models which learn the more general properties of the etiology under study instead of aligning to dataset-specific peculiarities. Essentially, our proposed model learns to unlearn the dataset bias.
△ Less
Submitted 2 December, 2018;
originally announced December 2018.
-
Survey of state-of-the-art mixed data clustering algorithms
Authors:
Amir Ahmad,
Shehroz S. Khan
Abstract:
Mixed data comprises both numeric and categorical features, and mixed datasets occur frequently in many domains, such as health, finance, and marketing. Clustering is often applied to mixed datasets to find structures and to group similar objects for further analysis. However, clustering mixed data is challenging because it is difficult to directly apply mathematical operations, such as summation…
▽ More
Mixed data comprises both numeric and categorical features, and mixed datasets occur frequently in many domains, such as health, finance, and marketing. Clustering is often applied to mixed datasets to find structures and to group similar objects for further analysis. However, clustering mixed data is challenging because it is difficult to directly apply mathematical operations, such as summation or averaging, to the feature values of these datasets. In this paper, we present a taxonomy for the study of mixed data clustering algorithms by identifying five major research themes. We then present a state-of-the-art review of the research works within each research theme. We analyze the strengths and weaknesses of these methods with pointers for future research directions. Lastly, we present an in-depth analysis of the overall challenges in this field, highlight open research questions and discuss guidelines to make progress in the field.
△ Less
Submitted 18 March, 2019; v1 submitted 11 November, 2018;
originally announced November 2018.
-
Statistical modeling of rates and trends in Holocene relative sea level
Authors:
Erica L. Ashe,
Niamh Cahill,
Carling Hay,
Nicole S. Khan,
Andrew Kemp,
Simon Engelhart,
Benjamin P. Horton,
Andrew Parnell,
Robert E. Kopp
Abstract:
Characterizing the spatio-temporal variability of relative sea level (RSL) and estimating local, regional, and global RSL trends requires statistical analysis of RSL data. Formal statistical treatments, needed to account for the spatially and temporally sparse distribution of data and for geochronological and elevational uncertainties, have advanced considerably over the last decade. Time-series m…
▽ More
Characterizing the spatio-temporal variability of relative sea level (RSL) and estimating local, regional, and global RSL trends requires statistical analysis of RSL data. Formal statistical treatments, needed to account for the spatially and temporally sparse distribution of data and for geochronological and elevational uncertainties, have advanced considerably over the last decade. Time-series models have adopted more flexible and physically-informed specifications with more rigorous quantification of uncertainties. Spatio-temporal models have evolved from simple regional averaging to frameworks that more richly represent the correlation structure of RSL across space and time. More complex statistical approaches enable rigorous quantification of spatial and temporal variability, the combination of geographically disparate data, and the separation of the RSL field into various components associated with different driving processes. We review the range of statistical modeling and analysis choices used in the literature, reformulating them for ease of comparison in a common hierarchical statistical framework. The hierarchical framework separates each model into different levels, clearly partitioning measurement and inferential uncertainty from process variability. Placing models in a hierarchical framework enables us to highlight both the similarities and differences among modeling and analysis choices. We illustrate the implications of some modeling and analysis choices currently used in the literature by comparing the results of their application to common datasets within a hierarchical framework. In light of the complex patterns of spatial and temporal variability exhibited by RSL, we recommend non-parametric approaches for modeling temporal and spatio-temporal RSL.
△ Less
Submitted 24 October, 2018;
originally announced October 2018.
-
RAFP-Pred: Robust Prediction of Antifreeze Proteins using Localized Analysis of n-Peptide Compositions
Authors:
Shujaat Khan,
Imran Naseem,
Roberto Togneri,
Mohammed Bennamoun
Abstract:
In extreme cold weather, living organisms produce Antifreeze Proteins (AFPs) to counter the otherwise lethal intracellular formation of ice. Structures and sequences of various AFPs exhibit a high degree of heterogeneity, consequently the prediction of the AFPs is considered to be a challenging task. In this research, we propose to handle this arduous manifold learning task using the notion of loc…
▽ More
In extreme cold weather, living organisms produce Antifreeze Proteins (AFPs) to counter the otherwise lethal intracellular formation of ice. Structures and sequences of various AFPs exhibit a high degree of heterogeneity, consequently the prediction of the AFPs is considered to be a challenging task. In this research, we propose to handle this arduous manifold learning task using the notion of localized processing. In particular an AFP sequence is segmented into two sub-segments each of which is analyzed for amino acid and di-peptide compositions. We propose to use only the most significant features using the concept of information gain (IG) followed by a random forest classification approach. The proposed RAFP-Pred achieved an excellent performance on a number of standard datasets. We report a high Youden's index (sensitivity+specificity-1) value of 0.75 on the standard independent test data set outperforming the AFP-PseAAC, AFP\_PSSM, AFP-Pred and iAFP by a margin of 0.05, 0.06, 0.14 and 0.68 respectively. The verification rate on the UniProKB dataset is found to be 83.19\% which is substantially superior to the 57.18\% reported for the iAFP method.
△ Less
Submitted 25 September, 2018;
originally announced September 2018.
-
DeepFall -- Non-invasive Fall Detection with Deep Spatio-Temporal Convolutional Autoencoders
Authors:
Jacob Nogas,
Shehroz S. Khan,
Alex Mihailidis
Abstract:
Human falls rarely occur; however, detecting falls is very important from the health and safety perspective. Due to the rarity of falls, it is difficult to employ supervised classification techniques to detect them. Moreover, in these highly skewed situations it is also difficult to extract domain specific features to identify falls. In this paper, we present a novel framework, \textit{DeepFall},…
▽ More
Human falls rarely occur; however, detecting falls is very important from the health and safety perspective. Due to the rarity of falls, it is difficult to employ supervised classification techniques to detect them. Moreover, in these highly skewed situations it is also difficult to extract domain specific features to identify falls. In this paper, we present a novel framework, \textit{DeepFall}, which formulates the fall detection problem as an anomaly detection problem. The \textit{DeepFall} framework presents the novel use of deep spatio-temporal convolutional autoencoders to learn spatial and temporal features from normal activities using non-invasive sensing modalities. We also present a new anomaly scoring method that combines the reconstruction score of frames across a temporal window to detect unseen falls. We tested the \textit{DeepFall} framework on three publicly available datasets collected through non-invasive sensing modalities, thermal camera and depth cameras and show superior results in comparison to traditional autoencoder methods to identify unseen falls.
△ Less
Submitted 27 April, 2020; v1 submitted 30 August, 2018;
originally announced September 2018.
-
Development and Evaluation of Recurrent Neural Network based Models for Hourly Traffic Volume and AADT Prediction
Authors:
MD Zadid Khan,
Sakib Mahmud Khan,
Mashrur Chowdhury,
Kakan Dey
Abstract:
The prediction of high-resolution hourly traffic volumes of a given roadway is essential for transportation planning. Traditionally, Automatic Traffic Recorders (ATR) are used to collect this hourly volume data. These large datasets are time series data characterized by long-term temporal dependencies and missing values. Regarding the temporal dependencies, all roadways are characterized by season…
▽ More
The prediction of high-resolution hourly traffic volumes of a given roadway is essential for transportation planning. Traditionally, Automatic Traffic Recorders (ATR) are used to collect this hourly volume data. These large datasets are time series data characterized by long-term temporal dependencies and missing values. Regarding the temporal dependencies, all roadways are characterized by seasonal variations that can be weekly, monthly or yearly, depending on the cause of the variation. Regarding the missing data in a time-series sequence, traditional time series forecasting models perform poorly under the influence of seasonal variations. To address this limitation, robust, Recurrent Neural Network (RNN) based, multi-step ahead forecasting models are developed for time-series in this study. The simple RNN, the Gated Recurrent Unit (GRU) and the Long Short-Term Memory (LSTM) units are used to develop the model and evaluate its performance. Two approaches are used to address the missing value issue: masking and imputation, in conjunction with the RNN models. Six different imputation algorithms are then used to identify the best model. The analysis indicates that the LSTM model performs better than simple RNN and GRU models, and imputation performs better than masking to predict future traffic volume. Based on analysis using 92 ATRs, the LSTM-Median model is deemed the best model in all scenarios for hourly traffic volume and AADT prediction, with an average RMSE of 274 and MAPE of 18.91% for hourly traffic volume prediction and average RMSE of 824 and MAPE of 2.10% for AADT prediction.
△ Less
Submitted 25 November, 2018; v1 submitted 15 August, 2018;
originally announced August 2018.
-
Supervised classification for object identification in urban areas using satellite imagery
Authors:
Hazrat Ali,
Adnan Ali Awan,
Sanaullah Khan,
Omer Shafique,
Atiq ur Rahman,
Shahid Khan
Abstract:
This paper presents a useful method to achieve classification in satellite imagery. The approach is based on pixel level study employing various features such as correlation, homogeneity, energy and contrast. In this study gray-scale images are used for training the classification model. For supervised classification, two classification techniques are employed namely the Support Vector Machine (SV…
▽ More
This paper presents a useful method to achieve classification in satellite imagery. The approach is based on pixel level study employing various features such as correlation, homogeneity, energy and contrast. In this study gray-scale images are used for training the classification model. For supervised classification, two classification techniques are employed namely the Support Vector Machine (SVM) and the Naive Bayes. With textural features used for gray-scale images, Naive Bayes performs better with an overall accuracy of 76% compared to 68% achieved by SVM. The computational time is evaluated while performing the experiment with two different window sizes i.e., 50x50 and 70x70. The required computational time on a single image is found to be 27 seconds for a window size of 70x70 and 45 seconds for a window size of 50x50.
△ Less
Submitted 2 August, 2018;
originally announced August 2018.
-
Comments on "Momentum fractional LMS for power signal parameter estimation"
Authors:
Shujaat Khan,
Imran Naseem,
Alishba Sadiq,
Jawwad Ahmad,
Muhammad Moinuddin
Abstract:
The purpose of this paper is to indicate that the recently proposed Momentum fractional least mean squares (mFLMS) algorithm has some serious flaws in its design and analysis. Our apprehensions are based on the evidence we found in the derivation and analysis in the paper titled: \textquotedblleft \textit{Momentum fractional LMS for power signal parameter estimation}\textquotedblright. In addition…
▽ More
The purpose of this paper is to indicate that the recently proposed Momentum fractional least mean squares (mFLMS) algorithm has some serious flaws in its design and analysis. Our apprehensions are based on the evidence we found in the derivation and analysis in the paper titled: \textquotedblleft \textit{Momentum fractional LMS for power signal parameter estimation}\textquotedblright. In addition to the theoretical bases our claims are also verified through extensive simulation results. The experiments clearly show that the new method does not have any advantage over the classical least mean square (LMS) method.
△ Less
Submitted 19 May, 2018;
originally announced May 2018.
-
Maturation Trajectories of Cortical Resting-State Networks Depend on the Mediating Frequency Band
Authors:
Sheraz Khan,
Javeria Hashmi,
Fahimeh Mamashli,
Konstantinos Michmizos,
Manfred Kitzbichler,
Hari Bharadwaj,
Yousra Bekhti,
Santosh Ganesan,
Keri A Garel,
Susan Whitfield-Gabrieli,
Randy Gollub,
Jian Kong,
Lucia M Vaina,
Kunjan Rana,
Steven Stufflebeam,
Matti Hamalainen,
Tal Kenet
Abstract:
The functional significance of resting state networks and their abnormal manifestations in psychiatric disorders are firmly established, as is the importance of the cortical rhythms in mediating these networks. Resting state networks are known to undergo substantial reorganization from childhood to adulthood, but whether distinct cortical rhythms, which are generated by separable neural mechanisms…
▽ More
The functional significance of resting state networks and their abnormal manifestations in psychiatric disorders are firmly established, as is the importance of the cortical rhythms in mediating these networks. Resting state networks are known to undergo substantial reorganization from childhood to adulthood, but whether distinct cortical rhythms, which are generated by separable neural mechanisms and are often manifested abnormally in psychiatric conditions, mediate maturation differentially, remains unknown. Using magnetoencephalography (MEG) to map frequency band specific maturation of resting state networks from age 7 to 29 in 162 participants (31 independent), we found significant changes with age in networks mediated by the beta (13-30Hz) and gamma (31-80Hz) bands. More specifically, gamma band mediated networks followed an expected asymptotic trajectory, but beta band mediated networks followed a linear trajectory. Network integration increased with age in gamma band mediated networks, while local segregation increased with age in beta band mediated networks. Spatially, the hubs that changed in importance with age in the beta band mediated networks had relatively little overlap with those that showed the greatest changes in the gamma band mediated networks. These findings are relevant for our understanding of the neural mechanisms of cortical maturation, in both typical and atypical development.
△ Less
Submitted 12 February, 2018;
originally announced March 2018.
-
Enhanced ${q}$-Least Mean Square
Authors:
Shujaat Khan,
Alishba Sadiq,
Imran Naseem,
Roberto Togneri,
Mohammed Bennamoun
Abstract:
In this work, a new class of stochastic gradient algorithm is developed based on $q$-calculus. Unlike the existing $q$-LMS algorithm, the proposed approach fully utilizes the concept of $q$-calculus by incorporating time-varying $q$ parameter. The proposed enhanced $q$-LMS ($Eq$-LMS) algorithm utilizes a novel, parameterless concept of error-correlation energy and normalization of signal to ensure…
▽ More
In this work, a new class of stochastic gradient algorithm is developed based on $q$-calculus. Unlike the existing $q$-LMS algorithm, the proposed approach fully utilizes the concept of $q$-calculus by incorporating time-varying $q$ parameter. The proposed enhanced $q$-LMS ($Eq$-LMS) algorithm utilizes a novel, parameterless concept of error-correlation energy and normalization of signal to ensure high convergence, stability and low steady-state error. The proposed algorithm automatically adapts the learning rate with respect to the error. For the evaluation purpose the system identification problem is considered. Extensive experiments show better performance of the proposed $Eq$-LMS algorithm compared to the standard $q$-LMS approach.
△ Less
Submitted 1 January, 2018;
originally announced January 2018.