Search | arXiv e-print repository

Arbitrary-Length Generalization for Addition in a Tiny Transformer

Abstract: This paper introduces a novel training methodology that enables a Transformer model to generalize the addition of two-digit numbers to numbers with unseen lengths of digits. The proposed approach employs an autoregressive generation technique, processing from right to left, which mimics a common manual method for adding large numbers. To the best of my knowledge, this methodology has not been prev… ▽ More This paper introduces a novel training methodology that enables a Transformer model to generalize the addition of two-digit numbers to numbers with unseen lengths of digits. The proposed approach employs an autoregressive generation technique, processing from right to left, which mimics a common manual method for adding large numbers. To the best of my knowledge, this methodology has not been previously explored in the literature. All results are reproducible, and the corresponding R code is available at github.com/AGPatriota/ALGA-R/. △ Less

Submitted 11 June, 2024; v1 submitted 30 May, 2024; originally announced June 2024.

Comments: Testing-Digits.R output with 50-digit numbers (8 pages, 1 figure)

arXiv:1512.07059 [pdf, ps, other]

Improved hypothesis testing in a general multivariate elliptical model

Authors: T. F. N. Melo, S. L. P. Ferrari, A. G. Patriota

Abstract: This paper investigates improved testing inferences under a general multivariate elliptical regression model. The model is very flexible in terms of the specification of the mean vector and the dispersion matrix, and of the choice of the error distribution. The error terms are allowed to follow a multivariate distribution in the class of the elliptical distributions, which has the multivariate nor… ▽ More This paper investigates improved testing inferences under a general multivariate elliptical regression model. The model is very flexible in terms of the specification of the mean vector and the dispersion matrix, and of the choice of the error distribution. The error terms are allowed to follow a multivariate distribution in the class of the elliptical distributions, which has the multivariate normal and Student-t distributions as special cases. We obtain Skovgaard's adjusted likelihood ratio statistics and Barndorff-Nielsen's adjusted signed likelihood ratio statistics and we conduct a simulation study. The simulations suggest that the proposed tests display superior finite sample behavior as compared to the standard tests. Two applications are presented in order to illustrate the methods. △ Less

Submitted 31 October, 2016; v1 submitted 22 December, 2015; originally announced December 2015.

Comments: 20 pages, 3 figures

arXiv:1510.02950 [pdf, other]

A measure of evidence based on the likelihood-ratio statistics

Authors: Alexandre G. Patriota

Abstract: In this paper, we show that the likelihood-ratio measure (a) is invariant with respect to dominating sigma-finite measures, (b) satisfies logical consequences which are not satisfied by standard $p$-values, (c) respects frequentist properties, i.e., the type I error can be properly controlled, and, under mild regularity conditions, (d) can be used as an upper bound for posterior probabilities. We… ▽ More In this paper, we show that the likelihood-ratio measure (a) is invariant with respect to dominating sigma-finite measures, (b) satisfies logical consequences which are not satisfied by standard $p$-values, (c) respects frequentist properties, i.e., the type I error can be properly controlled, and, under mild regularity conditions, (d) can be used as an upper bound for posterior probabilities. We also discuss a generic application to test whether the genotype frequencies of a given population are under the Hardy-Weinberg equilibrium, under inbreeding restrictions or under outbreeding restrictions. △ Less

Submitted 5 April, 2021; v1 submitted 10 October, 2015; originally announced October 2015.

Comments: 25 double-spaced pages, 4 figures. Rewritten version with more results

arXiv:1508.05994 [pdf, ps, other]

Improved estimation in a general multivariate elliptical model

Authors: Tatiane F. N. Melo, Silvia L. P. Ferrari, Alexandre G. Patriota

Abstract: The problem of reducing the bias of maximum likelihood estimator in a general multivariate elliptical regression model is considered. The model is very flexible and allows the mean vector and the dispersion matrix to have parameters in common. Many frequently used models are special cases of this general formulation, namely: errors-in-variables models, nonlinear mixed-effects models, heteroscedast… ▽ More The problem of reducing the bias of maximum likelihood estimator in a general multivariate elliptical regression model is considered. The model is very flexible and allows the mean vector and the dispersion matrix to have parameters in common. Many frequently used models are special cases of this general formulation, namely: errors-in-variables models, nonlinear mixed-effects models, heteroscedastic nonlinear models, among others. In any of these models, the vector of the errors may have any multivariate elliptical distribution. We obtain the second-order bias of the maximum likelihood estimator, a bias-corrected estimator, and a bias-reduced estimator. Simulation results indicate the effectiveness of the bias correction and bias reduction schemes. △ Less

Submitted 29 January, 2016; v1 submitted 24 August, 2015; originally announced August 2015.

arXiv:1311.6732 [pdf, other]

A statistical test to identify differences in clustering structures

Authors: André Fujita, Daniel Y. Takahashi, Alexandre G. Patriota, João R. Sato

Abstract: Statistical inference on functional magnetic resonance imaging (fMRI) data is an important task in brain imaging. One major hypothesis is that the presence or not of a psychiatric disorder can be explained by the differential clustering of neurons in the brain. In view of this fact, it is clearly of interest to address the question of whether the properties of the clusters have changed between gro… ▽ More Statistical inference on functional magnetic resonance imaging (fMRI) data is an important task in brain imaging. One major hypothesis is that the presence or not of a psychiatric disorder can be explained by the differential clustering of neurons in the brain. In view of this fact, it is clearly of interest to address the question of whether the properties of the clusters have changed between groups of patients and controls. The normal method of approaching group differences in brain imaging is to carry out a voxel-wise univariate analysis for a difference between the mean group responses using an appropriate test (e.g. a t-test) and to assemble the resulting "significantly different voxels" into clusters, testing again at cluster level. In this approach of course, the primary voxel-level test is blind to any cluster structure. Direct assessments of differences between groups (or reproducibility within groups) at the cluster level have been rare in brain imaging. For this reason, we introduce a novel statistical test called ANOCVA - ANalysis Of Cluster structure Variability, which statistically tests whether two or more populations are equally clustered using specific features. The proposed method allows us to compare the clustering structure of multiple groups simultaneously, and also to identify features that contribute to the differential clustering. We illustrate the performance of ANOCVA through simulations and an application to an fMRI data set composed of children with ADHD and controls. Results show that there are several differences in the brain's clustering structure between them, corroborating the hypothesis in the literature. Furthermore, we identified some brain regions previously not described, generating new hypothesis to be tested empirically. △ Less

Submitted 26 November, 2013; originally announced November 2013.

arXiv:1205.5039 [pdf, other]

Modified likelihood ratio tests in heteroskedastic multivariate regression models with measurement error

Authors: Tatiane F. N. Melo, Silvia L. P. Ferrari, Alexandre G. Patriota

Abstract: In this paper, we develop modified versions of the likelihood ratio test for multivariate heteroskedastic errors-in-variables regression models. The error terms are allowed to follow a multivariate distribution in the elliptical class of distributions, which has the normal distribution as a special case. We derive the Skovgaard adjusted likelihood ratio statistics, which follow a chi-squared distr… ▽ More In this paper, we develop modified versions of the likelihood ratio test for multivariate heteroskedastic errors-in-variables regression models. The error terms are allowed to follow a multivariate distribution in the elliptical class of distributions, which has the normal distribution as a special case. We derive the Skovgaard adjusted likelihood ratio statistics, which follow a chi-squared distribution with a high degree of accuracy. We conduct a simulation study and show that the proposed tests display superior finite sample behavior as compared to the standard likelihood ratio test. We illustrate the usefulness of our results in applied settings using a data set from the WHO MONICA Projection cardiovascular disease. △ Less

Submitted 15 March, 2013; v1 submitted 22 May, 2012; originally announced May 2012.

Comments: 22 pages, 3 figures

arXiv:1201.0400 [pdf, ps, other]

doi 10.1016/j.fss.2013.03.007

A classical measure of evidence for general null hypotheses

Authors: Alexandre G. Patriota

Abstract: In science, the most widespread statistical quantities are perhaps $p$-values. A typical advice is to reject the null hypothesis $H_0$ if the corresponding p-value is sufficiently small (usually smaller than 0.05). Many criticisms regarding p-values have arisen in the scientific literature. The main issue is that in general optimal p-values (based on likelihood ratio statistics) are not measures o… ▽ More In science, the most widespread statistical quantities are perhaps $p$-values. A typical advice is to reject the null hypothesis $H_0$ if the corresponding p-value is sufficiently small (usually smaller than 0.05). Many criticisms regarding p-values have arisen in the scientific literature. The main issue is that in general optimal p-values (based on likelihood ratio statistics) are not measures of evidence over the parameter space $Θ$. Here, we propose an \emph{objective} measure of evidence for very general null hypotheses that satisfies logical requirements (i.e., operations on the subsets of $Θ$) that are not met by p-values (e.g., it is a possibility measure). We study the proposed measure in the light of the abstract belief calculus formalism and we conclude that it can be used to establish objective states of belief on the subsets of $Θ$. Based on its properties, we strongly recommend this measure as an additional summary of significance tests. At the end of the paper we give a short listing of possible open problems. △ Less

Submitted 15 November, 2013; v1 submitted 1 January, 2012; originally announced January 2012.

Comments: 26 pages, one figure and one table. Corrected version

Journal ref: Fuzzy sets and Systems, 233, 74-88, 2013

arXiv:1103.6166 [pdf, ps, other]

A note on Carathéodory's Extension Theorem

Authors: Alexandre G Patriota

Abstract: In this note, we show that the Carathéodory's extension theorem is still valid for a class of subsets of $Ω$ less restricted than a semi-ring, which we call quasi-semi-ring. In this note, we show that the Carathéodory's extension theorem is still valid for a class of subsets of $Ω$ less restricted than a semi-ring, which we call quasi-semi-ring. △ Less

Submitted 31 March, 2011; v1 submitted 31 March, 2011; originally announced March 2011.

Comments: 8 pages

arXiv:0911.5628 [pdf, ps, other]

doi 10.1016/j.stamet.2010.02.001

Vector Autoregressive Models With Measurement Errors for Testing Ganger Causality

Authors: Alexandre G. Patriota, Joao R. Sato, Betsabe G. Blas

Abstract: This paper develops a method for estimating parameters of a vector autoregression (VAR) observed in white noise. The estimation method assumes the noise variance matrix is known and does not require any iterative process. This study provides consistent estimators and shows the asymptotic distribution of the parameters required for conducting tests of Granger causality. Methods in the existing st… ▽ More This paper develops a method for estimating parameters of a vector autoregression (VAR) observed in white noise. The estimation method assumes the noise variance matrix is known and does not require any iterative process. This study provides consistent estimators and shows the asymptotic distribution of the parameters required for conducting tests of Granger causality. Methods in the existing statistical literature cannot be used for testing Granger causality, since under the null hypothesis the model becomes unidentifiable. Measurement error effects on parameter estimates were evaluated by using computational simulations. The results show that the proposed approach produces empirical false positive rates close to the adopted nominal level (even for small samples) and has a good performance around the null hypothesis. The applicability and usefulness of the proposed approach are illustrated using a functional magnetic resonance imaging dataset. △ Less

Submitted 30 November, 2009; originally announced November 2009.

Comments: manuscript submitted for possible publication

arXiv:0906.0113 [pdf, ps, other]

doi 10.1016/j.csda.2010.06.007

A note on Influence diagnostics in nonlinear mixed-effects elliptical models

Authors: Alexandre G. Patriota

Abstract: This paper provides general matrix formulas for computing the score function, the (expected and observed) Fisher information and the $Δ$ matrices (required for the assessment of local influence) for a quite general model which includes the one proposed by Russo et al. (2009). Additionally, we also present an expression for the generalized leverage. The matrix formulation has a considerable advanta… ▽ More This paper provides general matrix formulas for computing the score function, the (expected and observed) Fisher information and the $Δ$ matrices (required for the assessment of local influence) for a quite general model which includes the one proposed by Russo et al. (2009). Additionally, we also present an expression for the generalized leverage. The matrix formulation has a considerable advantage, since although the complexity of the postulated model, all general formulas are compact, clear and have nice forms. △ Less

Submitted 16 September, 2021; v1 submitted 1 June, 2009; originally announced June 2009.

Comments: Paper submitted for possible publication, 6 pages (formulas corrected)

arXiv:0903.3146 [pdf, ps, other]

doi 10.1007/s00362-009-0243-7

Improved maximum likelihood estimators in a heteroskedastic errors-in-variables model

Authors: Alexandre G. Patriota, Artur J. Lemonte, Heleno Bolfarine

Abstract: This paper develops a bias correction scheme for a multivariate heteroskedastic errors-in-variables model. The applicability of this model is justified in areas such as astrophysics, epidemiology and analytical chemistry, where the variables are subject to measurement errors and the variances vary with the observations. We conduct Monte Carlo simulations to investigate the performance of the corre… ▽ More This paper develops a bias correction scheme for a multivariate heteroskedastic errors-in-variables model. The applicability of this model is justified in areas such as astrophysics, epidemiology and analytical chemistry, where the variables are subject to measurement errors and the variances vary with the observations. We conduct Monte Carlo simulations to investigate the performance of the corrected estimators. The numerical results show that the bias correction scheme yields nearly unbiased estimates. We also give an application to a real data set. △ Less

Submitted 26 August, 2015; v1 submitted 18 March, 2009; originally announced March 2009.

Comments: 12 pages. Statistical Papers

arXiv:0812.3612 [pdf, ps, other]

doi 10.1016/j.spl.2009.04.018

Bias correction in a multivariate normal regression model with general parameterization

Authors: Alexandre G. Patriota, Artur J. Lemonte

Abstract: This paper develops a bias correction scheme for a multivariate normal model under a general parameterization. In the model, the mean vector and the covariance matrix share the same parameters. It includes many important regression models available in the literature as special cases, such as (non)linear regression, errors-in-variables models, and so forth. Moreover, heteroscedastic situations ma… ▽ More This paper develops a bias correction scheme for a multivariate normal model under a general parameterization. In the model, the mean vector and the covariance matrix share the same parameters. It includes many important regression models available in the literature as special cases, such as (non)linear regression, errors-in-variables models, and so forth. Moreover, heteroscedastic situations may also be studied within our framework. We derive a general expression for the second-order biases of maximum likelihood estimates of the model parameters and show that it is always possible to obtain the second order bias by means of ordinary weighted lest-squares regressions. We enlighten such general expression with an errors-in-variables model and also conduct some simulations in order to verify the performance of the corrected estimates. The simulation results show that the bias correction scheme yields nearly unbiased estimators. We also present an empirical ilustration. △ Less

Submitted 16 January, 2009; v1 submitted 18 December, 2008; originally announced December 2008.

Comments: 1 Figure, 17 pages

Showing 1–12 of 12 results for author: Patriota, A G