-
Ethical Challenges in Gamified Education Research and Development: An Umbrella Review and Potential Directions
Authors:
Ana Carolina Tomé Klock,
Brenda Salenave Santana,
Juho Hamari
Abstract:
Gamification is a technological, economic, cultural, and societal development toward promoting a more game-like reality. As this emergent phenomenon has been gradually consolidated into our daily lives, especially in educational settings, many scholars and practitioners face a major challenge ahead: how to understand and mitigate the unethical impacts of gamification when researching and developin…
▽ More
Gamification is a technological, economic, cultural, and societal development toward promoting a more game-like reality. As this emergent phenomenon has been gradually consolidated into our daily lives, especially in educational settings, many scholars and practitioners face a major challenge ahead: how to understand and mitigate the unethical impacts of gamification when researching and developing such educational technologies? Thus, this study explores ethical challenges in gamified educational applications and proposes potential solutions to address them based on an umbrella review. After analysing secondary studies, this study details and proposes recommendations on addressing some ethical challenges in gamified education, such as power dynamics and paternalism, lack of voluntarity and confidentiality, cognitive manipulation, and social comparison. Research and development decision-making processes affected by such challenges are also elaborated, and potential actions to mitigate their effects in gamification planning, conducting and communication are further introduced. Thus, this chapter provides an understanding of ethical challenges posed by the literature in gamified education and a set of guidelines for future research and development.
△ Less
Submitted 26 September, 2023;
originally announced September 2023.
-
Gradient is All You Need?
Authors:
Konstantin Riedl,
Timo Klock,
Carina Geldhauser,
Massimo Fornasier
Abstract:
In this paper we provide a novel analytical perspective on the theoretical understanding of gradient-based learning algorithms by interpreting consensus-based optimization (CBO), a recently proposed multi-particle derivative-free optimization method, as a stochastic relaxation of gradient descent. Remarkably, we observe that through communication of the particles, CBO exhibits a stochastic gradien…
▽ More
In this paper we provide a novel analytical perspective on the theoretical understanding of gradient-based learning algorithms by interpreting consensus-based optimization (CBO), a recently proposed multi-particle derivative-free optimization method, as a stochastic relaxation of gradient descent. Remarkably, we observe that through communication of the particles, CBO exhibits a stochastic gradient descent (SGD)-like behavior despite solely relying on evaluations of the objective function. The fundamental value of such link between CBO and SGD lies in the fact that CBO is provably globally convergent to global minimizers for ample classes of nonsmooth and nonconvex objective functions, hence, on the one side, offering a novel explanation for the success of stochastic relaxations of gradient descent. On the other side, contrary to the conventional wisdom for which zero-order methods ought to be inefficient or not to possess generalization abilities, our results unveil an intrinsic gradient descent nature of such heuristics. This viewpoint furthermore complements previous insights into the working principles of CBO, which describe the dynamics in the mean-field limit through a nonlinear nonlocal partial differential equation that allows to alleviate complexities of the nonconvex function landscape. Our proofs leverage a completely nonsmooth analysis, which combines a novel quantitative version of the Laplace principle (log-sum-exp trick) and the minimizing movement scheme (proximal iteration). In doing so, we furnish useful and precise insights that explain how stochastic perturbations of gradient descent overcome energy barriers and reach deep levels of nonconvex functions. Instructive numerical illustrations support the provided theoretical insights.
△ Less
Submitted 16 June, 2023;
originally announced June 2023.
-
Finite Sample Identification of Wide Shallow Neural Networks with Biases
Authors:
Massimo Fornasier,
Timo Klock,
Marco Mondelli,
Michael Rauchensteiner
Abstract:
Artificial neural networks are functions depending on a finite number of parameters typically encoded as weights and biases. The identification of the parameters of the network from finite samples of input-output pairs is often referred to as the \emph{teacher-student model}, and this model has represented a popular framework for understanding training and generalization. Even if the problem is NP…
▽ More
Artificial neural networks are functions depending on a finite number of parameters typically encoded as weights and biases. The identification of the parameters of the network from finite samples of input-output pairs is often referred to as the \emph{teacher-student model}, and this model has represented a popular framework for understanding training and generalization. Even if the problem is NP-complete in the worst case, a rapidly growing literature -- after adding suitable distributional assumptions -- has established finite sample identification of two-layer networks with a number of neurons $m=\mathcal O(D)$, $D$ being the input dimension. For the range $D<m<D^2$ the problem becomes harder, and truly little is known for networks parametrized by biases as well. This paper fills the gap by providing constructive methods and theoretical guarantees of finite sample identification for such wider shallow networks with biases. Our approach is based on a two-step pipeline: first, we recover the direction of the weights, by exploiting second order information; next, we identify the signs by suitable algebraic evaluations, and we recover the biases by empirical risk minimization via gradient descent. Numerical results demonstrate the effectiveness of our approach.
△ Less
Submitted 8 November, 2022;
originally announced November 2022.
-
Semi-Supervised Manifold Learning with Complexity Decoupled Chart Autoencoders
Authors:
Stefan C. Schonsheck,
Scott Mahan,
Timo Klock,
Alexander Cloninger,
Rongjie Lai
Abstract:
Autoencoding is a popular method in representation learning. Conventional autoencoders employ symmetric encoding-decoding procedures and a simple Euclidean latent space to detect hidden low-dimensional structures in an unsupervised way. Some modern approaches to novel data generation such as generative adversarial networks askew this symmetry, but still employ a pair of massive networks--one to ge…
▽ More
Autoencoding is a popular method in representation learning. Conventional autoencoders employ symmetric encoding-decoding procedures and a simple Euclidean latent space to detect hidden low-dimensional structures in an unsupervised way. Some modern approaches to novel data generation such as generative adversarial networks askew this symmetry, but still employ a pair of massive networks--one to generate the image and another to judge the images quality based on priors learned from a training set. This work introduces a chart autoencoder with an asymmetric encoding-decoding process that can incorporate additional semi-supervised information such as class labels. Besides enhancing the capability for handling data with complicated topological and geometric structures, the proposed model can successfully differentiate nearby but disjoint manifolds and intersecting manifolds with only a small amount of supervision. Moreover, this model only requires a low-complexity encoding operation, such as a locally defined linear projection. We discuss the approximation power of such networks and derive a bound that essentially depends on the intrinsic dimension of the data manifold rather than the dimension of ambient space. Next we incorporate bounds for the sampling rate of training data need to faithfully represent a given data manifold. We present numerical experiments that verify that the proposed model can effectively manage data with multi-class nearby but disjoint manifolds of different classes, overlapping manifolds, and manifolds with non-trivial topology. Finally, we conclude with some experiments on computer vision and molecular dynamics problems which showcase the efficacy of our methods on real-world data.
△ Less
Submitted 4 October, 2024; v1 submitted 22 August, 2022;
originally announced August 2022.
-
Landscape analysis of an improved power method for tensor decomposition
Authors:
Joe Kileel,
Timo Klock,
João M. Pereira
Abstract:
In this work, we consider the optimization formulation for symmetric tensor decomposition recently introduced in the Subspace Power Method (SPM) of Kileel and Pereira. Unlike popular alternative functionals for tensor decomposition, the SPM objective function has the desirable properties that its maximal value is known in advance, and its global optima are exactly the rank-1 components of the tens…
▽ More
In this work, we consider the optimization formulation for symmetric tensor decomposition recently introduced in the Subspace Power Method (SPM) of Kileel and Pereira. Unlike popular alternative functionals for tensor decomposition, the SPM objective function has the desirable properties that its maximal value is known in advance, and its global optima are exactly the rank-1 components of the tensor when the input is sufficiently low-rank. We analyze the non-convex optimization landscape associated with the SPM objective. Our analysis accounts for working with noisy tensors. We derive quantitative bounds such that any second-order critical point with SPM objective value exceeding the bound must equal a tensor component in the noiseless case, and must approximate a tensor component in the noisy case. For decomposing tensors of size $D^{\times m}$, we obtain a near-global guarantee up to rank $\widetilde{o}(D^{\lfloor m/2 \rfloor})$ under a random tensor model, and a global guarantee up to rank $\mathcal{O}(D)$ assuming deterministic frame conditions. This implies that SPM with suitable initialization is a provable, efficient, robust algorithm for low-rank symmetric tensor decomposition. We conclude with numerics that show a practical preferability for using the SPM functional over a more established counterpart.
△ Less
Submitted 29 October, 2021;
originally announced October 2021.
-
Stable Recovery of Entangled Weights: Towards Robust Identification of Deep Neural Networks from Minimal Samples
Authors:
Christian Fiedler,
Massimo Fornasier,
Timo Klock,
Michael Rauchensteiner
Abstract:
In this paper we approach the problem of unique and stable identifiability of generic deep artificial neural networks with pyramidal shape and smooth activation functions from a finite number of input-output samples. More specifically we introduce the so-called entangled weights, which compose weights of successive layers intertwined with suitable diagonal and invertible matrices depending on the…
▽ More
In this paper we approach the problem of unique and stable identifiability of generic deep artificial neural networks with pyramidal shape and smooth activation functions from a finite number of input-output samples. More specifically we introduce the so-called entangled weights, which compose weights of successive layers intertwined with suitable diagonal and invertible matrices depending on the activation functions and their shifts. We prove that entangled weights are completely and stably approximated by an efficient and robust algorithm as soon as $\mathcal O(D^2 \times m)$ nonadaptive input-output samples of the network are collected, where $D$ is the input dimension and $m$ is the number of neurons of the network. Moreover, we empirically observe that the approach applies to networks with up to $\mathcal O(D \times m_L)$ neurons, where $m_L$ is the number of output neurons at layer $L$. Provided knowledge of layer assignments of entangled weights and of remaining scaling and shift parameters, which may be further heuristically obtained by least squares, the entangled weights identify the network completely and uniquely. To highlight the relevance of the theoretical result of stable recovery of entangled weights, we present numerical experiments, which demonstrate that multilayered networks with generic weights can be robustly identified and therefore uniformly approximated by the presented algorithmic pipeline. In contrast backpropagation cannot generalize stably very well in this setting, being always limited by relatively large uniform error. In terms of practical impact, our study shows that we can relate input-output information uniquely and stably to network parameters, providing a form of explainability. Moreover, our method paves the way for compression of overparametrized networks and for the training of minimal complexity networks.
△ Less
Submitted 18 January, 2021;
originally announced January 2021.
-
Analysing gamification elements in educational environments using an existing Gamification taxonomy
Authors:
Armando M. Toda,
Ana C. T. Klock,
Wilk Oliveira,
Paula T. Palomino,
Luiz Rodrigues,
Lei Shi,
Ig Bittencourt,
Isabela Gasparini,
Seiji Isotani,
Alexandra I. Cristea
Abstract:
Gamification has been widely employed in the educational domain over the past eight years when the term became a trend. However, the literature states that gamification still lacks formal definitions to support the design and analysis of gamified strategies. This paper analysed the game elements employed in gamified learning environments through a previously proposed and evaluated taxonomy while d…
▽ More
Gamification has been widely employed in the educational domain over the past eight years when the term became a trend. However, the literature states that gamification still lacks formal definitions to support the design and analysis of gamified strategies. This paper analysed the game elements employed in gamified learning environments through a previously proposed and evaluated taxonomy while detailing and expanding this taxonomy. In the current paper, we describe our taxonomy in-depth as well as expand it. Our new structured results demonstrate an extension of the proposed taxonomy which results from this process, is divided into five dimensions, related to the learner and the learning environment. Our main contribution is the detailed taxonomy that can be used to design and evaluate gamification design in learning environments.
△ Less
Submitted 12 August, 2020;
originally announced August 2020.
-
A deep network construction that adapts to intrinsic dimensionality beyond the domain
Authors:
Alexander Cloninger,
Timo Klock
Abstract:
We study the approximation of two-layer compositions $f(x) = g(φ(x))$ via deep networks with ReLU activation, where $φ$ is a geometrically intuitive, dimensionality reducing feature map. We focus on two intuitive and practically relevant choices for $φ$: the projection onto a low-dimensional embedded submanifold and a distance to a collection of low-dimensional sets. We achieve near optimal approx…
▽ More
We study the approximation of two-layer compositions $f(x) = g(φ(x))$ via deep networks with ReLU activation, where $φ$ is a geometrically intuitive, dimensionality reducing feature map. We focus on two intuitive and practically relevant choices for $φ$: the projection onto a low-dimensional embedded submanifold and a distance to a collection of low-dimensional sets. We achieve near optimal approximation rates, which depend only on the complexity of the dimensionality reducing map $φ$ rather than the ambient dimension. Since $φ$ encapsulates all nonlinear features that are material to the function $f$, this suggests that deep nets are faithful to an intrinsic dimension governed by $f$ rather than the complexity of the domain of $f$. In particular, the prevalent assumption of approximating functions on low-dimensional manifolds can be significantly relaxed using functions of type $f(x) = g(φ(x))$ with $φ$ representing an orthogonal projection onto the same manifold.
△ Less
Submitted 26 April, 2021; v1 submitted 6 August, 2020;
originally announced August 2020.
-
Robust and Resource Efficient Identification of Two Hidden Layer Neural Networks
Authors:
Massimo Fornasier,
Timo Klock,
Michael Rauchensteiner
Abstract:
We address the structure identification and the uniform approximation of two fully nonlinear layer neural networks of the type $f(x)=1^T h(B^T g(A^T x))$ on $\mathbb R^d$ from a small number of query samples. We approach the problem by sampling actively finite difference approximations to Hessians of the network. Gathering several approximate Hessians allows reliably to approximate the matrix subs…
▽ More
We address the structure identification and the uniform approximation of two fully nonlinear layer neural networks of the type $f(x)=1^T h(B^T g(A^T x))$ on $\mathbb R^d$ from a small number of query samples. We approach the problem by sampling actively finite difference approximations to Hessians of the network. Gathering several approximate Hessians allows reliably to approximate the matrix subspace $\mathcal W$ spanned by symmetric tensors $a_1 \otimes a_1 ,\dots,a_{m_0}\otimes a_{m_0}$ formed by weights of the first layer together with the entangled symmetric tensors $v_1 \otimes v_1 ,\dots,v_{m_1}\otimes v_{m_1}$, formed by suitable combinations of the weights of the first and second layer as $v_\ell=A G_0 b_\ell/\|A G_0 b_\ell\|_2$, $\ell \in [m_1]$, for a diagonal matrix $G_0$ depending on the activation functions of the first layer. The identification of the 1-rank symmetric tensors within $\mathcal W$ is then performed by the solution of a robust nonlinear program. We provide guarantees of stable recovery under a posteriori verifiable conditions. We further address the correct attribution of approximate weights to the first or second layer. By using a suitably adapted gradient descent iteration, it is possible then to estimate, up to intrinsic symmetries, the shifts of the activations functions of the first layer and compute exactly the matrix $G_0$. Our method of identification of the weights of the network is fully constructive, with quantifiable sample complexity, and therefore contributes to dwindle the black-box nature of the network training phase. We corroborate our theoretical results by extensive numerical experiments.
△ Less
Submitted 30 June, 2019;
originally announced July 2019.
-
Nonlinear generalization of the monotone single index model
Authors:
Zeljko Kereta,
Timo Klock,
Valeriya Naumova
Abstract:
Single index model is a powerful yet simple model, widely used in statistics, machine learning, and other scientific fields. It models the regression function as $g(<a,x>)$, where a is an unknown index vector and x are the features. This paper deals with a nonlinear generalization of this framework to allow for a regressor that uses multiple index vectors, adapting to local changes in the response…
▽ More
Single index model is a powerful yet simple model, widely used in statistics, machine learning, and other scientific fields. It models the regression function as $g(<a,x>)$, where a is an unknown index vector and x are the features. This paper deals with a nonlinear generalization of this framework to allow for a regressor that uses multiple index vectors, adapting to local changes in the responses. To do so we exploit the conditional distribution over function-driven partitions, and use linear regression to locally estimate index vectors. We then regress by applying a kNN type estimator that uses a localized proxy of the geodesic metric. We present theoretical guarantees for estimation of local index vectors and out-of-sample prediction, and demonstrate the performance of our method with experiments on synthetic and real-world data sets, comparing it with state-of-the-art methods.
△ Less
Submitted 5 September, 2019; v1 submitted 24 February, 2019;
originally announced February 2019.
-
Adaptive multi-penalty regularization based on a generalized Lasso path
Authors:
Markus Grasmair,
Timo Klock,
Valeriya Naumova
Abstract:
For many algorithms, parameter tuning remains a challenging and critical task, which becomes tedious and infeasible in a multi-parameter setting. Multi-penalty regularization, successfully used for solving undetermined sparse regression of problems of unmixing type where signal and noise are additively mixed, is one of such examples. In this paper, we propose a novel algorithmic framework for an a…
▽ More
For many algorithms, parameter tuning remains a challenging and critical task, which becomes tedious and infeasible in a multi-parameter setting. Multi-penalty regularization, successfully used for solving undetermined sparse regression of problems of unmixing type where signal and noise are additively mixed, is one of such examples. In this paper, we propose a novel algorithmic framework for an adaptive parameter choice in multi-penalty regularization with a focus on the correct support recovery. Building upon the theory of regularization paths and algorithms for single-penalty functionals, we extend these ideas to a multi-penalty framework by providing an efficient procedure for the construction of regions containing structurally similar solutions, i.e., solutions with the same sparsity and sign pattern, over the whole range of parameters. Combining this with a model selection criterion, we can choose regularization parameters in a data-adaptive manner. Another advantage of our algorithm is that it provides an overview on the solution stability over the whole range of parameters. This can be further exploited to obtain additional insights into the problem of interest. We provide a numerical analysis of our method and compare it to the state-of-the-art single-penalty algorithms for compressed sensing problems in order to demonstrate the robustness and power of the proposed algorithm.
△ Less
Submitted 11 October, 2017;
originally announced October 2017.