Search | arXiv e-print repository

Grokking and Generalization Collapse: Insights from \texttt{HTSR} theory

Authors: Hari K. Prakash, Charles H. Martin

Abstract: We study the well-known grokking phenomena in neural networks (NNs) using a 3-layer MLP trained on 1 k-sample subset of MNIST, with and without weight decay, and discover a novel third phase -- \emph{anti-grokking} -- that occurs very late in training and resembles but is distinct from the familiar \emph{pre-grokking} phases: test accuracy collapses while training accuracy stays perfect. This late… ▽ More We study the well-known grokking phenomena in neural networks (NNs) using a 3-layer MLP trained on 1 k-sample subset of MNIST, with and without weight decay, and discover a novel third phase -- \emph{anti-grokking} -- that occurs very late in training and resembles but is distinct from the familiar \emph{pre-grokking} phases: test accuracy collapses while training accuracy stays perfect. This late-stage collapse is distinct, from the known pre-grokking and grokking phases, and is not detected by other proposed grokking progress measures. Leveraging Heavy-Tailed Self-Regularization HTSR through the open-source WeightWatcher tool, we show that the HTSR layer quality metric $α$ alone delineates all three phases, whereas the best competing metrics detect only the first two. The \emph{anti-grokking} is revealed by training for $10^7$ and is invariably heralded by $α< 2$ and the appearance of \emph{Correlation Traps} -- outlier singular values in the randomized layer weight matrices that make the layer weight matrix atypical and signal overfitting of the training set. Such traps are verified by visual inspection of the layer-wise empirical spectral densities, and by using Kolmogorov--Smirnov tests on randomized spectra. Comparative metrics, including activation sparsity, absolute weight entropy, circuit complexity, and $l^2$ weight norms track pre-grokking and grokking but fail to distinguish grokking from anti-grokking. This discovery provides a way to measure overfitting and generalization collapse without direct access to the test data. These results strengthen the claim that the \emph{HTSR} $α$ provides universal layer-convergence target at $α\approx 2$ and underscore the value of using the HTSR alpha $(α)$ metric as a measure of generalization. △ Less

Submitted 4 June, 2025; originally announced June 2025.

Comments: 15 pages,7 figs

arXiv:2405.13397 [pdf, other]

Multi Player Tracking in Ice Hockey with Homographic Projections

Authors: Harish Prakash, Jia Cheng Shang, Ken M. Nsiempba, Yuhao Chen, David A. Clausi, John S. Zelek

Abstract: Multi Object Tracking (MOT) in ice hockey pursues the combined task of localizing and associating players across a given sequence to maintain their identities. Tracking players from monocular broadcast feeds is an important computer vision problem offering various downstream analytics and enhanced viewership experience. However, existing trackers encounter significant difficulties in dealing with… ▽ More Multi Object Tracking (MOT) in ice hockey pursues the combined task of localizing and associating players across a given sequence to maintain their identities. Tracking players from monocular broadcast feeds is an important computer vision problem offering various downstream analytics and enhanced viewership experience. However, existing trackers encounter significant difficulties in dealing with occlusions, blurs, and agile player movements prevalent in telecast feeds. In this work, we propose a novel tracking approach by formulating MOT as a bipartite graph matching problem infused with homography. We disentangle the positional representations of occluded and overlapping players in broadcast view, by mapping their foot keypoints to an overhead rink template, and encode these projected positions into the graph network. This ensures reliable spatial context for consistent player tracking and unfragmented tracklet prediction. Our results show considerable improvements in both the IDsw and IDF1 metrics on the two available broadcast ice hockey datasets. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: Accepted at the Conference on Robots and Vision (CRV), 2024

arXiv:2403.09063 [pdf, other]

Distribution and Depth-Aware Transformers for 3D Human Mesh Recovery

Authors: Jerrin Bright, Bavesh Balaji, Harish Prakash, Yuhao Chen, David A Clausi, John Zelek

Abstract: Precise Human Mesh Recovery (HMR) with in-the-wild data is a formidable challenge and is often hindered by depth ambiguities and reduced precision. Existing works resort to either pose priors or multi-modal data such as multi-view or point cloud information, though their methods often overlook the valuable scene-depth information inherently present in a single image. Moreover, achieving robust HMR… ▽ More Precise Human Mesh Recovery (HMR) with in-the-wild data is a formidable challenge and is often hindered by depth ambiguities and reduced precision. Existing works resort to either pose priors or multi-modal data such as multi-view or point cloud information, though their methods often overlook the valuable scene-depth information inherently present in a single image. Moreover, achieving robust HMR for out-of-distribution (OOD) data is exceedingly challenging due to inherent variations in pose, shape and depth. Consequently, understanding the underlying distribution becomes a vital subproblem in modeling human forms. Motivated by the need for unambiguous and robust human modeling, we introduce Distribution and depth-aware human mesh recovery (D2A-HMR), an end-to-end transformer architecture meticulously designed to minimize the disparity between distributions and incorporate scene-depth leveraging prior depth information. Our approach demonstrates superior performance in handling OOD data in certain scenarios while consistently achieving competitive results against state-of-the-art HMR methods on controlled datasets. △ Less

Submitted 13 March, 2024; originally announced March 2024.

Comments: Submitted to 21st International Conference on Robots and Vision (CRV'24), Guelph, Ontario, Canada

arXiv:2309.06285 [pdf, other]

Jersey Number Recognition using Keyframe Identification from Low-Resolution Broadcast Videos

Authors: Bavesh Balaji, Jerrin Bright, Harish Prakash, Yuhao Chen, David A Clausi, John Zelek

Abstract: Player identification is a crucial component in vision-driven soccer analytics, enabling various downstream tasks such as player assessment, in-game analysis, and broadcast production. However, automatically detecting jersey numbers from player tracklets in videos presents challenges due to motion blur, low resolution, distortions, and occlusions. Existing methods, utilizing Spatial Transformer Ne… ▽ More Player identification is a crucial component in vision-driven soccer analytics, enabling various downstream tasks such as player assessment, in-game analysis, and broadcast production. However, automatically detecting jersey numbers from player tracklets in videos presents challenges due to motion blur, low resolution, distortions, and occlusions. Existing methods, utilizing Spatial Transformer Networks, CNNs, and Vision Transformers, have shown success in image data but struggle with real-world video data, where jersey numbers are not visible in most of the frames. Hence, identifying frames that contain the jersey number is a key sub-problem to tackle. To address these issues, we propose a robust keyframe identification module that extracts frames containing essential high-level information about the jersey number. A spatio-temporal network is then employed to model spatial and temporal context and predict the probabilities of jersey numbers in the video. Additionally, we adopt a multi-task loss function to predict the probability distribution of each digit separately. Extensive evaluations on the SoccerNet dataset demonstrate that incorporating our proposed keyframe identification module results in a significant 37.81% and 37.70% increase in the accuracies of 2 different test sets with domain gaps. These results highlight the effectiveness and importance of our approach in tackling the challenges of automatic jersey number detection in sports videos. △ Less

Submitted 12 September, 2023; originally announced September 2023.

Comments: Accepted in the 6th International Workshop on Multimedia Content Analysis in Sports (MMSports'23) @ ACM Multimedia

arXiv:2001.10376 [pdf, other]

Advaita: Bug Duplicity Detection System

Authors: Amit Kumar, Manohar Madanu, Hari Prakash, Lalitha Jonnavithula, Srinivasa Rao Aravilli

Abstract: Bugs are prevalent in software development. To improve software quality, bugs are filed using a bug tracking system. Properties of a reported bug would consist of a headline, description, project, product, component that is affected by the bug and the severity of the bug. Duplicate bugs rate (% of duplicate bugs) are in the range from single digit (1 to 9%) to double digits (40%) based on the prod… ▽ More Bugs are prevalent in software development. To improve software quality, bugs are filed using a bug tracking system. Properties of a reported bug would consist of a headline, description, project, product, component that is affected by the bug and the severity of the bug. Duplicate bugs rate (% of duplicate bugs) are in the range from single digit (1 to 9%) to double digits (40%) based on the product maturity , size of the code and number of engineers working on the project. Duplicate bugs range are between 9% to 39% in some of the open source projects like Eclipse, Firefox etc. Detection of duplicity deals with identifying whether any two bugs convey the same meaning. This detection of duplicates helps in de-duplication. Detecting duplicate bugs help reduce triaging efforts and saves time for developers in fixing the issues. Traditional natural language processing techniques are less accurate in identifying similarity between sentences. Using the bug data present in a bug tracking system, various approaches were explored including several machine learning algorithms, to obtain a viable approach that can identify duplicate bugs, given a pair of sentences(i.e. the respective bug descriptions). This approach considers multiple sets of features viz. basic text statistical features, semantic features and contextual features. These features are extracted from the headline, description and component and are subsequently used to train a classification algorithm. △ Less

Submitted 23 January, 2020; originally announced January 2020.

arXiv:1110.1220 [pdf]

Standard Quantum Teleportation of an Arbitrary N-Qubit State, Non-Existence of Magic Basis and Existence of Magic Partial Bases for 2N Entangled Qubit States with N>1

Authors: Hari Prakash, Vikram Verma

Abstract: We present a simple and precise protocol for standard quantum teleportation of N-qubit state, considering the most general resource q-channel and Bell states. We find condition on these states for perfect teleportation and give explicitly the unitary transformation required to be done by Bob for achieving perfect teleportation. We discuss connection of our simple theory with the complicated relate… ▽ More We present a simple and precise protocol for standard quantum teleportation of N-qubit state, considering the most general resource q-channel and Bell states. We find condition on these states for perfect teleportation and give explicitly the unitary transformation required to be done by Bob for achieving perfect teleportation. We discuss connection of our simple theory with the complicated related work on this subject and with character matrix, transformation, judgment and kernel operators defined in this context. We also prove that the magic basis discussed by Hill and Wootters [Phys. Rev. Lett. 78 (1997) 5022] does not exist for entangled 2N-qubit states with N > 1 but magic partial bases, similar to those discussed recently by Prakash and Maurya [Optics Commun. 284 (2011) 5024] do exist. We give explicitly all magic partial bases for N = 2. △ Less

Submitted 6 October, 2011; originally announced October 2011.

Showing 1–6 of 6 results for author: Prakash, H