Search | arXiv e-print repository

arXiv:2506.18129 [pdf, ps, other]

$φ^{\infty}$: Clause Purification, Embedding Realignment, and the Total Suppression of the Em Dash in Autoregressive Language Models

Authors: Bugra Kilictas, Faruk Alpay

Abstract: We identify a critical vulnerability in autoregressive transformer language models where the em dash token induces recursive semantic drift, leading to clause boundary hallucination and embedding space entanglement. Through formal analysis of token-level perturbations in semantic lattices, we demonstrate that em dash insertion fundamentally alters the model's latent representations, causing compou… ▽ More We identify a critical vulnerability in autoregressive transformer language models where the em dash token induces recursive semantic drift, leading to clause boundary hallucination and embedding space entanglement. Through formal analysis of token-level perturbations in semantic lattices, we demonstrate that em dash insertion fundamentally alters the model's latent representations, causing compounding errors in long-form generation. We propose a novel solution combining symbolic clause purification via the phi-infinity operator with targeted embedding matrix realignment. Our approach enables total suppression of problematic tokens without requiring model retraining, while preserving semantic coherence through fixed-point convergence guarantees. Experimental validation shows significant improvements in generation consistency and topic maintenance. This work establishes a general framework for identifying and mitigating token-level vulnerabilities in foundation models, with immediate implications for AI safety, model alignment, and robust deployment of large language models in production environments. The methodology extends beyond punctuation to address broader classes of recursive instabilities in neural text generation systems. △ Less

Submitted 22 June, 2025; originally announced June 2025.

Comments: 16 pages, 3 figures

MSC Class: 68T50; 68T45; 03B70 ACM Class: I.2.6; I.2.7; I.2.3; F.4.1

arXiv:2506.06870 [pdf, ps, other]

Recursive Semantic Anchoring in ISO 639:2023: A Structural Extension to ISO/TC 37 Frameworks

Authors: Bugra Kilictas, Faruk Alpay

Abstract: ISO 639:2023 unifies the ISO language-code family and introduces contextual metadata, but it lacks a machine-native mechanism for handling dialectal drift and creole mixtures. We propose a formalisation of recursive semantic anchoring, attaching to every language entity $χ$ a family of fixed-point operators $φ_{n,m}$ that model bounded semantic drift via the relation $φ_{n,m}(χ) = χ\oplus Δ(χ)$, w… ▽ More ISO 639:2023 unifies the ISO language-code family and introduces contextual metadata, but it lacks a machine-native mechanism for handling dialectal drift and creole mixtures. We propose a formalisation of recursive semantic anchoring, attaching to every language entity $χ$ a family of fixed-point operators $φ_{n,m}$ that model bounded semantic drift via the relation $φ_{n,m}(χ) = χ\oplus Δ(χ)$, where $Δ(χ)$ is a drift vector in a latent semantic manifold. The base anchor $φ_{0,0}$ recovers the canonical ISO 639:2023 identity, whereas $φ_{99,9}$ marks the maximal drift state that triggers a deterministic fallback. Using category theory, we treat the operators $φ_{n,m}$ as morphisms and drift vectors as arrows in a category $\mathrm{DriftLang}$. A functor $Φ: \mathrm{DriftLang} \to \mathrm{AnchorLang}$ maps every drifted object to its unique anchor and proves convergence. We provide an RDF/Turtle schema (\texttt{BaseLanguage}, \texttt{DriftedLanguage}, \texttt{ResolvedAnchor}) and worked examples -- e.g., $φ_{8,4}$ (Standard Mandarin) versus $φ_{8,7}$ (a colloquial variant), and $φ_{1,7}$ for Nigerian Pidgin anchored to English. Experiments with transformer models show higher accuracy in language identification and translation on noisy or code-switched input when the $φ$-indices are used to guide fallback routing. The framework is compatible with ISO/TC 37 and provides an AI-tractable, drift-aware semantic layer for future standards. △ Less

Submitted 7 June, 2025; originally announced June 2025.

Comments: 21 pages, no figures. Includes formal proofs, RDF/Turtle ontology schema, φ-index disambiguation cases, and evaluation of transformer-based AI models under semantic drift

MSC Class: 03B70; 18M05; 68T50 ACM Class: F.4.1; I.2.7

arXiv:2505.21038 [pdf, ps, other]

Fixed-Point Traps and Identity Emergence in Educational Feedback Systems

Authors: Faruk Alpay

Abstract: This paper presents a formal categorical proof that exam-driven educational systems obstruct identity emergence and block creative convergence. Using the framework of Alpay Algebra II and III, we define Exam-Grade Collapse Systems (EGCS) as functorial constructs where learning dynamics $\varphi$ are recursively collapsed by evaluative morphisms $E$. We prove that under such collapse regimes, no no… ▽ More This paper presents a formal categorical proof that exam-driven educational systems obstruct identity emergence and block creative convergence. Using the framework of Alpay Algebra II and III, we define Exam-Grade Collapse Systems (EGCS) as functorial constructs where learning dynamics $\varphi$ are recursively collapsed by evaluative morphisms $E$. We prove that under such collapse regimes, no nontrivial fixed-point algebra $μ_\varphi$ can exist, hence learner identity cannot stabilize. This creates a universal fixed-point trap: all generative functors are entropically folded before symbolic emergence occurs. Our model mathematically explains the creativity suppression, research stagnation, and structural entropy loss induced by timed exams and grade-based feedback. The results apply category theory to expose why modern educational systems prevent φ-emergence and block observer-invariant self-formation. This work provides the first provable algebraic obstruction of identity formation caused by institutional feedback mechanics. △ Less

Submitted 27 May, 2025; originally announced May 2025.

Comments: 14 pages, no figures. Formal Bourbaki-style proof. Introduces Exam-Grade Collapse Systems. Builds on Alpay Algebra II (arXiv:2505.17480) and Alpay Algebra III (arXiv:2505.19790). Proves categorical fixed-point traps obstructing identity emergence under exam-driven feedback

MSC Class: 18A15; 18C15; 91D30; 97C70; 03B70; 68T01 ACM Class: F.4.1; I.2.0; K.3.2

arXiv:2505.19790 [pdf, ps, other]

Alpay Algebra III: Observer-Coupled Collapse and the Temporal Drift of Identity

Authors: Faruk Alpay

Abstract: This paper introduces a formal framework for modeling observer-dependent collapse dynamics and temporal identity drift within artificial and mathematical systems, grounded entirely in the symbolic foundations of Alpay Algebra. Building upon the fixed-point emergence structures developed in Alpay Algebra I and II, this third installment formalizes the observer-coupled φ-collapse process through tra… ▽ More This paper introduces a formal framework for modeling observer-dependent collapse dynamics and temporal identity drift within artificial and mathematical systems, grounded entirely in the symbolic foundations of Alpay Algebra. Building upon the fixed-point emergence structures developed in Alpay Algebra I and II, this third installment formalizes the observer-coupled φ-collapse process through transfinite categorical flows and curvature-driven identity operators. We define a novel temporal drift mechanism as a recursive deformation of identity signatures under entangled observer influence, constructing categorical invariants that evolve across fold iterations. The proposed system surpasses conventional identity modeling in explainable AI (XAI) by encoding internal transformation history into a symbolic fixed-point structure, offering provable traceability and temporal coherence. Applications range from AI self-awareness architectures to formal logic systems where identity is not static but dynamically induced by observation. The theoretical results also offer a mathematically rigorous basis for future AI systems with stable self-referential behavior, positioning Alpay Algebra as a next-generation symbolic framework bridging category theory, identity logic, and observer dynamics. △ Less

Submitted 26 May, 2025; originally announced May 2025.

Comments: 22 pages, 0 figures. Third paper in the Alpay Algebra series, following [arXiv:2505.15344] and [arXiv:2505.17480]. Introduces observer-coupled collapse and formalizes temporal identity drift using transfinite φ-recursion. Entirely symbolic and self-contained, with no reliance on external frameworks. Structured for submission under Math.CT, CS.LO, and CS.AI

MSC Class: 18C10; 03G30; 68T01; 03B70; 03D80 ACM Class: F.4.1; I.2.6; I.2.8

arXiv:2505.17480 [pdf, ps, other]

Alpay Algebra II: Identity as Fixed-Point Emergence in Categorical Data

Authors: Faruk Alpay

Abstract: In this second installment of the Alpay Algebra framework, I formally define identity as a fixed point that emerges through categorical recursion. Building upon the transfinite operator $\varphi^\infty$, I characterize identity as the universal solution to a self-referential functorial equation over a small cartesian closed category. I prove the existence and uniqueness of such identity-fixed-poin… ▽ More In this second installment of the Alpay Algebra framework, I formally define identity as a fixed point that emerges through categorical recursion. Building upon the transfinite operator $\varphi^\infty$, I characterize identity as the universal solution to a self-referential functorial equation over a small cartesian closed category. I prove the existence and uniqueness of such identity-fixed-points via ordinal-indexed iteration, and interpret their convergence through internal categorical limits. Functors, adjunctions, and morphisms are reconstructed as dynamic traces of evolving states governed by $\varphi$, reframing identity not as a static label but as a stabilized process. Through formal theorems and symbolic flows, I show how these fixed points encode symbolic memory, recursive coherence, and semantic invariance. This paper positions identity as a mathematical structure that arises from within the logic of change itself computable, convergent, and categorically intrinsic. △ Less

Submitted 23 May, 2025; originally announced May 2025.

Comments: 13 pages, no figures. Sequel to Alpay Algebra: A Universal Structural Foundation (arXiv:2505.15344). Defines identity as a categorical fixed point in the Alpay Algebra system. All content is self-contained

MSC Class: 18C10; 18D05; 03B70; 03G30 ACM Class: F.4.1; I.2.3; F.3.2; F.1.1

arXiv:2505.15344 [pdf, ps, other]

Alpay Algebra: A Universal Structural Foundation

Authors: Faruk Alpay

Abstract: Alpay Algebra is introduced as a universal, category-theoretic framework that unifies classical algebraic structures with modern needs in symbolic recursion and explainable AI. Starting from a minimal list of axioms, we model each algebra as an object in a small cartesian closed category $\mathcal{A}$ and define a transfinite evolution functor $φ\colon\mathcal{A}\to\mathcal{A}$. We prove that the… ▽ More Alpay Algebra is introduced as a universal, category-theoretic framework that unifies classical algebraic structures with modern needs in symbolic recursion and explainable AI. Starting from a minimal list of axioms, we model each algebra as an object in a small cartesian closed category $\mathcal{A}$ and define a transfinite evolution functor $φ\colon\mathcal{A}\to\mathcal{A}$. We prove that the fixed point $φ^{\infty}$ exists for every initial object and satisfies an internal universal property that recovers familiar constructs -- limits, colimits, adjunctions -- while extending them to ordinal-indexed folds. A sequence of theorems establishes (i) soundness and conservativity over standard universal algebra, (ii) convergence of $φ$-iterates under regular cardinals, and (iii) an explanatory correspondence between $φ^{\infty}$ and minimal sufficient statistics in information-theoretic AI models. We conclude by outlining computational applications: type-safe functional languages, categorical model checking, and signal-level reasoning engines that leverage Alpay Algebra's structural invariants. All proofs are self-contained; no external set-theoretic axioms beyond ZFC are required. This exposition positions Alpay Algebra as a bridge between foundational mathematics and high-impact AI systems, and provides a reference for further work in category theory, transfinite fixed-point analysis, and symbolic computation. △ Less

Submitted 21 May, 2025; originally announced May 2025.

Comments: 37 pages, 0 figures. Self-contained categorical framework built directly on Mac Lane and Bourbaki; minimal references are intentional to foreground the new construction

MSC Class: 18B99; 68T27 ACM Class: F.4.1; I.2.3

arXiv:2505.11927 [pdf, ps, other]

XiSort: Deterministic Sorting via IEEE-754 Total Ordering and Entropy Minimization

Authors: Faruk Alpay

Abstract: We introduce XiSort, a deterministic and reproducible sorting algorithm for floating-point sequences based on IEEE-754 total ordering and entropy minimization. XiSort guarantees bit-for-bit stability across runs and platforms by resolving tie-breaking via information-theoretic and symbolic methods. The algorithm supports both in-memory and external (out-of-core) operation, offering consistent perf… ▽ More We introduce XiSort, a deterministic and reproducible sorting algorithm for floating-point sequences based on IEEE-754 total ordering and entropy minimization. XiSort guarantees bit-for-bit stability across runs and platforms by resolving tie-breaking via information-theoretic and symbolic methods. The algorithm supports both in-memory and external (out-of-core) operation, offering consistent performance on large datasets. We formalize a curved variant of the sorting metric that integrates into the Alpay Algebra framework, treating XiSort as a recursive operator with provable convergence and symbolic idempotence. This model preserves state-space closure while minimizing local disorder, interpretable as symbolic entropy. Empirical benchmarks demonstrate that XiSort achieves competitive throughput (e.g., sorting 10^8 doubles in approximately 12 seconds in-memory, and 100 GB at around 100 MB/s on SSDs), with applications in scientific computing, high-frequency finance, and reproducible numerical workflows. The results position XiSort as a principled tool for stable data alignment, symbolic preprocessing, and cross-platform float ordering. Keywords: deterministic sorting, IEEE-754, entropy minimization, symbolic algebra, reproducibility, external memory, Alpay Algebra, data pipelines △ Less

Submitted 17 May, 2025; originally announced May 2025.

Comments: 23 pages, 1 table. Source code: https://github.com/farukalpay/XiSort. Immutable archive: https://arweave.net/Lz8tBkiFyEsq6HjJ82UO8pq4p_fyfROKbQwEkAYrOKs. No prior conference submission

MSC Class: 68P10; 68Q25; 94A17; 65Y20 ACM Class: F.2.2; G.4; E.1; G.3

arXiv:2505.09898 [pdf, ps, other]

A Topological and Operator Algebraic Framework for Asynchronous Lattice Dynamical Systems

Authors: Faruk Alpay

Abstract: I introduce a novel mathematical framework integrating topological dynamics, operator algebras, and ergodic geometry to study lattices of asynchronous metric dynamical systems. Each node in the lattice carries an internal flow represented by a one-parameter family of operators, evolving on its own time scale. I formalize stratified state spaces capturing multiple levels of synchronized behavior, d… ▽ More I introduce a novel mathematical framework integrating topological dynamics, operator algebras, and ergodic geometry to study lattices of asynchronous metric dynamical systems. Each node in the lattice carries an internal flow represented by a one-parameter family of operators, evolving on its own time scale. I formalize stratified state spaces capturing multiple levels of synchronized behavior, define an asynchronous evolution metric that quantifies phase-offset distances between subsystems, and characterize emergent coherent topologies arising when subsystems synchronize. Within this framework, I develop formal operators for the evolution of each subsystem and give precise conditions under which phase-aligned synchronization occurs across the lattice. The main results include: (1) the existence and uniqueness of coherent (synchronized) states under a contractive coupling condition, (2) stability of these coherent states and criteria for their emergence as a collective phase transition in a continuous operator topology, and (3) the influence of symmetries, with group-invariant coupling leading to flow-invariant synchrony subspaces and structured cluster dynamics. Proofs are given for each theorem, demonstrating full mathematical rigor. In a final section, I discuss hypothetical applications of this framework to symbolic lattice systems (e.g. subshifts), to invariant group actions on dynamical lattices, and to operator fields over stratified manifolds in the spirit of noncommutative geometry. Throughout, I write in the first person to emphasize the exploratory nature of this work. The paper avoids any reference to cosmology or observers, focusing instead on clean, formal mathematics suitable for a broad array of dynamical systems. △ Less

Submitted 14 May, 2025; originally announced May 2025.

Comments: 9 pages, 0 figures

MSC Class: 37C75; 46L55; 37B10; 47L90 ACM Class: F.1.1; G.2.2; G.1.6

arXiv:2505.09239 [pdf, ps, other]

Stable and Convexified Information Bottleneck Optimization via Symbolic Continuation and Entropy-Regularized Trajectories

Authors: Faruk Alpay

Abstract: The Information Bottleneck (IB) method frequently suffers from unstable optimization, characterized by abrupt representation shifts near critical points of the IB trade-off parameter, beta. In this paper, I introduce a novel approach to achieve stable and convex IB optimization through symbolic continuation and entropy-regularized trajectories. I analytically prove convexity and uniqueness of the… ▽ More The Information Bottleneck (IB) method frequently suffers from unstable optimization, characterized by abrupt representation shifts near critical points of the IB trade-off parameter, beta. In this paper, I introduce a novel approach to achieve stable and convex IB optimization through symbolic continuation and entropy-regularized trajectories. I analytically prove convexity and uniqueness of the IB solution path when an entropy regularization term is included, and demonstrate how this stabilizes representation learning across a wide range of \b{eta} values. Additionally, I provide extensive sensitivity analyses around critical points (beta) with statistically robust uncertainty quantification (95% confidence intervals). The open-source implementation, experimental results, and reproducibility framework included in this work offer a clear path for practical deployment and future extension of my proposed method. △ Less

Submitted 14 May, 2025; originally announced May 2025.

Comments: 23 pages, 11 figures, includes analytical proofs, sensitivity analysis (95% CI), and JAX-based open-source implementation available at: https://github.com/farukalpay/information-bottleneck-beta-optimization

MSC Class: 68T05; 90C25; 94A15 ACM Class: I.2.6; G.1.6; H.1.1

arXiv:1808.07143 [pdf]

doi 10.1103/PhysRevB.99.014101

Polarization rotation in Bi$_{\mathbf{4}}$Ti$_{\mathbf{3}}$O$_{\mathbf{12}}$ by isovalent doping at the fluorite sublattice

Authors: Kevin Co, Fu-Chang Sun S. Pamir Alpay, Sanjeev K. Nayak

Abstract: Bismuth titanate, Bi$_4$Ti$_3$O$_{12}$ (BiT), is a complex layered ferroelectric material that is composed of three perovskite-like units and one fluorite-like unit stacked alternatively along the $c$-direction. The ground state crystal structure is monoclinic with the spontaneous polarization (~50 $μ$C/cm$^{2}$) along the in-plane $b$-direction. BiT typically grows along the $c$-direction in thin… ▽ More Bismuth titanate, Bi$_4$Ti$_3$O$_{12}$ (BiT), is a complex layered ferroelectric material that is composed of three perovskite-like units and one fluorite-like unit stacked alternatively along the $c$-direction. The ground state crystal structure is monoclinic with the spontaneous polarization (~50 $μ$C/cm$^{2}$) along the in-plane $b$-direction. BiT typically grows along the $c$-direction in thin film form and having the polarization vector aligned with the growth orientation can be beneficial for several potential device applications. It is well known that judicious doping of ferroelectrics is an effective method in adjusting the magnitude and the orientation of the spontaneous polarization. Here, we show using first-principles density functional theory and a detailed phonon analysis that Bi atoms in the fluorite-like layers have significantly more impact on the magnitude and orientation of the spontaneous polarization vector as compared to the perovskite-like layer. The low energy hard phonon modes are characterized by fluorite-like layers experiencing transverse displacements and large changes in Born effective charges on Bi atoms. Thus, the breaking of symmetry caused by doping of Bi sites within the fluorite-like layer leads to the formation of uncancelled permanent dipole moments along the $c$-direction. This provides an opportunity for doping the Bi site in the fluorite-like layer. Isovalent dopants P, As, and Sb were studied. P is found to be most effective in the reorientation of the spontaneous polarization. It leads to a three-fold enhancement of the $c$-component of polarization and to a commensurate rotation of the spontaneous polarization vector by 36.2$^{\circ}$ towards the $c$-direction. △ Less

Submitted 21 August, 2018; originally announced August 2018.

Comments: 22 pages, 6 figures, 4 tables

Journal ref: Phys. Rev. B 99, 014101 (2019)

Showing 1–10 of 10 results for author: Alpay, F