-
N$^2$: A Unified Python Package and Test Bench for Nearest Neighbor-Based Matrix Completion
Authors:
Caleb Chin,
Aashish Khubchandani,
Harshvardhan Maskara,
Kyuseong Choi,
Jacob Feitelberg,
Albert Gong,
Manit Paul,
Tathagata Sadhukhan,
Anish Agarwal,
Raaz Dwivedi
Abstract:
Nearest neighbor (NN) methods have re-emerged as competitive tools for matrix completion, offering strong empirical performance and recent theoretical guarantees, including entry-wise error bounds, confidence intervals, and minimax optimality. Despite their simplicity, recent work has shown that NN approaches are robust to a range of missingness patterns and effective across diverse applications.…
▽ More
Nearest neighbor (NN) methods have re-emerged as competitive tools for matrix completion, offering strong empirical performance and recent theoretical guarantees, including entry-wise error bounds, confidence intervals, and minimax optimality. Despite their simplicity, recent work has shown that NN approaches are robust to a range of missingness patterns and effective across diverse applications. This paper introduces N$^2$, a unified Python package and testbed that consolidates a broad class of NN-based methods through a modular, extensible interface. Built for both researchers and practitioners, N$^2$ supports rapid experimentation and benchmarking. Using this framework, we introduce a new NN variant that achieves state-of-the-art results in several settings. We also release a benchmark suite of real-world datasets, from healthcare and recommender systems to causal inference and LLM evaluation, designed to stress-test matrix completion methods beyond synthetic scenarios. Our experiments demonstrate that while classical methods excel on idealized data, NN-based techniques consistently outperform them in real-world settings.
△ Less
Submitted 4 June, 2025;
originally announced June 2025.
-
Do Looks Matter? Exploring Functional and Aesthetic Design Preferences for a Robotic Guide Dog
Authors:
Aviv L. Cohav,
A. Xinran Gong,
J. Taery Kim,
Clint Zeagler,
Sehoon Ha,
Bruce N. Walker
Abstract:
Dog guides offer an effective mobility solution for blind or visually impaired (BVI) individuals, but conventional dog guides have limitations including the need for care, potential distractions, societal prejudice, high costs, and limited availability. To address these challenges, we seek to develop a robot dog guide capable of performing the tasks of a conventional dog guide, enhanced with addit…
▽ More
Dog guides offer an effective mobility solution for blind or visually impaired (BVI) individuals, but conventional dog guides have limitations including the need for care, potential distractions, societal prejudice, high costs, and limited availability. To address these challenges, we seek to develop a robot dog guide capable of performing the tasks of a conventional dog guide, enhanced with additional features. In this work, we focus on design research to identify functional and aesthetic design concepts to implement into a quadrupedal robot. The aesthetic design remains relevant even for BVI users due to their sensitivity toward societal perceptions and the need for smooth integration into society. We collected data through interviews and surveys to answer specific design questions pertaining to the appearance, texture, features, and method of controlling and communicating with the robot. Our study identified essential and preferred features for a future robot dog guide, which are supported by relevant statistics aligning with each suggestion. These findings will inform the future development of user-centered designs to effectively meet the needs of BVI individuals.
△ Less
Submitted 18 February, 2025;
originally announced March 2025.
-
PhantomWiki: On-Demand Datasets for Reasoning and Retrieval Evaluation
Authors:
Albert Gong,
Kamilė Stankevičiūtė,
Chao Wan,
Anmol Kabra,
Raphael Thesmar,
Johann Lee,
Julius Klenke,
Carla P. Gomes,
Kilian Q. Weinberger
Abstract:
High-quality benchmarks are essential for evaluating reasoning and retrieval capabilities of large language models (LLMs). However, curating datasets for this purpose is not a permanent solution as they are prone to data leakage and inflated performance results. To address these challenges, we propose PhantomWiki: a pipeline to generate unique, factually consistent document corpora with diverse qu…
▽ More
High-quality benchmarks are essential for evaluating reasoning and retrieval capabilities of large language models (LLMs). However, curating datasets for this purpose is not a permanent solution as they are prone to data leakage and inflated performance results. To address these challenges, we propose PhantomWiki: a pipeline to generate unique, factually consistent document corpora with diverse question-answer pairs. Unlike prior work, PhantomWiki is neither a fixed dataset, nor is it based on any existing data. Instead, a new PhantomWiki instance is generated on demand for each evaluation. We vary the question difficulty and corpus size to disentangle reasoning and retrieval capabilities respectively, and find that PhantomWiki datasets are surprisingly challenging for frontier LLMs. Thus, we contribute a scalable and data leakage-resistant framework for disentangled evaluation of reasoning, retrieval, and tool-use abilities. Our code is available at https://github.com/kilian-group/phantom-wiki.
△ Less
Submitted 9 June, 2025; v1 submitted 27 February, 2025;
originally announced February 2025.
-
External Large Foundation Model: How to Efficiently Serve Trillions of Parameters for Online Ads Recommendation
Authors:
Mingfu Liang,
Xi Liu,
Rong Jin,
Boyang Liu,
Qiuling Suo,
Qinghai Zhou,
Song Zhou,
Laming Chen,
Hua Zheng,
Zhiyuan Li,
Shali Jiang,
Jiyan Yang,
Xiaozhen Xia,
Fan Yang,
Yasmine Badr,
Ellie Wen,
Shuyu Xu,
Hansey Chen,
Zhengyu Zhang,
Jade Nie,
Chunzhi Yang,
Zhichen Zeng,
Weilin Zhang,
Xingliang Huang,
Qianru Li
, et al. (80 additional authors not shown)
Abstract:
Ads recommendation is a prominent service of online advertising systems and has been actively studied. Recent studies indicate that scaling-up and advanced design of the recommendation model can bring significant performance improvement. However, with a larger model scale, such prior studies have a significantly increasing gap from industry as they often neglect two fundamental challenges in indus…
▽ More
Ads recommendation is a prominent service of online advertising systems and has been actively studied. Recent studies indicate that scaling-up and advanced design of the recommendation model can bring significant performance improvement. However, with a larger model scale, such prior studies have a significantly increasing gap from industry as they often neglect two fundamental challenges in industrial-scale applications. First, training and inference budgets are restricted for the model to be served, exceeding which may incur latency and impair user experience. Second, large-volume data arrive in a streaming mode with data distributions dynamically shifting, as new users/ads join and existing users/ads leave the system. We propose the External Large Foundation Model (ExFM) framework to address the overlooked challenges. Specifically, we develop external distillation and a data augmentation system (DAS) to control the computational cost of training/inference while maintaining high performance. We design the teacher in a way like a foundation model (FM) that can serve multiple students as vertical models (VMs) to amortize its building cost. We propose Auxiliary Head and Student Adapter to mitigate the data distribution gap between FM and VMs caused by the streaming data issue. Comprehensive experiments on internal industrial-scale applications and public datasets demonstrate significant performance gain by ExFM.
△ Less
Submitted 23 April, 2025; v1 submitted 20 February, 2025;
originally announced February 2025.
-
Low-Rank Thinning
Authors:
Annabelle Michael Carrell,
Albert Gong,
Abhishek Shetty,
Raaz Dwivedi,
Lester Mackey
Abstract:
The goal in thinning is to summarize a dataset using a small set of representative points. Remarkably, sub-Gaussian thinning algorithms like Kernel Halving and Compress can match the quality of uniform subsampling while substantially reducing the number of summary points. However, existing guarantees cover only a restricted range of distributions and kernel-based quality measures and suffer from p…
▽ More
The goal in thinning is to summarize a dataset using a small set of representative points. Remarkably, sub-Gaussian thinning algorithms like Kernel Halving and Compress can match the quality of uniform subsampling while substantially reducing the number of summary points. However, existing guarantees cover only a restricted range of distributions and kernel-based quality measures and suffer from pessimistic dimension dependence. To address these deficiencies, we introduce a new low-rank analysis of sub-Gaussian thinning that applies to any distribution and any kernel, guaranteeing high-quality compression whenever the kernel or data matrix is approximately low-rank. To demonstrate the broad applicability of the techniques, we design practical sub-Gaussian thinning approaches that improve upon the best known guarantees for approximating attention in transformers, accelerating stochastic gradient training through reordering, and distinguishing distributions in near-linear time.
△ Less
Submitted 25 April, 2025; v1 submitted 17 February, 2025;
originally announced February 2025.
-
Computation with quantum Reed-Muller codes and their mapping onto 2D atom arrays
Authors:
Anqi Gong,
Joseph M. Renes
Abstract:
We give a fault tolerant construction for error correction and computation using two punctured quantum Reed-Muller (PQRM) codes. In particular, we consider the $[[127,1,15]]$ self-dual doubly-even code that has transversal Clifford gates (CNOT, H, S) and the triply-even $[[127,1,7]]$ code that has transversal T and CNOT gates. We show that code switching between these codes can be accomplished usi…
▽ More
We give a fault tolerant construction for error correction and computation using two punctured quantum Reed-Muller (PQRM) codes. In particular, we consider the $[[127,1,15]]$ self-dual doubly-even code that has transversal Clifford gates (CNOT, H, S) and the triply-even $[[127,1,7]]$ code that has transversal T and CNOT gates. We show that code switching between these codes can be accomplished using Steane error correction. For fault-tolerant ancilla preparation we utilize the low-depth hypercube encoding circuit along with different code automorphism permutations in different ancilla blocks, while decoding is handled by the high-performance classical successive cancellation list decoder. In this way, every logical operation in this universal gate set is amenable to extended rectangle analysis. The CNOT exRec has a failure rate approaching $10^{-9}$ at $10^{-3}$ circuit-level depolarizing noise.
Furthermore, we map the PQRM codes to a 2D layout suitable for implementation in arrays of trapped atoms and try to reduce the circuit depth of parallel atom movements in state preparation. The resulting protocol is strictly fault-tolerant for the $[[127,1,7]]$ code and practically fault-tolerant for the $[[127,1,15]]$ code. Moreover, each patch requires a permutation consisting of $7$ sub-hypercube swaps only. These are swaps of rectangular grids in our 2D hypercube layout and can be naturally created with acousto-optic deflectors (AODs).
Lastly, we show for the family of $[[2^{2r},{2r\choose r},2^r]]$ QRM codes that the entire logical Clifford group can be achieved using only permutations, transversal gates, and fold-transversal gates.
△ Less
Submitted 30 October, 2024;
originally announced October 2024.
-
Supervised Kernel Thinning
Authors:
Albert Gong,
Kyuseong Choi,
Raaz Dwivedi
Abstract:
The kernel thinning algorithm of Dwivedi & Mackey (2024) provides a better-than-i.i.d. compression of a generic set of points. By generating high-fidelity coresets of size significantly smaller than the input points, KT is known to speed up unsupervised tasks like Monte Carlo integration, uncertainty quantification, and non-parametric hypothesis testing, with minimal loss in statistical accuracy.…
▽ More
The kernel thinning algorithm of Dwivedi & Mackey (2024) provides a better-than-i.i.d. compression of a generic set of points. By generating high-fidelity coresets of size significantly smaller than the input points, KT is known to speed up unsupervised tasks like Monte Carlo integration, uncertainty quantification, and non-parametric hypothesis testing, with minimal loss in statistical accuracy. In this work, we generalize the KT algorithm to speed up supervised learning problems involving kernel methods. Specifically, we combine two classical algorithms--Nadaraya-Watson (NW) regression or kernel smoothing, and kernel ridge regression (KRR)--with KT to provide a quadratic speed-up in both training and inference times. We show how distribution compression with KT in each setting reduces to constructing an appropriate kernel, and introduce the Kernel-Thinned NW and Kernel-Thinned KRR estimators. We prove that KT-based regression estimators enjoy significantly superior computational efficiency over the full-data estimators and improved statistical efficiency over i.i.d. subsampling of the training data. En route, we also provide a novel multiplicative error guarantee for compressing with KT. We validate our design choices with both simulations and real data experiments.
△ Less
Submitted 15 January, 2025; v1 submitted 17 October, 2024;
originally announced October 2024.
-
Sum of Consecutive Terms of Pell and Related Sequences
Authors:
Navvye Anand,
Amit Kumar Basistha,
Kenny B. Davenport,
Alexander Gong,
Florian Luca,
Steven J. Miller,
Alexander Zhu
Abstract:
We study new identities related to the sums of adjacent terms in the Pell sequence, defined by $P_{n} := 2P_{n-1}+P_{n-2}$ for $ n\geq 2$ and $P_{0}=0, P_{1}=1$, and generalize these identities for many similar sequences. We prove that the sum of $N>1$ consecutive Pell numbers is a fixed integer multiple of another Pell number if and only if $4\mid N$. We consider the generalized Pell $(k,i)$-numb…
▽ More
We study new identities related to the sums of adjacent terms in the Pell sequence, defined by $P_{n} := 2P_{n-1}+P_{n-2}$ for $ n\geq 2$ and $P_{0}=0, P_{1}=1$, and generalize these identities for many similar sequences. We prove that the sum of $N>1$ consecutive Pell numbers is a fixed integer multiple of another Pell number if and only if $4\mid N$. We consider the generalized Pell $(k,i)$-numbers defined by $p(n) :=\ 2p(n-1)+p(n-k-1) $ for $n\geq k+1$, with $p(0)=p(1)=\cdots =p(i)=0$ and $p(i+1)=\cdots = p(k)=1$ for $0\leq i\leq k-1$, and prove that the sum of $N=2k+2$ consecutive terms is a fixed integer multiple of another term in the sequence. We also prove that for the generalized Pell $(k,k-1)$-numbers such a relation does not exist when $N$ and $k$ are odd. We give analogous results for the Fibonacci and other related second-order recursive sequences.
△ Less
Submitted 14 January, 2025; v1 submitted 13 July, 2024;
originally announced July 2024.
-
A Two-stage Reinforcement Learning-based Approach for Multi-entity Task Allocation
Authors:
Aicheng Gong,
Kai Yang,
Jiafei Lyu,
Xiu Li
Abstract:
Task allocation is a key combinatorial optimization problem, crucial for modern applications such as multi-robot cooperation and resource scheduling. Decision makers must allocate entities to tasks reasonably across different scenarios. However, traditional methods assume static attributes and numbers of tasks and entities, often relying on dynamic programming and heuristic algorithms for solution…
▽ More
Task allocation is a key combinatorial optimization problem, crucial for modern applications such as multi-robot cooperation and resource scheduling. Decision makers must allocate entities to tasks reasonably across different scenarios. However, traditional methods assume static attributes and numbers of tasks and entities, often relying on dynamic programming and heuristic algorithms for solutions. In reality, task allocation resembles Markov decision processes, with dynamically changing task and entity attributes. Thus, algorithms must dynamically allocate tasks based on their states. To address this issue, we propose a two-stage task allocation algorithm based on similarity, utilizing reinforcement learning to learn allocation strategies. The proposed pre-assign strategy allows entities to preselect appropriate tasks, effectively avoiding local optima and thereby better finding the optimal allocation. We also introduce an attention mechanism and a hyperparameter network structure to adapt to the changing number and attributes of entities and tasks, enabling our network structure to generalize to new tasks. Experimental results across multiple environments demonstrate that our algorithm effectively addresses the challenges of dynamic task allocation in practical applications. Compared to heuristic algorithms like genetic algorithms, our reinforcement learning approach better solves dynamic allocation problems and achieves zero-shot generalization to new tasks with good performance. The code is available at https://github.com/yk7333/TaskAllocation.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Age-Gain-Dependent Random Access for Event-Driven Periodic Updating
Authors:
Yuqing Zhu,
Yiwen Zhu,
Aoyu Gong,
Yan Lin,
Yuan-Hsuan Lo,
Yijin Zhang
Abstract:
This paper considers utilizing the knowledge of age gains to reduce the network average age of information (AoI) in random access with event-driven periodic updating for the first time. Built on the form of slotted ALOHA, we require each device to determine its age gain threshold and transmission probability in an easily implementable decentralized manner, so that the unavoided contention can be l…
▽ More
This paper considers utilizing the knowledge of age gains to reduce the network average age of information (AoI) in random access with event-driven periodic updating for the first time. Built on the form of slotted ALOHA, we require each device to determine its age gain threshold and transmission probability in an easily implementable decentralized manner, so that the unavoided contention can be limited to devices with age gains as high as possible. For the basic case that each device utilizes its knowledge of age gain of only itself, we provide an analytical modeling approach by a multi-layer discrete-time Markov chains (DTMCs), where an external infinite-horizon DTMC manages the jumps between the beginnings of frames and an internal finite-horizon DTMC manages the evolution during an arbitrary frame. Such modelling enables that optimal access parameters can be obtained offline. For the enhanced case that each device utilizes its knowledge of age gains of all the devices, we require each device to adjust its access parameters for maximizing the estimated network \textit{expected AoI reduction} (EAR) per slot, which captures the essential for improving the contribution of the throughput to the AoI performance. To estimate the network EAR, we require each device to use Bayes' rule to keep a posteriori joint probability distribution of local age and age gain of an arbitrary device based on the channel observations. Numerical results validate our theoretical analysis and demonstrate the advantage of the proposed schemes over the existing schemes in a wide range of network configurations.
△ Less
Submitted 27 June, 2024; v1 submitted 2 June, 2024;
originally announced June 2024.
-
Toward Low-latency Iterative Decoding of QLDPC Codes Under Circuit-Level Noise
Authors:
Anqi Gong,
Sebastian Cammerer,
Joseph M. Renes
Abstract:
We introduce a sliding window decoder based on belief propagation (BP) with guided decimation for the purposes of decoding quantum low-density parity-check codes in the presence of circuit-level noise. Windowed decoding keeps the decoding complexity reasonable when, as is typically the case, repeated rounds of syndrome extraction are required to decode. Within each window, we employ several rounds…
▽ More
We introduce a sliding window decoder based on belief propagation (BP) with guided decimation for the purposes of decoding quantum low-density parity-check codes in the presence of circuit-level noise. Windowed decoding keeps the decoding complexity reasonable when, as is typically the case, repeated rounds of syndrome extraction are required to decode. Within each window, we employ several rounds of BP with decimation of the variable node that we expect to be the most likely to flip in each round, Furthermore, we employ ensemble decoding to keep both decimation options (guesses) open in a small number of chosen rounds. We term the resulting decoder BP with guided decimation guessing (GDG). Applied to bivariate bicycle codes, GDG achieves a similar logical error rate as BP with an additional OSD post-processing stage (BP+OSD) and combination-sweep of order 10. For a window size of three syndrome cycles, a multi-threaded CPU implementation of GDG achieves a worst-case decoding latency of 3ms per window for the [[144,12,12]] code.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
2D isotropic negative permeability in a Λ-type three-level atomic system
Authors:
Shuang-Ying Zhang,
Shun-Cai Zhao,
Ai-Ling Gong
Abstract:
A approach for two-dimensional(2D) negative permeability in a $Λ$-type three-level atomic system interacting with a probe magnetic and the superposition of two orthogonal standing-wave fields is proposed. Through the theoretical analysis and numerical simulation, two equally and tunable peak maxima of negative magnetic responses are observed in the x-y plane, and around the peak maxima region the…
▽ More
A approach for two-dimensional(2D) negative permeability in a $Λ$-type three-level atomic system interacting with a probe magnetic and the superposition of two orthogonal standing-wave fields is proposed. Through the theoretical analysis and numerical simulation, two equally and tunable peak maxima of negative magnetic responses are observed in the x-y plane, and around the peak maxima region the negative permeability is isotropic. A new avenue to 2D isotropic negative
△ Less
Submitted 21 March, 2024; v1 submitted 17 March, 2024;
originally announced March 2024.
-
Effect of Spontaneously Generated Coherence and Detuning on 2D Atom Localization in Two Orthogonal Standing-Wave Fields
Authors:
Shun-Cai Zhao,
Qi-Xuan Wu,
Ai-Ling Gong
Abstract:
Two-dimensional (2D) atom localization via the spontaneously generated coherence (SGC) and detunings associated with the probe and standing-wave driving fields in a three-level V-type atomic system are investigated. In the gain process, two equal and tunable peak maxima of position distribution in the plane via the detunings are observed. However, one decreasing and the other increasing peak maxim…
▽ More
Two-dimensional (2D) atom localization via the spontaneously generated coherence (SGC) and detunings associated with the probe and standing-wave driving fields in a three-level V-type atomic system are investigated. In the gain process, two equal and tunable peak maxima of position distribution in the plane via the detunings are observed. However, one decreasing and the other increasing peak maxima in the absorption process via the SGC are achieved in the quadrants I and III of the x-y plane. A better resolution and more novelty for the 2D atom localization in our scheme are obtained.
△ Less
Submitted 17 March, 2024;
originally announced March 2024.
-
Algebraic analysis of electromagnetic chirality-induced negative refractive index in a four-level atomic system
Authors:
Shun-Cai Zhao,
Qi-Xuan Wu,
Ai-Ling Gong
Abstract:
This paper presents a algebraic analysis of electromagnetic chirality-induced negative refractive index in a four-level atomic medium. According to analyze mathematically its argument of the complex refractive index for one circular polarization, it found that the negative refractive index without simultaneously negative permittivity and permeability can be obtained when the argument is in the sec…
▽ More
This paper presents a algebraic analysis of electromagnetic chirality-induced negative refractive index in a four-level atomic medium. According to analyze mathematically its argument of the complex refractive index for one circular polarization, it found that the negative refractive index without simultaneously negative permittivity and permeability can be obtained when the argument is in the second quadrant of the cartesian coordinate system, and that the probe field coupling to two equal transition frequencies in the atomic level doesn't require. This undoubtedly reduced stringent conditions to negative refractive index by quantum optics. As an application, our scheme may possibly give a novel approach to obtain negative refractive index by electromagnetic chirality-inducing.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
Graph Neural Networks for Enhanced Decoding of Quantum LDPC Codes
Authors:
Anqi Gong,
Sebastian Cammerer,
Joseph M. Renes
Abstract:
In this work, we propose a fully differentiable iterative decoder for quantum low-density parity-check (LDPC) codes. The proposed algorithm is composed of classical belief propagation (BP) decoding stages and intermediate graph neural network (GNN) layers. Both component decoders are defined over the same sparse decoding graph enabling a seamless integration and scalability to large codes. The cor…
▽ More
In this work, we propose a fully differentiable iterative decoder for quantum low-density parity-check (LDPC) codes. The proposed algorithm is composed of classical belief propagation (BP) decoding stages and intermediate graph neural network (GNN) layers. Both component decoders are defined over the same sparse decoding graph enabling a seamless integration and scalability to large codes. The core idea is to use the GNN component between consecutive BP runs, so that the knowledge from the previous BP run, if stuck in a local minima caused by trapping sets or short cycles in the decoding graph, can be leveraged to better initialize the next BP run. By doing so, the proposed decoder can learn to compensate for sub-optimal BP decoding graphs that result from the design constraints of quantum LDPC codes. Since the entire decoder remains differentiable, gradient descent-based training is possible. We compare the error rate performance of the proposed decoder against various post-processing methods such as random perturbation, enhanced feedback, augmentation, and ordered-statistics decoding (OSD) and show that a carefully designed training process lowers the error-floor significantly. As a result, our proposed decoder outperforms the former three methods using significantly fewer post-processing attempts. The source code of our experiments is available online.
△ Less
Submitted 6 November, 2023; v1 submitted 26 October, 2023;
originally announced October 2023.
-
Age-of-Information Dependent Random Access for Periodic Updating
Authors:
Yuqing Zhu,
Yiwen Zhu,
Aoyu Gong,
Yan Lin,
Yijin Zhang
Abstract:
This paper considers an uplink Internet of Things system with synchronous periodic traffic, where multiple devices generate their status updates at the beginning of each global frame and attempt to send them to a common access point. To achieve a low network-wide age of information (AoI) in an easily implementable manner, we require each device to adopt an age-dependent random access protocol, i.e…
▽ More
This paper considers an uplink Internet of Things system with synchronous periodic traffic, where multiple devices generate their status updates at the beginning of each global frame and attempt to send them to a common access point. To achieve a low network-wide age of information (AoI) in an easily implementable manner, we require each device to adopt an age-dependent random access protocol, i.e., to transmit with a certain probability only when its corresponding AoI reaches a certain threshold. We analyze the time-average expected AoI by a multi-layer Markov model where an external infinite-horizon Markov chain manages the jumps between the beginnings of frames, while two internal finite-horizon Markov chains manage the evolution during an arbitrary frame for different cases. Simulation results verify the accuracy of the modeling and the AoI advantage over age-independent schemes.
△ Less
Submitted 6 July, 2023;
originally announced July 2023.
-
Kinematic space for quantum extremal surface
Authors:
An Gong,
Chong-Bin Chen,
Fu-Wen Shu
Abstract:
This paper investigates the entanglement entropy inequality and explores the presentation of mutual information and conditional mutual information in kinematic space. Specifically, we examine the regions within kinematic space responsible for computing these physical quantities, enabling a more intuitive understanding of the entanglement entropy inequality. Building upon this, we employ the concep…
▽ More
This paper investigates the entanglement entropy inequality and explores the presentation of mutual information and conditional mutual information in kinematic space. Specifically, we examine the regions within kinematic space responsible for computing these physical quantities, enabling a more intuitive understanding of the entanglement entropy inequality. Building upon this, we employ the concept of double holography to analyze the properties of the entanglement inequality in any given region. By utilizing kinematic space, we calculate the contribution of the bulk to the holographic entanglement entropy in double holography. In conclusion, we establish that kinematic space substantiates a conjecture, namely that the entanglement entropy of an entire region can be expressed as a linear combination of the entanglement entropy of a single interval within the entangled region.
△ Less
Submitted 25 May, 2023;
originally announced May 2023.
-
Deadline-Constrained Opportunistic Spectrum Access With Spectrum Handoff
Authors:
Zhaolong Xue,
Aoyu Gong,
Yuan-Hsun Lo,
Sirui Tian,
Yijin Zhang
Abstract:
This paper considers designing an optimal policy for deadline-constrained access in cognitive radio networks, where a secondary user needs to complete a packet transmission over the vacant spectrum within a delivery deadline. To minimize the total access cost, it is desirable to design an optimal opportunistic access policy by utilizing channel dynamics and sensing outcomes. We take non-negligible…
▽ More
This paper considers designing an optimal policy for deadline-constrained access in cognitive radio networks, where a secondary user needs to complete a packet transmission over the vacant spectrum within a delivery deadline. To minimize the total access cost, it is desirable to design an optimal opportunistic access policy by utilizing channel dynamics and sensing outcomes. We take non-negligible switching overheads, a state-dependent overtime penalty, and practical switching operations into consideration in the Markov decision process formulation of such an access problem under wide-band sensing. Moreover, we establish the existence of monotone optimal decision rules to reduce the complexity of computing an optimal policy. Simulation results verify our theoretical studies and the cost advantage over other policies.
△ Less
Submitted 21 May, 2023;
originally announced May 2023.
-
How to out-perform default random forest regression: choosing hyperparameters for applications in large-sample hydrology
Authors:
Divya K. Bilolikar,
Aishwarya More,
Aella Gong,
Joseph Janssen
Abstract:
Predictions are a central part of water resources research. Historically, physically-based models have been preferred; however, they have largely failed at modeling hydrological processes at a catchment scale and there are some important prediction problems that cannot be modeled physically. As such, machine learning (ML) models have been seen as a valid alternative in recent years. In spite of th…
▽ More
Predictions are a central part of water resources research. Historically, physically-based models have been preferred; however, they have largely failed at modeling hydrological processes at a catchment scale and there are some important prediction problems that cannot be modeled physically. As such, machine learning (ML) models have been seen as a valid alternative in recent years. In spite of their availability, well-optimized state-of-the-art ML strategies are not being widely used in water resources research. This is because using state-of-the-art ML models and optimizing hyperparameters requires expert mathematical and statistical knowledge. Further, some analyses require many model trainings, so sometimes even expert statisticians cannot properly optimize hyperparameters. To leverage data and use it effectively to drive scientific advances in the field, it is essential to make ML models accessible to subject matter experts by improving automated machine learning resources. ML models such as XGBoost have been recently shown to outperform random forest (RF) models which are traditionally used in water resources research. In this study, based on over 150 water-related datasets, we extensively compare XGBoost and RF. This study provides water scientists with access to quick user-friendly RF and XGBoost model optimization.
△ Less
Submitted 11 May, 2023;
originally announced May 2023.
-
Improved Logical Error Rate via List Decoding of Quantum Polar Codes
Authors:
Anqi Gong,
Joseph M. Renes
Abstract:
The successive cancellation list decoder (SCL) is an efficient decoder for classical polar codes with low decoding error, approximating the maximum likelihood decoder (MLD) for small list sizes. Here we adapt the SCL to the task of decoding quantum polar codes and show that it inherits the high performance and low complexity of the classical case, and can approximate the quantum MLD for certain ch…
▽ More
The successive cancellation list decoder (SCL) is an efficient decoder for classical polar codes with low decoding error, approximating the maximum likelihood decoder (MLD) for small list sizes. Here we adapt the SCL to the task of decoding quantum polar codes and show that it inherits the high performance and low complexity of the classical case, and can approximate the quantum MLD for certain channels. We apply SCL decoding to a novel version of quantum polar codes based on the polarization weight (PW) method, which entirely avoids the need for small amounts of entanglement assistance apparent in previous quantum polar code constructions. When used to find the precise error pattern, the quantum SCL decoder (SCL-E) shows competitive performance with surface codes of similar size and low-density parity check codes of similar size and rate. The SCL decoder may instead be used to approximate the probability of each equivalence class of errors, and then choose the most likely class. We benchmark this class-oriented decoder (SCL-C) against the SCL-E decoder and find a noticeable improvement in the logical error rate. This improvement stems from the fact that the contributions from just the low-weight errors give a reasonable approximation to the error class probabilities. Both SCL-E and SCL-C maintain the complexity O(LN logN) of SCL for code size N and list size L. We also show that the list decoder can be used to gain insight into the weight distribution of the codes and how this impacts the effect of degenerate errors.
△ Less
Submitted 10 April, 2023;
originally announced April 2023.
-
State Advantage Weighting for Offline RL
Authors:
Jiafei Lyu,
Aicheng Gong,
Le Wan,
Zongqing Lu,
Xiu Li
Abstract:
We present state advantage weighting for offline reinforcement learning (RL). In contrast to action advantage $A(s,a)$ that we commonly adopt in QSA learning, we leverage state advantage $A(s,s^\prime)$ and QSS learning for offline RL, hence decoupling the action from values. We expect the agent can get to the high-reward state and the action is determined by how the agent can get to that correspo…
▽ More
We present state advantage weighting for offline reinforcement learning (RL). In contrast to action advantage $A(s,a)$ that we commonly adopt in QSA learning, we leverage state advantage $A(s,s^\prime)$ and QSS learning for offline RL, hence decoupling the action from values. We expect the agent can get to the high-reward state and the action is determined by how the agent can get to that corresponding state. Experiments on D4RL datasets show that our proposed method can achieve remarkable performance against the common baselines. Furthermore, our method shows good generalization capability when transferring from offline to online.
△ Less
Submitted 8 November, 2022; v1 submitted 9 October, 2022;
originally announced October 2022.
-
Optimizing Age of Information in Wireless Uplink Networks with Partial Observations
Authors:
Jingwei Liu,
Rui Zhang,
Aoyu Gong,
He Chen
Abstract:
We consider a wireless uplink network consisting of multiple end devices and an access point (AP). Each device monitors a physical process with stochastic arrival of status updates and sends these updates to the AP over a shared channel. The AP aims to schedule the transmissions of these devices to optimize the network-wide information freshness, quantified by the Age of Information (AoI) metric.…
▽ More
We consider a wireless uplink network consisting of multiple end devices and an access point (AP). Each device monitors a physical process with stochastic arrival of status updates and sends these updates to the AP over a shared channel. The AP aims to schedule the transmissions of these devices to optimize the network-wide information freshness, quantified by the Age of Information (AoI) metric. Due to the stochastic arrival of the status updates at the devices, the AP only has partial observations of system times of the latest status updates at the devices when making scheduling decisions. We formulate such a decision-making problem as a belief Markov Decision Process (belief-MDP). The belief-MDP in its original form is difficult to solve as the dimension of its states can go to infinity and its belief space is uncountable. By leveraging the properties of the status update arrival (i.e., Bernoulli) processes, we manage to simplify the feasible states of the belief-MDP to two-dimensional vectors. Built on that, we devise a low-complexity scheduling policy. We derive upper bounds for the AoI performance of the low-complexity policy and analyze the performance guarantee by comparing its performance with a universal lower bound. Numerical results validate our analyses.
△ Less
Submitted 26 June, 2022; v1 submitted 7 February, 2022;
originally announced February 2022.
-
Bounds for Treatment Effects in the Presence of Anticipatory Behavior
Authors:
Aibo Gong
Abstract:
In program evaluations, units can often anticipate the implementation of a new policy before it occurs. Such anticipatory behavior can lead to units' outcomes becoming dependent on their future treatment assignments. In this paper, I employ a potential-outcomes framework to analyze the treatment effect with anticipation. I start with a classical difference-in-differences model with two time period…
▽ More
In program evaluations, units can often anticipate the implementation of a new policy before it occurs. Such anticipatory behavior can lead to units' outcomes becoming dependent on their future treatment assignments. In this paper, I employ a potential-outcomes framework to analyze the treatment effect with anticipation. I start with a classical difference-in-differences model with two time periods and provide identified sets with easy-to-implement estimation and inference strategies for causal parameters. Empirical applications and generalizations are provided. I illustrate my results by analyzing the effect of an early retirement incentive program for teachers, which the target units were likely to anticipate, on student achievement. The empirical results show the result can be overestimated by up to 30\% in the worst case and demonstrate the potential pitfalls of failing to consider anticipation in policy evaluation.
△ Less
Submitted 1 December, 2022; v1 submitted 12 November, 2021;
originally announced November 2021.
-
Dynamic Control for Random Access in Deadline-Constrained Broadcasting
Authors:
Aoyu Gong,
Lei Deng,
Fang Liu,
Yijin Zhang
Abstract:
This paper considers random access in deadline-constrained broadcasting with frame-synchronized traffic. To enhance the maximum achievable timely delivery ratio (TDR), we define a dynamic control scheme that allows each active node to determine the transmission probability with certainty based on the current delivery urgency and the knowledge of current contention intensity. For an idealized envir…
▽ More
This paper considers random access in deadline-constrained broadcasting with frame-synchronized traffic. To enhance the maximum achievable timely delivery ratio (TDR), we define a dynamic control scheme that allows each active node to determine the transmission probability with certainty based on the current delivery urgency and the knowledge of current contention intensity. For an idealized environment where the contention intensity is completely known, we develop an analytical framework based on the theory of Markov Decision Process (MDP), which leads to an optimal scheme by applying backward induction. For a realistic environment where the contention intensity is incompletely known, we develop a framework using Partially Observable Markov Decision Process (POMDP), which can in theory be solved. We show that for both environments, there exists an optimal scheme that is optimal over all types of policies. To overcome the infeasibility in obtaining an optimal or near-optimal scheme from the POMDP framework, we investigate the behaviors of the optimal scheme for two extreme cases in the MDP framework, and leverage intuition gained from these behaviors to propose a heuristic scheme for the realistic environment with TDR close to the maximum achievable TDR in the idealized environment. In addition, we propose an approximation on the knowledge of contention intensity to further simplify this heuristic scheme. Numerical results with respect to a wide range of configurations are provided to validate our study.
△ Less
Submitted 6 August, 2021;
originally announced August 2021.
-
Evidence for freezing of charge degrees of freedom across a critical point in CeCoIn$_5$
Authors:
Nikola Maksimovic,
Tessa Cookmeyer,
Jan Rusz,
Vikram Nagarajan,
Amanda Gong,
Fanghui Wan,
Stefano Faubel,
Ian M. Hayes,
Sooyoung Jang,
Yochai Werman,
Peter M. Oppeneer,
Ehud Altman,
James G. Analytis
Abstract:
The presence of a quantum critical point separating two distinct zero-temperature phases is thought to underlie the `strange' metal state of many high-temperature superconductors. The nature of this quantum critical point, as well as a description of the resulting strange metal, are central open problems in condensed matter physics. In large part, the controversy stems from the lack of a clear bro…
▽ More
The presence of a quantum critical point separating two distinct zero-temperature phases is thought to underlie the `strange' metal state of many high-temperature superconductors. The nature of this quantum critical point, as well as a description of the resulting strange metal, are central open problems in condensed matter physics. In large part, the controversy stems from the lack of a clear broken symmetry to characterize the critical phase transition, and this challenge is no clearer than in the example of the unconventional superconductor CeCoIn$_5$. Through Hall effect and Fermi surface measurements of CeCoIn$_5$, in comparison to ab initio calculations, we find evidence for a critical point that connects two Fermi surfaces with different volumes without apparent symmetry-breaking, indicating the presence of a transition that involves an abrupt localization of one sector of the charge degrees of freedom. We present a model for the anomalous electrical Hall resistivity of this material based on the conductivity of valence charge fluctuations.
△ Less
Submitted 25 November, 2020;
originally announced November 2020.
-
Age-of-Information-based Scheduling in Multiuser Uplinks with Stochastic Arrivals: A POMDP Approach
Authors:
Aoyu Gong,
Tong Zhang,
He Chen,
Yijin Zhang
Abstract:
In this paper, we consider a multiuser uplink status update system, where a monitor aims to timely collect randomly generated status updates from multiple end nodes through a shared wireless channel. We adopt the recently proposed metric, termed age of information (AoI), to quantify the information timeliness and freshness. Due to the random generation of the status updates at the end node side, t…
▽ More
In this paper, we consider a multiuser uplink status update system, where a monitor aims to timely collect randomly generated status updates from multiple end nodes through a shared wireless channel. We adopt the recently proposed metric, termed age of information (AoI), to quantify the information timeliness and freshness. Due to the random generation of the status updates at the end node side, the monitor only grasps a partial knowledge of the status update arrivals. Under such a practical scenario, we aim to address a fundamental multiuser scheduling problem: how to schedule the end nodes to minimize the network-wide AoI? To solve this problem, we formulate it as a partially observable Markov decision process (POMDP), and develop a dynamic programming (DP) algorithm to obtain the optimal scheduling policy. By noting that the optimal policy is computationally prohibitive, we further design a low-complexity myopic policy that only minimizes the one-step expected reward. Simulation results show that the performance of the myopic policy can approach that of the optimal policy, and is better than that of the baseline policy.
△ Less
Submitted 29 May, 2020; v1 submitted 11 May, 2020;
originally announced May 2020.
-
Virtual CNN Branching: Efficient Feature Ensemble for Person Re-Identification
Authors:
Albert Gong,
Qiang Qiu,
Guillermo Sapiro
Abstract:
In this paper we introduce an ensemble method for convolutional neural network (CNN), called "virtual branching," which can be implemented with nearly no additional parameters and computation on top of standard CNNs. We propose our method in the context of person re-identification (re-ID). Our CNN model consists of shared bottom layers, followed by "virtual" branches, where neurons from a block of…
▽ More
In this paper we introduce an ensemble method for convolutional neural network (CNN), called "virtual branching," which can be implemented with nearly no additional parameters and computation on top of standard CNNs. We propose our method in the context of person re-identification (re-ID). Our CNN model consists of shared bottom layers, followed by "virtual" branches, where neurons from a block of regular convolutional and fully-connected layers are partitioned into multiple sets. Each virtual branch is trained with different data to specialize in different aspects, e.g., a specific body region or pose orientation. In this way, robust ensemble representations are obtained against human body misalignment, deformations, or variations in viewing angles, at nearly no any additional cost. The proposed method achieves competitive performance on multiple person re-ID benchmark datasets, including Market-1501, CUHK03, and DukeMTMC-reID.
△ Less
Submitted 15 March, 2018;
originally announced March 2018.
-
Performance of the LHCb Vertex Locator
Authors:
LHCb VELO Group,
R. Aaij,
A. Affolder,
K. Akiba,
M. Alexander,
S. Ali,
R. B. Appleby,
M. Artuso,
A. Bates,
A. Bay,
O. Behrendt,
J. Benton,
M. van Beuzekom,
P. M. Bjørnstad,
G. Bogdanova,
S. Borghi,
A. Borgia,
T. J. V. Bowcock,
J. van den Brand,
H. Brown,
J. Buytaert,
O. Callot,
J. Carroll,
G. Casse,
P. Collins
, et al. (79 additional authors not shown)
Abstract:
The Vertex Locator (VELO) is a silicon microstrip detector that surrounds the proton-proton interaction region in the LHCb experiment. The performance of the detector during the first years of its physics operation is reviewed. The system is operated in vacuum, uses a bi-phase CO2 cooling system, and the sensors are moved to 7 mm from the LHC beam for physics data taking. The performance and stabi…
▽ More
The Vertex Locator (VELO) is a silicon microstrip detector that surrounds the proton-proton interaction region in the LHCb experiment. The performance of the detector during the first years of its physics operation is reviewed. The system is operated in vacuum, uses a bi-phase CO2 cooling system, and the sensors are moved to 7 mm from the LHC beam for physics data taking. The performance and stability of these characteristic features of the detector are described, and details of the material budget are given. The calibration of the timing and the data processing algorithms that are implemented in FPGAs are described. The system performance is fully characterised. The sensors have a signal to noise ratio of approximately 20 and a best hit resolution of 4 microns is achieved at the optimal track angle. The typical detector occupancy for minimum bias events in standard operating conditions in 2011 is around 0.5%, and the detector has less than 1% of faulty strips. The proximity of the detector to the beam means that the inner regions of the n+-on-n sensors have undergone space-charge sign inversion due to radiation damage. The VELO performance parameters that drive the experiment's physics sensitivity are also given. The track finding efficiency of the VELO is typically above 98% and the modules have been aligned to a precision of 1 micron for translations in the plane transverse to the beam. A primary vertex resolution of 13 microns in the transverse plane and 71 microns along the beam axis is achieved for vertices with 25 tracks. An impact parameter resolution of less than 35 microns is achieved for particles with transverse momentum greater than 1 GeV/c.
△ Less
Submitted 10 September, 2014; v1 submitted 30 May, 2014;
originally announced May 2014.