-
Dual Codebook VQ: Enhanced Image Reconstruction with Reduced Codebook Size
Authors:
Parisa Boodaghi Malidarreh,
Jillur Rahman Saurav,
Thuong Le Hoai Pham,
Amir Hajighasemi,
Anahita Samadi,
Saurabh Shrinivas Maydeo,
Mohammad Sadegh Nasr,
Jacob M. Luber
Abstract:
Vector Quantization (VQ) techniques face significant challenges in codebook utilization, limiting reconstruction fidelity in image modeling. We introduce a Dual Codebook mechanism that effectively addresses this limitation by partitioning the representation into complementary global and local components. The global codebook employs a lightweight transformer for concurrent updates of all code vecto…
▽ More
Vector Quantization (VQ) techniques face significant challenges in codebook utilization, limiting reconstruction fidelity in image modeling. We introduce a Dual Codebook mechanism that effectively addresses this limitation by partitioning the representation into complementary global and local components. The global codebook employs a lightweight transformer for concurrent updates of all code vectors, while the local codebook maintains precise feature representation through deterministic selection. This complementary approach is trained from scratch without requiring pre-trained knowledge. Experimental evaluation across multiple standard benchmark datasets demonstrates state-of-the-art reconstruction quality while using a compact codebook of size 512 - half the size of previous methods that require pre-training. Our approach achieves significant FID improvements across diverse image domains, particularly excelling in scene and face reconstruction tasks. These results establish Dual Codebook VQ as an efficient paradigm for high-fidelity image reconstruction with significantly reduced computational requirements.
△ Less
Submitted 13 March, 2025;
originally announced March 2025.
-
Peptide Sequencing Via Protein Language Models
Authors:
Thuong Le Hoai Pham,
Jillur Rahman Saurav,
Aisosa A. Omere,
Calvin J. Heyl,
Mohammad Sadegh Nasr,
Cody Tyler Reynolds,
Jai Prakash Yadav Veerla,
Helen H Shang,
Justyn Jaworski,
Alison Ravenscraft,
Joseph Anthony Buonomo,
Jacob M. Luber
Abstract:
We introduce a protein language model for determining the complete sequence of a peptide based on measurement of a limited set of amino acids. To date, protein sequencing relies on mass spectrometry, with some novel edman degregation based platforms able to sequence non-native peptides. Current protein sequencing techniques face limitations in accurately identifying all amino acids, hindering comp…
▽ More
We introduce a protein language model for determining the complete sequence of a peptide based on measurement of a limited set of amino acids. To date, protein sequencing relies on mass spectrometry, with some novel edman degregation based platforms able to sequence non-native peptides. Current protein sequencing techniques face limitations in accurately identifying all amino acids, hindering comprehensive proteome analysis. Our method simulates partial sequencing data by selectively masking amino acids that are experimentally difficult to identify in protein sequences from the UniRef database. This targeted masking mimics real-world sequencing limitations. We then modify and finetune a ProtBert derived transformer-based model, for a new downstream task predicting these masked residues, providing an approximation of the complete sequence. Evaluating on three bacterial Escherichia species, we achieve per-amino-acid accuracy up to 90.5% when only four amino acids ([KCYM]) are known. Structural assessment using AlphaFold and TM-score validates the biological relevance of our predictions. The model also demonstrates potential for evolutionary analysis through cross-species performance. This integration of simulated experimental constraints with computational predictions offers a promising avenue for enhancing protein sequence analysis, potentially accelerating advancements in proteomics and structural biology by providing a probabilistic reconstruction of the complete protein sequence from limited experimental data.
△ Less
Submitted 1 August, 2024;
originally announced August 2024.
-
Computationally Predicted Electronic Properties and Energetics of Native Defects in Cubic Boron Nitride
Authors:
Ngoc Linh Nguyen,
Hung The Dang,
Tien Lam Pham,
Thi Minh Hoa Nghiem
Abstract:
In this study, we employ a first-principles approach to conduct a comprehensive investigation of the properties of nine common native point defects in cubic boron nitride. This analysis combines standard semi-local and dielectric hybrid density-exchange-correlation functional calculations, encompassing vacancies, interstitials, antisites, and their complexes. Our findings elucidate the influence o…
▽ More
In this study, we employ a first-principles approach to conduct a comprehensive investigation of the properties of nine common native point defects in cubic boron nitride. This analysis combines standard semi-local and dielectric hybrid density-exchange-correlation functional calculations, encompassing vacancies, interstitials, antisites, and their complexes. Our findings elucidate the influence of these defects on the structural and electronic characteristics of cubic boron nitride, such as local structures, formation energy, magnetism, and the energies of defect states within the band gap. Notably, we accurately simulate the photoluminescent spectra of cubic boron nitride induced by these defects, demonstrating excellent agreement with experimental observations. This outcome indicates that the prominent peaks in the photoluminescent spectrum at 2.5 and 2.8 eV can be attributed to the nitrogen to boron antisite (N$_{\rm B}$) and boron interstitial (B$_{\rm i}$) defects, respectively. Additionally, we investigate the energetic stability of defects under various charge states, providing valuable references for benchmarking purposes.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
Testing macroscopic local realism using cat-states and Bell inequalities in time
Authors:
M. Thenabadu,
G-L. Cheng,
T. L. H. Pham,
L. V. Drummond,
L. Rosales-Zárate,
M. D. Reid
Abstract:
We show how one may test macroscopic local realism where, different from conventional Bell tests, all relevant measurements need only distinguish between two macroscopically distinct states of the system being measured. Here, measurements give macroscopically distinguishable outcomes for a system observable and do not resolve microscopic properties (of order $\hbar$). Macroscopic local realism ass…
▽ More
We show how one may test macroscopic local realism where, different from conventional Bell tests, all relevant measurements need only distinguish between two macroscopically distinct states of the system being measured. Here, measurements give macroscopically distinguishable outcomes for a system observable and do not resolve microscopic properties (of order $\hbar$). Macroscopic local realism assumes: (1) macroscopic realism (the system prior to measurement is in a state which will lead to just one of the macroscopically distinguishable outcomes) and (2) macroscopic locality (a measurement on a system at one location cannot affect the macroscopic outcome of the measurement on a system at another location, if the measurement events are spacelike separated). To obtain a quantifiable test, we define $M$-scopic local realism where the outcomes are separated by an amount $\sim M$. We first show for $N$ up to $20$ that $N$-scopic Bell violations are predicted for entangled superpositions of $N$ bosons (at each of two sites). Secondly, we show violation of $M$-scopic local realism for entangled superpositions of coherent states of amplitude $α$, for arbitrarily large $M=α$. In both cases, the systems evolve dynamically according to a local nonlinear interaction. The first uses nonlinear beam splitters realised through nonlinear Josephson interactions; the second is based on nonlinear Kerr interactions. To achieve the Bell violations, the traditional choice between two spin measurement settings is replaced by a choice between different times of evolution at each site.
△ Less
Submitted 22 April, 2020; v1 submitted 11 June, 2019;
originally announced June 2019.
-
Important descriptors and descriptor groups of Curie temperatures of rare-earth transition-metal binary alloys
Authors:
Hieu Chi Dam,
Viet Cuong Nguyen,
Tien Lam Pham,
Anh Tuan Nguyen,
Kiyoyuki Terakura,
Takashi Miyake,
Hiori Kino
Abstract:
We analyze Curie temperatures of rare-earth transition metal binary alloys with machine learning method. In order to select important descriptors and descriptor groups, we introduce newly developed subgroup relevance analysis and adopt the hierarchical clustering in the representation. We execute the exhaustive search and successfully illustrate the importance of descriptors and descriptor groups.…
▽ More
We analyze Curie temperatures of rare-earth transition metal binary alloys with machine learning method. In order to select important descriptors and descriptor groups, we introduce newly developed subgroup relevance analysis and adopt the hierarchical clustering in the representation. We execute the exhaustive search and successfully illustrate the importance of descriptors and descriptor groups. We execute the exhaustive search and illustrate that our approach indeed leads to the successful selection of important descriptors and descriptor groups. It helps us to choose the combination of the descriptors and to understand the meaning of the selected combination of descriptors.
△ Less
Submitted 15 October, 2018; v1 submitted 12 September, 2018;
originally announced September 2018.
-
Machine learning reveals orbital interaction in crystalline materials
Authors:
Tien Lam Pham,
Hiori Kino,
Kiyoyuki Terakura,
Takashi Miyake,
Ichigaku Takigawa,
Koji Tsuda,
Hieu Chi Dam
Abstract:
We propose a novel representation of crystalline materials named orbital-field matrix (OFM) based on the distribution of valence shell electrons. We demonstrate that this new representation can be highly useful in mining material data. Our experiment shows that the formation energies of crystalline materials, the atomization energies of molecular materials, and the local magnetic moments of the co…
▽ More
We propose a novel representation of crystalline materials named orbital-field matrix (OFM) based on the distribution of valence shell electrons. We demonstrate that this new representation can be highly useful in mining material data. Our experiment shows that the formation energies of crystalline materials, the atomization energies of molecular materials, and the local magnetic moments of the constituent atoms in transition metal--rare-earth metal bimetal alloys can be predicted with high accuracy using the OFM. Knowledge regarding the role of coordination numbers of transition-metal and rare-earth metal elements in determining the local magnetic moment of transition metal sites can be acquired directly from decision tree regression analyses using the OFM.
△ Less
Submitted 3 May, 2017; v1 submitted 2 May, 2017;
originally announced May 2017.
-
A regression-based feature selection study of the Curie temperature of transition-metal rare-earth compounds: prediction and understanding
Authors:
Hieu Chi Dam,
Viet Cuong Nguyen,
Tien Lam Pham,
Anh Tuan Nguyen,
Hiori Kino,
Kiyoyuki Terakura,
Takashi Miyake
Abstract:
The Curie temperature ($T_C$) of binary alloy compounds consisting of 3$d$ transition-metal and 4$f$ rare-earth elements is analyzed by a machine learning technique. We first demonstrate that nonlinear regression can accurately reproduce $T_C$ of the compounds. The prediction accuracy for $T_C$ is maximized when five to ten descriptors are selected, with the rare-earth concentration being the most…
▽ More
The Curie temperature ($T_C$) of binary alloy compounds consisting of 3$d$ transition-metal and 4$f$ rare-earth elements is analyzed by a machine learning technique. We first demonstrate that nonlinear regression can accurately reproduce $T_C$ of the compounds. The prediction accuracy for $T_C$ is maximized when five to ten descriptors are selected, with the rare-earth concentration being the most relevant. We then discuss an attempt to utilize a regression-based model selection technique to learn the relation between the descriptors and the actuation mechanism of the corresponding physical phenomenon, i.e., $T_C$ in the present case.
△ Less
Submitted 2 May, 2017; v1 submitted 2 May, 2017;
originally announced May 2017.