-
The generic Markov CoHA is not spherically generated
Authors:
Ben Davison
Abstract:
Let $Q$ be the Markov quiver, and let $W$ be an infinitely mutable potential for $Q$. We calculate some low degree refined BPS invariants for the resulting Jacobi algebra, and use them to show that the critical cohomological Hall algebra $\mathcal{H}_{Q,W}$ is not necessarily spherically generated, and is not independent of the choice of infinitely mutable potential $W$. This leads to a counterexa…
▽ More
Let $Q$ be the Markov quiver, and let $W$ be an infinitely mutable potential for $Q$. We calculate some low degree refined BPS invariants for the resulting Jacobi algebra, and use them to show that the critical cohomological Hall algebra $\mathcal{H}_{Q,W}$ is not necessarily spherically generated, and is not independent of the choice of infinitely mutable potential $W$. This leads to a counterexample to a conjecture of Gaiotto, Grygoryev and Li \cite[§2.1]{GGL}, but also suggestions for how to modify it. In the case of generic cubic $W$, we discuss a way to modify the conjecture, by excluding the non-spherical part via the decomposition of $\mathcal{H}_{Q,W}$ according to the characters of a discrete symmetry group.
△ Less
Submitted 7 February, 2025;
originally announced February 2025.
-
Cohomology of symmetric stacks
Authors:
Chenjing Bu,
Ben Davison,
Andrés Ibáñez Núñez,
Tasuki Kinjo,
Tudor Pădurariu
Abstract:
We construct decompositions of:
(1) the cohomology of smooth stacks,
(2) the Borel--Moore homology of $0$-shifted symplectic stacks, and
(3) the vanishing cycle cohomology of $(-1)$-shifted symplectic stacks,
assuming a good moduli space exists and the tangent space has a pointwise orthogonal structure. These conditions are satisfied by many stacks of interest, including moduli stacks of s…
▽ More
We construct decompositions of:
(1) the cohomology of smooth stacks,
(2) the Borel--Moore homology of $0$-shifted symplectic stacks, and
(3) the vanishing cycle cohomology of $(-1)$-shifted symplectic stacks,
assuming a good moduli space exists and the tangent space has a pointwise orthogonal structure. These conditions are satisfied by many stacks of interest, including moduli stacks of semistable $G$-bundles and (twisted) $G$-Higgs bundles on curves, $G$-character stacks of oriented closed 2-manifolds and various 3-manifolds, and moduli stacks of semistable coherent sheaves on Calabi--Yau threefolds and K3 surfaces with generic polarization. As a special case, we prove a PBW-type theorem for cohomological Hall algebras of $3$-Calabi--Yau categories with commutative orientation data, a strong form of the cohomological integrality conjecture for such categories. We define the BPS cohomology as the primary summand of the decomposition. When the stack is smooth, the BPS cohomology coincides with the intersection cohomology of the good moduli space, generalizing a theorem of Meinhardt--Reineke. Using the BPS cohomology for singular spaces, we propose a formulation of the topological mirror symmetry conjecture for the stack of $G$-Higgs bundles generalizing the work of Hausel and Thaddeus for type A groups, and a version of Langlands duality for character stacks of compact oriented 3-manifolds, following Ben-Zvi--Gunningham--Jordan--Safronov.
△ Less
Submitted 31 May, 2025; v1 submitted 6 February, 2025;
originally announced February 2025.
-
Degree two Gopakumar-Vafa invariants of local curves
Authors:
Ben Davison,
Naoki Koseki
Abstract:
We investigate the Gopakumar-Vafa (GV) theory of local curves, namely, the total spaces of rank two vector bundles with canonical determinant on smooth projective curves. Under a certain genericity condition on the rank two bundles, we propose a general mechanism to compute the degree two GV invariants of local curves. In particular, we determine all the degree two GV invariants when the base curv…
▽ More
We investigate the Gopakumar-Vafa (GV) theory of local curves, namely, the total spaces of rank two vector bundles with canonical determinant on smooth projective curves. Under a certain genericity condition on the rank two bundles, we propose a general mechanism to compute the degree two GV invariants of local curves. In particular, we determine all the degree two GV invariants when the base curve has genus two. Combined with previous work by Bryan and Pandharipande, we obtain the GV/GW correspondence in this case. When the base curve has genus greater than two, we calculate GV invariants for some extremal genera, providing evidence for the GV/GW conjecture for curves of higher genus.
△ Less
Submitted 20 April, 2025; v1 submitted 21 August, 2024;
originally announced August 2024.
-
ExpoMamba: Exploiting Frequency SSM Blocks for Efficient and Effective Image Enhancement
Authors:
Eashan Adhikarla,
Kai Zhang,
John Nicholson,
Brian D. Davison
Abstract:
Low-light image enhancement remains a challenging task in computer vision, with existing state-of-the-art models often limited by hardware constraints and computational inefficiencies, particularly in handling high-resolution images. Recent foundation models, such as transformers and diffusion models, despite their efficacy in various domains, are limited in use on edge devices due to their comput…
▽ More
Low-light image enhancement remains a challenging task in computer vision, with existing state-of-the-art models often limited by hardware constraints and computational inefficiencies, particularly in handling high-resolution images. Recent foundation models, such as transformers and diffusion models, despite their efficacy in various domains, are limited in use on edge devices due to their computational complexity and slow inference times. We introduce ExpoMamba, a novel architecture that integrates components of the frequency state space within a modified U-Net, offering a blend of efficiency and effectiveness. This model is specifically optimized to address mixed exposure challenges, a common issue in low-light image enhancement, while ensuring computational efficiency. Our experiments demonstrate that ExpoMamba enhances low-light images up to 2-3x faster than traditional models with an inference time of 36.6 ms and achieves a PSNR improvement of approximately 15-20% over competing models, making it highly suitable for real-time image processing applications.
△ Less
Submitted 18 August, 2024;
originally announced August 2024.
-
Unified-EGformer: Exposure Guided Lightweight Transformer for Mixed-Exposure Image Enhancement
Authors:
Eashan Adhikarla,
Kai Zhang,
Rosaura G. VidalMata,
Manjushree Aithal,
Nikhil Ambha Madhusudhana,
John Nicholson,
Lichao Sun,
Brian D. Davison
Abstract:
Despite recent strides made by AI in image processing, the issue of mixed exposure, pivotal in many real-world scenarios like surveillance and photography, remains inadequately addressed. Traditional image enhancement techniques and current transformer models are limited with primary focus on either overexposure or underexposure. To bridge this gap, we introduce the Unified-Exposure Guided Transfo…
▽ More
Despite recent strides made by AI in image processing, the issue of mixed exposure, pivotal in many real-world scenarios like surveillance and photography, remains inadequately addressed. Traditional image enhancement techniques and current transformer models are limited with primary focus on either overexposure or underexposure. To bridge this gap, we introduce the Unified-Exposure Guided Transformer (Unified-EGformer). Our proposed solution is built upon advanced transformer architectures, equipped with local pixel-level refinement and global refinement blocks for color correction and image-wide adjustments. We employ a guided attention mechanism to precisely identify exposure-compromised regions, ensuring its adaptability across various real-world conditions. U-EGformer, with a lightweight design featuring a memory footprint (peak memory) of only $\sim$1134 MB (0.1 Million parameters) and an inference time of 95 ms (9.61x faster than the average), is a viable choice for real-time applications such as surveillance and autonomous navigation. Additionally, our model is highly generalizable, requiring minimal fine-tuning to handle multiple tasks and datasets with a single architecture.
△ Less
Submitted 18 July, 2024;
originally announced July 2024.
-
Unleashing the Power of Multi-Task Learning: A Comprehensive Survey Spanning Traditional, Deep, and Pretrained Foundation Model Eras
Authors:
Jun Yu,
Yutong Dai,
Xiaokang Liu,
Jin Huang,
Yishan Shen,
Ke Zhang,
Rong Zhou,
Eashan Adhikarla,
Wenxuan Ye,
Yixin Liu,
Zhaoming Kong,
Kai Zhang,
Yilong Yin,
Vinod Namboodiri,
Brian D. Davison,
Jason H. Moore,
Yong Chen
Abstract:
MTL is a learning paradigm that effectively leverages both task-specific and shared information to address multiple related tasks simultaneously. In contrast to STL, MTL offers a suite of benefits that enhance both the training process and the inference efficiency. MTL's key advantages encompass streamlined model architecture, performance enhancement, and cross-domain generalizability. Over the pa…
▽ More
MTL is a learning paradigm that effectively leverages both task-specific and shared information to address multiple related tasks simultaneously. In contrast to STL, MTL offers a suite of benefits that enhance both the training process and the inference efficiency. MTL's key advantages encompass streamlined model architecture, performance enhancement, and cross-domain generalizability. Over the past twenty years, MTL has become widely recognized as a flexible and effective approach in various fields, including CV, NLP, recommendation systems, disease prognosis and diagnosis, and robotics. This survey provides a comprehensive overview of the evolution of MTL, encompassing the technical aspects of cutting-edge methods from traditional approaches to deep learning and the latest trend of pretrained foundation models. Our survey methodically categorizes MTL techniques into five key areas: regularization, relationship learning, feature propagation, optimization, and pre-training. This categorization not only chronologically outlines the development of MTL but also dives into various specialized strategies within each category. Furthermore, the survey reveals how the MTL evolves from handling a fixed set of tasks to embracing a more flexible approach free from task or modality constraints. It explores the concepts of task-promptable and -agnostic training, along with the capacity for ZSL, which unleashes the untapped potential of this historically coveted learning paradigm. Overall, we hope this survey provides the research community with a comprehensive overview of the advancements in MTL from its inception in 1997 to the present in 2023. We address present challenges and look ahead to future possibilities, shedding light on the opportunities and potential avenues for MTL research in a broad manner. This project is publicly available at https://github.com/junfish/Awesome-Multitask-Learning.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
Okounkov's conjecture via BPS Lie algebras
Authors:
Tommaso Maria Botta,
Ben Davison
Abstract:
Let $Q$ be an arbitrary finite quiver. We use nonabelian stable envelopes to relate representations of the Maulik-Okounkov Lie algebra $\mathfrak{g}^{MO}_Q$ to representations of the BPS Lie algebra associated to the tripled quiver $\tilde Q$ with its canonical potential. We use this comparison to provide an isomorphism between the Maulik-Okounkov Lie algebra and the BPS Lie algebra. Via this isom…
▽ More
Let $Q$ be an arbitrary finite quiver. We use nonabelian stable envelopes to relate representations of the Maulik-Okounkov Lie algebra $\mathfrak{g}^{MO}_Q$ to representations of the BPS Lie algebra associated to the tripled quiver $\tilde Q$ with its canonical potential. We use this comparison to provide an isomorphism between the Maulik-Okounkov Lie algebra and the BPS Lie algebra. Via this isomorphism we prove Okounkov's conjecture, equating the graded dimensions of the Lie algebra $\mathfrak{g}^{MO}_Q$ with the coefficients of Kac polynomials. Via general results regarding cohomological Hall algebras in dimensions two and three we furthermore give a complete description of $\mathfrak{g}^{MO}_Q$ as a generalised Kac-Moody Lie algebra with Cartan datum given by intersection cohomology of singular Nakajima quiver varieties, and prove a conjecture of Maulik and Okounkov, stating that their Lie algebra is obtained from a Lie algebra defined over the rationals, by extension of scalars. Finally, we explain how our results suggest the correct definition of critical stable envelopes in vanishing cycle cohomology.
△ Less
Submitted 9 December, 2024; v1 submitted 21 December, 2023;
originally announced December 2023.
-
Robust Computer Vision in an Ever-Changing World: A Survey of Techniques for Tackling Distribution Shifts
Authors:
Eashan Adhikarla,
Kai Zhang,
Jun Yu,
Lichao Sun,
John Nicholson,
Brian D. Davison
Abstract:
AI applications are becoming increasingly visible to the general public. There is a notable gap between the theoretical assumptions researchers make about computer vision models and the reality those models face when deployed in the real world. One of the critical reasons for this gap is a challenging problem known as distribution shift. Distribution shifts tend to vary with complexity of the data…
▽ More
AI applications are becoming increasingly visible to the general public. There is a notable gap between the theoretical assumptions researchers make about computer vision models and the reality those models face when deployed in the real world. One of the critical reasons for this gap is a challenging problem known as distribution shift. Distribution shifts tend to vary with complexity of the data, dataset size, and application type. In our paper, we discuss the identification of such a prominent gap, exploring the concept of distribution shift and its critical significance. We provide an in-depth overview of various types of distribution shifts, elucidate their distinctions, and explore techniques within the realm of the data-centric domain employed to address them. Distribution shifts can occur during every phase of the machine learning pipeline, from the data collection stage to the stage of training a machine learning model to the stage of final model deployment. As a result, it raises concerns about the overall robustness of the machine learning techniques for computer vision applications that are deployed publicly for consumers. Different deep learning models each tailored for specific type of data and tasks, architectural pipelines; highlighting how variations in data preprocessing and feature extraction can impact robustness., data augmentation strategies (e.g. geometric, synthetic and learning-based); demonstrating their role in enhancing model generalization, and training mechanisms (e.g. transfer learning, zero-shot) fall under the umbrella of data-centric methods. Each of these components form an integral part of the neural-network we analyze contributing uniquely to strengthening model robustness against distribution shifts. We compare and contrast numerous AI models that are built for mitigating shifts in hidden stratification and spurious correlations, ...
△ Less
Submitted 3 December, 2023;
originally announced December 2023.
-
Deep Feature Registration for Unsupervised Domain Adaptation
Authors:
Youshan Zhang,
Brian D. Davison
Abstract:
While unsupervised domain adaptation has been explored to leverage the knowledge from a labeled source domain to an unlabeled target domain, existing methods focus on the distribution alignment between two domains. However, how to better align source and target features is not well addressed. In this paper, we propose a deep feature registration (DFR) model to generate registered features that mai…
▽ More
While unsupervised domain adaptation has been explored to leverage the knowledge from a labeled source domain to an unlabeled target domain, existing methods focus on the distribution alignment between two domains. However, how to better align source and target features is not well addressed. In this paper, we propose a deep feature registration (DFR) model to generate registered features that maintain domain invariant features and simultaneously minimize the domain-dissimilarity of registered features and target features via histogram matching. We further employ a pseudo label refinement process, which considers both probabilistic soft selection and center-based hard selection to improve the quality of pseudo labels in the target domain. Extensive experiments on multiple UDA benchmarks demonstrate the effectiveness of our DFR model, resulting in new state-of-the-art performance.
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks
Authors:
Kai Zhang,
Rong Zhou,
Eashan Adhikarla,
Zhiling Yan,
Yixin Liu,
Jun Yu,
Zhengliang Liu,
Xun Chen,
Brian D. Davison,
Hui Ren,
Jing Huang,
Chen Chen,
Yuyin Zhou,
Sunyang Fu,
Wei Liu,
Tianming Liu,
Xiang Li,
Yong Chen,
Lifang He,
James Zou,
Quanzheng Li,
Hongfang Liu,
Lichao Sun
Abstract:
Traditional biomedical artificial intelligence (AI) models, designed for specific tasks or modalities, often exhibit limited flexibility in real-world deployment and struggle to utilize holistic information. Generalist AI holds the potential to address these limitations due to its versatility in interpreting different data types and generating tailored outputs for diverse needs. However, existing…
▽ More
Traditional biomedical artificial intelligence (AI) models, designed for specific tasks or modalities, often exhibit limited flexibility in real-world deployment and struggle to utilize holistic information. Generalist AI holds the potential to address these limitations due to its versatility in interpreting different data types and generating tailored outputs for diverse needs. However, existing biomedical generalist AI solutions are typically heavyweight and closed source to researchers, practitioners, and patients. Here, we propose BiomedGPT, the first open-source and lightweight vision-language foundation model, designed as a generalist capable of performing various biomedical tasks. BiomedGPT achieved state-of-the-art results in 16 out of 25 experiments while maintaining a computing-friendly model scale. We also conducted human evaluations to assess the capabilities of BiomedGPT in radiology visual question answering, report generation, and summarization. BiomedGPT exhibits robust prediction ability with a low error rate of 3.8% in question answering, satisfactory performance with an error rate of 8.3% in writing complex radiology reports, and competitive summarization ability with a nearly equivalent preference score to human experts. Our method demonstrates that effective training with diverse data can lead to more practical biomedical AI for improving diagnosis and workflow efficiency.
△ Less
Submitted 11 August, 2024; v1 submitted 26 May, 2023;
originally announced May 2023.
-
Unconfounded Propensity Estimation for Unbiased Ranking
Authors:
Dan Luo,
Lixin Zou,
Qingyao Ai,
Zhiyu Chen,
Chenliang Li,
Dawei Yin,
Brian D. Davison
Abstract:
The goal of unbiased learning to rank (ULTR) is to leverage implicit user feedback for optimizing learning-to-rank systems. Among existing solutions, automatic ULTR algorithms that jointly learn user bias models (i.e., propensity models) with unbiased rankers have received a lot of attention due to their superior performance and low deployment cost in practice. Despite their theoretical soundness,…
▽ More
The goal of unbiased learning to rank (ULTR) is to leverage implicit user feedback for optimizing learning-to-rank systems. Among existing solutions, automatic ULTR algorithms that jointly learn user bias models (i.e., propensity models) with unbiased rankers have received a lot of attention due to their superior performance and low deployment cost in practice. Despite their theoretical soundness, the effectiveness is usually justified under a weak logging policy, where the ranking model can barely rank documents according to their relevance to the query. However, when the logging policy is strong, e.g., an industry-deployed ranking policy, the reported effectiveness cannot be reproduced. In this paper, we first investigate ULTR from a causal perspective and uncover a negative result: existing ULTR algorithms fail to address the issue of propensity overestimation caused by the query-document relevance confounder. Then, we propose a new learning objective based on backdoor adjustment and highlight its differences from conventional propensity models, which reveal the prevalence of propensity overestimation. On top of that, we introduce a novel propensity model called Logging-Policy-aware Propensity (LPP) model and its distinctive two-step optimization strategy, which allows for the joint learning of LPP and ranking models within the automatic ULTR framework, and actualize the unconfounded propensity estimation for ULTR. Extensive experiments on two benchmarks demonstrate the effectiveness and generalizability of the proposed method.
△ Less
Submitted 8 July, 2023; v1 submitted 16 May, 2023;
originally announced May 2023.
-
BPS algebras and generalised Kac-Moody algebras from 2-Calabi-Yau categories
Authors:
Ben Davison,
Lucien Hennecart,
Sebastian Schlegel Mejia
Abstract:
We determine the structure of the BPS algebra of $2$-Calabi-Yau Abelian categories whose stack of objects admits a good moduli space. We prove that this algebra is isomorphic to the positive part of the enveloping algebra of a generalised Kac-Moody Lie algebra generated by the intersection cohomology of certain connected components (corresponding to roots) of the good moduli space. Some major exam…
▽ More
We determine the structure of the BPS algebra of $2$-Calabi-Yau Abelian categories whose stack of objects admits a good moduli space. We prove that this algebra is isomorphic to the positive part of the enveloping algebra of a generalised Kac-Moody Lie algebra generated by the intersection cohomology of certain connected components (corresponding to roots) of the good moduli space. Some major examples include the BPS algebras of (1) the category of semistable coherent sheaves of given slope on a K3 surface or, more generally, quasiprojective symplectic surface, (2) semistable Higgs bundles on smooth projective curves, (3) preprojective algebras of quivers, (4) multiplicative preprojective algebras and (5) fundamental groups of (quiver) Riemann surfaces. We define the BPS Lie algebras of $2$-Calabi-Yau categories and prove that they coincide with the ones obtained by dimensional reduction from the critical cohomological Hall algebra in the case in which the 2-Calabi-Yau category is the category of representations of a preprojective algebra. Consequences include (1) A proof in full generality of the Bozec-Schiffmann positivity conjecture for absolutely cuspidal polynomials, a strengthening of the Kac positivity conjecture (2) A proof of the cohomological integrality conjecture for the category of semistable coherent sheaves on local K3 surfaces (3) A description of the cohomology (in all degrees) of Nakajima quiver varieties as direct sums of irreducible lowest weight representations over the BPS Lie algebra.
△ Less
Submitted 14 February, 2024; v1 submitted 22 March, 2023;
originally announced March 2023.
-
BPS Lie algebras for totally negative 2-Calabi-Yau categories and nonabelian Hodge theory for stacks
Authors:
Ben Davison,
Lucien Hennecart,
Sebastian Schlegel Mejia
Abstract:
We define and study a sheaf-theoretic cohomological Hall algebra for suitably geometric Abelian categories $\mathcal{A}$ of homological dimension at most two, and a sheaf-theoretic BPS algebra under the conditions that $\mathcal{A}$ is 2-Calabi-Yau and has a good moduli space. We show that the BPS algebra for the preprojective algebra $Π_Q$ of a totally negative quiver is the free algebra generate…
▽ More
We define and study a sheaf-theoretic cohomological Hall algebra for suitably geometric Abelian categories $\mathcal{A}$ of homological dimension at most two, and a sheaf-theoretic BPS algebra under the conditions that $\mathcal{A}$ is 2-Calabi-Yau and has a good moduli space. We show that the BPS algebra for the preprojective algebra $Π_Q$ of a totally negative quiver is the free algebra generated by the intersection cohomology of the closure of the locus parameterising simple $Π_Q$-modules in the coarse moduli space.
We define and study the BPS Lie algebra of arbitrary 2-Calabi-Yau categories $\mathcal{A}$ for which the Euler form is negative on all pairs of non-zero objects, which recovers the BPS algebra as its universal enveloping algebra for such "totally negative" 2CY categories. We show that for totally negative 2CY categories the BPS algebra is freely generated by intersection complexes of certain coarse moduli spaces as above, and the Borel-Moore homology of the stack of objects in such $\mathcal{A}$ satisfies a Yangian-type PBW theorem for the BPS Lie algebra. In this way we prove the cohomological integrality theorem for these categories.
We use our results to prove that for $C$ a smooth projective curve, and for $r$ and $d$ not necessarily coprime, there is a nonabelian Hodge isomorphism between the Borel-Moore homologies of the stack of rank $r$ and degree $d$ Higgs bundles, and the appropriate stack of twisted representations of the fundamental group of $C$. In addition we prove the Bozec-Schiffmann positivity conjecture for totally negative quivers; we prove that their polynomials counting cuspidal functions in the constructible Hall algebra for $Q$ have positive coefficients, strengthening the positivity theorem for the Kac polynomials of such quivers.
△ Less
Submitted 14 February, 2023; v1 submitted 15 December, 2022;
originally announced December 2022.
-
Affine BPS algebras, W algebras, and the cohomological Hall algebra of $\mathbb{A}^2$
Authors:
Ben Davison
Abstract:
We introduce affinizations and deformations of the BPS Lie algebra associated to a tripled quiver with potential, and use them to precisely determine the $T$-equivariant cohomological Hall algebra $\mathcal{H}_{\mathbb{A}^2}^T$ of compactly supported coherent sheaves on $\mathbb{A}^2$, acted on by a torus $T$. In particular we show that this algebra is spherically generated for all $T$.
We introduce affinizations and deformations of the BPS Lie algebra associated to a tripled quiver with potential, and use them to precisely determine the $T$-equivariant cohomological Hall algebra $\mathcal{H}_{\mathbb{A}^2}^T$ of compactly supported coherent sheaves on $\mathbb{A}^2$, acted on by a torus $T$. In particular we show that this algebra is spherically generated for all $T$.
△ Less
Submitted 19 April, 2025; v1 submitted 13 September, 2022;
originally announced September 2022.
-
Model-based Unbiased Learning to Rank
Authors:
Dan Luo,
Lixin Zou,
Qingyao Ai,
Zhiyu Chen,
Dawei Yin,
Brian D. Davison
Abstract:
Unbiased Learning to Rank (ULTR) that learns to rank documents with biased user feedback data is a well-known challenge in information retrieval. Existing methods in unbiased learning to rank typically rely on click modeling or inverse propensity weighting (IPW). Unfortunately, the search engines are faced with severe long-tail query distribution, where neither click modeling nor IPW can handle we…
▽ More
Unbiased Learning to Rank (ULTR) that learns to rank documents with biased user feedback data is a well-known challenge in information retrieval. Existing methods in unbiased learning to rank typically rely on click modeling or inverse propensity weighting (IPW). Unfortunately, the search engines are faced with severe long-tail query distribution, where neither click modeling nor IPW can handle well. Click modeling suffers from data sparsity problem since the same query-document pair appears limited times on tail queries; IPW suffers from high variance problem since it is highly sensitive to small propensity score values. Therefore, a general debiasing framework that works well under tail queries is in desperate need. To address this problem, we propose a model-based unbiased learning-to-rank framework. Specifically, we develop a general context-aware user simulator to generate pseudo clicks for unobserved ranked lists to train rankers, which addresses the data sparsity problem. In addition, considering the discrepancy between pseudo clicks and actual clicks, we take the observation of a ranked list as the treatment variable and further incorporate inverse propensity weighting with pseudo labels in a doubly robust way. The derived bias and variance indicate that the proposed model-based method is more robust than existing methods. Finally, extensive experiments on benchmark datasets, including simulated datasets and real click logs, demonstrate that the proposed model-based method consistently performs outperforms state-of-the-art methods in various scenarios. The code is available at https://github.com/rowedenny/MULTR.
△ Less
Submitted 7 February, 2023; v1 submitted 24 July, 2022;
originally announced July 2022.
-
StruBERT: Structure-aware BERT for Table Search and Matching
Authors:
Mohamed Trabelsi,
Zhiyu Chen,
Shuo Zhang,
Brian D. Davison,
Jeff Heflin
Abstract:
A large amount of information is stored in data tables. Users can search for data tables using a keyword-based query. A table is composed primarily of data values that are organized in rows and columns providing implicit structural information. A table is usually accompanied by secondary information such as the caption, page title, etc., that form the textual information. Understanding the connect…
▽ More
A large amount of information is stored in data tables. Users can search for data tables using a keyword-based query. A table is composed primarily of data values that are organized in rows and columns providing implicit structural information. A table is usually accompanied by secondary information such as the caption, page title, etc., that form the textual information. Understanding the connection between the textual and structural information is an important yet neglected aspect in table retrieval as previous methods treat each source of information independently. In addition, users can search for data tables that are similar to an existing table, and this setting can be seen as a content-based table retrieval. In this paper, we propose StruBERT, a structure-aware BERT model that fuses the textual and structural information of a data table to produce context-aware representations for both textual and tabular content of a data table. StruBERT features are integrated in a new end-to-end neural ranking model to solve three table-related downstream tasks: keyword- and content-based table retrieval, and table similarity. We evaluate our approach using three datasets, and we demonstrate substantial improvements in terms of retrieval and classification metrics over state-of-the-art methods.
△ Less
Submitted 27 March, 2022;
originally announced March 2022.
-
Memory Defense: More Robust Classification via a Memory-Masking Autoencoder
Authors:
Eashan Adhikarla,
Dan Luo,
Brian D. Davison
Abstract:
Many deep neural networks are susceptible to minute perturbations of images that have been carefully crafted to cause misclassification. Ideally, a robust classifier would be immune to small variations in input images, and a number of defensive approaches have been created as a result. One method would be to discern a latent representation which could ignore small changes to the input. However, ty…
▽ More
Many deep neural networks are susceptible to minute perturbations of images that have been carefully crafted to cause misclassification. Ideally, a robust classifier would be immune to small variations in input images, and a number of defensive approaches have been created as a result. One method would be to discern a latent representation which could ignore small changes to the input. However, typical autoencoders easily mingle inter-class latent representations when there are strong similarities between classes, making it harder for a decoder to accurately project the image back to the original high-dimensional space. We propose a novel framework, Memory Defense, an augmented classifier with a memory-masking autoencoder to counter this challenge. By masking other classes, the autoencoder learns class-specific independent latent representations. We test the model's robustness against four widely used attacks. Experiments on the Fashion-MNIST & CIFAR-10 datasets demonstrate the superiority of our model. We make available our source code at GitHub repository: https://github.com/eashanadhikarla/MemDefense
△ Less
Submitted 5 February, 2022;
originally announced February 2022.
-
Nonabelian Hodge theory for stacks and a stacky P=W conjecture
Authors:
Ben Davison
Abstract:
We introduce a version of the P=W conjecture relating the Borel-Moore homology of the stack of representations of the fundamental group of a genus g Riemann surface with the Borel-Moore homology of the stack of degree zero semistable Higgs bundles on a smooth projective complex curve of genus $g$. In order to state the conjecture we propose a construction of a canonical isomorphism between these B…
▽ More
We introduce a version of the P=W conjecture relating the Borel-Moore homology of the stack of representations of the fundamental group of a genus g Riemann surface with the Borel-Moore homology of the stack of degree zero semistable Higgs bundles on a smooth projective complex curve of genus $g$. In order to state the conjecture we propose a construction of a canonical isomorphism between these Borel-Moore homology groups. We relate the stacky P=W conjecture with the original P=W conjecture concerning the cohomology of smooth moduli spaces of twisted objects, and the PI=WI conjecture concerning the intersection cohomology groups of singular moduli spaces of untwisted objects. In genus zero and one, we prove the conjectures that we introduce in this paper.
△ Less
Submitted 4 April, 2024; v1 submitted 20 December, 2021;
originally announced December 2021.
-
Deep Least Squares Alignment for Unsupervised Domain Adaptation
Authors:
Youshan Zhang,
Brian D. Davison
Abstract:
Unsupervised domain adaptation leverages rich information from a labeled source domain to model an unlabeled target domain. Existing methods attempt to align the cross-domain distributions. However, the statistical representations of the alignment of the two domains are not well addressed. In this paper, we propose deep least squares alignment (DLSA) to estimate the distribution of the two domains…
▽ More
Unsupervised domain adaptation leverages rich information from a labeled source domain to model an unlabeled target domain. Existing methods attempt to align the cross-domain distributions. However, the statistical representations of the alignment of the two domains are not well addressed. In this paper, we propose deep least squares alignment (DLSA) to estimate the distribution of the two domains in a latent space by parameterizing a linear model. We further develop marginal and conditional adaptation loss to reduce the domain discrepancy by minimizing the angle between fitting lines and intercept differences and further learning domain invariant features. Extensive experiments demonstrate that the proposed DLSA model is effective in aligning domain distributions and outperforms state-of-the-art methods.
△ Less
Submitted 3 November, 2021;
originally announced November 2021.
-
A boson-fermion correspondence in cohomological Donaldson-Thomas theory
Authors:
Ben Davison
Abstract:
We introduce and study a fermionization procedure for the cohomological Hall algebra $\mathcal{H}_{Π_Q}$ of representations of a preprojective algebra, that selectively switches the cohomological parity of the BPS Lie algebra from even to odd. We do so by determining the cohomological Donaldson--Thomas invariants of central extensions of preprojective algebras studied in the work of Etingof and Ra…
▽ More
We introduce and study a fermionization procedure for the cohomological Hall algebra $\mathcal{H}_{Π_Q}$ of representations of a preprojective algebra, that selectively switches the cohomological parity of the BPS Lie algebra from even to odd. We do so by determining the cohomological Donaldson--Thomas invariants of central extensions of preprojective algebras studied in the work of Etingof and Rains, via deformed dimensional reduction.
Via the same techniques, we determine the Borel-Moore homology of the stack of representations of the $μ$-deformed preprojective algebra introduced by Crawley-Boevey and Holland, for all dimension vectors. This provides a common generalisation of the results of Crawley-Boevey and Van den Bergh on the cohomology of smooth moduli schemes of representations of deformed preprojective algebras, and my earlier results on the Borel-Moore homology of the stack of representations of the undeformed preprojective algebra.
△ Less
Submitted 16 February, 2022; v1 submitted 20 September, 2021;
originally announced September 2021.
-
Automatic Head Overcoat Thickness Measure with NASNet-Large-Decoder Net
Authors:
Youshan Zhang,
Brian D. Davison,
Vivien W. Talghader,
Zhiyu Chen,
Zhiyong Xiao,
Gary J. Kunkel
Abstract:
Transmission electron microscopy (TEM) is one of the primary tools to show microstructural characterization of materials as well as film thickness. However, manual determination of film thickness from TEM images is time-consuming as well as subjective, especially when the films in question are very thin and the need for measurement precision is very high. Such is the case for head overcoat (HOC) t…
▽ More
Transmission electron microscopy (TEM) is one of the primary tools to show microstructural characterization of materials as well as film thickness. However, manual determination of film thickness from TEM images is time-consuming as well as subjective, especially when the films in question are very thin and the need for measurement precision is very high. Such is the case for head overcoat (HOC) thickness measurements in the magnetic hard disk drive industry. It is therefore necessary to develop software to automatically measure HOC thickness. In this paper, for the first time, we propose a HOC layer segmentation method using NASNet-Large as an encoder and then followed by a decoder architecture, which is one of the most commonly used architectures in deep learning for image segmentation. To further improve segmentation results, we are the first to propose a post-processing layer to remove irrelevant portions in the segmentation result. To measure the thickness of the segmented HOC layer, we propose a regressive convolutional neural network (RCNN) model as well as orthogonal thickness calculation methods. Experimental results demonstrate a higher dice score for our model which has lower mean squared error and outperforms current state-of-the-art manual measurement.
△ Less
Submitted 22 June, 2021;
originally announced June 2021.
-
Enhanced Separable Disentanglement for Unsupervised Domain Adaptation
Authors:
Youshan Zhang,
Brian D. Davison
Abstract:
Domain adaptation aims to mitigate the domain gap when transferring knowledge from an existing labeled domain to a new domain. However, existing disentanglement-based methods do not fully consider separation between domain-invariant and domain-specific features, which means the domain-invariant features are not discriminative. The reconstructed features are also not sufficiently used during traini…
▽ More
Domain adaptation aims to mitigate the domain gap when transferring knowledge from an existing labeled domain to a new domain. However, existing disentanglement-based methods do not fully consider separation between domain-invariant and domain-specific features, which means the domain-invariant features are not discriminative. The reconstructed features are also not sufficiently used during training. In this paper, we propose a novel enhanced separable disentanglement (ESD) model. We first employ a disentangler to distill domain-invariant and domain-specific features. Then, we apply feature separation enhancement processes to minimize contamination between domain-invariant and domain-specific features. Finally, our model reconstructs complete feature vectors, which are used for further disentanglement during the training phase. Extensive experiments from three benchmark datasets outperform state-of-the-art methods, especially on challenging cross-domain tasks.
△ Less
Submitted 22 June, 2021;
originally announced June 2021.
-
Purity and 2-Calabi-Yau categories
Authors:
Ben Davison
Abstract:
For various 2-Calabi-Yau categories $\mathscr{C}$ for which the stack of objects $\mathfrak{M}$ has a good moduli space $p\colon\mathfrak{M}\rightarrow \mathcal{M}$, we establish purity of the mixed Hodge module complex $p_{!}\underline{\mathbb{Q}}_{\mathfrak{M}}$. We do this by using formality in 2CY categories, along with étale neighbourhood theorems for stacks, to prove that the morphism $p$ is…
▽ More
For various 2-Calabi-Yau categories $\mathscr{C}$ for which the stack of objects $\mathfrak{M}$ has a good moduli space $p\colon\mathfrak{M}\rightarrow \mathcal{M}$, we establish purity of the mixed Hodge module complex $p_{!}\underline{\mathbb{Q}}_{\mathfrak{M}}$. We do this by using formality in 2CY categories, along with étale neighbourhood theorems for stacks, to prove that the morphism $p$ is modelled étale-locally by the semisimplification morphism from the stack of modules of a preprojective algebra. Then via the integrality theorem in cohomological Donaldson-Thomas theory we prove purity of $p_{!}\underline{\mathbb{Q}}_{\mathfrak{M}}$. It follows that the Beilinson-Bernstein-Deligne-Gabber decomposition theorem for the constant sheaf holds for the morphism $p$, despite the possibly very singular and stacky nature of $\mathfrak{M}$. We use this to define cuspidal cohomology for $\mathfrak{M}$, which is conjecturally a complete space of generators for the BPS algebra associated to $\mathscr{C}$.
We prove purity of the Borel-Moore homology of the moduli stack $\mathfrak{M}$, provided its good moduli space $\mathcal{M}$ is projective, or admits a suitable contracting $\mathbb{C}^*$-action. In particular, when $\mathfrak{M}$ is the moduli stack of Gieseker-semistable sheaves on a K3 surface, this proves a conjecture of Halpern-Leistner. We use these results to moreover prove purity for several stacks of coherent sheaves that do not admit a good moduli space.
Without the usual assumption that $r$ and $d$ are coprime, we prove that the Borel-Moore homology of the stack of semistable degree $d$ rank $r$ Higgs sheaves is pure and carries a perverse filtration with respect to the Hitchin base, generalising the usual perverse filtration for the Hitchin system to the case of singular stacks of Higgs sheaves.
△ Less
Submitted 1 April, 2024; v1 submitted 14 June, 2021;
originally announced June 2021.
-
Correlated Adversarial Joint Discrepancy Adaptation Network
Authors:
Youshan Zhang,
Brian D. Davison
Abstract:
Domain adaptation aims to mitigate the domain shift problem when transferring knowledge from one domain into another similar but different domain. However, most existing works rely on extracting marginal features without considering class labels. Moreover, some methods name their model as so-called unsupervised domain adaptation while tuning the parameters using the target domain label. To address…
▽ More
Domain adaptation aims to mitigate the domain shift problem when transferring knowledge from one domain into another similar but different domain. However, most existing works rely on extracting marginal features without considering class labels. Moreover, some methods name their model as so-called unsupervised domain adaptation while tuning the parameters using the target domain label. To address these issues, we propose a novel approach called correlated adversarial joint discrepancy adaptation network (CAJNet), which minimizes the joint discrepancy of two domains and achieves competitive performance with tuning parameters using the correlated label. By training the joint features, we can align the marginal and conditional distributions between the two domains. In addition, we introduce a probability-based top-$\mathcal{K}$ correlated label ($\mathcal{K}$-label), which is a powerful indicator of the target domain and effective metric to tune parameters to aid predictions. Extensive experiments on benchmark datasets demonstrate significant improvements in classification accuracy over the state of the art.
△ Less
Submitted 18 May, 2021;
originally announced May 2021.
-
WTR: A Test Collection for Web Table Retrieval
Authors:
Zhiyu Chen,
Shuo Zhang,
Brian D. Davison
Abstract:
We describe the development, characteristics and availability of a test collection for the task of Web table retrieval, which uses a large-scale Web Table Corpora extracted from the Common Crawl. Since a Web table usually has rich context information such as the page title and surrounding paragraphs, we not only provide relevance judgments of query-table pairs, but also the relevance judgments of…
▽ More
We describe the development, characteristics and availability of a test collection for the task of Web table retrieval, which uses a large-scale Web Table Corpora extracted from the Common Crawl. Since a Web table usually has rich context information such as the page title and surrounding paragraphs, we not only provide relevance judgments of query-table pairs, but also the relevance judgments of query-table context pairs with respect to a query, which are ignored by previous test collections. To facilitate future research with this benchmark, we provide details about how the dataset is pre-processed and also baseline results from both traditional and recently proposed table retrieval methods. Our experimental results show that proper usage of context labels can benefit previous table retrieval methods.
△ Less
Submitted 5 May, 2021;
originally announced May 2021.
-
Deep Spherical Manifold Gaussian Kernel for Unsupervised Domain Adaptation
Authors:
Youshan Zhang,
Brian D. Davison
Abstract:
Unsupervised Domain adaptation is an effective method in addressing the domain shift issue when transferring knowledge from an existing richly labeled domain to a new domain. Existing manifold-based methods either are based on traditional models or largely rely on Grassmannian manifold via minimizing differences of single covariance matrices of two domains. In addition, existing pseudo-labeling al…
▽ More
Unsupervised Domain adaptation is an effective method in addressing the domain shift issue when transferring knowledge from an existing richly labeled domain to a new domain. Existing manifold-based methods either are based on traditional models or largely rely on Grassmannian manifold via minimizing differences of single covariance matrices of two domains. In addition, existing pseudo-labeling algorithms inadequately consider the quality of pseudo labels in aligning the conditional distribution between two domains. In this work, a deep spherical manifold Gaussian kernel (DSGK) framework is proposed to map the source and target subspaces into a spherical manifold and reduce the discrepancy between them by embedding both extracted features and a Gaussian kernel. To align the conditional distributions, we further develop an easy-to-hard pseudo label refinement process to improve the quality of the pseudo labels and then reduce categorical spherical manifold Gaussian kernel geodesic loss. Extensive experimental results show that DSGK outperforms state-of-the-art methods, especially on challenging cross-domain learning tasks.
△ Less
Submitted 5 May, 2021;
originally announced May 2021.
-
Efficient Pre-trained Features and Recurrent Pseudo-Labeling in Unsupervised Domain Adaptation
Authors:
Youshan Zhang,
Brian D. Davison
Abstract:
Domain adaptation (DA) mitigates the domain shift problem when transferring knowledge from one annotated domain to another similar but different unlabeled domain. However, existing models often utilize one of the ImageNet models as the backbone without exploring others, and fine-tuning or retraining the backbone ImageNet model is also time-consuming. Moreover, pseudo-labeling has been used to impr…
▽ More
Domain adaptation (DA) mitigates the domain shift problem when transferring knowledge from one annotated domain to another similar but different unlabeled domain. However, existing models often utilize one of the ImageNet models as the backbone without exploring others, and fine-tuning or retraining the backbone ImageNet model is also time-consuming. Moreover, pseudo-labeling has been used to improve the performance in the target domain, while how to generate confident pseudo labels and explicitly align domain distributions has not been well addressed. In this paper, we show how to efficiently opt for the best pre-trained features from seventeen well-known ImageNet models in unsupervised DA problems. In addition, we propose a recurrent pseudo-labeling model using the best pre-trained features (termed PRPL) to improve classification performance. To show the effectiveness of PRPL, we evaluate it on three benchmark datasets, Office+Caltech-10, Office-31, and Office-Home. Extensive experiments show that our model reduces computation time and boosts the mean accuracy to 98.1%, 92.4%, and 81.2%, respectively, substantially outperforming the state of the art.
△ Less
Submitted 1 May, 2021; v1 submitted 27 April, 2021;
originally announced April 2021.
-
Adversarial Regression Learning for Bone Age Estimation
Authors:
Youshan Zhang,
Brian D. Davison
Abstract:
Estimation of bone age from hand radiographs is essential to determine skeletal age in diagnosing endocrine disorders and depicting the growth status of children. However, existing automatic methods only apply their models to test images without considering the discrepancy between training samples and test samples, which will lead to a lower generalization ability. In this paper, we propose an adv…
▽ More
Estimation of bone age from hand radiographs is essential to determine skeletal age in diagnosing endocrine disorders and depicting the growth status of children. However, existing automatic methods only apply their models to test images without considering the discrepancy between training samples and test samples, which will lead to a lower generalization ability. In this paper, we propose an adversarial regression learning network (ARLNet) for bone age estimation. Specifically, we first extract bone features from a fine-tuned Inception V3 neural network and propose regression percentage loss for training. To reduce the discrepancy between training and test data, we then propose adversarial regression loss and feature reconstruction loss to guarantee the transition from training data to test data and vice versa, preserving invariant features from both training and test data. Experimental results show that the proposed model outperforms state-of-the-art methods.
△ Less
Submitted 10 March, 2021;
originally announced March 2021.
-
Neural ranking models for document retrieval
Authors:
Mohamed Trabelsi,
Zhiyu Chen,
Brian D. Davison,
Jeff Heflin
Abstract:
Ranking models are the main components of information retrieval systems. Several approaches to ranking are based on traditional machine learning algorithms using a set of hand-crafted features. Recently, researchers have leveraged deep learning models in information retrieval. These models are trained end-to-end to extract features from the raw data for ranking tasks, so that they overcome the lim…
▽ More
Ranking models are the main components of information retrieval systems. Several approaches to ranking are based on traditional machine learning algorithms using a set of hand-crafted features. Recently, researchers have leveraged deep learning models in information retrieval. These models are trained end-to-end to extract features from the raw data for ranking tasks, so that they overcome the limitations of hand-crafted features. A variety of deep learning models have been proposed, and each model presents a set of neural network components to extract features that are used for ranking. In this paper, we compare the proposed models in the literature along different dimensions in order to understand the major contributions and limitations of each model. In our discussion of the literature, we analyze the promising neural components, and propose future research directions. We also show the analogy between document retrieval and other retrieval tasks where the items to be ranked are structured documents, answers, images and videos.
△ Less
Submitted 1 November, 2021; v1 submitted 23 February, 2021;
originally announced February 2021.
-
Adversarial Consistent Learning on Partial Domain Adaptation of PlantCLEF 2020 Challenge
Authors:
Youshan Zhang,
Brian D. Davison
Abstract:
Domain adaptation is one of the most crucial techniques to mitigate the domain shift problem, which exists when transferring knowledge from an abundant labeled sourced domain to a target domain with few or no labels. Partial domain adaptation addresses the scenario when target categories are only a subset of source categories. In this paper, to enable the efficient representation of cross-domain p…
▽ More
Domain adaptation is one of the most crucial techniques to mitigate the domain shift problem, which exists when transferring knowledge from an abundant labeled sourced domain to a target domain with few or no labels. Partial domain adaptation addresses the scenario when target categories are only a subset of source categories. In this paper, to enable the efficient representation of cross-domain plant images, we first extract deep features from pre-trained models and then develop adversarial consistent learning ($ACL$) in a unified deep architecture for partial domain adaptation. It consists of source domain classification loss, adversarial learning loss, and feature consistency loss. Adversarial learning loss can maintain domain-invariant features between the source and target domains. Moreover, feature consistency loss can preserve the fine-grained feature transition between two domains. We also find the shared categories of two domains via down-weighting the irrelevant categories in the source domain. Experimental results demonstrate that training features from NASNetLarge model with proposed $ACL$ architecture yields promising results on the PlantCLEF 2020 Challenge.
△ Less
Submitted 19 September, 2020;
originally announced September 2020.
-
BPS Lie algebras and the less perverse filtration on the preprojective CoHA
Authors:
Ben Davison
Abstract:
The affinization morphism for the stack $\mathfrak{M}(Π_Q)$ of representations of a preprojective algebra $Π_Q$ is a local model for the morphism from the stack of objects in a general 2-Calabi-Yau category to the good moduli space. We show that the derived direct image of the dualizing complex along this morphism is pure, and admits a decomposition in the sense of the Beilinson-Bernstein-Deligne-…
▽ More
The affinization morphism for the stack $\mathfrak{M}(Π_Q)$ of representations of a preprojective algebra $Π_Q$ is a local model for the morphism from the stack of objects in a general 2-Calabi-Yau category to the good moduli space. We show that the derived direct image of the dualizing complex along this morphism is pure, and admits a decomposition in the sense of the Beilinson-Bernstein-Deligne-Gabber decomposition theorem.
We introduce a new perverse filtration on the Borel-Moore homology of $\mathfrak{M}(Π_Q)$, using this decomposition. We show that the zeroth piece of the resulting filtration on the cohomological Hall algebra built out of the Borel-Moore homology of $\mathfrak{M}(Π_Q)$ is isomorphic to the universal enveloping algebra of an associated BPS Lie algebra $\mathfrak{g}_{Π_Q}$. This Lie algebra is defined via the Kontsevich-Soibelman theory of critical cohomological Hall algebras for 3-Calabi-Yau categories. We then lift this Lie algebra to a Lie algebra object in the category of perverse sheaves on the coarse moduli space of $Π_Q$-modules, and use this algebra structure to prove results about the summands appearing in the above decomposition theorem. In particular, we prove that the intersection cohomology of singular spaces of semistable $Π_Q$-modules provide "cuspidal cohomology" - a conjecturally complete subspace of canonical generators for $\mathfrak{g}_{Π_Q}$.
△ Less
Submitted 22 April, 2024; v1 submitted 7 July, 2020;
originally announced July 2020.
-
Pretrained Generalized Autoregressive Model with Adaptive Probabilistic Label Clusters for Extreme Multi-label Text Classification
Authors:
Hui Ye,
Zhiyu Chen,
Da-Han Wang,
Brian D. Davison
Abstract:
Extreme multi-label text classification (XMTC) is a task for tagging a given text with the most relevant labels from an extremely large label set. We propose a novel deep learning method called APLC-XLNet. Our approach fine-tunes the recently released generalized autoregressive pretrained model (XLNet) to learn a dense representation for the input text. We propose Adaptive Probabilistic Label Clus…
▽ More
Extreme multi-label text classification (XMTC) is a task for tagging a given text with the most relevant labels from an extremely large label set. We propose a novel deep learning method called APLC-XLNet. Our approach fine-tunes the recently released generalized autoregressive pretrained model (XLNet) to learn a dense representation for the input text. We propose Adaptive Probabilistic Label Clusters (APLC) to approximate the cross entropy loss by exploiting the unbalanced label distribution to form clusters that explicitly reduce the computational time. Our experiments, carried out on five benchmark datasets, show that our approach has achieved new state-of-the-art results on four benchmark datasets. Our source code is available publicly at https://github.com/huiyegit/APLC_XLNet.
△ Less
Submitted 14 August, 2020; v1 submitted 5 July, 2020;
originally announced July 2020.
-
Table Search Using a Deep Contextualized Language Model
Authors:
Zhiyu Chen,
Mohamed Trabelsi,
Jeff Heflin,
Yinan Xu,
Brian D. Davison
Abstract:
Pretrained contextualized language models such as BERT have achieved impressive results on various natural language processing benchmarks. Benefiting from multiple pretraining tasks and large scale training corpora, pretrained models can capture complex syntactic word relations. In this paper, we use the deep contextualized language model BERT for the task of ad hoc table retrieval. We investigate…
▽ More
Pretrained contextualized language models such as BERT have achieved impressive results on various natural language processing benchmarks. Benefiting from multiple pretraining tasks and large scale training corpora, pretrained models can capture complex syntactic word relations. In this paper, we use the deep contextualized language model BERT for the task of ad hoc table retrieval. We investigate how to encode table content considering the table structure and input length limit of BERT. We also propose an approach that incorporates features from prior literature on table retrieval and jointly trains them with BERT. In experiments on public datasets, we show that our best approach can outperform the previous state-of-the-art method and BERT baselines with a large margin under different evaluation metrics.
△ Less
Submitted 26 May, 2020; v1 submitted 19 May, 2020;
originally announced May 2020.
-
Impact of ImageNet Model Selection on Domain Adaptation
Authors:
Youshan Zhang,
Brian D. Davison
Abstract:
Deep neural networks are widely used in image classification problems. However, little work addresses how features from different deep neural networks affect the domain adaptation problem. Existing methods often extract deep features from one ImageNet model, without exploring other neural networks. In this paper, we investigate how different ImageNet models affect transfer accuracy on domain adapt…
▽ More
Deep neural networks are widely used in image classification problems. However, little work addresses how features from different deep neural networks affect the domain adaptation problem. Existing methods often extract deep features from one ImageNet model, without exploring other neural networks. In this paper, we investigate how different ImageNet models affect transfer accuracy on domain adaptation problems. We extract features from sixteen distinct pre-trained ImageNet models and examine the performance of twelve benchmarking methods when using the features. Extensive experimental results show that a higher accuracy ImageNet model produces better features, and leads to higher accuracy on domain adaptation problems (with a correlation coefficient of up to 0.95). We also examine the architecture of each neural network to find the best layer for feature extraction. Together, performance from our features exceeds that of the state-of-the-art in three benchmark datasets.
△ Less
Submitted 6 February, 2020;
originally announced February 2020.
-
Leveraging Schema Labels to Enhance Dataset Search
Authors:
Zhiyu Chen,
Haiyan Jia,
Jeff Heflin,
Brian D. Davison
Abstract:
A search engine's ability to retrieve desirable datasets is important for data sharing and reuse. Existing dataset search engines typically rely on matching queries to dataset descriptions. However, a user may not have enough prior knowledge to write a query using terms that match with description text.We propose a novel schema label generation model which generates possible schema labels based on…
▽ More
A search engine's ability to retrieve desirable datasets is important for data sharing and reuse. Existing dataset search engines typically rely on matching queries to dataset descriptions. However, a user may not have enough prior knowledge to write a query using terms that match with description text.We propose a novel schema label generation model which generates possible schema labels based on dataset table content. We incorporate the generated schema labels into a mixed ranking model which not only considers the relevance between the query and dataset metadata but also the similarity between the query and generated schema labels. To evaluate our method on real-world datasets, we create a new benchmark specifically for the dataset retrieval task. Experiments show that our approach can effectively improve the precision and NDCG scores of the dataset retrieval task compared with baseline methods. We also test on a collection of Wikipedia tables to show that the features generated from schema labels can improve the unsupervised and supervised web table retrieval task as well.
△ Less
Submitted 27 January, 2020;
originally announced January 2020.
-
Deformed dimensional reduction
Authors:
Ben Davison,
Tudor Pădurariu
Abstract:
Since its first use by Behrend, Bryan, and Szendrői in the computation of motivic Donaldson-Thomas (DT) invariants of $\mathbb{A}_{\mathbb{C}}^3$, dimensional reduction has proved to be an important tool in motivic and cohomological DT theory. Inspired by a conjecture of Cazzaniga, Morrison, Pym, and Szendrői on motivic DT invariants, work of Dobrovolska, Ginzburg, and Travkin on exponential sums,…
▽ More
Since its first use by Behrend, Bryan, and Szendrői in the computation of motivic Donaldson-Thomas (DT) invariants of $\mathbb{A}_{\mathbb{C}}^3$, dimensional reduction has proved to be an important tool in motivic and cohomological DT theory. Inspired by a conjecture of Cazzaniga, Morrison, Pym, and Szendrői on motivic DT invariants, work of Dobrovolska, Ginzburg, and Travkin on exponential sums, and work of Orlov and Hirano on equivalences of categories of singularities, we generalize the dimensional reduction theorem in motivic and cohomological DT theory and use it to prove versions of the Cazzaniga-Morrison-Pym-Szendrői conjecture in these settings.
△ Less
Submitted 9 January, 2020;
originally announced January 2020.
-
Strong positivity for quantum theta bases of quantum cluster algebras
Authors:
Ben Davison,
Travis Mandel
Abstract:
We construct "quantum theta bases," extending the set of quantum cluster monomials, for various versions of skew-symmetric quantum cluster algebras. These bases consist precisely of the indecomposable universally positive elements of the algebras they generate, and the structure constants for their multiplication are Laurent polynomials in the quantum parameter with non-negative integer coefficien…
▽ More
We construct "quantum theta bases," extending the set of quantum cluster monomials, for various versions of skew-symmetric quantum cluster algebras. These bases consist precisely of the indecomposable universally positive elements of the algebras they generate, and the structure constants for their multiplication are Laurent polynomials in the quantum parameter with non-negative integer coefficients, proving the quantum strong cluster positivity conjecture for these algebras. The classical limits recover the theta bases considered by Gross-Hacking-Keel-Kontsevich. Our approach combines the scattering diagram techniques used in loc. cit. with the Donaldson-Thomas theory of quivers.
△ Less
Submitted 15 June, 2021; v1 submitted 28 October, 2019;
originally announced October 2019.
-
The local motivic DT/PT correspondence
Authors:
Ben Davison,
Andrea T. Ricolfi
Abstract:
We show that the Quot scheme $Q_L^n = \textrm{Quot}_{\mathbb A^3}(\mathscr I_L,n)$ parameterising length $n$ quotients of the ideal sheaf of a line in $\mathbb{A}^3$ is a global critical locus, and calculate the resulting motivic partition function (varying $n$), in the ring of relative motives over the configuration space of points in $\mathbb{A}^3$. As in the work of Behrend-Bryan-Szendrői this…
▽ More
We show that the Quot scheme $Q_L^n = \textrm{Quot}_{\mathbb A^3}(\mathscr I_L,n)$ parameterising length $n$ quotients of the ideal sheaf of a line in $\mathbb{A}^3$ is a global critical locus, and calculate the resulting motivic partition function (varying $n$), in the ring of relative motives over the configuration space of points in $\mathbb{A}^3$. As in the work of Behrend-Bryan-Szendrői this enables us to define a virtual motive for the Quot scheme of $n$ points of the ideal sheaf $\mathscr I_C\subset \mathscr O_Y$, where $C\subset Y$ is a smooth curve embedded in a smooth 3-fold $Y$, and we compute the associated motivic partition function. The result fits into a motivic wall-crossing type formula, refining the relation between Behrend's virtual Euler characteristic of $\textrm{Quot}_Y(\mathscr I_C,n)$ and of the symmetric product $\textrm{Sym}^nC$. Our "relative" analysis leads to results and conjectures regarding the pushforward of the sheaf of vanishing cycles along the Hilbert-Chow map $Q_L^n \rightarrow \textrm{Sym}^n(\mathbb{A}^3)$, and connections with cohomological Hall algebra representations.
△ Less
Submitted 4 May, 2021; v1 submitted 29 May, 2019;
originally announced May 2019.
-
Modified Distribution Alignment for Domain Adaptation with Pre-trained Inception ResNet
Authors:
Youshan Zhang,
Brian D. Davison
Abstract:
Deep neural networks have been widely used in computer vision. There are several well trained deep neural networks for the ImageNet classification challenge, which has played a significant role in image recognition. However, little work has explored pre-trained neural networks for image recognition in domain adaption. In this paper, we are the first to extract better-represented features from a pr…
▽ More
Deep neural networks have been widely used in computer vision. There are several well trained deep neural networks for the ImageNet classification challenge, which has played a significant role in image recognition. However, little work has explored pre-trained neural networks for image recognition in domain adaption. In this paper, we are the first to extract better-represented features from a pre-trained Inception ResNet model for domain adaptation. We then present a modified distribution alignment method for classification using the extracted features. We test our model using three benchmark datasets (Office+Caltech-10, Office-31, and Office-Home). Extensive experiments demonstrate significant improvements (4.8%, 5.5%, and 10%) in classification accuracy over the state-of-the-art.
△ Less
Submitted 18 April, 2019; v1 submitted 3 April, 2019;
originally announced April 2019.
-
Refined invariants of finite-dimensional Jacobi algebras
Authors:
Ben Davison
Abstract:
We define and study refined Gopakumar-Vafa invariants of contractible curves in complex algebraic 3-folds, alongside the cohomological Donaldson--Thomas theory of finite-dimensional Jacobi algebras. These Gopakumar-Vafa invariants can be constructed one of two ways: as cohomological BPS invariants of contraction algebras controlling the deformation theory of these curves, as defined by Donovan and…
▽ More
We define and study refined Gopakumar-Vafa invariants of contractible curves in complex algebraic 3-folds, alongside the cohomological Donaldson--Thomas theory of finite-dimensional Jacobi algebras. These Gopakumar-Vafa invariants can be constructed one of two ways: as cohomological BPS invariants of contraction algebras controlling the deformation theory of these curves, as defined by Donovan and Wemyss, or by feeding the moduli spaces that Katz used to define genus zero Gopakumar-Vafa invariants into the machinery developed by Joyce et al. The conjecture that the two definitions give isomorphic results is a special case of a kind of categorified version of the strong rationality conjecture due to Pandharipande and Thomas, that we discuss and propose a means of proving. We prove the positivity of the cohomological/refined BPS invariants of all finite-dimensional Jacobi algebras. This result supports this strengthening of the strong rationality conjecture, as well as the conjecture of Brown and Wemyss stating that all finite-dimensional Jacobi algebras for appropriate symmetric quivers are isomorphic to contraction algebras.
△ Less
Submitted 10 October, 2023; v1 submitted 2 March, 2019;
originally announced March 2019.
-
Enumerating coloured partitions in 2 and 3 dimensions
Authors:
Ben Davison,
Jared Ongaro,
Balazs Szendroi
Abstract:
We study generating functions of ordinary and plane partitions coloured by the action of a finite subgroup of the corresponding special linear group. After reviewing known results for the case of ordinary partitions, we formulate a conjecture concerning a factorisation property of the generating function of coloured plane partitions that can be thought of as an orbifold analogue of a conjecture of…
▽ More
We study generating functions of ordinary and plane partitions coloured by the action of a finite subgroup of the corresponding special linear group. After reviewing known results for the case of ordinary partitions, we formulate a conjecture concerning a factorisation property of the generating function of coloured plane partitions that can be thought of as an orbifold analogue of a conjecture of Maulik et al., now a theorem, in three-dimensional Donaldson-Thomas theory. We study natural quantisations of the generating functions arising from geometry, discuss a quantised version of our conjecture, and prove a positivity result for the quantised coloured plane partition function under a geometric assumption.
△ Less
Submitted 5 June, 2019; v1 submitted 30 November, 2018;
originally announced November 2018.
-
IP Geolocation through Reverse DNS
Authors:
Ovidiu Dan,
Vaibhav Parikh,
Brian D. Davison
Abstract:
IP Geolocation databases are widely used in online services to map end user IP addresses to their geographical locations. However, they use proprietary geolocation methods and in some cases they have poor accuracy. We propose a systematic approach to use publicly accessible reverse DNS hostnames for geolocating IP addresses. Our method is designed to be combined with other geolocation data sources…
▽ More
IP Geolocation databases are widely used in online services to map end user IP addresses to their geographical locations. However, they use proprietary geolocation methods and in some cases they have poor accuracy. We propose a systematic approach to use publicly accessible reverse DNS hostnames for geolocating IP addresses. Our method is designed to be combined with other geolocation data sources. We cast the task as a machine learning problem where for a given hostname, we generate and rank a list of potential location candidates. We evaluate our approach against three state of the art academic baselines and two state of the art commercial IP geolocation databases. We show that our work significantly outperforms the academic baselines, and is complementary and competitive with commercial databases. To aid reproducibility, we open source our entire approach.
△ Less
Submitted 10 November, 2018;
originally announced November 2018.
-
The integrality conjecture and the cohomology of preprojective stacks
Authors:
Ben Davison
Abstract:
We study the Borel-Moore homology of stacks of representations of preprojective algebras $Π_Q$, via the study of the DT theory of the undeformed 3-Calabi-Yau completion $Π_Q[x]$. Via a result on the supports of the BPS sheaves for $Π_Q[x]$-mod, we prove purity of the BPS cohomology for the stack of $Π_Q[x]$-modules, and define BPS sheaves for stacks of $Π_Q$-modules. These are mixed Hodge modules…
▽ More
We study the Borel-Moore homology of stacks of representations of preprojective algebras $Π_Q$, via the study of the DT theory of the undeformed 3-Calabi-Yau completion $Π_Q[x]$. Via a result on the supports of the BPS sheaves for $Π_Q[x]$-mod, we prove purity of the BPS cohomology for the stack of $Π_Q[x]$-modules, and define BPS sheaves for stacks of $Π_Q$-modules. These are mixed Hodge modules on the coarse moduli space of $Π_Q$-modules that control the Borel-Moore homology and geometric representation theory associated to these stacks. We show that the hypercohomology of these objects is pure, and thus that the Borel-Moore homology of stacks of $Π_Q$-modules is also pure.
We transport the cohomological wall-crossing and integrality theorems from DT theory to the category of $Π_Q$-modules. Among these and other applications, we use our results to prove positivity of a number of "restricted" Kac polynomials, determine the critical cohomology of $\mathrm{Hilb}_n(\mathbb{A}^3)$, and the Borel-Moore homology of genus one character stacks, as well as various applications to the cohomological Hall algebras associated to Borel-Moore homology of stacks of preprojective algebras, including the PBW theorem, and torsion-freeness.
△ Less
Submitted 25 March, 2022; v1 submitted 5 February, 2016;
originally announced February 2016.
-
Positivity for quantum cluster algebras
Authors:
Ben Davison
Abstract:
Building on work by Kontsevich, Soibelman, Nagao and Efimov, we prove the positivity of quantum cluster coefficients for all skew-symmetric quantum cluster algebras, via a proof of a conjecture first suggested by Kontsevich on the purity of mixed Hodge structures arising in the theory of cluster mutation of spherical collections in 3-Calabi-Yau categories. The result implies positivity, as well as…
▽ More
Building on work by Kontsevich, Soibelman, Nagao and Efimov, we prove the positivity of quantum cluster coefficients for all skew-symmetric quantum cluster algebras, via a proof of a conjecture first suggested by Kontsevich on the purity of mixed Hodge structures arising in the theory of cluster mutation of spherical collections in 3-Calabi-Yau categories. The result implies positivity, as well as the stronger Lefschetz property conjectured by Efimov, and also the classical positivity conjecture of Fomin and Zelevinsky, recently proved by Lee and Schiffler. Closely related to these results is a categorified "no exotics" type theorem for cohomological Donaldson-Thomas invariants, which we discuss and prove in the appendix.
△ Less
Submitted 4 October, 2017; v1 submitted 28 January, 2016;
originally announced January 2016.
-
Cohomological Donaldson-Thomas theory of a quiver with potential and quantum enveloping algebras
Authors:
Ben Davison,
Sven Meinhardt
Abstract:
This paper concerns the cohomological aspects of Donaldson-Thomas theory for Jacobi algebras and the associated cohomological Hall algebra, introduced by Kontsevich and Soibelman. We prove the Hodge-theoretic categorification of the integrality conjecture and the wall crossing formula, and furthermore realise the isomorphism in both of these theorems as Poincaré-Birkhoff-Witt isomorphisms for the…
▽ More
This paper concerns the cohomological aspects of Donaldson-Thomas theory for Jacobi algebras and the associated cohomological Hall algebra, introduced by Kontsevich and Soibelman. We prove the Hodge-theoretic categorification of the integrality conjecture and the wall crossing formula, and furthermore realise the isomorphism in both of these theorems as Poincaré-Birkhoff-Witt isomorphisms for the associated cohomological Hall algebra. We do this by defining a perverse filtration on the cohomological Hall algebra, a result of the "hidden properness" of the semisimplification map from the moduli stack of semistable representations of the Jacobi algebra to the coarse moduli space of polystable representations. This enables us to construct a degeneration of the cohomological Hall algebra, for generic stability condition and fixed slope, to a free supercommutative algebra generated by a mixed Hodge structure categorifying the BPS invariants. As a corollary of this construction we furthermore obtain a Lie algebra structure on this mixed Hodge structure - the Lie algebra of BPS invariants - for which the entire cohomological Hall algebra can be seen as the positive part of a Yangian-type quantum group.
△ Less
Submitted 6 March, 2020; v1 submitted 11 January, 2016;
originally announced January 2016.
-
Donaldson-Thomas theory for categories of homological dimension one with potential
Authors:
Ben Davison,
Sven Meinhardt
Abstract:
The aim of the paper is twofold. Firstly, we give an axiomatic presentation of Donaldson-Thomas theory for categories of homological dimension at most one with potential. In particular, we provide rigorous proofs of all standard results concerning the integration map, wall-crossing, PT-DT correspondence, etc. following Kontsevich and Soibelman. We also show the equivalence of their approach and th…
▽ More
The aim of the paper is twofold. Firstly, we give an axiomatic presentation of Donaldson-Thomas theory for categories of homological dimension at most one with potential. In particular, we provide rigorous proofs of all standard results concerning the integration map, wall-crossing, PT-DT correspondence, etc. following Kontsevich and Soibelman. We also show the equivalence of their approach and the one given by Joyce and Song. Secondly, we relate Donaldson-Thomas functions for such a category with arbitrary potential to those with zero potential under some mild conditions. As a result of this, we obtain a geometric interpretation of Donaldson-Thomas functions in all known realizations, i.e. mixed Hodge modules, perverse sheaves and constructible functions.
△ Less
Submitted 30 December, 2015;
originally announced December 2015.
-
Cohomological Hall algebras and character varieties
Authors:
Ben Davison
Abstract:
In this paper we investigate the relationship between twisted and untwisted character varieties via a specific instance of the Cohomological Hall algebra for moduli of objects in 3-Calabi-Yau categories introduced by Kontsevich and Soibelman. In terms of Donaldson--Thomas theory, this relationship is completely understood via the calculations of Hausel and Villegas of the E polynomials of twisted…
▽ More
In this paper we investigate the relationship between twisted and untwisted character varieties via a specific instance of the Cohomological Hall algebra for moduli of objects in 3-Calabi-Yau categories introduced by Kontsevich and Soibelman. In terms of Donaldson--Thomas theory, this relationship is completely understood via the calculations of Hausel and Villegas of the E polynomials of twisted character varieties and untwisted character stacks. We present a conjectural lift of this relationship to the cohomological Hall algebra setting.
△ Less
Submitted 14 May, 2016; v1 submitted 1 April, 2015;
originally announced April 2015.
-
The critical CoHA of a quiver with potential
Authors:
Ben Davison
Abstract:
Pursuing the similarity between the Kontsevich--Soibelman construction of the cohomological Hall algebra of BPS states and Lusztig's construction of canonical bases for quantum enveloping algebras, and the similarity between the inetgrality conjecture for motivic Donaldson--Thomas invariants and the PBW theorem for quantum enveloping algebras, we build a coproduct on the cohomological Hall algebra…
▽ More
Pursuing the similarity between the Kontsevich--Soibelman construction of the cohomological Hall algebra of BPS states and Lusztig's construction of canonical bases for quantum enveloping algebras, and the similarity between the inetgrality conjecture for motivic Donaldson--Thomas invariants and the PBW theorem for quantum enveloping algebras, we build a coproduct on the cohomological Hall algebra associated to a quiver with potential. We also prove a cohomological dimensional reduction theorem, further linking a special class of cohomological Hall algebras with Yangians, and explaining how to connect the study of character varieties with the study of cohomological Hall algebras.
△ Less
Submitted 27 October, 2016; v1 submitted 27 November, 2013;
originally announced November 2013.
-
Purity of critical cohomology and Kac's conjecture
Authors:
Ben Davison
Abstract:
We provide a new proof of the Kac positivity conjecture for an arbitrary quiver $Q$. The ingredients are the cohomological integrality theorem in Donaldson-Thomas theory, dimensional reduction, and an easy purity result. These facts imply the purity of the cohomological Donaldson-Thomas invariants for a quiver with potential $(\tilde{Q},W)$ associated to $Q$, which in turn implies positivity of th…
▽ More
We provide a new proof of the Kac positivity conjecture for an arbitrary quiver $Q$. The ingredients are the cohomological integrality theorem in Donaldson-Thomas theory, dimensional reduction, and an easy purity result. These facts imply the purity of the cohomological Donaldson-Thomas invariants for a quiver with potential $(\tilde{Q},W)$ associated to $Q$, which in turn implies positivity of the Kac polynomials for $Q$.
△ Less
Submitted 10 March, 2017; v1 submitted 27 November, 2013;
originally announced November 2013.
-
Purity for graded potentials and quantum cluster positivity
Authors:
Ben Davison,
Davesh Maulik,
Joerg Schuermann,
Balazs Szendroi
Abstract:
Consider a smooth quasiprojective variety X equipped with a C*-action, and a regular function f: X -> C which is C*-equivariant with respect to a positive weight action on the base. We prove the purity of the mixed Hodge structure and the hard Lefschetz theorem on the cohomology of the vanishing cycle complex of f on proper components of the critical locus of f, generalizing a result of Steenbrink…
▽ More
Consider a smooth quasiprojective variety X equipped with a C*-action, and a regular function f: X -> C which is C*-equivariant with respect to a positive weight action on the base. We prove the purity of the mixed Hodge structure and the hard Lefschetz theorem on the cohomology of the vanishing cycle complex of f on proper components of the critical locus of f, generalizing a result of Steenbrink for isolated quasi-homogeneous singularities. Building on work of Kontsevich-Soibelman, Nagao and Efimov, we use this result to prove the quantum positivity conjecture for cluster mutations for all quivers admitting a positively graded nondegenerate potential. We deduce quantum positivity for all quivers of rank at most 4; quivers with nondegenerate potential admitting a cut; and quivers with potential associated to triangulations of surfaces with marked points and nonempty boundary.
△ Less
Submitted 8 December, 2014; v1 submitted 12 July, 2013;
originally announced July 2013.