Skip to main content

Showing 51–100 of 361 results for author: Shi, K

.
  1. arXiv:2410.21276  [pdf, other

    cs.CL cs.AI cs.CV cs.CY cs.LG cs.SD eess.AS

    GPT-4o System Card

    Authors: OpenAI, :, Aaron Hurst, Adam Lerer, Adam P. Goucher, Adam Perelman, Aditya Ramesh, Aidan Clark, AJ Ostrow, Akila Welihinda, Alan Hayes, Alec Radford, Aleksander Mądry, Alex Baker-Whitcomb, Alex Beutel, Alex Borzunov, Alex Carney, Alex Chow, Alex Kirillov, Alex Nichol, Alex Paino, Alex Renzin, Alex Tachard Passos, Alexander Kirillov, Alexi Christakis , et al. (395 additional authors not shown)

    Abstract: GPT-4o is an autoregressive omni model that accepts as input any combination of text, audio, image, and video, and generates any combination of text, audio, and image outputs. It's trained end-to-end across text, vision, and audio, meaning all inputs and outputs are processed by the same neural network. GPT-4o can respond to audio inputs in as little as 232 milliseconds, with an average of 320 mil… ▽ More

    Submitted 25 October, 2024; originally announced October 2024.

  2. arXiv:2410.20312  [pdf, other

    cs.LG stat.ML

    Q-Distribution guided Q-learning for offline reinforcement learning: Uncertainty penalized Q-value via consistency model

    Authors: Jing Zhang, Linjiajie Fang, Kexin Shi, Wenjia Wang, Bing-Yi Jing

    Abstract: ``Distribution shift'' is the main obstacle to the success of offline reinforcement learning. A learning policy may take actions beyond the behavior policy's knowledge, referred to as Out-of-Distribution (OOD) actions. The Q-values for these OOD actions can be easily overestimated. As a result, the learning policy is biased by using incorrect Q-value estimates. One common approach to avoid Q-value… ▽ More

    Submitted 12 January, 2025; v1 submitted 26 October, 2024; originally announced October 2024.

    Comments: Neurips 2024

  3. arXiv:2410.19872  [pdf, other

    cs.CV

    Radar and Camera Fusion for Object Detection and Tracking: A Comprehensive Survey

    Authors: Kun Shi, Shibo He, Zhenyu Shi, Anjun Chen, Zehui Xiong, Jiming Chen, Jun Luo

    Abstract: Multi-modal fusion is imperative to the implementation of reliable object detection and tracking in complex environments. Exploiting the synergy of heterogeneous modal information endows perception systems the ability to achieve more comprehensive, robust, and accurate performance. As a nucleus concern in wireless-vision collaboration, radar-camera fusion has prompted prospective research directio… ▽ More

    Submitted 24 October, 2024; originally announced October 2024.

  4. arXiv:2410.13271  [pdf, other

    cs.CV cs.LG

    Inductive Gradient Adjustment For Spectral Bias In Implicit Neural Representations

    Authors: Kexuan Shi, Hai Chen, Leheng Zhang, Shuhang Gu

    Abstract: Implicit Neural Representations (INRs), as a versatile representation paradigm, have achieved success in various computer vision tasks. Due to the spectral bias of the vanilla multi-layer perceptrons (MLPs), existing methods focus on designing MLPs with sophisticated architectures or repurposing training techniques for highly accurate INRs. In this paper, we delve into the linear dynamics model of… ▽ More

    Submitted 25 May, 2025; v1 submitted 17 October, 2024; originally announced October 2024.

    Comments: Accepted to ICML 2025. Code available at https://github.com/LabShuHangGU/IGA-INR

  5. arXiv:2410.13202  [pdf, other

    cond-mat.mes-hall physics.app-ph

    Anatomy of Thermally Interplayed Spin-Orbit Torque Driven Antiferromagnetic Switching

    Authors: Wenlong Cai, Zanhong Chen, Yuzhang Shi, Daoqian Zhu, Guang Yang, Ao Du, Shiyang Lu, Kaihua Cao, Hongxi Liu, Kewen Shi, Weisheng Zhao

    Abstract: Current-induced antiferromagnetic (AFM) switching remains critical in spintronics, yet the interplay between thermal effects and spin torques still lacks clear clarification. Here we experimentally investigate the thermally interplayed spin-orbit torque induced AFM switching in magnetic tunnel junctions via pulse-width dependent reversal and time-resolved measurements. By introducing the Langevin… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  6. arXiv:2410.08282  [pdf, other

    cs.RO cs.AI cs.CV cs.GR

    FusionSense: Bridging Common Sense, Vision, and Touch for Robust Sparse-View Reconstruction

    Authors: Irving Fang, Kairui Shi, Xujin He, Siqi Tan, Yifan Wang, Hanwen Zhao, Hung-Jui Huang, Wenzhen Yuan, Chen Feng, Jing Zhang

    Abstract: Humans effortlessly integrate common-sense knowledge with sensory input from vision and touch to understand their surroundings. Emulating this capability, we introduce FusionSense, a novel 3D reconstruction framework that enables robots to fuse priors from foundation models with highly sparse observations from vision and tactile sensors. FusionSense addresses three key challenges: (i) How can robo… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

    ACM Class: I.4.5; I.4.8

  7. arXiv:2410.07069  [pdf, other

    cs.CL cs.AI cs.LG

    ReIFE: Re-evaluating Instruction-Following Evaluation

    Authors: Yixin Liu, Kejian Shi, Alexander R. Fabbri, Yilun Zhao, Peifeng Wang, Chien-Sheng Wu, Shafiq Joty, Arman Cohan

    Abstract: The automatic evaluation of instruction following typically involves using large language models (LLMs) to assess response quality. However, there is a lack of comprehensive evaluation of these LLM-based evaluators across two dimensions: the base LLMs and the evaluation protocols. Therefore, we present a thorough meta-evaluation of instruction following, including 25 base LLMs and 15 recently prop… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

    Comments: GitHub Repo: https://github.com/yale-nlp/ReIFE, Evaluation Result Collection: https://huggingface.co/datasets/yale-nlp/ReIFE

  8. arXiv:2410.06764  [pdf, other

    cs.DS math.OC

    An Optimal Algorithm for the Stacker Crane Problem on Fixed Topologies

    Authors: Yike Chen, Ke Shi, Chao Xu

    Abstract: The Stacker Crane Problem (SCP) is a variant of the Traveling Salesman Problem. In SCP, pairs of pickup and delivery points are designated on a graph, and a crane must visit these points to move objects from each pickup location to its respective delivery point. The goal is to minimize the total distance traveled. SCP is known to be NP-hard, even on tree structures. The only positive results, in t… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

  9. Implications for galaxy property estimation revealed by CO luminosity-FWHM relations in local star-forming galaxies

    Authors: Yi-Han Wu, Jun-Feng Wang, Xiao-Hu Li, Xue-Jian Jiang, Chao-Wei Tsai, Jing-Wen Wu, Kun-Peng Shi, Lin Zhu, Wen-Yu Zhong

    Abstract: This study explores a relationship between the CO luminosity-full width at half-maximum linewidth linear relation (i.e. the CO LFR) and mean galaxy property of the local star-forming galaxy sample in the xCOLDGASS data base, via a mathematical statement. The whole data base galaxies were separated into two subsamples based on their stellar masses and redshifts, being a help to examine the dependen… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

    Comments: 10 pages, 3 tables, 2 figures; we sincerely appreciate the suggestion of the referee and the acceptance of the MNRAS. Y-HW is greatly grateful to all the co-authors for their works on his articles

  10. arXiv:2409.04851  [pdf, other

    cs.CV

    AdaptiveFusion: Adaptive Multi-Modal Multi-View Fusion for 3D Human Body Reconstruction

    Authors: Anjun Chen, Xiangyu Wang, Zhi Xu, Kun Shi, Yan Qin, Yuchi Huo, Jiming Chen, Qi Ye

    Abstract: Recent advancements in sensor technology and deep learning have led to significant progress in 3D human body reconstruction. However, most existing approaches rely on data from a specific sensor, which can be unreliable due to the inherent limitations of individual sensing modalities. Additionally, existing multi-modal fusion methods generally require customized designs based on the specific senso… ▽ More

    Submitted 13 March, 2025; v1 submitted 7 September, 2024; originally announced September 2024.

    Comments: TMM 2025, Project Page: https://chen3110.github.io/adaptivefusion/index.html

  11. arXiv:2409.03635  [pdf, ps, other

    quant-ph cs.CR

    On the Relativistic Zero Knowledge Quantum Proofs of Knowledge

    Authors: Kaiyan Shi, Kaushik Chakraborty, Wen Yu Kon, Omar Amer, Marco Pistoia, Charles Lim

    Abstract: We initiate the study of relativistic zero-knowledge quantum proof of knowledge systems with classical communication, formally defining a number of useful concepts and constructing appropriate knowledge extractors for all the existing protocols in the relativistic setting which satisfy a weaker variant of the special soundness property due to Unruh (EUROCRYPT 2012). We show that there exists quant… ▽ More

    Submitted 17 December, 2024; v1 submitted 5 September, 2024; originally announced September 2024.

    Comments: 38 pages

  12. arXiv:2408.04820  [pdf, other

    cs.SE cs.AI cs.CL cs.HC cs.LG

    Natural Language Outlines for Code: Literate Programming in the LLM Era

    Authors: Kensen Shi, Deniz Altınbüken, Saswat Anand, Mihai Christodorescu, Katja Grünwedel, Alexa Koenings, Sai Naidu, Anurag Pathak, Marc Rasi, Fredde Ribeiro, Brandon Ruffin, Siddhant Sanyam, Maxim Tabachnyk, Sara Toth, Roy Tu, Tobias Welp, Pengcheng Yin, Manzil Zaheer, Satish Chandra, Charles Sutton

    Abstract: We propose using natural language outlines as a novel modality and interaction surface for providing AI assistance to developers throughout the software development process. An NL outline for a code function comprises multiple statements written in concise prose, which partition the code and summarize its main ideas in the style of literate programming. Crucially, we find that modern LLMs can gene… ▽ More

    Submitted 17 April, 2025; v1 submitted 8 August, 2024; originally announced August 2024.

    Comments: Accepted to FSE'25 Industry Track

  13. arXiv:2408.04201  [pdf, ps, other

    math-ph hep-th nlin.SI

    Exact solution of a quantum integrable system associated with the $G_2$ exceptional Lie algebra

    Authors: Guang-Liang Li, Junpeng Cao, Pei Sun, Wen-Li Yang, Kangjie Shi, Yupeng Wang

    Abstract: A quantum integrable spin chain model associated with the $G_2$ exceptional Lie algebra is studied. By using the fusion technique, the closed recursive relations among the fused transfer matrices are obtained. These identities allow us to derive the exact energy spectrum and Bethe ansatz equations of the system based on polynomial analysis. The present method provides a unified treatment to invest… ▽ More

    Submitted 16 December, 2024; v1 submitted 7 August, 2024; originally announced August 2024.

    Comments: Some numerical checks for small site numbers are added; 36 pages

    Journal ref: Nucl. Phys. B 1010 (2025), 116777

  14. arXiv:2407.16396  [pdf, other

    cs.CV

    Learning Unsigned Distance Functions from Multi-view Images with Volume Rendering Priors

    Authors: Wenyuan Zhang, Kanle Shi, Yu-Shen Liu, Zhizhong Han

    Abstract: Unsigned distance functions (UDFs) have been a vital representation for open surfaces. With different differentiable renderers, current methods are able to train neural networks to infer a UDF by minimizing the rendering errors on the UDF to the multi-view ground truth. However, these differentiable renderers are mainly handcrafted, which makes them either biased on ray-surface intersections, or s… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024. Project page: https://wen-yuan-zhang.github.io/VolumeRenderingPriors/

  15. arXiv:2407.11789  [pdf, other

    cs.CL cs.AI cs.CY

    Large Language Models as Misleading Assistants in Conversation

    Authors: Betty Li Hou, Kejian Shi, Jason Phang, James Aung, Steven Adler, Rosie Campbell

    Abstract: Large Language Models (LLMs) are able to provide assistance on a wide range of information-seeking tasks. However, model outputs may be misleading, whether unintentionally or in cases of intentional deception. We investigate the ability of LLMs to be deceptive in the context of providing assistance on a reading comprehension task, using LLMs as proxies for human users. We compare outcomes of (1) w… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Next Generation of AI Safety Workshop, 41st International Conference on Machine Learning (ICML 2024)

  16. arXiv:2406.19545  [pdf, other

    cs.CL cs.AI

    Leveraging Machine-Generated Rationales to Facilitate Social Meaning Detection in Conversations

    Authors: Ritam Dutt, Zhen Wu, Kelly Shi, Divyanshu Sheth, Prakhar Gupta, Carolyn Penstein Rose

    Abstract: We present a generalizable classification approach that leverages Large Language Models (LLMs) to facilitate the detection of implicitly encoded social meaning in conversations. We design a multi-faceted prompt to extract a textual explanation of the reasoning that connects visible cues to underlying social meanings. These extracted explanations or rationales serve as augmentations to the conversa… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: To appear at The Proceedings of the Association for Computational Linguistics, 2024

  17. arXiv:2406.16263  [pdf, ps, other

    eess.SY math.OC

    Discrete-time Integral Resonant Control of Negative Imaginary Systems: Application to a High-speed Nanopositioner

    Authors: Kanghong Shi, Erfan Khodabakhshi, Prosanto Biswas, Ian R. Petersen, S. O. Reza Moheimani

    Abstract: We propose a discrete-time integral resonant control (IRC) approach for negative imaginary (NI) systems, which overcomes several limitations of continuous-time IRC. We show that a discrete-time IRC has a step-advanced negative imaginary property. A zero-order hold-sampled NI system can be asymptotically stabilized using a discrete-time IRC with suitable parameters. A hardware experiment is conduct… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: 10 pages, 10 figures

  18. arXiv:2406.13179  [pdf, other

    cs.SD cs.AI cs.NE eess.AS

    Global-Local Convolution with Spiking Neural Networks for Energy-efficient Keyword Spotting

    Authors: Shuai Wang, Dehao Zhang, Kexin Shi, Yuchen Wang, Wenjie Wei, Jibin Wu, Malu Zhang

    Abstract: Thanks to Deep Neural Networks (DNNs), the accuracy of Keyword Spotting (KWS) has made substantial progress. However, as KWS systems are usually implemented on edge devices, energy efficiency becomes a critical requirement besides performance. Here, we take advantage of spiking neural networks' energy efficiency and propose an end-to-end lightweight KWS model. The model consists of two innovative… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  19. arXiv:2406.07835  [pdf, other

    cs.CL cs.AI

    SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature

    Authors: David Wadden, Kejian Shi, Jacob Morrison, Aakanksha Naik, Shruti Singh, Nitzan Barzilay, Kyle Lo, Tom Hope, Luca Soldaini, Shannon Zejiang Shen, Doug Downey, Hannaneh Hajishirzi, Arman Cohan

    Abstract: We present SciRIFF (Scientific Resource for Instruction-Following and Finetuning), a dataset of 137K instruction-following demonstrations for 54 tasks covering five essential scientific literature understanding capabilities: information extraction, summarization, question answering, claim verification, and classification. SciRIFF demonstrations are notable for their long input contexts, detailed t… ▽ More

    Submitted 19 August, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: Submitted to NeurIPS Datasets and Benchmarks 2024

  20. arXiv:2406.01643  [pdf, other

    eess.SY

    Unified Control of Voltage, Frequency and Angle in Electrical Power Systems: A Passivity and Negative-Imaginary based Approach

    Authors: Yijun Chen, Kanghong Shi, Ian R. Petersen, Elizabeth L. Ratnam

    Abstract: This paper proposes a unified methodology for voltage regulation, frequency synchronization, and rotor angle control in power transmission systems considering a one-axis generator model with time-varying voltages. First, we formulate an output consensus problem with a passivity and negative-imaginary (NI) based control framework. We establish output consensus results for both networked passive sys… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 8 pages, 7 figures, the 63rd IEEE Conference on Decision and Control. arXiv admin note: text overlap with arXiv:2406.01206

  21. arXiv:2406.01206  [pdf, other

    eess.SY

    On the Stability of Networked Nonlinear Negative Imaginary Systems with Applications to Electrical Power Systems

    Authors: Yijun Chen, Kanghong Shi, Ian R. Petersen, Elizabeth L. Ratnam

    Abstract: In the transition to achieving net zero emissions, it has been suggested that a substantial expansion of electric power grids will be necessary to support emerging renewable energy zones. In this paper, we propose employing battery-based feedback control and nonlinear negative imaginary (NI) systems theory to reduce the need for such expansion. By formulating a novel Luré-Postnikov-like Lyapunov f… ▽ More

    Submitted 11 July, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: 8 pages, 2 figures, 26th International Symposium on Mathematical Theory of Networks and Systems

  22. arXiv:2405.20215  [pdf, other

    cs.CL

    TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models

    Authors: Chen Zhang, Chengguang Tang, Dading Chong, Ke Shi, Guohua Tang, Feng Jiang, Haizhou Li

    Abstract: Mainstream approaches to aligning large language models (LLMs) heavily rely on human preference data, particularly when models require periodic updates. The standard process for iterative alignment of LLMs involves collecting new human feedback for each update. However, the data collection process is costly and challenging to scale. To address this issue, we introduce the "TS-Align" framework, whi… ▽ More

    Submitted 29 September, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: EMNLP-2024 Findings

  23. arXiv:2405.19299  [pdf, other

    cs.CL

    Expert-Guided Extinction of Toxic Tokens for Debiased Generation

    Authors: Xueyao Sun, Kaize Shi, Haoran Tang, Guandong Xu, Qing Li

    Abstract: Large language models (LLMs) can elicit social bias during generations, especially when inference with toxic prompts. Controlling the sensitive attributes in generation encounters challenges in data distribution, generalizability, and efficiency. Specifically, fine-tuning and retrieval demand extensive unbiased corpus, while direct prompting requires meticulously curated instructions for correctin… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  24. arXiv:2405.18067  [pdf, ps, other

    math.SG

    Ekeland-Hofer-Zehnder capacities of lagrangian products with special forms

    Authors: Kun Shi

    Abstract: In this paper, we give some estimations for Ekeland-Hofer-Zehnder capacities of lagrangian products with special forms through combinatorial formulas. Based on these estimations, we give some interesting corollaries.

    Submitted 5 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: 6pages

    MSC Class: 53D05; 53C23 (primary); 70H05; 57R17 (secondary)

  25. arXiv:2405.17659  [pdf, other

    eess.IV cs.CV

    Enhancing Global Sensitivity and Uncertainty Quantification in Medical Image Reconstruction with Monte Carlo Arbitrary-Masked Mamba

    Authors: Jiahao Huang, Liutao Yang, Fanwen Wang, Yang Nan, Weiwen Wu, Chengyan Wang, Kuangyu Shi, Angelica I. Aviles-Rivero, Carola-Bibiane Schönlieb, Daoqiang Zhang, Guang Yang

    Abstract: Deep learning has been extensively applied in medical image reconstruction, where Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) represent the predominant paradigms, each possessing distinct advantages and inherent limitations: CNNs exhibit linear complexity with local sensitivity, whereas ViTs demonstrate quadratic complexity with global sensitivity. The emerging Mamba has sh… ▽ More

    Submitted 25 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

  26. arXiv:2405.12996  [pdf, ps, other

    eess.IV

    Dose-aware Diffusion Model for 3D PET Image Denoising: Multi-institutional Validation with Reader Study and Real Low-dose Data

    Authors: Huidong Xie, Weijie Gan, Reimund Bayerlein, Bo Zhou, Ming-Kai Chen, Michal Kulon, Annemarie Boustani, Kuan-Yin Ko, Der-Shiun Wang, Benjamin A. Spencer, Wei Ji, Xiongchao Chen, Qiong Liu, Xueqi Guo, Menghua Xia, Yinchi Zhou, Hui Liu, Liang Guo, Hongyu An, Ulugbek S. Kamilov, Hanzhong Wang, Biao Li, Axel Rominger, Kuangyu Shi, Ge Wang , et al. (2 additional authors not shown)

    Abstract: Reducing scan times, radiation dose, and enhancing image quality for lower-performance scanners, are critical in low-dose PET imaging. Deep learning techniques have been investigated for PET image denoising. However, existing models have often resulted in compromised image quality when achieving low-count/low-dose PET and have limited generalizability to different image noise-levels, acquisition p… ▽ More

    Submitted 16 June, 2025; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: 18 Pages, 16 Figures, 5 Tables. Paper under review. First-place Freek J. Beekman Young Investigator Award at SNMMI 2024. Code available after paper publication. arXiv admin note: substantial text overlap with arXiv:2311.04248

  27. arXiv:2405.03085  [pdf, other

    cs.CL

    Compressing Long Context for Enhancing RAG with AMR-based Concept Distillation

    Authors: Kaize Shi, Xueyao Sun, Qing Li, Guandong Xu

    Abstract: Large Language Models (LLMs) have made significant strides in information acquisition. However, their overreliance on potentially flawed parametric knowledge leads to hallucinations and inaccuracies, particularly when handling long-tail, domain-specific queries. Retrieval Augmented Generation (RAG) addresses this limitation by incorporating external, non-parametric knowledge. Nevertheless, the ret… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  28. arXiv:2405.01109  [pdf, other

    math.NA cs.LG math.AP

    Hypergraph $p$-Laplacian regularization on point clouds for data interpolation

    Authors: Kehan Shi, Martin Burger

    Abstract: As a generalization of graphs, hypergraphs are widely used to model higher-order relations in data. This paper explores the benefit of the hypergraph structure for the interpolation of point cloud data that contain no explicit structural information. We define the $\varepsilon_n$-ball hypergraph and the $k_n$-nearest neighbor hypergraph on a point cloud and study the $p$-Laplacian regularization o… ▽ More

    Submitted 17 March, 2025; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: 34 pages

    MSC Class: 49J55; 35J20; 65N12

  29. arXiv:2404.19689  [pdf, ps, other

    math.AP cs.LG math.NA

    Continuum limit of $p$-biharmonic equations on graphs

    Authors: Kehan Shi, Martin Burger

    Abstract: This paper studies the $p$-biharmonic equation on graphs, which arises in point cloud processing and can be interpreted as a natural extension of the graph $p$-Laplacian from the perspective of hypergraph. The asymptotic behavior of the solution is investigated when the random geometric graph is considered and the number of data points goes to infinity. We show that the continuum limit is an appro… ▽ More

    Submitted 25 April, 2025; v1 submitted 30 April, 2024; originally announced April 2024.

    Comments: 21 pages

    MSC Class: 35R02; 35J30; 65N12

  30. arXiv:2404.17994  [pdf

    eess.IV

    LeqMod: Adaptable Lesion-Quantification-Consistent Modulation for Deep Learning Low-Count PET Image Denoising

    Authors: Menghua Xia, Huidong Xie, Qiong Liu, Bo Zhou, Hanzhong Wang, Biao Li, Axel Rominger, Quanzheng Li, Ramsey D. Badawi, Kuangyu Shi, Georges El Fakhri, Chi Liu

    Abstract: Deep learning-based positron emission tomography (PET) image denoising offers the potential to reduce radiation exposure and scanning time by transforming low-count images into high-count equivalents. However, existing methods typically blur crucial details, leading to inaccurate lesion quantification. This paper proposes a lesion-perceived and quantification-consistent modulation (LeqMod) strateg… ▽ More

    Submitted 4 March, 2025; v1 submitted 27 April, 2024; originally announced April 2024.

  31. arXiv:2404.14662  [pdf, other

    cs.LG cs.CL cs.PL cs.SE

    NExT: Teaching Large Language Models to Reason about Code Execution

    Authors: Ansong Ni, Miltiadis Allamanis, Arman Cohan, Yinlin Deng, Kensen Shi, Charles Sutton, Pengcheng Yin

    Abstract: A fundamental skill among human developers is the ability to understand and reason about program execution. As an example, a programmer can mentally simulate code execution in natural language to debug and repair code (aka. rubber duck debugging). However, large language models (LLMs) of code are typically trained on the surface textual form of programs, thus may lack a semantic understanding of h… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 35 pages

  32. arXiv:2404.06851  [pdf, other

    cs.CV

    UDiFF: Generating Conditional Unsigned Distance Fields with Optimal Wavelet Diffusion

    Authors: Junsheng Zhou, Weiqi Zhang, Baorui Ma, Kanle Shi, Yu-Shen Liu, Zhizhong Han

    Abstract: Diffusion models have shown remarkable results for image generation, editing and inpainting. Recent works explore diffusion models for 3D shape generation with neural implicit functions, i.e., signed distance function and occupancy function. However, they are limited to shapes with closed surfaces, which prevents them from generating diverse 3D real-world contents containing open surfaces. In this… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: To appear at CVPR2024. Project page: https://weiqi-zhang.github.io/UDiFF

  33. arXiv:2404.05528  [pdf, other

    physics.app-ph

    NAND-like SOT-MRAM-based Approximate Storage for Error-Tolerant Applications

    Authors: Min Wang, Zhengyi Hou, Chenyi Wang, Zhengjie Yan, Shixing Li, Ao Du, Wenlong Cai, Jinhao Li, Hongchao Zhang, Kaihua Cao, Kewen Shi, Bi Wang, Yuanfu Zhao, Qingyi Xiang, Zhaohao Wang, Weisheng Zhao

    Abstract: We demonstrate approximate storage based on NAND-like spin-orbit torque (SOT) MRAM, through "device-modeling-architecture" explorations. We experimentally achieve down to 1E-5 level selectivity. Selectivity and low-power solutions are established by numerical calculation workflow. System-level power consumption is evaluated in the 512 KB last-level cache according to 5 quality levels. Error-tolera… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  34. arXiv:2403.19276  [pdf, ps, other

    cs.IR

    Enhanced Bayesian Personalized Ranking for Robust Hard Negative Sampling in Recommender Systems

    Authors: Kexin Shi, Jing Zhang, Linjiajie Fang, Wenjia Wang, Bingyi Jing

    Abstract: In implicit collaborative filtering, hard negative mining techniques are developed to accelerate and enhance the recommendation model learning. However, the inadvertent selection of false negatives remains a major concern in hard negative sampling, as these false negatives can provide incorrect information and mislead the model learning. To date, only a small number of studies have been committed… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: 9 pages

  35. arXiv:2403.16046  [pdf, ps, other

    eess.SY math.OC

    Digital control of negative imaginary systems: a discrete-time hybrid integrator-gain system approach

    Authors: Kanghong Shi, Ian R. Petersen

    Abstract: A hybrid integrator-gain system (HIGS) is a control element that switches between an integrator and a gain, which overcomes some inherent limitations of linear controllers. In this paper, we consider using discrete-time HIGS controllers for the digital control of negative imaginary (NI) systems. We show that the discrete-time HIGS themselves are step-advanced negative imaginary systems. For a mini… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

    Comments: To appear in the 2024 European Control Conference. 7 pages, 3 figures

  36. arXiv:2403.15140  [pdf, ps, other

    eess.SY math.OC

    Hybrid integrator-gain system based integral resonant controllers for negative imaginary systems

    Authors: Kanghong Shi, Ian R. Petersen

    Abstract: We introduce a hybrid control system called a hybrid integrator-gain system (HIGS) based integral resonant controller (IRC) to stabilize negative imaginary (NI) systems. A HIGS-based IRC has a similar structure to an IRC, with the integrator replaced by a HIGS. We show that a HIGS-based IRC is an NI system. Also, for a SISO NI system with a minimal realization, we show there exists a HIGS-based IR… ▽ More

    Submitted 9 September, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

    Comments: 9 pages, 9 figures. The 63rd IEEE Conference on Decision and Control (CDC 2024)

  37. arXiv:2403.05769  [pdf

    physics.optics cond-mat.mes-hall

    High-rectification near-field radiative thermal diode using Weyl semimetals

    Authors: Yang Hu, Haotuo Liu, Bing Yang, Kezhang Shi, Mauro Antezza, Xiaohu Wu, Yasong Sun

    Abstract: Thermal diodes, which allow heat transfer in a preferential direction while being blocked in a reverse direction, have numerous applications in thermal management, information processing, energy harvesting, etc. Typical materials of thermal diodes in previous works include phase-change and magneto-optical materials. However, such thermal diodes highly depend on specific working temperatures or ext… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Journal ref: Phys. Rev. Materials 7, 035201 (2023)

  38. arXiv:2403.03346  [pdf, other

    cs.CV

    Enhancing Vision-Language Pre-training with Rich Supervisions

    Authors: Yuan Gao, Kunyu Shi, Pengkai Zhu, Edouard Belval, Oren Nuriel, Srikar Appalaraju, Shabnam Ghadar, Vijay Mahadevan, Zhuowen Tu, Stefano Soatto

    Abstract: We propose Strongly Supervised pre-training with ScreenShots (S4) - a novel pre-training paradigm for Vision-Language Models using data from large-scale web screenshot rendering. Using web screenshots unlocks a treasure trove of visual and textual cues that are not present in using image-text pairs. In S4, we leverage the inherent tree-structured hierarchy of HTML elements and the spatial localiza… ▽ More

    Submitted 12 March, 2025; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024

  39. arXiv:2403.02249  [pdf, other

    cs.CV cs.AI

    Non-autoregressive Sequence-to-Sequence Vision-Language Models

    Authors: Kunyu Shi, Qi Dong, Luis Goncalves, Zhuowen Tu, Stefano Soatto

    Abstract: Sequence-to-sequence vision-language models are showing promise, but their applicability is limited by their inference latency due to their autoregressive way of generating predictions. We propose a parallel decoding sequence-to-sequence vision-language model, trained with a Query-CTC loss, that marginalizes over multiple inference paths in the decoder. This allows us to model the joint distributi… ▽ More

    Submitted 12 March, 2025; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024

  40. arXiv:2402.15134  [pdf, other

    cs.LG cs.AI

    Deep Coupling Network For Multivariate Time Series Forecasting

    Authors: Kun Yi, Qi Zhang, Hui He, Kaize Shi, Liang Hu, Ning An, Zhendong Niu

    Abstract: Multivariate time series (MTS) forecasting is crucial in many real-world applications. To achieve accurate MTS forecasting, it is essential to simultaneously consider both intra- and inter-series relationships among time series data. However, previous work has typically modeled intra- and inter-series relationships separately and has disregarded multi-order interactions present within and between… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  41. arXiv:2402.12692  [pdf, other

    cs.CL

    FormulaReasoning: A Dataset for Formula-Based Numerical Reasoning

    Authors: Xiao Li, Bolin Zhu, Kaiwen Shi, Sichen Liu, Yin Zhu, Yiwei Liu, Gong Cheng

    Abstract: The application of formulas (e.g., physics formulas) is a fundamental ability of humans when solving numerical reasoning problems. Existing numerical reasoning datasets seldom explicitly indicate the formulas employed in reasoning, as their questions rely on implicit commonsense mathematical knowledge. In contrast, in this paper, we introduce FormulaReasoning, a new dataset specifically designed f… ▽ More

    Submitted 18 May, 2025; v1 submitted 19 February, 2024; originally announced February 2024.

  42. arXiv:2402.11558  [pdf, other

    cs.LG

    A Temporally Disentangled Contrastive Diffusion Model for Spatiotemporal Imputation

    Authors: Yakun Chen, Kaize Shi, Zhangkai Wu, Juan Chen, Xianzhi Wang, Julian McAuley, Guandong Xu, Shui Yu

    Abstract: Spatiotemporal data analysis is pivotal across various domains, such as transportation, meteorology, and healthcare. The data collected in real-world scenarios are often incomplete due to device malfunctions and network errors. Spatiotemporal imputation aims to predict missing values by exploiting the spatial and temporal dependencies in the observed data. Traditional imputation approaches based o… ▽ More

    Submitted 22 March, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

  43. arXiv:2402.08073  [pdf, other

    cs.LG cs.PL cs.SE

    Grounding Data Science Code Generation with Input-Output Specifications

    Authors: Yeming Wen, Pengcheng Yin, Kensen Shi, Henryk Michalewski, Swarat Chaudhuri, Alex Polozov

    Abstract: Large language models (LLMs) have recently demonstrated a remarkable ability to generate code from natural language (NL) prompts. However, in the real world, NL is often too ambiguous to capture the true intent behind programming problems, requiring additional input-output (I/O) specifications. Unfortunately, LLMs can have difficulty aligning their outputs with both the NL prompt and the I/O speci… ▽ More

    Submitted 14 March, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  44. arXiv:2401.12606  [pdf

    cond-mat.soft

    The Young-Laplace equation for a solid-liquid interface

    Authors: P. Montero de Hijes, K. Shi, E. G. Noya, E. E. Santiso, K. E. Gubbins, E. Sanz, C. Vega

    Abstract: The application of the Young-Laplace equation to a solid-liquid interface is considered. Computer simulations show that the pressure inside a solid cluster of hard spheres is smaller than the external pressure of the liquid (both for small and large clusters). That would suggest a negative value for the interfacial free energy. We show that in a Gibbsian description of the thermodynamics of a curv… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Journal ref: J. Chem. Phys. 153, 191102 (2020)

  45. arXiv:2401.08224  [pdf, other

    stat.ME cs.CR cs.LG

    Privacy Preserving Adaptive Experiment Design

    Authors: Jiachun Li, Kaining Shi, David Simchi-Levi

    Abstract: Adaptive experiment is widely adopted to estimate conditional average treatment effect (CATE) in clinical trials and many other scenarios. While the primary goal in experiment is to maximize estimation accuracy, due to the imperative of social welfare, it's also crucial to provide treatment with superior outcomes to patients, which is measured by regret in contextual bandit framework. These two ob… ▽ More

    Submitted 5 February, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: Add a table

  46. arXiv:2401.07402  [pdf, other

    cs.CV

    Improved Implicit Neural Representation with Fourier Reparameterized Training

    Authors: Kexuan Shi, Xingyu Zhou, Shuhang Gu

    Abstract: Implicit Neural Representation (INR) as a mighty representation paradigm has achieved success in various computer vision tasks recently. Due to the low-frequency bias issue of vanilla multi-layer perceptron (MLP), existing methods have investigated advanced techniques, such as positional encoding and periodic activation function, to improve the accuracy of INR. In this paper, we connect the networ… ▽ More

    Submitted 4 July, 2024; v1 submitted 14 January, 2024; originally announced January 2024.

    Comments: CVPR 2024

  47. arXiv:2401.06827  [pdf, other

    cs.CV cs.AI cs.CL

    APLe: Token-Wise Adaptive for Multi-Modal Prompt Learning

    Authors: Guiming Cao, Kaize Shi, Hong Fu, Huaiwen Zhang, Guandong Xu

    Abstract: Pre-trained Vision-Language (V-L) models set the benchmark for generalization to downstream tasks among the noteworthy contenders. Many characteristics of the V-L model have been explored in existing research including the challenge of the sensitivity to text input and the tuning process across multi-modal prompts. With the advanced utilization of the V-L model like CLIP, recent approaches deploy… ▽ More

    Submitted 23 January, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

    Comments: 7 pages,3 figures

  48. arXiv:2401.01250  [pdf, other

    cond-mat.quant-gas quant-ph

    Floquet topological phases with large winding number

    Authors: Kaiye Shi, Xiang Zhang, Wei Zhang

    Abstract: Recently, anomalous Floquet topological phases without static counterparts have been observed in different systems, where periodically driven models are realized to support a winding number of 1 and a pair of edge modes in each quasienergy gap. Here, we focus on cold atomic gases in optical lattices and propose a novel driving scheme that breaks rotation symmetry but maintains inversion symmetry o… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: 7 pages, 5 figures

    Journal ref: Phys. Rev. A 109, 013324 (2024)

  49. arXiv:2401.00999  [pdf, ps, other

    cond-mat.supr-con

    Possible Meissner effect near room temperature in copper-substituted lead apatite

    Authors: Hongyang Wang, Yao Yao, Ke Shi, Yijing Zhao, Hao Wu, Zhixing Wu, Zhihui Geng, Shufeng Ye, Ning Chen

    Abstract: With copper-substituted lead apatite below room temperature, we observe diamagnetic dc magnetization under magnetic field of 25 Oe with remarkable bifurcation between zero-field-cooling and field-cooling measurements, and under 200 Oe it changes to be paramagnetism. A glassy memory effect is found during cooling. Typical hysteresis loops for superconductors are detected below 250 K, along with an… ▽ More

    Submitted 1 January, 2024; originally announced January 2024.

    Comments: 7 pages, 4 figures

  50. arXiv:2401.00646  [pdf, ps, other

    cond-mat.str-el cond-mat.mtrl-sci

    High magnetic field phase diagram and weak FM breaking in (Ni0.93Co0.07)3V2O8

    Authors: Jiating Wu, Minjie Zhang, Ke Shi, Huxin Yin, Yuyan Han, Lansheng Ling, Wei Tong, Chuanying Xi, Li Pi, Zhaosheng Wang

    Abstract: We present magnetostriction and thermal expansion measurements on multiferroic (Ni0.93Co0.07)3V2O8. The high field phase diagrams up to 33 T along the a, b and c directions are built. For H//a, as the magnetic field increases, two intermediate phases appear between the incommensurate phase and the paramagnetic phase at about 7 K, and then a magnetically induced phase appears above the paramagnetic… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

    Comments: 7 pages, 4 figures

    Journal ref: Phys. Rev. B 108, 214108(2023)