Skip to main content

Showing 1–25 of 25 results for author: Qiao, W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.19874  [pdf, ps, other

    cs.CV cs.AI cs.MM

    StyleAR: Customizing Multimodal Autoregressive Model for Style-Aligned Text-to-Image Generation

    Authors: Yi Wu, Lingting Zhu, Shengju Qian, Lei Liu, Wandi Qiao, Lequan Yu, Bin Li

    Abstract: In the current research landscape, multimodal autoregressive (AR) models have shown exceptional capabilities across various domains, including visual understanding and generation. However, complex tasks such as style-aligned text-to-image generation present significant challenges, particularly in data acquisition. In analogy to instruction-following tuning for image editing of AR models, style-ali… ▽ More

    Submitted 26 May, 2025; originally announced May 2025.

  2. arXiv:2503.10125  [pdf, other

    cs.CV cs.MM

    Proxy-Tuning: Tailoring Multimodal Autoregressive Models for Subject-Driven Image Generation

    Authors: Yi Wu, Lingting Zhu, Lei Liu, Wandi Qiao, Ziqiang Li, Lequan Yu, Bin Li

    Abstract: Multimodal autoregressive (AR) models, based on next-token prediction and transformer architecture, have demonstrated remarkable capabilities in various multimodal tasks including text-to-image (T2I) generation. Despite their strong performance in general T2I tasks, our research reveals that these models initially struggle with subject-driven image generation compared to dominant diffusion models.… ▽ More

    Submitted 13 March, 2025; originally announced March 2025.

  3. arXiv:2503.03644  [pdf, other

    cs.CV

    DongbaMIE: A Multimodal Information Extraction Dataset for Evaluating Semantic Understanding of Dongba Pictograms

    Authors: Xiaojun Bi, Shuo Li, Junyao Xing, Ziyue Wang, Fuwen Luo, Weizheng Qiao, Lu Han, Ziwei Sun, Peng Li, Yang Liu

    Abstract: Dongba pictographic is the only pictographic script still in use in the world. Its pictorial ideographic features carry rich cultural and contextual information. However, due to the lack of relevant datasets, research on semantic understanding of Dongba hieroglyphs has progressed slowly. To this end, we constructed \textbf{DongbaMIE} - the first dataset focusing on multimodal information extractio… ▽ More

    Submitted 22 May, 2025; v1 submitted 5 March, 2025; originally announced March 2025.

    Comments: Our dataset can be obtained from: https://github.com/thinklis/DongbaMIE

  4. arXiv:2502.06521  [pdf, other

    cs.CR

    Sentient: Multi-Scenario Behavioral Intent Analysis for Advanced Persistent Threat Detection

    Authors: Wenhao Yan, Ning An, Wei Qiao, Weiheng Wu, Bo Jiang, Yuling Liu, Zhigang Lu, Junrong Liu

    Abstract: Advanced Persistent Threats (APTs) are challenging to detect due to their complexity and stealth. To mitigate such attacks, many approaches utilize provenance graphs to model entities and their dependencies, detecting the covert and persistent nature of APTs. However, existing methods face several challenges: 1) Environmental noise hinders precise detection; 2) Reliance on hard-to-obtain labeled d… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

  5. arXiv:2412.16215  [pdf, other

    cs.CV cs.AI cs.IR

    Zero-Shot Image Moderation in Google Ads with LLM-Assisted Textual Descriptions and Cross-modal Co-embeddings

    Authors: Enming Luo, Wei Qiao, Katie Warren, Jingxiang Li, Eric Xiao, Krishna Viswanathan, Yuan Wang, Yintao Liu, Jimin Li, Ariel Fuxman

    Abstract: We present a scalable and agile approach for ads image content moderation at Google, addressing the challenges of moderating massive volumes of ads with diverse content and evolving policies. The proposed method utilizes human-curated textual descriptions and cross-modal text-image co-embeddings to enable zero-shot classification of policy violating ads images, bypassing the need for extensive sup… ▽ More

    Submitted 17 December, 2024; originally announced December 2024.

  6. arXiv:2412.12492  [pdf, other

    cs.CV

    DuSSS: Dual Semantic Similarity-Supervised Vision-Language Model for Semi-Supervised Medical Image Segmentation

    Authors: Qingtao Pan, Wenhao Qiao, Jingjiao Lou, Bing Ji, Shuo Li

    Abstract: Semi-supervised medical image segmentation (SSMIS) uses consistency learning to regularize model training, which alleviates the burden of pixel-wise manual annotations. However, it often suffers from error supervision from low-quality pseudo labels. Vision-Language Model (VLM) has great potential to enhance pseudo labels by introducing text prompt guided multimodal supervision information. It neve… ▽ More

    Submitted 16 December, 2024; originally announced December 2024.

  7. arXiv:2411.18794  [pdf, other

    stat.ML cs.LG

    Graph Max Shift: A Hill-Climbing Method for Graph Clustering

    Authors: Ery Arias-Castro, Elizabeth Coda, Wanli Qiao

    Abstract: We present a method for graph clustering that is analogous with gradient ascent methods previously proposed for clustering points in space. We show that, when applied to a random geometric graph with data iid from some density with Morse regularity, the method is asymptotically consistent. Here, consistency is understood with respect to a density-level clustering defined by the partition of the su… ▽ More

    Submitted 27 November, 2024; originally announced November 2024.

  8. arXiv:2411.02775  [pdf, other

    cs.CR

    Winemaking: Extracting Essential Insights for Efficient Threat Detection in Audit Logs

    Authors: Weiheng Wu, Wei Qiao, Wenhao Yan, Bo Jiang, Yuling Liu, Baoxu Liu, Zhigang Lu, JunRong Liu

    Abstract: Advanced Persistent Threats (APTs) are continuously evolving, leveraging their stealthiness and persistence to put increasing pressure on current provenance-based Intrusion Detection Systems (IDS). This evolution exposes several critical issues: (1) The dense interaction between malicious and benign nodes within provenance graphs introduces neighbor noise, hindering effective detection; (2) The co… ▽ More

    Submitted 21 November, 2024; v1 submitted 4 November, 2024; originally announced November 2024.

    Comments: 8 pages body, 11 pages total(without authors)

  9. arXiv:2410.17910  [pdf, other

    cs.CR

    Slot: Provenance-Driven APT Detection through Graph Reinforcement Learning

    Authors: Wei Qiao, Yebo Feng, Teng Li, Zhuo Ma, Yulong Shen, JianFeng Ma, Yang Liu

    Abstract: Advanced Persistent Threats (APTs) represent sophisticated cyberattacks characterized by their ability to remain undetected within the victim system for extended periods, aiming to exfiltrate sensitive data or disrupt operations. Existing detection approaches often struggle to effectively identify these complex threats, construct the attack chain for defense facilitation, or resist adversarial att… ▽ More

    Submitted 11 January, 2025; v1 submitted 23 October, 2024; originally announced October 2024.

  10. arXiv:2409.15343  [pdf, other

    cs.IR

    Advertiser Content Understanding via LLMs for Google Ads Safety

    Authors: Joseph Wallace, Tushar Dogra, Wei Qiao, Yuan Wang

    Abstract: Ads Content Safety at Google requires classifying billions of ads for Google Ads content policies. Consistent and accurate policy enforcement is important for advertiser experience and user safety and it is a challenging problem, so there is a lot of value for improving it for advertisers and users. Inconsistent policy enforcement causes increased policy friction and poor experience with good adve… ▽ More

    Submitted 9 September, 2024; originally announced September 2024.

  11. arXiv:2406.03873  [pdf, other

    cs.LG cs.AI cs.CV

    Quantum Implicit Neural Representations

    Authors: Jiaming Zhao, Wenbo Qiao, Peng Zhang, Hui Gao

    Abstract: Implicit neural representations have emerged as a powerful paradigm to represent signals such as images and sounds. This approach aims to utilize neural networks to parameterize the implicit function of the signal. However, when representing implicit functions, traditional neural networks such as ReLU-based multilayer perceptrons face challenges in accurately modeling high-frequency components of… ▽ More

    Submitted 1 September, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: This paper was accepted by icml 2024

  12. arXiv:2405.17976  [pdf

    cs.AI cs.CL

    Yuan 2.0-M32: Mixture of Experts with Attention Router

    Authors: Shaohua Wu, Jiangang Luo, Xi Chen, Lingjun Li, Xudong Zhao, Tong Yu, Chao Wang, Yue Wang, Fei Wang, Weixu Qiao, Houbo He, Zeru Zhang, Zeyu Sun, Junxiong Mao, Chong Shen

    Abstract: Yuan 2.0-M32, with a similar base architecture as Yuan-2.0 2B, uses a mixture-of-experts architecture with 32 experts of which 2 experts are active. A new router network, Attention Router, is proposed and adopted for a more efficient selection of experts, which improves the accuracy compared to the model with classical router network. Yuan 2.0-M32 is trained with 2000B tokens from scratch, and the… ▽ More

    Submitted 29 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: 14 pages,3 figures, 7 tables

  13. arXiv:2402.14590  [pdf, other

    cs.IR cs.CL cs.LG

    Scaling Up LLM Reviews for Google Ads Content Moderation

    Authors: Wei Qiao, Tushar Dogra, Otilia Stretcu, Yu-Han Lyu, Tiantian Fang, Dongjin Kwon, Chun-Ta Lu, Enming Luo, Yuan Wang, Chih-Chun Chia, Ariel Fuxman, Fangzhou Wang, Ranjay Krishna, Mehmet Tek

    Abstract: Large language models (LLMs) are powerful tools for content moderation, but their inference costs and latency make them prohibitive for casual use on large datasets, such as the Google Ads repository. This study proposes a method for scaling up LLM reviews for content moderation in Google Ads. First, we use heuristics to select candidates via filtering and duplicate removal, and create clusters of… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  14. arXiv:2310.20380  [pdf, other

    cs.LG

    Dropout Strategy in Reinforcement Learning: Limiting the Surrogate Objective Variance in Policy Optimization Methods

    Authors: Zhengpeng Xie, Changdong Yu, Weizheng Qiao

    Abstract: Policy-based reinforcement learning algorithms are widely used in various fields. Among them, mainstream policy optimization algorithms such as TRPO and PPO introduce importance sampling into policy iteration, which allows the reuse of historical data. However, this can also lead to a high variance of the surrogate objective and indirectly affects the stability and convergence of the algorithm. In… ▽ More

    Submitted 3 November, 2023; v1 submitted 31 October, 2023; originally announced October 2023.

  15. arXiv:2307.10004  [pdf

    cs.AI

    6G Network Business Support System

    Authors: Ye Ouyang, Yaqin Zhang, Peng Wang, Yunxin Liu, Wen Qiao, Jun Zhu, Yang Liu, Feng Zhang, Shuling Wang, Xidong Wang

    Abstract: 6G is the next-generation intelligent and integrated digital information infrastructure, characterized by ubiquitous interconnection, native intelligence, multi-dimensional perception, global coverage, green and low-carbon, native network security, etc. 6G will realize the transition from serving people and people-things communication to supporting the efficient connection of intelligent agents, a… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

  16. arXiv:2306.06934  [pdf

    cs.CV

    Scale-Rotation-Equivariant Lie Group Convolution Neural Networks (Lie Group-CNNs)

    Authors: Wei-Dong Qiao, Yang Xu, Hui Li

    Abstract: The weight-sharing mechanism of convolutional kernels ensures translation-equivariance of convolution neural networks (CNNs). Recently, rotation-equivariance has been investigated. However, research on scale-equivariance or simultaneous scale-rotation-equivariance is insufficient. This study proposes a Lie group-CNN, which can keep scale-rotation-equivariance for image classification tasks. The Li… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

  17. arXiv:2301.12993  [pdf, other

    cs.CV cs.LG

    Benchmarking Robustness to Adversarial Image Obfuscations

    Authors: Florian Stimberg, Ayan Chakrabarti, Chun-Ta Lu, Hussein Hazimeh, Otilia Stretcu, Wei Qiao, Yintao Liu, Merve Kaya, Cyrus Rashtchian, Ariel Fuxman, Mehmet Tek, Sven Gowal

    Abstract: Automated content filtering and moderation is an important tool that allows online platforms to build striving user communities that facilitate cooperation and prevent abuse. Unfortunately, resourceful actors try to bypass automated filters in a bid to post content that violate platform policies and codes of conduct. To reach this goal, these malicious actors may obfuscate policy violating images… ▽ More

    Submitted 29 November, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

    ACM Class: I.2.10; I.4.0

  18. arXiv:2209.02951  [pdf, other

    cs.AR cs.PL

    Democratizing Domain-Specific Computing

    Authors: Yuze Chi, Weikang Qiao, Atefeh Sohrabizadeh, Jie Wang, Jason Cong

    Abstract: In the past few years, domain-specific accelerators (DSAs), such as Google's Tensor Processing Units, have shown to offer significant performance and energy efficiency over general-purpose CPUs. An important question is whether typical software developers can design and implement their own customized DSAs, with affordability and efficiency, to accelerate their applications. This article presents o… ▽ More

    Submitted 7 September, 2022; originally announced September 2022.

    Comments: To be published in CACM'22

  19. arXiv:2209.02663  [pdf, other

    cs.AR cs.DC cs.PF cs.PL

    TAPA: A Scalable Task-Parallel Dataflow Programming Framework for Modern FPGAs with Co-Optimization of HLS and Physical Design

    Authors: Licheng Guo, Yuze Chi, Jason Lau, Linghao Song, Xingyu Tian, Moazin Khatti, Weikang Qiao, Jie Wang, Ecenur Ustun, Zhenman Fang, Zhiru Zhang, Jason Cong

    Abstract: In this paper, we propose TAPA, an end-to-end framework that compiles a C++ task-parallel dataflow program into a high-frequency FPGA accelerator. Compared to existing solutions, TAPA has two major advantages. First, TAPA provides a set of convenient APIs that allow users to easily express flexible and complex inter-task communication structures. Second, TAPA adopts a coarse-grained floorplanning… ▽ More

    Submitted 6 September, 2022; originally announced September 2022.

    Journal ref: ACM Transactions on Reconfigurable Technology and Systems (2023), Volume 16, Issue 4 Article No.: 63, Pages 1 - 31

  20. arXiv:2208.14540  [pdf, ps, other

    math.ST cs.LG math.MG

    Embedding Functional Data: Multidimensional Scaling and Manifold Learning

    Authors: Ery Arias-Castro, Wanli Qiao

    Abstract: We adapt concepts, methodology, and theory originally developed in the areas of multidimensional scaling and dimensionality reduction for multivariate data to the functional setting. We focus on classical scaling and Isomap -- prototypical methods that have played important roles in these area -- and showcase their use in the context of functional data analysis. In the process, we highlight the cr… ▽ More

    Submitted 30 August, 2022; originally announced August 2022.

  21. arXiv:2205.07991  [pdf, other

    cs.AR cs.DC

    TopSort: A High-Performance Two-Phase Sorting Accelerator Optimized on HBM-based FPGAs

    Authors: Weikang Qiao, Licheng Guo, Zhenman Fang, Mau-Chung Frank Chang, Jason Cong

    Abstract: The emergence of high-bandwidth memory (HBM) brings new opportunities to boost the performance of sorting acceleration on FPGAs, which was conventionally bounded by the available off-chip memory bandwidth. However, it is nontrivial for designers to fully utilize this immense bandwidth. First, the existing sorter designs cannot be directly scaled at the increasing rate of available off-chip bandwid… ▽ More

    Submitted 16 May, 2022; originally announced May 2022.

  22. arXiv:2111.10298  [pdf, other

    math.ST cs.LG

    An Asymptotic Equivalence between the Mean-Shift Algorithm and the Cluster Tree

    Authors: Ery Arias-Castro, Wanli Qiao

    Abstract: Two important nonparametric approaches to clustering emerged in the 1970's: clustering by level sets or cluster tree as proposed by Hartigan, and clustering by gradient lines or gradient flow as proposed by Fukunaga and Hosteler. In a recent paper, we argue the thesis that these two approaches are fundamentally the same by showing that the gradient flow provides a way to move along the cluster tre… ▽ More

    Submitted 19 November, 2021; originally announced November 2021.

  23. arXiv:2104.12314  [pdf, other

    stat.ML cs.LG math.ST

    Algorithms for ridge estimation with convergence guarantees

    Authors: Wanli Qiao, Wolfgang Polonik

    Abstract: The extraction of filamentary structure from a point cloud is discussed. The filaments are modeled as ridge lines or higher dimensional ridges of an underlying density. We propose two novel algorithms, and provide theoretical guarantees for their convergences, by which we mean that the algorithms can asymptotically recover the full ridge set. We consider the new algorithms as alternatives to the S… ▽ More

    Submitted 31 December, 2024; v1 submitted 25 April, 2021; originally announced April 2021.

    Comments: 50 pages, 11 figures

    MSC Class: 62G05

  24. arXiv:2104.10103  [pdf, other

    stat.ML cs.LG stat.CO

    Space Partitioning and Regression Mode Seeking via a Mean-Shift-Inspired Algorithm

    Authors: Wanli Qiao, Amarda Shehu

    Abstract: The mean shift (MS) algorithm is a nonparametric method used to cluster sample points and find the local modes of kernel density estimates, using an idea based on iterative gradient ascent. In this paper we develop a mean-shift-inspired algorithm to estimate the modes of regression functions and partition the sample points in the input space. We prove convergence of the sequences generated by the… ▽ More

    Submitted 20 April, 2021; originally announced April 2021.

    Comments: 44 pages, 4 figures

    MSC Class: 62G08

  25. arXiv:1404.7719  [pdf, other

    cs.AI

    An argumentation system for reasoning with conflict-minimal paraconsistent ALC

    Authors: Wenzhao Qiao, Nico Roos

    Abstract: The semantic web is an open and distributed environment in which it is hard to guarantee consistency of knowledge and information. Under the standard two-valued semantics everything is entailed if knowledge and information is inconsistent. The semantics of the paraconsistent logic LP offers a solution. However, if the available knowledge and information is consistent, the set of conclusions entail… ▽ More

    Submitted 30 April, 2014; originally announced April 2014.

    Journal ref: Proceedings of the 15th International Workshop on Non-Monotonic Reasoning (NMR 2014)