Skip to main content

Showing 1–42 of 42 results for author: Diao, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.08614  [pdf, ps, other

    cs.CV

    WaveGuard: Robust Deepfake Detection and Source Tracing via Dual-Tree Complex Wavelet and Graph Neural Networks

    Authors: Ziyuan He, Zhiqing Guo, Liejun Wang, Gaobo Yang, Yunfeng Diao, Dan Ma

    Abstract: Deepfake technology poses increasing risks such as privacy invasion and identity theft. To address these threats, we propose WaveGuard, a proactive watermarking framework that enhances robustness and imperceptibility via frequency-domain embedding and graph-based structural consistency. Specifically, we embed watermarks into high-frequency sub-bands using Dual-Tree Complex Wavelet Transform (DT-CW… ▽ More

    Submitted 13 May, 2025; v1 submitted 13 May, 2025; originally announced May 2025.

    Comments: 11 pages, 5 figures, 4 tables

  2. arXiv:2504.11259  [pdf, ps, other

    cs.DB

    The Cambridge Report on Database Research

    Authors: Anastasia Ailamaki, Samuel Madden, Daniel Abadi, Gustavo Alonso, Sihem Amer-Yahia, Magdalena Balazinska, Philip A. Bernstein, Peter Boncz, Michael Cafarella, Surajit Chaudhuri, Susan Davidson, David DeWitt, Yanlei Diao, Xin Luna Dong, Michael Franklin, Juliana Freire, Johannes Gehrke, Alon Halevy, Joseph M. Hellerstein, Mark D. Hill, Stratos Idreos, Yannis Ioannidis, Christoph Koch, Donald Kossmann, Tim Kraska , et al. (21 additional authors not shown)

    Abstract: On October 19 and 20, 2023, the authors of this report convened in Cambridge, MA, to discuss the state of the database research field, its recent accomplishments and ongoing challenges, and future directions for research and community engagement. This gathering continues a long standing tradition in the database community, dating back to the late 1980s, in which researchers meet roughly every five… ▽ More

    Submitted 15 April, 2025; originally announced April 2025.

  3. arXiv:2504.04818  [pdf, other

    cs.CV

    SUEDE:Shared Unified Experts for Physical-Digital Face Attack Detection Enhancement

    Authors: Zuying Xie, Changtao Miao, Ajian Liu, Jiabao Guo, Feng Li, Dan Guo, Yunfeng Diao

    Abstract: Face recognition systems are vulnerable to physical attacks (e.g., printed photos) and digital threats (e.g., DeepFake), which are currently being studied as independent visual tasks, such as Face Anti-Spoofing and Forgery Detection. The inherent differences among various attack types present significant challenges in identifying a common feature space, making it difficult to develop a unified fra… ▽ More

    Submitted 7 April, 2025; originally announced April 2025.

    Comments: Accepted in ICME 2025

  4. arXiv:2504.04470  [pdf, other

    cs.CV

    Domain Generalization for Face Anti-spoofing via Content-aware Composite Prompt Engineering

    Authors: Jiabao Guo, Ajian Liu, Yunfeng Diao, Jin Zhang, Hui Ma, Bo Zhao, Richang Hong, Meng Wang

    Abstract: The challenge of Domain Generalization (DG) in Face Anti-Spoofing (FAS) is the significant interference of domain-specific signals on subtle spoofing clues. Recently, some CLIP-based algorithms have been developed to alleviate this interference by adjusting the weights of visual classifiers. However, our analysis of this class-wise prompt engineering suffers from two shortcomings for DG FAS: (1) T… ▽ More

    Submitted 6 April, 2025; originally announced April 2025.

  5. arXiv:2503.23060  [pdf, other

    cs.LG

    Unsupervised Anomaly Detection in Multivariate Time Series across Heterogeneous Domains

    Authors: Vincent Jacob, Yanlei Diao

    Abstract: The widespread adoption of digital services, along with the scale and complexity at which they operate, has made incidents in IT operations increasingly more likely, diverse, and impactful. This has led to the rapid development of a central aspect of "Artificial Intelligence for IT Operations" (AIOps), focusing on detecting anomalies in vast amounts of multivariate time series data generated by se… ▽ More

    Submitted 29 March, 2025; originally announced March 2025.

  6. arXiv:2503.08661  [pdf, other

    cs.IT cs.CV eess.IV

    Task-Oriented Co-Design of Communication, Computing, and Control for Edge-Enabled Industrial Cyber-Physical Systems

    Authors: Yufeng Diao, Yichi Zhang, Daniele De Martini, Philip Guodong Zhao, Emma Liying Li

    Abstract: This paper proposes a task-oriented co-design framework that integrates communication, computing, and control to address the key challenges of bandwidth limitations, noise interference, and latency in mission-critical industrial Cyber-Physical Systems (CPS). To improve communication efficiency and robustness, we design a task-oriented Joint Source-Channel Coding (JSCC) using Information Bottleneck… ▽ More

    Submitted 11 March, 2025; originally announced March 2025.

    Comments: This paper has been accepted for publication in IEEE Journal on Selected Areas in Communications (JSAC), with publication expected in 2025

  7. Leveraging Large Language Models For Optimized Item Categorization using UNSPSC Taxonomy

    Authors: Anmolika Singh, Yuhang Diao

    Abstract: Effective item categorization is vital for businesses, enabling the transformation of unstructured datasets into organized categories that streamline inventory management. Despite its importance, item categorization remains highly subjective and lacks a uniform standard across industries and businesses. The United Nations Standard Products and Services Code (UNSPSC) provides a standardized system… ▽ More

    Submitted 27 December, 2024; originally announced March 2025.

    Comments: 10 Pages, International Conference on NLP, AI, Computer Science & Engineering (NLAICSE 2024), December 2024, ISBN : 978-1-923107-45-8

    Journal ref: International Journal on Cybernetics & Informatics. 13. (2024)

  8. arXiv:2502.15472  [pdf, other

    cs.IT cs.CV eess.IV

    Aligning Task- and Reconstruction-Oriented Communications for Edge Intelligence

    Authors: Yufeng Diao, Yichi Zhang, Changyang She, Philip Guodong Zhao, Emma Liying Li

    Abstract: Existing communication systems aim to reconstruct the information at the receiver side, and are known as reconstruction-oriented communications. This approach often falls short in meeting the real-time, task-specific demands of modern AI-driven applications such as autonomous driving and semantic segmentation. As a new design principle, task-oriented communications have been developed. However, it… ▽ More

    Submitted 21 February, 2025; originally announced February 2025.

    Comments: Accepted for publication in IEEE Journal on Selected Areas in Communications (JSAC)

  9. arXiv:2502.04377  [pdf, other

    cs.CV cs.AI

    MapFusion: A Novel BEV Feature Fusion Network for Multi-modal Map Construction

    Authors: Xiaoshuai Hao, Yunfeng Diao, Mengchuan Wei, Yifan Yang, Peng Hao, Rong Yin, Hui Zhang, Weiming Li, Shu Zhao, Yu Liu

    Abstract: Map construction task plays a vital role in providing precise and comprehensive static environmental information essential for autonomous driving systems. Primary sensors include cameras and LiDAR, with configurations varying between camera-only, LiDAR-only, or camera-LiDAR fusion, based on cost-performance considerations. While fusion-based methods typically perform best, existing approaches ofte… ▽ More

    Submitted 5 February, 2025; originally announced February 2025.

  10. arXiv:2412.16483  [pdf, other

    cs.LG physics.chem-ph q-bio.BM

    MOL-Mamba: Enhancing Molecular Representation with Structural & Electronic Insights

    Authors: Jingjing Hu, Dan Guo, Zhan Si, Deguang Liu, Yunfeng Diao, Jing Zhang, Jinxing Zhou, Meng Wang

    Abstract: Molecular representation learning plays a crucial role in various downstream tasks, such as molecular property prediction and drug design. To accurately represent molecules, Graph Neural Networks (GNNs) and Graph Transformers (GTs) have shown potential in the realm of self-supervised pretraining. However, existing approaches often overlook the relationship between molecular structure and electroni… ▽ More

    Submitted 5 February, 2025; v1 submitted 20 December, 2024; originally announced December 2024.

    Comments: Accepted by AAAI2025

  11. arXiv:2412.07229  [pdf, other

    cs.LG cs.CV

    Moderating the Generalization of Score-based Generative Model

    Authors: Wan Jiang, He Wang, Xin Zhang, Dan Guo, Zhaoxin Fan, Yunfeng Diao, Richang Hong

    Abstract: Score-based Generative Models (SGMs) have demonstrated remarkable generalization abilities, e.g. generating unseen, but natural data. However, the greater the generalization power, the more likely the unintended generalization, and the more dangerous the abuse. Research on moderated generalization in SGMs remains limited. To fill this gap, we first examine the current 'gold standard' in Machine Un… ▽ More

    Submitted 10 December, 2024; originally announced December 2024.

  12. arXiv:2410.17986  [pdf, other

    cs.LG cs.AI cs.CR

    Federated Transformer: Multi-Party Vertical Federated Learning on Practical Fuzzily Linked Data

    Authors: Zhaomin Wu, Junyi Hou, Yiqun Diao, Bingsheng He

    Abstract: Federated Learning (FL) is an evolving paradigm that enables multiple parties to collaboratively train models without sharing raw data. Among its variants, Vertical Federated Learning (VFL) is particularly relevant in real-world, cross-organizational collaborations, where distinct features of a shared instance group are contributed by different parties. In these scenarios, parties are often linked… ▽ More

    Submitted 23 October, 2024; originally announced October 2024.

    Journal ref: 38th Conference on Neural Information Processing Systems (NeurIPS 2024)

  13. arXiv:2410.02082  [pdf, other

    cs.LG q-bio.QM

    FARM: Functional Group-Aware Representations for Small Molecules

    Authors: Thao Nguyen, Kuan-Hao Huang, Ge Liu, Martin D. Burke, Ying Diao, Heng Ji

    Abstract: We introduce Functional Group-Aware Representations for Small Molecules (FARM), a novel foundation model designed to bridge the gap between SMILES, natural language, and molecular graphs. The key innovation of FARM lies in its functional group-aware tokenization, which directly incorporates functional group information into the representations. This strategic reduction in tokenization granularity… ▽ More

    Submitted 6 October, 2024; v1 submitted 2 October, 2024; originally announced October 2024.

    Comments: Preprint

  14. arXiv:2409.06712  [pdf, other

    cs.CY

    A Meta-analysis of College Students' Intention to Use Generative Artificial Intelligence

    Authors: Yifei Diao, Ziyi Li, Jiateng Zhou, Wei Gao, Xin Gong

    Abstract: It is of critical importance to analyse the factors influencing college students' intention to use generative artificial intelligence (GenAI) to understand and predict learners' learning behaviours and academic outcomes. Nevertheless, a lack of congruity has been shown in extant research results. This study, therefore, conducted a meta-analysis of 27 empirical studies under an integrated theoretic… ▽ More

    Submitted 25 August, 2024; originally announced September 2024.

  15. arXiv:2409.02483  [pdf, other

    cs.CV cs.AI

    TASAR: Transfer-based Attack on Skeletal Action Recognition

    Authors: Yunfeng Diao, Baiqi Wu, Ruixuan Zhang, Ajian Liu, Xiaoshuai Hao, Xingxing Wei, Meng Wang, He Wang

    Abstract: Skeletal sequence data, as a widely employed representation of human actions, are crucial in Human Activity Recognition (HAR). Recently, adversarial attacks have been proposed in this area, which exposes potential security concerns, and more importantly provides a good tool for model robustness test. Within this research, transfer-based attack is an important tool as it mimics the real-world scena… ▽ More

    Submitted 12 February, 2025; v1 submitted 4 September, 2024; originally announced September 2024.

    Comments: Accepted in ICLR 2025

  16. arXiv:2407.20836  [pdf, other

    cs.CV cs.CR

    Vulnerabilities in AI-generated Image Detection: The Challenge of Adversarial Attacks

    Authors: Yunfeng Diao, Naixin Zhai, Changtao Miao, Zitong Yu, Xingxing Wei, Xun Yang, Meng Wang

    Abstract: Recent advancements in image synthesis, particularly with the advent of GAN and Diffusion models, have amplified public concerns regarding the dissemination of disinformation. To address such concerns, numerous AI-generated Image (AIGI) Detectors have been proposed and achieved promising performance in identifying fake images. However, there still lacks a systematic understanding of the adversaria… ▽ More

    Submitted 10 March, 2025; v1 submitted 30 July, 2024; originally announced July 2024.

  17. arXiv:2407.08572   

    cs.CV

    Boosting Adversarial Transferability for Skeleton-based Action Recognition via Exploring the Model Posterior Space

    Authors: Yunfeng Diao, Baiqi Wu, Ruixuan Zhang, Xun Yang, Meng Wang, He Wang

    Abstract: Skeletal motion plays a pivotal role in human activity recognition (HAR). Recently, attack methods have been proposed to identify the universal vulnerability of skeleton-based HAR(S-HAR). However, the research of adversarial transferability on S-HAR is largely missing. More importantly, existing attacks all struggle in transfer across unknown S-HAR models. We observed that the key reason is that t… ▽ More

    Submitted 5 September, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

    Comments: We have submitted a new version of our work at arXiv:2409.02483. This version, arXiv:2407.08572, is no longer valid. Any update for this work will be conducted in arXiv:2409.02483

  18. arXiv:2405.14203  [pdf, other

    cs.LG cs.AI physics.chem-ph

    GLaD: Synergizing Molecular Graphs and Language Descriptors for Enhanced Power Conversion Efficiency Prediction in Organic Photovoltaic Devices

    Authors: Thao Nguyen, Tiara Torres-Flores, Changhyun Hwang, Carl Edwards, Ying Diao, Heng Ji

    Abstract: This paper presents a novel approach for predicting Power Conversion Efficiency (PCE) of Organic Photovoltaic (OPV) devices, called GLaD: synergizing molecular Graphs and Language Descriptors for enhanced PCE prediction. Due to the lack of high-quality experimental data, we collect a dataset consisting of 500 pairs of OPV donor and acceptor molecules along with their corresponding PCE values, whic… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: In progress

  19. A Spark Optimizer for Adaptive, Fine-Grained Parameter Tuning

    Authors: Chenghao Lyu, Qi Fan, Philippe Guyard, Yanlei Diao

    Abstract: As Spark becomes a common big data analytics platform, its growing complexity makes automatic tuning of numerous parameters critical for performance. Our work on Spark parameter tuning is particularly motivated by two recent trends: Spark's Adaptive Query Execution (AQE) based on runtime statistics, and the increasingly popular Spark cloud deployments that make cost-performance reasoning crucial f… ▽ More

    Submitted 18 July, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

    Journal ref: PVLDB, 15(11): 3098-3111, 2022

  20. arXiv:2312.06290  [pdf, other

    cs.LG

    Exploiting Label Skews in Federated Learning with Model Concatenation

    Authors: Yiqun Diao, Qinbin Li, Bingsheng He

    Abstract: Federated Learning (FL) has emerged as a promising solution to perform deep learning on different data owners without exchanging raw data. However, non-IID data has been a key challenge in FL, which could significantly degrade the accuracy of the final model. Among different non-IID types, label skews have been challenging and common in image classification and other tasks. Instead of averaging th… ▽ More

    Submitted 16 December, 2023; v1 submitted 11 December, 2023; originally announced December 2023.

  21. arXiv:2309.05622  [pdf, other

    cs.RO eess.SY

    Task-Oriented Cross-System Design for Timely and Accurate Modeling in the Metaverse

    Authors: Zhen Meng, Kan Chen, Yufeng Diao, Changyang She, Guodong Zhao, Muhammad Ali Imran, Branka Vucetic

    Abstract: In this paper, we establish a task-oriented cross-system design framework to minimize the required packet rate for timely and accurate modeling of a real-world robotic arm in the Metaverse, where sensing, communication, prediction, control, and rendering are considered. To optimize a scheduling policy and prediction horizons, we design a Constraint Proximal Policy Optimization(C-PPO) algorithm by… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

    Comments: This paper is accepted by IEEE Journal on Selected Areas in Communications, JSAC-SI-HCM 2024

  22. arXiv:2308.15059  [pdf, other

    cs.LG cs.DB

    OEBench: Investigating Open Environment Challenges in Real-World Relational Data Streams

    Authors: Yiqun Diao, Yutong Yang, Qinbin Li, Bingsheng He, Mian Lu

    Abstract: How to get insights from relational data streams in a timely manner is a hot research topic. Data streams can present unique challenges, such as distribution drifts, outliers, emerging classes, and changing features, which have recently been described as open environment challenges for machine learning. While existing studies have been done on incremental learning for data streams, their evaluatio… ▽ More

    Submitted 15 December, 2023; v1 submitted 29 August, 2023; originally announced August 2023.

  23. arXiv:2306.16979  [pdf, other

    cs.CV cs.CR

    Post-train Black-box Defense via Bayesian Boundary Correction

    Authors: He Wang, Yunfeng Diao

    Abstract: Classifiers based on deep neural networks are susceptible to adversarial attack, where the widely existing vulnerability has invoked the research in defending them from potential threats. Given a vulnerable classifier, existing defense methods are mostly white-box and often require re-training the victim under modified loss functions/training regimes. While the model/data/training specifics of the… ▽ More

    Submitted 11 June, 2024; v1 submitted 29 June, 2023; originally announced June 2023.

    Comments: arXiv admin note: text overlap with arXiv:2203.04713

  24. arXiv:2305.09241  [pdf, other

    cs.LG cs.CR cs.CV

    Unlearnable Examples Give a False Sense of Security: Piercing through Unexploitable Data with Learnable Examples

    Authors: Wan Jiang, Yunfeng Diao, He Wang, Jianxin Sun, Meng Wang, Richang Hong

    Abstract: Safeguarding data from unauthorized exploitation is vital for privacy and security, especially in recent rampant research in security breach such as adversarial/membership attacks. To this end, \textit{unlearnable examples} (UEs) have been recently proposed as a compelling protection, by adding imperceptible perturbation to data so that models trained on them cannot classify them accurately on ori… ▽ More

    Submitted 3 October, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

    Comments: Accepted in MM 2023

  25. arXiv:2211.11312  [pdf, other

    cs.CV

    Understanding the Vulnerability of Skeleton-based Human Activity Recognition via Black-box Attack

    Authors: Yunfeng Diao, He Wang, Tianjia Shao, Yong-Liang Yang, Kun Zhou, David Hogg, Meng Wang

    Abstract: Human Activity Recognition (HAR) has been employed in a wide range of applications, e.g. self-driving cars, where safety and lives are at stake. Recently, the robustness of skeleton-based HAR methods have been questioned due to their vulnerability to adversarial attacks. However, the proposed attacks require the full-knowledge of the attacked classifier, which is overly restrictive. In this paper,… ▽ More

    Submitted 6 May, 2024; v1 submitted 21 November, 2022; originally announced November 2022.

    Comments: Accepted in Pattern Recognition. arXiv admin note: substantial text overlap with arXiv:2103.05266

  26. Fine-Grained Modeling and Optimization for Intelligent Resource Management in Big Data Processing

    Authors: Chenghao Lyu, Qi Fan, Fei Song, Arnab Sinha, Yanlei Diao, Wei Chen, Li Ma, Yihui Feng, Yaliang Li, Kai Zeng, Jingren Zhou

    Abstract: Big data processing at the production scale presents a highly complex environment for resource optimization (RO), a problem crucial for meeting performance goals and budgetary constraints of analytical users. The RO problem is challenging because it involves a set of decisions (the partition count, placement of parallel instances on machines, and resource allocation to each instance), requires mul… ▽ More

    Submitted 9 July, 2022; v1 submitted 5 July, 2022; originally announced July 2022.

    Journal ref: PVLDB, 17(11): 3565-3579, 2024

  27. arXiv:2203.04713  [pdf, other

    cs.CV

    Defending Black-box Skeleton-based Human Activity Classifiers

    Authors: He Wang, Yunfeng Diao, Zichang Tan, Guodong Guo

    Abstract: Skeletal motions have been heavily replied upon for human activity recognition (HAR). Recently, a universal vulnerability of skeleton-based HAR has been identified across a variety of classifiers and data, calling for mitigation. To this end, we propose the first black-box defense method for skeleton-based HAR to our best knowledge. Our method is featured by full Bayesian treatments of the clean d… ▽ More

    Submitted 2 December, 2022; v1 submitted 9 March, 2022; originally announced March 2022.

    Comments: Accepted in AAAI 2023

  28. arXiv:2203.00595  [pdf

    physics.med-ph cs.LG physics.bio-ph q-bio.QM

    Parameter estimation for WMTI-Watson model of white matter using encoder-decoder recurrent neural network

    Authors: Yujian Diao, Ileana Ozana Jelescu

    Abstract: Biophysical modelling of the diffusion MRI signal provides estimates of specific microstructural tissue properties. Although nonlinear optimization such as non-linear least squares (NLLS) is the most widespread method for model estimation, it suffers from local minima and high computational cost. Deep Learning approaches are steadily replacing NL fitting, but come with the limitation that the mode… ▽ More

    Submitted 2 March, 2022; v1 submitted 1 March, 2022; originally announced March 2022.

    Journal ref: Magn Reson Med. 2022;1-14

  29. arXiv:2107.08709  [pdf, other

    cs.AR

    ZIPPER: Exploiting Tile- and Operator-level Parallelism for General and Scalable Graph Neural Network Acceleration

    Authors: Zhihui Zhang, Jingwen Leng, Shuwen Lu, Youshan Miao, Yijia Diao, Minyi Guo, Chao Li, Yuhao Zhu

    Abstract: Graph neural networks (GNNs) start to gain momentum after showing significant performance improvement in a variety of domains including molecular science, recommendation, and transportation. Turning such performance improvement of GNNs into practical applications relies on effective and efficient execution, especially for inference. However, neither CPU nor GPU can meet these needs if considering… ▽ More

    Submitted 19 July, 2021; originally announced July 2021.

    Comments: 11 pages

  30. Efficient Exploration of Interesting Aggregates in RDF Graphs

    Authors: Yanlei Diao, Paweł Guzewicz, Ioana Manolescu, Mirjana Mazuran

    Abstract: As large Open Data are increasingly shared as RDF graphs today, there is a growing demand to help users discover the most interesting facets of a graph, which are often hard to grasp without automatic tools. We consider the problem of automatically identifying the k most interesting aggregate queries that can be evaluated on an RDF graph, given an integer k and a user-specified interestingness fun… ▽ More

    Submitted 31 March, 2021; originally announced March 2021.

    Comments: Accepted for publication in proceedings of the 2021 International Conference on Management of Data (SIGMOD '21), June 20--25, 2021, Virtual Event, China

  31. arXiv:2103.05266  [pdf, other

    cs.CV cs.AI

    BASAR:Black-box Attack on Skeletal Action Recognition

    Authors: Yunfeng Diao, Tianjia Shao, Yong-Liang Yang, Kun Zhou, He Wang

    Abstract: Skeletal motion plays a vital role in human activity recognition as either an independent data source or a complement. The robustness of skeleton-based activity recognizers has been questioned recently, which shows that they are vulnerable to adversarial attacks when the full-knowledge of the recognizer is accessible to the attacker. However, this white-box requirement is overly restrictive in mos… ▽ More

    Submitted 25 July, 2021; v1 submitted 9 March, 2021; originally announced March 2021.

    Comments: Accepted in CVPR 2021

  32. arXiv:2102.02079  [pdf, other

    cs.LG cs.DC

    Federated Learning on Non-IID Data Silos: An Experimental Study

    Authors: Qinbin Li, Yiqun Diao, Quan Chen, Bingsheng He

    Abstract: Due to the increasing privacy concerns and data regulations, training data have been increasingly fragmented, forming distributed databases of multiple "data silos" (e.g., within different organizations and countries). To develop effective machine learning services, there is a must to exploit data from such distributed databases without exchanging the raw data. Recently, federated learning (FL) ha… ▽ More

    Submitted 28 October, 2021; v1 submitted 3 February, 2021; originally announced February 2021.

  33. arXiv:2101.08167  [pdf, other

    cs.DC cs.DB cs.LG

    Neural-based Modeling for Performance Tuning of Spark Data Analytics

    Authors: Khaled Zaouk, Fei Song, Chenghao Lyu, Yanlei Diao

    Abstract: Cloud data analytics has become an integral part of enterprise business operations for data-driven insight discovery. Performance modeling of cloud data analytics is crucial for performance tuning and other critical operations in the cloud. Traditional modeling techniques fail to adapt to the high degree of diversity in workloads and system behaviors in this domain. In this paper, we bring recent… ▽ More

    Submitted 20 January, 2021; originally announced January 2021.

  34. arXiv:2010.05073  [pdf, other

    cs.LG cs.DB

    Exathlon: A Benchmark for Explainable Anomaly Detection over Time Series

    Authors: Vincent Jacob, Fei Song, Arnaud Stiegler, Bijan Rad, Yanlei Diao, Nesime Tatbul

    Abstract: Access to high-quality data repositories and benchmarks have been instrumental in advancing the state of the art in many experimental research domains. While advanced analytics tasks over time series data have been gaining lots of attention, lack of such community resources severely limits scientific progress. In this paper, we present Exathlon, the first comprehensive public benchmark for explain… ▽ More

    Submitted 5 September, 2021; v1 submitted 10 October, 2020; originally announced October 2020.

  35. arXiv:2008.05079  [pdf, other

    cs.CV

    BiHand: Recovering Hand Mesh with Multi-stage Bisected Hourglass Networks

    Authors: Lixin Yang, Jiasen Li, Wenqiang Xu, Yiqun Diao, Cewu Lu

    Abstract: 3D hand estimation has been a long-standing research topic in computer vision. A recent trend aims not only to estimate the 3D hand joint locations but also to recover the mesh model. However, achieving those goals from a single RGB image remains challenging. In this paper, we introduce an end-to-end learnable model, BiHand, which consists of three cascaded stages, namely 2D seeding stage, 3D lift… ▽ More

    Submitted 11 August, 2020; originally announced August 2020.

    Comments: To appear on BMVC2020

  36. arXiv:2005.03314  [pdf, other

    cs.DB cs.DC

    Boosting Cloud Data Analytics using Multi-Objective Optimization

    Authors: Fei Song, Khaled Zaouk, Chenghao Lyu, Arnab Sinha, Qi Fan, Yanlei Diao, Prashant Shenoy

    Abstract: Data analytics in the cloud has become an integral part of enterprise businesses. Big data analytics systems, however, still lack the ability to take user performance goals and budgetary constraints for a task, collectively referred to as task objectives, and automatically configure an analytic job to achieve these objectives. This paper presents a data analytics optimizer that can automatically d… ▽ More

    Submitted 7 May, 2020; originally announced May 2020.

  37. arXiv:1911.01220  [pdf, other

    cs.LG eess.IV physics.med-ph stat.ML

    Learning-based estimation of dielectric properties and tissue density in head models for personalized radio-frequency dosimetry

    Authors: Essam A. Rashed, Yinliang Diao, Akimasa Hirata

    Abstract: Radio-frequency dosimetry is an important process in human safety and for compliance of related products. Recently, computational human models generated from medical images have often been used for such assessment, especially to consider the inter-variability of subjects. However, the common procedure to develop personalized models is time consuming because it involves excessive segmentation of se… ▽ More

    Submitted 4 February, 2020; v1 submitted 4 November, 2019; originally announced November 2019.

    Comments: 18 pages, 10 figures, 4 tables

    Journal ref: Physics in Medicine and Biology 65, pp. 065001, 2020

  38. arXiv:1510.08897  [pdf, other

    cs.DB cs.IR

    AIDE: An Automated Sample-based Approach for Interactive Data Exploration

    Authors: Kyriaki Dimitriadou, Olga Papaemmanouil, Yanlei Diao

    Abstract: In this paper, we argue that database systems be augmented with an automated data exploration service that methodically steers users through the data in a meaningful way. Such an automated system is crucial for deriving insights from complex datasets found in many big data applications such as scientific and healthcare applications as well as for reducing the human effort of data exploration. Towa… ▽ More

    Submitted 29 October, 2015; originally announced October 2015.

    Comments: 14 pages

  39. arXiv:1301.2648  [pdf, ps, other

    cs.IT cs.DC cs.NI

    A New Distributed Localization Method for Sensor Networks

    Authors: Yingfei Diao, Zhiyun Lin, Minyue Fu, Huanshui Zhang

    Abstract: This paper studies the problem of determining the sensor locations in a large sensor network using relative distance (range) measurements only. Our work follows from a seminal paper by Khan et al. [1] where a distributed algorithm, known as DILOC, for sensor localization is given using the barycentric coordinate. A main limitation of the DILOC algorithm is that all sensor nodes must be inside the… ▽ More

    Submitted 12 January, 2013; originally announced January 2013.

  40. arXiv:1103.4410  [pdf

    cs.DB

    Distributed Inference and Query Processing for RFID Tracking and Monitoring

    Authors: Zhao Cao, Charles Sutton, Yanlei Diao, Prashant Shenoy

    Abstract: In this paper, we present the design of a scalable, distributed stream processing system for RFID tracking and monitoring. Since RFID data lacks containment and location information that is key to query processing, we propose to combine location and containment inference with stream query processing in a single architecture, with inference as an enabling mechanism for high-level query processing.… ▽ More

    Submitted 22 March, 2011; originally announced March 2011.

    Comments: VLDB2011

    Journal ref: Proceedings of the VLDB Endowment (PVLDB), Vol. 4, No. 5, pp. 326-337 (2011)

  41. arXiv:0909.1777  [pdf

    cs.DB

    Capturing Data Uncertainty in High-Volume Stream Processing

    Authors: Yanlei Diao, Boduo Li, Anna Liu, Liping Peng, Charles Sutton, Thanh Tran, Michael Zink

    Abstract: We present the design and development of a data stream system that captures data uncertainty from data collection to query processing to final result generation. Our system focuses on data that is naturally modeled as continuous random variables. For such data, our system employs an approach grounded in probability and statistical theory to capture data uncertainty and integrates this approach i… ▽ More

    Submitted 9 September, 2009; originally announced September 2009.

    Comments: CIDR 2009

  42. arXiv:cs/0612128  [pdf

    cs.DB

    SASE: Complex Event Processing over Streams

    Authors: Daniel Gyllstrom, Eugene Wu, Hee-Jin Chae, Yanlei Diao, Patrick Stahlberg, Gordon Anderson

    Abstract: RFID technology is gaining adoption on an increasing scale for tracking and monitoring purposes. Wide deployments of RFID devices will soon generate an unprecedented volume of data. Emerging applications require the RFID data to be filtered and correlated for complex pattern detection and transformed to events that provide meaningful, actionable information to end applications. In this work, we… ▽ More

    Submitted 22 December, 2006; originally announced December 2006.

    Comments: This article is published under a Creative Commons License Agreement (http://creativecommons.org/licenses/by/2.5/.) You may copy, distribute, display, and perform the work, make derivative works and make commercial use of the work, but, you must attribute the work to the author and CIDR 2007. 3rd Biennial Conference on Innovative Data Systems Research (CIDR) January 710, 2007, Asilomar, California, USA