Skip to main content

Showing 1–17 of 17 results for author: Ling, N

.
  1. arXiv:2506.14114  [pdf

    cs.LG

    Evaluating Loss Functions for Graph Neural Networks: Towards Pretraining and Generalization

    Authors: Khushnood Abbas, Ruizhe Hou, Zhou Wengang, Dong Shi, Niu Ling, Satyaki Nan, Alireza Abbasi

    Abstract: Graph Neural Networks (GNNs) became useful for learning on non-Euclidean data. However, their best performance depends on choosing the right model architecture and the training objective, also called the loss function. Researchers have studied these parts separately, but a large-scale evaluation has not looked at how GNN models and many loss functions work together across different tasks. To fix t… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

    Comments: ACM single column 633 pages

  2. arXiv:2504.16953  [pdf, other

    eess.IV

    TVC: Tokenized Video Compression with Ultra-Low Bitrate

    Authors: Lebin Zhou, Cihan Ruan, Nam Ling, Wei Wang, Wei Jiang

    Abstract: Tokenized visual representations have shown great promise in image compression, yet their extension to video remains underexplored due to the challenges posed by complex temporal dynamics and stringent bitrate constraints. In this paper, we propose Tokenized Video Compression (TVC), the first token-based dual-stream video compression framework designed to operate effectively at ultra-low bitrates.… ▽ More

    Submitted 12 May, 2025; v1 submitted 22 April, 2025; originally announced April 2025.

  3. arXiv:2502.14190  [pdf, ps, other

    cs.CV eess.IV

    Stereo Image Coding for Machines with Joint Visual Feature Compression

    Authors: Dengchao Jin, Jianjun Lei, Bo Peng, Zhaoqing Pan, Nam Ling, Qingming Huang

    Abstract: 2D image coding for machines (ICM) has achieved great success in coding efficiency, while less effort has been devoted to stereo image fields. To promote the efficiency of stereo image compression (SIC) and intelligent analysis, the stereo image coding for machines (SICM) is formulated and explored in this paper. More specifically, a machine vision-oriented stereo feature compression network (MVSF… ▽ More

    Submitted 19 February, 2025; originally announced February 2025.

  4. arXiv:2412.18695  [pdf, other

    cs.RO cs.DC cs.LG

    TimelyLLM: Segmented LLM Serving System for Time-sensitive Robotic Applications

    Authors: Neiwen Ling, Guojun Chen, Lin Zhong

    Abstract: Large Language Models (LLMs) such as GPT-4 and Llama3 can already comprehend complex commands and process diverse tasks. This advancement facilitates their application in controlling drones and robots for various tasks. However, existing LLM serving systems typically employ a first-come, first-served (FCFS) batching mechanism, which fails to address the time-sensitive requirements of robotic appli… ▽ More

    Submitted 24 December, 2024; originally announced December 2024.

  5. arXiv:2411.16336  [pdf, other

    eess.IV cs.CV

    WTDUN: Wavelet Tree-Structured Sampling and Deep Unfolding Network for Image Compressed Sensing

    Authors: Kai Han, Jin Wang, Yunhui Shi, Hanqin Cai, Nam Ling, Baocai Yin

    Abstract: Deep unfolding networks have gained increasing attention in the field of compressed sensing (CS) owing to their theoretical interpretability and superior reconstruction performance. However, most existing deep unfolding methods often face the following issues: 1) they learn directly from single-channel images, leading to a simple feature representation that does not fully capture complex features;… ▽ More

    Submitted 25 November, 2024; originally announced November 2024.

    Comments: 20pages,Accepted by ACM Transactions on Multimedia Computing Communications and Applications (TOMM)

    Journal ref: ACM Transactions on Multimedia Computing Communications and Applications, 21(1): 33.1-33.22, 2024

  6. arXiv:2408.16866  [pdf, other

    cs.CV

    GameIR: A Large-Scale Synthesized Ground-Truth Dataset for Image Restoration over Gaming Content

    Authors: Lebin Zhou, Kun Han, Nam Ling, Wei Wang, Wei Jiang

    Abstract: Image restoration methods like super-resolution and image synthesis have been successfully used in commercial cloud gaming products like NVIDIA's DLSS. However, restoration over gaming content is not well studied by the general public. The discrepancy is mainly caused by the lack of ground-truth gaming training data that match the test cases. Due to the unique characteristics of gaming content, th… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

  7. arXiv:2405.09125  [pdf, other

    cs.CV cs.AI

    HAAP: Vision-context Hierarchical Attention Autoregressive with Adaptive Permutation for Scene Text Recognition

    Authors: Honghui Chen, Yuhang Qiu, Jiabao Wang, Pingping Chen, Nam Ling

    Abstract: Internal Language Model (LM)-based methods use permutation language modeling (PLM) to solve the error correction caused by conditional independence in external LM-based methods. However, random permutations of human interference cause fit oscillations in the model training, and Iterative Refinement (IR) operation to improve multimodal information decoupling also introduces additional overhead. To… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: 12 pages, 10 figures

    MSC Class: 68T01 ACM Class: I.2.10

  8. arXiv:2404.13786  [pdf, other

    eess.SY cs.AI cs.DC cs.LG

    Soar: Design and Deployment of A Smart Roadside Infrastructure System for Autonomous Driving

    Authors: Shuyao Shi, Neiwen Ling, Zhehao Jiang, Xuan Huang, Yuze He, Xiaoguang Zhao, Bufang Yang, Chen Bian, Jingfei Xia, Zhenyu Yan, Raymond Yeung, Guoliang Xing

    Abstract: Recently,smart roadside infrastructure (SRI) has demonstrated the potential of achieving fully autonomous driving systems. To explore the potential of infrastructure-assisted autonomous driving, this paper presents the design and deployment of Soar, the first end-to-end SRI system specifically designed to support autonomous driving systems. Soar consists of both software and hardware components ca… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  9. arXiv:2312.14950  [pdf, other

    cs.RO cs.AI cs.HC

    TypeFly: Flying Drones with Large Language Model

    Authors: Guojun Chen, Xiaojing Yu, Neiwen Ling, Lin Zhong

    Abstract: Recent advancements in robot control using large language models (LLMs) have demonstrated significant potential, primarily due to LLMs' capabilities to understand natural language commands and generate executable plans in various languages. However, in real-time and interactive applications involving mobile robots, particularly drones, the sequential token generation process inherent to LLMs intro… ▽ More

    Submitted 26 September, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

  10. arXiv:2311.10986  [pdf, other

    cs.LG

    EdgeFM: Leveraging Foundation Model for Open-set Learning on the Edge

    Authors: Bufang Yang, Lixing He, Neiwen Ling, Zhenyu Yan, Guoliang Xing, Xian Shuai, Xiaozhe Ren, Xin Jiang

    Abstract: Deep Learning (DL) models have been widely deployed on IoT devices with the help of advancements in DL algorithms and chips. However, the limited resources of edge devices make these on-device DL models hard to be generalizable to diverse environments and tasks. Although the recently emerged foundation models (FMs) show impressive generalization power, how to effectively leverage the rich knowledg… ▽ More

    Submitted 22 November, 2023; v1 submitted 18 November, 2023; originally announced November 2023.

    Comments: Accepted to the 21th ACM Conference on Embedded Networked Sensor Systems (SenSys 2023)

  11. arXiv:2309.04806  [pdf, other

    cs.CV cs.AI

    Timely Fusion of Surround Radar/Lidar for Object Detection in Autonomous Driving Systems

    Authors: Wenjing Xie, Tao Hu, Neiwen Ling, Guoliang Xing, Chun Jason Xue, Nan Guan

    Abstract: Fusing Radar and Lidar sensor data can fully utilize their complementary advantages and provide more accurate reconstruction of the surrounding for autonomous driving systems. Surround Radar/Lidar can provide 360-degree view sampling with the minimal cost, which are promising sensing hardware solutions for autonomous driving systems. However, due to the intrinsic physical constraints, the rotating… ▽ More

    Submitted 27 May, 2024; v1 submitted 9 September, 2023; originally announced September 2023.

  12. arXiv:2307.04339  [pdf, other

    cs.DC cs.AI

    Miriam: Exploiting Elastic Kernels for Real-time Multi-DNN Inference on Edge GPU

    Authors: Zhihe Zhao, Neiwen Ling, Nan Guan, Guoliang Xing

    Abstract: Many applications such as autonomous driving and augmented reality, require the concurrent running of multiple deep neural networks (DNN) that poses different levels of real-time performance requirements. However, coordinating multiple DNN tasks with varying levels of criticality on edge GPUs remains an area of limited study. Unlike server-level GPUs, edge GPUs are resource-limited and lack hardwa… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

  13. arXiv:2201.05752  [pdf, other

    cs.LG cs.PL

    Moses: Efficient Exploitation of Cross-device Transferable Features for Tensor Program Optimization

    Authors: Zhihe Zhao, Xian Shuai, Yang Bai, Neiwen Ling, Nan Guan, Zhenyu Yan, Guoliang Xing

    Abstract: Achieving efficient execution of machine learning models has attracted significant attention recently. To generate tensor programs efficiently, a key component of DNN compilers is the cost model that can predict the performance of each configuration on specific devices. However, due to the rapid emergence of hardware platforms, it is increasingly labor-intensive to train domain-specific predictors… ▽ More

    Submitted 14 January, 2022; originally announced January 2022.

  14. arXiv:2110.15569  [pdf, other

    cs.CV

    Novel View Synthesis from a Single Image via Unsupervised learning

    Authors: Bingzheng Liu, Jianjun Lei, Bo Peng, Chuanbo Yu, Wanqing Li, Nam Ling

    Abstract: View synthesis aims to generate novel views from one or more given source views. Although existing methods have achieved promising performance, they usually require paired views of different poses to learn a pixel transformation. This paper proposes an unsupervised network to learn such a pixel transformation from a single source viewpoint. In particular, the network consists of a token transforma… ▽ More

    Submitted 29 October, 2021; originally announced October 2021.

    Comments: 9 pages, submitted to TCSVT

  15. arXiv:2008.06940  [pdf, other

    cs.LG cs.SI

    TempNodeEmb:Temporal Node Embedding considering temporal edge influence matrix

    Authors: Khushnood Abbas, Alireza Abbasi, Dong Shi, Niu Ling, Mingsheng Shang, Chen Liong, Bolun Chen

    Abstract: Understanding the evolutionary patterns of real-world evolving complex systems such as human interactions, transport networks, biological interactions, and computer networks has important implications in our daily lives. Predicting future links among the nodes in such networks reveals an important aspect of the evolution of temporal networks. To analyse networks, they are mapped to adjacency matri… ▽ More

    Submitted 16 August, 2020; originally announced August 2020.

    Comments: IEEE double column 6 pages

  16. arXiv:2002.12521  [pdf, other

    eess.IV cs.LG cs.MM

    Improved Image Coding Autoencoder With Deep Learning

    Authors: Licheng Xiao, Hairong Wang, Nam Ling

    Abstract: In this paper, we build autoencoder based pipelines for extreme end-to-end image compression based on Ballé's approach, which is the state-of-the-art open source implementation in image compression using deep learning. We deepened the network by adding one more hidden layer before each strided convolutional layer with exactly the same number of down-samplings and up-samplings. Our approach outperf… ▽ More

    Submitted 27 February, 2020; originally announced February 2020.

  17. arXiv:1811.06679  [pdf, other

    cs.CV

    HSCS: Hierarchical Sparsity Based Co-saliency Detection for RGBD Images

    Authors: Runmin Cong, Jianjun Lei, Huazhu Fu, Qingming Huang, Xiaochun Cao, Nam Ling

    Abstract: Co-saliency detection aims to discover common and salient objects in an image group containing more than two relevant images. Moreover, depth information has been demonstrated to be effective for many computer vision tasks. In this paper, we propose a novel co-saliency detection method for RGBD images based on hierarchical sparsity reconstruction and energy function refinement. With the assistance… ▽ More

    Submitted 16 November, 2018; originally announced November 2018.

    Comments: 11 pages, 5 figures, Accepted by IEEE Transactions on Multimedia, https://rmcong.github.io/