Skip to main content

Showing 1–15 of 15 results for author: Teng, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.13211  [pdf, ps, other

    cs.CV cs.AI

    MAGI-1: Autoregressive Video Generation at Scale

    Authors: Sand. ai, Hansi Teng, Hongyu Jia, Lei Sun, Lingzhi Li, Maolin Li, Mingqiu Tang, Shuai Han, Tianning Zhang, W. Q. Zhang, Weifeng Luo, Xiaoyang Kang, Yuchen Sun, Yue Cao, Yunpeng Huang, Yutong Lin, Yuxin Fang, Zewei Tao, Zheng Zhang, Zhongshu Wang, Zixun Liu, Dai Shi, Guoli Su, Hanwen Sun, Hong Pan , et al. (14 additional authors not shown)

    Abstract: We present MAGI-1, a world model that generates videos by autoregressively predicting a sequence of video chunks, defined as fixed-length segments of consecutive frames. Trained to denoise per-chunk noise that increases monotonically over time, MAGI-1 enables causal temporal modeling and naturally supports streaming generation. It achieves strong performance on image-to-video (I2V) tasks condition… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

  2. arXiv:2412.02899  [pdf, other

    cs.RO

    Adaptive LiDAR Odometry and Mapping for Autonomous Agricultural Mobile Robots in Unmanned Farms

    Authors: Hanzhe Teng, Yipeng Wang, Dimitrios Chatziparaschis, Konstantinos Karydis

    Abstract: Unmanned and intelligent agricultural systems are crucial for enhancing agricultural efficiency and for helping mitigate the effect of labor shortage. However, unlike urban environments, agricultural fields impose distinct and unique challenges on autonomous robotic systems, such as the unstructured and dynamic nature of the environment, the rough and uneven terrain, and the resulting non-smooth r… ▽ More

    Submitted 3 December, 2024; originally announced December 2024.

  3. arXiv:2409.15278  [pdf, other

    cs.CV

    PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions

    Authors: Weifeng Lin, Xinyu Wei, Renrui Zhang, Le Zhuo, Shitian Zhao, Siyuan Huang, Huan Teng, Junlin Xie, Yu Qiao, Peng Gao, Hongsheng Li

    Abstract: This paper presents a versatile image-to-image visual assistant, PixWizard, designed for image generation, manipulation, and translation based on free-from language instructions. To this end, we tackle a variety of vision tasks into a unified image-text-to-image generation framework and curate an Omni Pixel-to-Pixel Instruction-Tuning Dataset. By constructing detailed instruction templates in natu… ▽ More

    Submitted 27 February, 2025; v1 submitted 23 September, 2024; originally announced September 2024.

    Comments: Code is released at https://github.com/AFeng-x/PixWizard

  4. arXiv:2405.06181  [pdf, other

    cs.CV cs.RO

    Residual-NeRF: Learning Residual NeRFs for Transparent Object Manipulation

    Authors: Bardienus P. Duisterhof, Yuemin Mao, Si Heng Teng, Jeffrey Ichnowski

    Abstract: Transparent objects are ubiquitous in industry, pharmaceuticals, and households. Grasping and manipulating these objects is a significant challenge for robots. Existing methods have difficulty reconstructing complete depth maps for challenging transparent objects, leaving holes in the depth reconstruction. Recent work has shown neural radiance fields (NeRFs) work well for depth perception in scene… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  5. arXiv:2404.02516  [pdf, other

    cs.RO

    On-the-Go Tree Detection and Geometric Traits Estimation with Ground Mobile Robots in Fruit Tree Groves

    Authors: Dimitrios Chatziparaschis, Hanzhe Teng, Yipeng Wang, Pamodya Peiris, Elia Scudiero, Konstantinos Karydis

    Abstract: By-tree information gathering is an essential task in precision agriculture achieved by ground mobile sensors, but it can be time- and labor-intensive. In this paper we present an algorithmic framework to perform real-time and on-the-go detection of trees and key geometric characteristics (namely, width and height) with wheeled mobile robots in the field. Our method is based on the fusion of 2D do… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: 7 pages, 6 figures

  6. arXiv:2402.00362  [pdf

    physics.ao-ph cs.AI

    Climate Trends of Tropical Cyclone Intensity and Energy Extremes Revealed by Deep Learning

    Authors: Buo-Fu Chen, Boyo Chen, Chun-Min Hsiao, Hsu-Feng Teng, Cheng-Shang Lee, Hung-Chi Kuo

    Abstract: Anthropogenic influences have been linked to tropical cyclone (TC) poleward migration, TC extreme precipitation, and an increased proportion of major hurricanes [1, 2, 3, 4]. Understanding past TC trends and variability is critical for projecting future TC impacts on human society considering the changing climate [5]. However, past trends of TC structure/energy remain uncertain due to limited obse… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

    Comments: 41 pages

  7. arXiv:2309.15332  [pdf, other

    cs.RO cs.CV

    Multimodal Dataset for Localization, Mapping and Crop Monitoring in Citrus Tree Farms

    Authors: Hanzhe Teng, Yipeng Wang, Xiaoao Song, Konstantinos Karydis

    Abstract: In this work we introduce the CitrusFarm dataset, a comprehensive multimodal sensory dataset collected by a wheeled mobile robot operating in agricultural fields. The dataset offers stereo RGB images with depth information, as well as monochrome, near-infrared and thermal images, presenting diverse spectral responses crucial for agricultural research. Furthermore, it provides a range of navigation… ▽ More

    Submitted 28 September, 2023; v1 submitted 26 September, 2023; originally announced September 2023.

    Comments: Accepted to the 18th International Symposium on Visual Computing (ISVC 2023)

  8. arXiv:2306.10543  [pdf, other

    cs.CL

    UniMC: A Unified Framework for Long-Term Memory Conversation via Relevance Representation Learning

    Authors: Kang Zhao, Wei Liu, Jian Luan, Minglei Gao, Li Qian, Hanlin Teng, Bin Wang

    Abstract: Open-domain long-term memory conversation can establish long-term intimacy with humans, and the key is the ability to understand and memorize long-term dialogue history information. Existing works integrate multiple models for modelling through a pipeline, which ignores the coupling between different stages. In this paper, we propose a Unified framework for Long-term Memory Conversations (UniMC),… ▽ More

    Submitted 18 June, 2023; originally announced June 2023.

  9. arXiv:2210.02511  [pdf, other

    cs.CV cs.RO

    TartanCalib: Iterative Wide-Angle Lens Calibration using Adaptive SubPixel Refinement of AprilTags

    Authors: Bardienus P Duisterhof, Yaoyu Hu, Si Heng Teng, Michael Kaess, Sebastian Scherer

    Abstract: Wide-angle cameras are uniquely positioned for mobile robots, by virtue of the rich information they provide in a small, light, and cost-effective form factor. An accurate calibration of the intrinsics and extrinsics is a critical pre-requisite for using the edge of a wide-angle lens for depth perception and odometry. Calibrating wide-angle lenses with current state-of-the-art techniques yields po… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

  10. arXiv:2210.01298  [pdf, other

    cs.CV cs.RO

    Centroid Distance Keypoint Detector for Colored Point Clouds

    Authors: Hanzhe Teng, Dimitrios Chatziparaschis, Xinyue Kan, Amit K. Roy-Chowdhury, Konstantinos Karydis

    Abstract: Keypoint detection serves as the basis for many computer vision and robotics applications. Despite the fact that colored point clouds can be readily obtained, most existing keypoint detectors extract only geometry-salient keypoints, which can impede the overall performance of systems that intend to (or have the potential to) leverage color information. To promote advances in such systems, we propo… ▽ More

    Submitted 15 June, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

    Comments: Accepted to IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023; copyright will be transferred to IEEE upon publication

  11. arXiv:2205.03124  [pdf, other

    cs.CV

    A High-Accuracy Unsupervised Person Re-identification Method Using Auxiliary Information Mined from Datasets

    Authors: Hehan Teng, Tao He, Yuchen Guo, Guiguang Ding

    Abstract: Supervised person re-identification methods rely heavily on high-quality cross-camera training label. This significantly hinders the deployment of re-ID models in real-world applications. The unsupervised person re-ID methods can reduce the cost of data annotation, but their performance is still far lower than the supervised ones. In this paper, we make full use of the auxiliary information mined… ▽ More

    Submitted 6 May, 2022; originally announced May 2022.

  12. arXiv:2204.00891  [pdf, other

    cs.CV

    A Free Lunch to Person Re-identification: Learning from Automatically Generated Noisy Tracklets

    Authors: Hehan Teng, Tao He, Yuchen Guo, Zhenhua Guo, Guiguang Ding

    Abstract: A series of unsupervised video-based re-identification (re-ID) methods have been proposed to solve the problem of high labor cost required to annotate re-ID datasets. But their performance is still far lower than the supervised counterparts. In the mean time, clean datasets without noise are used in these methods, which is not realistic. In this paper, we propose to tackle this problem by learning… ▽ More

    Submitted 2 April, 2022; originally announced April 2022.

  13. arXiv:2008.12920  [pdf, other

    cs.RO

    Development and Testing of a Novel Automated Insect Capture Module for Sample Collection and Transfer

    Authors: Keran Ye, Gustavo J. Correa, Tom Guda, Hanzhe Teng, Anandasankar Ray, Konstantinos Karydis

    Abstract: There exists an urgent need for efficient tools in disease surveillance to help model and predict the spread of disease. The transmission of insect-borne diseases poses a serious concern to public health officials and the medical and research community at large. In the modeling of this spread, we face bottlenecks in (1) the frequency at which we are able to sample insect vectors in environments th… ▽ More

    Submitted 29 August, 2020; originally announced August 2020.

    Comments: Accepted to IEEE International Conference on Automation Science and Engineering (CASE) 2020

  14. arXiv:2006.16460  [pdf, other

    cs.RO

    Online Exploration and Coverage Planning in Unknown Obstacle-Cluttered Environments

    Authors: Xinyue Kan, Hanzhe Teng, Konstantinos Karydis

    Abstract: Online coverage planning can be useful in applications like field monitoring and search and rescue. Without prior information of the environment, achieving resolution-complete coverage considering the non-holonomic mobility constraints in commonly-used vehicles (e.g., wheeled robots) remains a challenge. In this paper, we propose a hierarchical, hex-decomposition-based coverage planning algorithm… ▽ More

    Submitted 29 June, 2020; originally announced June 2020.

    Comments: To be published in IEEE Robotics and Automation Letters (RA-L) 2020

  15. arXiv:1904.10781  [pdf, other

    cs.CV

    Informative sample generation using class aware generative adversarial networks for classification of chest Xrays

    Authors: Behzad Bozorgtabar, Dwarikanath Mahapatra, Hendrik von Teng, Alexander Pollinger, Lukas Ebner, Jean-Phillipe Thiran, Mauricio Reyes

    Abstract: Training robust deep learning (DL) systems for disease detection from medical images is challenging due to limited images covering different disease types and severity. The problem is especially acute, where there is a severe class imbalance. We propose an active learning (AL) framework to select most informative samples for training our model using a Bayesian neural network. Informative samples a… ▽ More

    Submitted 30 April, 2019; v1 submitted 24 April, 2019; originally announced April 2019.