Skip to main content

Showing 1–16 of 16 results for author: Truong, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.19264  [pdf, other

    cs.CV

    Improving Novel view synthesis of 360$^\circ$ Scenes in Extremely Sparse Views by Jointly Training Hemisphere Sampled Synthetic Images

    Authors: Guangan Chen, Anh Minh Truong, Hanhe Lin, Michiel Vlaminck, Wilfried Philips, Hiep Luong

    Abstract: Novel view synthesis in 360$^\circ$ scenes from extremely sparse input views is essential for applications like virtual reality and augmented reality. This paper presents a novel framework for novel view synthesis in extremely sparse-view cases. As typical structure-from-motion methods are unable to estimate camera poses in extremely sparse-view cases, we apply DUSt3R to estimate camera poses and… ▽ More

    Submitted 25 May, 2025; originally announced May 2025.

    Journal ref: The IEEE International Conference on Image Processing (ICIP) 2025

  2. arXiv:2505.10312  [pdf, ps, other

    cs.HC cs.CV

    SOS: A Shuffle Order Strategy for Data Augmentation in Industrial Human Activity Recognition

    Authors: Anh Tuan Ha, Hoang Khang Phan, Thai Minh Tien Ngo, Anh Phan Truong, Nhat Tan Le

    Abstract: In the realm of Human Activity Recognition (HAR), obtaining high quality and variance data is still a persistent challenge due to high costs and the inherent variability of real-world activities. This study introduces a generation dataset by deep learning approaches (Attention Autoencoder and conditional Generative Adversarial Networks). Another problem that data heterogeneity is a critical challe… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

  3. arXiv:2504.15933  [pdf, other

    cs.GR cs.LG

    Low-Rank Adaptation of Neural Fields

    Authors: Anh Truong, Ahmed H. Mahmoud, Mina Konaković Luković, Justin Solomon

    Abstract: Processing visual data often involves small adjustments or sequences of changes, such as in image filtering, surface smoothing, and video storage. While established graphics techniques like normal mapping and video compression exploit redundancy to encode such small changes efficiently, the problem of encoding small changes to neural fields (NF) -- neural network parameterizations of visual or phy… ▽ More

    Submitted 22 April, 2025; originally announced April 2025.

  4. arXiv:2504.05106  [pdf, other

    cs.HC cs.AI cs.LG

    SpeakEasy: Enhancing Text-to-Speech Interactions for Expressive Content Creation

    Authors: Stephen Brade, Sam Anderson, Rithesh Kumar, Zeyu Jin, Anh Truong

    Abstract: Novice content creators often invest significant time recording expressive speech for social media videos. While recent advancements in text-to-speech (TTS) technology can generate highly realistic speech in various languages and accents, many struggle with unintuitive or overly granular TTS interfaces. We propose simplifying TTS generation by allowing users to specify high-level context alongside… ▽ More

    Submitted 7 April, 2025; originally announced April 2025.

  5. VideoMix: Aggregating How-To Videos for Task-Oriented Learning

    Authors: Saelyne Yang, Anh Truong, Juho Kim, Dingzeyu Li

    Abstract: Tutorial videos are a valuable resource for people looking to learn new tasks. People often learn these skills by viewing multiple tutorial videos to get an overall understanding of a task by looking at different approaches to achieve the task. However, navigating through multiple videos can be time-consuming and mentally demanding as these videos are scattered and not easy to skim. We propose Vid… ▽ More

    Submitted 26 March, 2025; originally announced March 2025.

    Comments: In Proceedings of the 30th International Conference on Intelligent User Interfaces (IUI '25) 2025

  6. arXiv:2503.04103  [pdf, other

    cs.HC

    Compositional Structures as Substrates for Human-AI Co-creation Environment: A Design Approach and A Case Study

    Authors: Yining Cao, Yiyi Huang, Anh Truong, Hijung Valentina Shin, Haijun Xia

    Abstract: It has been increasingly recognized that effective human-AI co-creation requires more than prompts and results, but an environment with empowering structures that facilitate exploration, planning, iteration, as well as control and inspection of AI generation. Yet, a concrete design approach to such an environment has not been established. Our literature analysis highlights that compositional struc… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

  7. arXiv:2503.03702  [pdf, other

    cs.CL

    Developing and Utilizing a Large-Scale Cantonese Dataset for Multi-Tasking in Large Language Models

    Authors: Jiyue Jiang, Alfred Kar Yin Truong, Yanyu Chen, Qinghang Bao, Sheng Wang, Pengan Chen, Jiuming Wang, Lingpeng Kong, Yu Li, Chuan Wu

    Abstract: High-quality data resources play a crucial role in learning large language models (LLMs), particularly for low-resource languages like Cantonese. Despite having more than 85 million native speakers, Cantonese is still considered a low-resource language in the field of natural language processing (NLP) due to factors such as the dominance of Mandarin, lack of cohesion within the Cantonese-speaking… ▽ More

    Submitted 5 March, 2025; originally announced March 2025.

  8. arXiv:2502.08807  [pdf, other

    cs.AR cs.LG

    InTAR: Inter-Task Auto-Reconfigurable Accelerator Design for High Data Volume Variation in DNNs

    Authors: Zifan He, Anderson Truong, Yingqi Cao, Jason Cong

    Abstract: The rise of deep neural networks (DNNs) has driven an increased demand for computing power and memory. Modern DNNs exhibit high data volume variation (HDV) across tasks, which poses challenges for FPGA acceleration: conventional accelerators rely on fixed execution patterns (dataflow or sequential) that can lead to pipeline stalls or necessitate frequent off-chip memory accesses. To address these… ▽ More

    Submitted 4 April, 2025; v1 submitted 12 February, 2025; originally announced February 2025.

    Comments: FCCM 2025

  9. arXiv:2403.08049  [pdf, other

    cs.HC cs.AI cs.LG

    TutoAI: A Cross-domain Framework for AI-assisted Mixed-media Tutorial Creation on Physical Tasks

    Authors: Yuexi Chen, Vlad I. Morariu, Anh Truong, Zhicheng Liu

    Abstract: Mixed-media tutorials, which integrate videos, images, text, and diagrams to teach procedural skills, offer more browsable alternatives than timeline-based videos. However, manually creating such tutorials is tedious, and existing automated solutions are often restricted to a particular domain. While AI models hold promise, it is unclear how to effectively harness their powers, given the multi-mod… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: CHI 2024, supplementary materials: https://hdi.cs.umd.edu/papers/TutoAI_CHI24_Supp.pdf

  10. arXiv:2311.05867  [pdf, other

    cs.HC

    PodReels: Human-AI Co-Creation of Video Podcast Teasers

    Authors: Sitong Wang, Zheng Ning, Anh Truong, Mira Dontcheva, Dingzeyu Li, Lydia B. Chilton

    Abstract: Video podcast teasers are short videos that can be shared on social media platforms to capture interest in the full episodes of a video podcast. These teasers enable long-form podcasters to reach new audiences and gain new followers. However, creating a compelling teaser from an hour-long episode is challenging. Selecting interesting clips requires significant mental effort; editing the chosen cli… ▽ More

    Submitted 9 May, 2024; v1 submitted 9 November, 2023; originally announced November 2023.

  11. arXiv:2202.00997  [pdf, other

    eess.IV cs.AI cs.CV

    Gradient Variance Loss for Structure-Enhanced Image Super-Resolution

    Authors: Lusine Abrahamyan, Anh Minh Truong, Wilfried Philips, Nikos Deligiannis

    Abstract: Recent success in the field of single image super-resolution (SISR) is achieved by optimizing deep convolutional neural networks (CNNs) in the image space with the L1 or L2 loss. However, when trained with these loss functions, models usually fail to recover sharp edges present in the high-resolution (HR) images for the reason that the model tends to give a statistical average of potential HR solu… ▽ More

    Submitted 2 February, 2022; originally announced February 2022.

    Comments: ICASSP 2022

  12. arXiv:2012.09597  [pdf, other

    cs.LG

    Sensitive Data Detection with High-Throughput Neural Network Models for Financial Institutions

    Authors: Anh Truong, Austin Walters, Jeremy Goodsitt

    Abstract: Named Entity Recognition has been extensively investigated in many fields. However, the application of sensitive entity detection for production systems in financial institutions has not been well explored due to the lack of publicly available, labeled datasets. In this paper, we use internal and synthetic datasets to evaluate various methods of detecting NPI (Nonpublic Personally Identifiable) in… ▽ More

    Submitted 17 December, 2020; originally announced December 2020.

  13. arXiv:2008.13315  [pdf, other

    cs.RO eess.SP

    Benchmarking Metric Ground Navigation

    Authors: Daniel Perille, Abigail Truong, Xuesu Xiao, Peter Stone

    Abstract: Metric ground navigation addresses the problem of autonomously moving a robot from one point to another in an obstacle-occupied planar environment in a collision-free manner. It is one of the most fundamental capabilities of intelligent mobile robots. This paper presents a standardized testbed with a set of environments and metrics to benchmark difficulty of different scenarios and performance of… ▽ More

    Submitted 2 November, 2020; v1 submitted 30 August, 2020; originally announced August 2020.

  14. arXiv:1910.02993  [pdf, other

    cs.DB cs.CL cs.CV cs.IR

    Rekall: Specifying Video Events using Compositions of Spatiotemporal Labels

    Authors: Daniel Y. Fu, Will Crichton, James Hong, Xinwei Yao, Haotian Zhang, Anh Truong, Avanika Narayan, Maneesh Agrawala, Christopher Ré, Kayvon Fatahalian

    Abstract: Many real-world video analysis applications require the ability to identify domain-specific events in video, such as interviews and commercials in TV news broadcasts, or action sequences in film. Unfortunately, pre-trained models to detect all the events of interest in video may not exist, and training new models from scratch can be costly and labor-intensive. In this paper, we explore the utility… ▽ More

    Submitted 7 October, 2019; originally announced October 2019.

  15. Towards Automated Machine Learning: Evaluation and Comparison of AutoML Approaches and Tools

    Authors: Anh Truong, Austin Walters, Jeremy Goodsitt, Keegan Hines, C. Bayan Bruss, Reza Farivar

    Abstract: There has been considerable growth and interest in industrial applications of machine learning (ML) in recent years. ML engineers, as a consequence, are in high demand across the industry, yet improving the efficiency of ML engineers remains a fundamental challenge. Automated machine learning (AutoML) has emerged as a way to save time and effort on repetitive tasks in ML pipelines, such as data pr… ▽ More

    Submitted 3 September, 2019; v1 submitted 15 August, 2019; originally announced August 2019.

  16. arXiv:1705.00703  [pdf, other

    cs.CV

    Submodular Trajectory Optimization for Aerial 3D Scanning

    Authors: Mike Roberts, Debadeepta Dey, Anh Truong, Sudipta Sinha, Shital Shah, Ashish Kapoor, Pat Hanrahan, Neel Joshi

    Abstract: Drones equipped with cameras are emerging as a powerful tool for large-scale aerial 3D scanning, but existing automatic flight planners do not exploit all available information about the scene, and can therefore produce inaccurate and incomplete 3D models. We present an automatic method to generate drone trajectories, such that the imagery acquired during the flight will later produce a high-fidel… ▽ More

    Submitted 4 August, 2017; v1 submitted 1 May, 2017; originally announced May 2017.

    Comments: Accepted for publication at the International Conference on Computer Vision (ICCV) 2017; Supplementary video: http://www.youtube.com/watch?v=89fFmfVZSO8