Skip to main content

Showing 1–7 of 7 results for author: Chua, T

Searching in archive eess. Search in all archives.
.
  1. arXiv:2503.23377  [pdf, other

    cs.CV cs.AI cs.SD eess.AS

    JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization

    Authors: Kai Liu, Wei Li, Lai Chen, Shengqiong Wu, Yanhao Zheng, Jiayi Ji, Fan Zhou, Rongxin Jiang, Jiebo Luo, Hao Fei, Tat-Seng Chua

    Abstract: This paper introduces JavisDiT, a novel Joint Audio-Video Diffusion Transformer designed for synchronized audio-video generation (JAVG). Built upon the powerful Diffusion Transformer (DiT) architecture, JavisDiT is able to generate high-quality audio and video content simultaneously from open-ended user prompts. To ensure optimal synchronization, we introduce a fine-grained spatio-temporal alignme… ▽ More

    Submitted 30 March, 2025; originally announced March 2025.

    Comments: Work in progress. Homepage: https://javisdit.github.io/

  2. arXiv:2406.14333  [pdf, other

    cs.IR cs.SD eess.AS

    LARP: Language Audio Relational Pre-training for Cold-Start Playlist Continuation

    Authors: Rebecca Salganik, Xiaohao Liu, Yunshan Ma, Jian Kang, Tat-Seng Chua

    Abstract: As online music consumption increasingly shifts towards playlist-based listening, the task of playlist continuation, in which an algorithm suggests songs to extend a playlist in a personalized and musically cohesive manner, has become vital to the success of music streaming. Currently, many existing playlist continuation approaches rely on collaborative filtering methods to perform recommendation.… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  3. arXiv:2312.05871  [pdf, other

    cs.DC eess.SY math.OC

    Optimization for the Metaverse over Mobile Edge Computing with Play to Earn

    Authors: Chang Liu, Terence Jie Chua, Jun Zhao

    Abstract: The concept of the Metaverse has garnered growing interest from both academic and industry circles. The decentralization of both the integrity and security of digital items has spurred the popularity of play-to-earn (P2E) games, where players are entitled to earn and own digital assets which they may trade for physical-world currencies. However, these computationally-intensive games are hardly pla… ▽ More

    Submitted 10 December, 2023; originally announced December 2023.

    Comments: This work appears as a full paper in IEEE Conference on Computer Communications (INFOCOM) 2024

  4. arXiv:2210.04689  [pdf, other

    cs.LG cs.AI cs.NI cs.SI eess.SP

    Time Minimization in Hierarchical Federated Learning

    Authors: Chang Liu, Terence Jie Chua, Jun Zhao

    Abstract: Federated Learning is a modern decentralized machine learning technique where user equipments perform machine learning tasks locally and then upload the model parameters to a central server. In this paper, we consider a 3-layer hierarchical federated learning system which involves model parameter exchanges between the cloud and edge servers, and the edge servers and user equipment. In a hierarchic… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

    Comments: This paper appears in the Proceedings of 2022 ACM/IEEE Symposium on Edge Computing (SEC). Please feel free to contact us for questions or remarks

  5. arXiv:2209.13425  [pdf, other

    cs.NI cs.LG eess.SP

    Resource Allocation for Mobile Metaverse with the Internet of Vehicles over 6G Wireless Communications: A Deep Reinforcement Learning Approach

    Authors: Terence Jie Chua, Wenhan Yu, Jun Zhao

    Abstract: Improving the interactivity and interconnectivity between people is one of the highlights of the Metaverse. The Metaverse relies on a core approach, digital twinning, which is a means to replicate physical world objects, people, actions and scenes onto the virtual world. Being able to access scenes and information associated with the physical world, in the Metaverse in real-time and under mobility… ▽ More

    Submitted 27 September, 2022; originally announced September 2022.

    Comments: This paper appears in the Proceedings of 8th IEEE World Forum on the Internet of Things (WFIoT) 2022. Please feel free to contact us for questions or remarks

  6. arXiv:2207.06057  [pdf, other

    cs.SD cs.MM eess.AS

    Subband-based Generative Adversarial Network for Non-parallel Many-to-many Voice Conversion

    Authors: Jian Ma, Zhedong Zheng, Hao Fei, Feng Zheng, Tat-seng Chua, Yi Yang

    Abstract: Voice conversion is to generate a new speech with the source content and a target voice style. In this paper, we focus on one general setting, i.e., non-parallel many-to-many voice conversion, which is close to the real-world scenario. As the name implies, non-parallel many-to-many voice conversion does not require the paired source and reference speeches and can be applied to arbitrary voice tran… ▽ More

    Submitted 27 July, 2022; v1 submitted 13 July, 2022; originally announced July 2022.

  7. arXiv:2010.08091  [pdf, other

    cs.SD cs.MM eess.AS

    PiRhDy: Learning Pitch-, Rhythm-, and Dynamics-aware Embeddings for Symbolic Music

    Authors: Hongru Liang, Wenqiang Lei, Paul Yaozhu Chan, Zhenglu Yang, Maosong Sun, Tat-Seng Chua

    Abstract: Definitive embeddings remain a fundamental challenge of computational musicology for symbolic music in deep learning today. Analogous to natural language, music can be modeled as a sequence of tokens. This motivates the majority of existing solutions to explore the utilization of word embedding models to build music embeddings. However, music differs from natural languages in two key aspects: (1)… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

    Comments: ACM Multimedia 2020 -- best paper