Skip to main content

Showing 1–2 of 2 results for author: Amin, M A

Searching in archive eess. Search in all archives.
.
  1. arXiv:2410.15017  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    DM-Codec: Distilling Multimodal Representations for Speech Tokenization

    Authors: Md Mubtasim Ahasan, Md Fahim, Tasnim Mohiuddin, A K M Mahbubur Rahman, Aman Chadha, Tariq Iqbal, M Ashraful Amin, Md Mofijul Islam, Amin Ahsan Ali

    Abstract: Recent advancements in speech-language models have yielded significant improvements in speech tokenization and synthesis. However, effectively mapping the complex, multidimensional attributes of speech into discrete tokens remains challenging. This process demands acoustic, semantic, and contextual information for precise speech representations. Existing speech representations generally fall into… ▽ More

    Submitted 19 October, 2024; originally announced October 2024.

  2. arXiv:2008.10736  [pdf

    cs.CV cs.LG eess.IV

    LULC Segmentation of RGB Satellite Image Using FCN-8

    Authors: Abu Bakar Siddik Nayem, Anis Sarker, Ovi Paul, Amin Ali, Md. Ashraful Amin, AKM Mahbubur Rahman

    Abstract: This work presents use of Fully Convolutional Network (FCN-8) for semantic segmentation of high-resolution RGB earth surface satel-lite images into land use land cover (LULC) categories. Specically, we propose a non-overlapping grid-based approach to train a Fully Convo-lutional Network (FCN-8) with vgg-16 weights to segment satellite im-ages into four (forest, built-up, farmland and water) classe… ▽ More

    Submitted 24 August, 2020; originally announced August 2020.

    Comments: Accepted paper at 3rd SLAAI-International Conference on Artificial Intelligence; 13 pages, 7 figures, 3 tables