-
Figurative-cum-Commonsense Knowledge Infusion for Multimodal Mental Health Meme Classification
Authors:
Abdullah Mazhar,
Zuhair hasan shaik,
Aseem Srivastava,
Polly Ruhnke,
Lavanya Vaddavalli,
Sri Keshav Katragadda,
Shweta Yadav,
Md Shad Akhtar
Abstract:
The expression of mental health symptoms through non-traditional means, such as memes, has gained remarkable attention over the past few years, with users often highlighting their mental health struggles through figurative intricacies within memes. While humans rely on commonsense knowledge to interpret these complex expressions, current Multimodal Language Models (MLMs) struggle to capture these…
▽ More
The expression of mental health symptoms through non-traditional means, such as memes, has gained remarkable attention over the past few years, with users often highlighting their mental health struggles through figurative intricacies within memes. While humans rely on commonsense knowledge to interpret these complex expressions, current Multimodal Language Models (MLMs) struggle to capture these figurative aspects inherent in memes. To address this gap, we introduce a novel dataset, AxiOM, derived from the GAD anxiety questionnaire, which categorizes memes into six fine-grained anxiety symptoms. Next, we propose a commonsense and domain-enriched framework, M3H, to enhance MLMs' ability to interpret figurative language and commonsense knowledge. The overarching goal remains to first understand and then classify the mental health symptoms expressed in memes. We benchmark M3H against 6 competitive baselines (with 20 variations), demonstrating improvements in both quantitative and qualitative metrics, including a detailed human evaluation. We observe a clear improvement of 4.20% and 4.66% on weighted-F1 metric. To assess the generalizability, we perform extensive experiments on a public dataset, RESTORE, for depressive symptom identification, presenting an extensive ablation study that highlights the contribution of each module in both datasets. Our findings reveal limitations in existing models and the advantage of employing commonsense to enhance figurative understanding.
△ Less
Submitted 25 January, 2025;
originally announced January 2025.
-
Triple Coding Empowered FDMA-CDMA Mode High Security CAOS Camera
Authors:
Nabeel A. Riza,
Mohsin A. Mazhar
Abstract:
For the first time, the hybrid triple coding empowered Frequency Division Multiple Access (FDMA) Code Division Multiple Access (CDMA) mode of the CAOS (i.e., Coded Access Optical Sensor) camera is demonstrated. Compared to the independent FDMA and CDMA modes, the FDMA-CDMA mode has a novel high security space-time-frequency triple signal encoding design for robust, faster, linear irradiance extrac…
▽ More
For the first time, the hybrid triple coding empowered Frequency Division Multiple Access (FDMA) Code Division Multiple Access (CDMA) mode of the CAOS (i.e., Coded Access Optical Sensor) camera is demonstrated. Compared to the independent FDMA and CDMA modes, the FDMA-CDMA mode has a novel high security space-time-frequency triple signal encoding design for robust, faster, linear irradiance extraction at a moderately High Dynamic Range (HDR). Specifically, this hybrid mode simultaneously combines the linear HDR strength of the FDMA mode Fast Fourier Transform (FFT) Digital Signal Processing (DSP)-based spectrum analysis with the high Signal to Noise Ratio (SNR) provided by the many simultaneous CAOS pixels photodetection of the CDMA mode. In particular, the demonstrated FDMA CDMA mode with P FDMA channels provides a P times faster camera operation versus the equivalent linear HDR Frequency Modulation (FM)CDMA mode. The active FDMA CDMA mode CAOS camera operation is also demonstrated using P equal to 3 LED light sources, each with its unique optical spectral content driven by its independent FDMA frequency. This illuminated target spectral signature matched active CAOS mode allows simultaneous capture of P images without the use of P time multiplexed slots operation tunable optical filter.
△ Less
Submitted 28 June, 2021;
originally announced July 2021.
-
FDMA-CDMA Mode CAOS Camera Demonstration using UV to NIR Full Spectrum
Authors:
Nabeel A. Riza,
Mohsin A. Mazhar
Abstract:
For the first time, the hybrid Frequency Division Multiple Access (FDMA) Code Division Multiple Access (CDMA) mode of the CAOS (i.e., Coded Access Optical Sensor) camera is demonstrated. The FDMA CDMA mode is a time frequency double signal encoding design for robust and faster linear High Dynamic Range (HDR) image irradiance extraction. Specifically, it simultaneously combines the strength of the…
▽ More
For the first time, the hybrid Frequency Division Multiple Access (FDMA) Code Division Multiple Access (CDMA) mode of the CAOS (i.e., Coded Access Optical Sensor) camera is demonstrated. The FDMA CDMA mode is a time frequency double signal encoding design for robust and faster linear High Dynamic Range (HDR) image irradiance extraction. Specifically, it simultaneously combines the strength of the FDMA-mode linear HDR Fast Fourier Transform (FFT) Digital Signal Processing (DSP) based spectrum analysis with the CDMA mode provided many simultaneous CAOS pixels high Signal to Noise Ratio (SNR) photo-detection. The FDMA CDMA mode with P FDMA channels provides a faster camera operation versus the linear HDR Frequency Modulation (FM) CDMA mode. Visible band imaging experiments using a Digital Micromirror Device (DMD) based CAOS camera demonstrate a P equal to 4 channels FDMA CDMA mode high quality image recovery of a calibrated 64 dB 6 patches HDR target versus the CDMA and FM CDMA CAOS modes that limit dynamic range and speed, respectively. Simultaneous dual image capture capability of the FDMA-CDMA mode is also demonstrated for the first time in Ultraviolet (UV) to Near Infrared (NIR) 350 to 1800 nm full spectrum using Silicon (Si) and Germanium (Ge) point photo-detectors.
△ Less
Submitted 6 January, 2021;
originally announced January 2021.
-
CAOS Spectral Imager Design and Advanced High Dynamic Range FDMA-TDMA CAOS Mode
Authors:
Mohsin A. Mazhar,
Nabeel A. Riza
Abstract:
In the first part of the paper, a CAOS line camera is introduced for spectral imaging of one dimensional (1-D) or line targets. The proposed spectral camera uses both a diffraction grating as well as a cylindrical lens optics system to provide line imaging along the line pixels direction of the image axis and Fourier transforming operations in the orthogonal direction to provide line pixels optica…
▽ More
In the first part of the paper, a CAOS line camera is introduced for spectral imaging of one dimensional (1-D) or line targets. The proposed spectral camera uses both a diffraction grating as well as a cylindrical lens optics system to provide line imaging along the line pixels direction of the image axis and Fourier transforming operations in the orthogonal direction to provide line pixels optical spectrum analysis. The imager incorporates the Digital Micro-mirror Device (DMD)-based Coded Access Optical Sensor (CAOS) structure. The design includes a line-by-line scan option to enable two dimensional (2-D) spectral imaging. For the first time, demonstrated is line style spectral imaging using a 2850 K color temperature white light target illumination source along with visible band color bandpass filters and a moving mechanical pinhole to simulate a line target with individual pixels along 1-D that have unique spectral content. A ~412 nm to ~732 nm input target spectrum is measured using a 38 by 52 CAOS pixels spatial sampling grid providing a test image line of 38 pixels with each pixel providing a designed spectral resolution of ~6.2 nm. The spectral image is generated using the robust Code Division Multiple Access (CDMA) mode of the camera. The second part of the paper demonstrates for the first time the High Dynamic Range (HDR) operation of the Frequency Division Multiple Access (FDMA)-Time Division Multiple Access (TDMA) mode of the CAOS camera. The FDMA-TDMA mode also feature HDR recovery like the Frequency Modulation (FM)-TDMA mode, although at a much faster imaging rate and a higher Signal-to-Noise Ratio (SNR) as more than one CAOS pixel is extracted at a time.
△ Less
Submitted 20 February, 2021; v1 submitted 11 December, 2020;
originally announced December 2020.
-
96 dB Linear High Dynamic Range CAOS Spectrometer Demonstration
Authors:
Mohsin A. Mazhar,
Nabeel A. Riza
Abstract:
For the first time, a CAOS (i.e., Coded Access Optical Sensor) spectrometer is demonstrated. The design implemented uses a reflective diffraction grating and a time-frequency CAOS mode operations Digital Micromirror Device (DMD) in combination with a large area point photo-detector to enable highly programmable linear High Dynamic Range (HDR) spectrometry. Experiments are conducted with a 2850 K c…
▽ More
For the first time, a CAOS (i.e., Coded Access Optical Sensor) spectrometer is demonstrated. The design implemented uses a reflective diffraction grating and a time-frequency CAOS mode operations Digital Micromirror Device (DMD) in combination with a large area point photo-detector to enable highly programmable linear High Dynamic Range (HDR) spectrometry. Experiments are conducted with a 2850 K color temperature light bulb source and visible band color bandpass and high-pass filters as well as neutral density (ND) attenuation filters. A ~369 nm to ~715 nm input light source spectrum is measured with a designed ~1 nm spectral resolution. Using the optical filters and different CAOS modes, namely, Code Division Multiple Access (CDMA), Frequency Modulation (FM)-CDMA and FM-Time Division Multiple Access (TDMA) modes, measured are improving spectrometer linear dynamic ranges of 28 dB, 50 dB, and 96.2 dB, respectively. Applications for the linear HDR CAOS spectrometer includes materials inspection in biomedicine, foods, forensics, and pharmaceuticals.
△ Less
Submitted 9 November, 2020;
originally announced November 2020.
-
First Demonstration of the Active-Mode CAOS Camera
Authors:
Nabeel A. Riza,
Mohsin A. Mazhar
Abstract:
For the first time, demonstrated is the active-mode CAOS (i.e., Coded Access Optical Sensor) camera. The design demonstrated uses a hybrid approach to both optical device engagement and time-frequency CAOS mode operations. Specifically, time-frequency modulation of both the target illumination light source and the Digital Micromirror Device (DMD) combine to deliver the Frequency Modulation (FM)-Co…
▽ More
For the first time, demonstrated is the active-mode CAOS (i.e., Coded Access Optical Sensor) camera. The design demonstrated uses a hybrid approach to both optical device engagement and time-frequency CAOS mode operations. Specifically, time-frequency modulation of both the target illumination light source and the Digital Micromirror Device (DMD) combine to deliver the Frequency Modulation (FM)-Code Division Multiple Access (CDMA) mode of the CAOS camera. Using a 39.6 Klux white light 32 KHz FM LED source combined with a 1 KHz bit rate 4096 bits Walsh sequence CDMA code via the DMD, achieved is 58 x 70 CAOS pixels near 60 dB linear Dynamic Range (DR) imaging of a 36 patch calibrated high DR white light target. Applications for the active-mode CAOS camera are numerous and includes indoor full spectrum food inspection where a linear DR camera can play an important role for accurate measurements.
△ Less
Submitted 20 May, 2020;
originally announced May 2020.
-
Content-Based Image Retrieval Based on Late Fusion of Binary and Local Descriptors
Authors:
Nouman Ali,
Danish Ali Mazhar,
Zeshan Iqbal,
Rehan Ashraf,
Jawad Ahmed,
Farrukh Zeeshan Khan
Abstract:
One of the challenges in Content-Based Image Retrieval (CBIR) is to reduce the semantic gaps between low-level features and high-level semantic concepts. In CBIR, the images are represented in the feature space and the performance of CBIR depends on the type of selected feature representation. Late fusion also known as visual words integration is applied to enhance the performance of image retriev…
▽ More
One of the challenges in Content-Based Image Retrieval (CBIR) is to reduce the semantic gaps between low-level features and high-level semantic concepts. In CBIR, the images are represented in the feature space and the performance of CBIR depends on the type of selected feature representation. Late fusion also known as visual words integration is applied to enhance the performance of image retrieval. The recent advances in image retrieval diverted the focus of research towards the use of binary descriptors as they are reported computationally efficient. In this paper, we aim to investigate the late fusion of Fast Retina Keypoint (FREAK) and Scale Invariant Feature Transform (SIFT). The late fusion of binary and local descriptor is selected because among binary descriptors, FREAK has shown good results in classification-based problems while SIFT is robust to translation, scaling, rotation and small distortions. The late fusion of FREAK and SIFT integrates the performance of both feature descriptors for an effective image retrieval. Experimental results and comparisons show that the proposed late fusion enhances the performances of image retrieval.
△ Less
Submitted 24 March, 2017;
originally announced March 2017.