-
Inter-Feature-Map Differential Coding of Surveillance Video
Authors:
Kei Iino,
Miho Takahashi,
Hiroshi Watanabe,
Ichiro Morinaga,
Shohei Enomoto,
Xu Shi,
Akira Sakamoto,
Takeharu Eda
Abstract:
In Collaborative Intelligence, a deep neural network (DNN) is partitioned and deployed at the edge and the cloud for bandwidth saving and system optimization. When a model input is an image, it has been confirmed that the intermediate feature map, the output from the edge, can be smaller than the input data size. However, its effectiveness has not been reported when the input is a video. In this s…
▽ More
In Collaborative Intelligence, a deep neural network (DNN) is partitioned and deployed at the edge and the cloud for bandwidth saving and system optimization. When a model input is an image, it has been confirmed that the intermediate feature map, the output from the edge, can be smaller than the input data size. However, its effectiveness has not been reported when the input is a video. In this study, we propose a method to compress the feature map of surveillance videos by applying inter-feature-map differential coding (IFMDC). IFMDC shows a compression ratio comparable to, or better than, HEVC to the input video in the case of small accuracy reduction. Our method is especially effective for videos that are sensitive to image quality degradation when HEVC is applied
△ Less
Submitted 1 November, 2024;
originally announced November 2024.
-
Linear Least Squares Estimation of Fiber-Longitudinal Optical Power Profile
Authors:
Takeo Sasai,
Minami Takahashi,
Masanori Nakamura,
Etsushi Yamazaki,
Yoshiaki Kisaka
Abstract:
This paper presents a linear least squares method for fiber-longitudinal power profile estimation (PPE), which estimates an optical signal power distribution throughout a fiber-optic link at a coherent receiver. The method finds the global optimum in least square estimation of longitudinal power profiles, thus closely matching true optical power profiles and locating loss anomalies in a link with…
▽ More
This paper presents a linear least squares method for fiber-longitudinal power profile estimation (PPE), which estimates an optical signal power distribution throughout a fiber-optic link at a coherent receiver. The method finds the global optimum in least square estimation of longitudinal power profiles, thus closely matching true optical power profiles and locating loss anomalies in a link with high spatial resolution. Experimental results show that the method achieves accurate PPE with an RMS error from OTDR of 0.18 dB. Consequently, it successfully identifies a loss anomaly as small as 0.77 dB, demonstrating the potential of a coherent receiver in locating even splice and connector losses. The method is also evaluated under a WDM condition with optimal system fiber launch power, highlighting its feasibility for use in operations. Furthermore, a fundamental limit for stable estimation and spatial resolution of least-squares-based PPE is quantitatively discussed in relation to the ill-posedness of PPE by evaluating the condition number of a nonlinear perturbation matrix.
△ Less
Submitted 8 November, 2023; v1 submitted 7 October, 2023;
originally announced October 2023.
-
Spatial Data Augmentation with Simulated Room Impulse Responses for Sound Event Localization and Detection
Authors:
Yuichiro Koyama,
Kazuhide Shigemi,
Masafumi Takahashi,
Kazuki Shimada,
Naoya Takahashi,
Emiru Tsunoo,
Shusuke Takahashi,
Yuki Mitsufuji
Abstract:
Recording and annotating real sound events for a sound event localization and detection (SELD) task is time consuming, and data augmentation techniques are often favored when the amount of data is limited. However, how to augment the spatial information in a dataset, including unlabeled directional interference events, remains an open research question. Furthermore, directional interference events…
▽ More
Recording and annotating real sound events for a sound event localization and detection (SELD) task is time consuming, and data augmentation techniques are often favored when the amount of data is limited. However, how to augment the spatial information in a dataset, including unlabeled directional interference events, remains an open research question. Furthermore, directional interference events make it difficult to accurately extract spatial characteristics from target sound events. To address this problem, we propose an impulse response simulation framework (IRS) that augments spatial characteristics using simulated room impulse responses (RIR). RIRs corresponding to a microphone array assumed to be placed in various rooms are accurately simulated, and the source signals of the target sound events are extracted from a mixture. The simulated RIRs are then convolved with the extracted source signals to obtain an augmented multi-channel training dataset. Evaluation results obtained using the TAU-NIGENS Spatial Sound Events 2021 dataset show that the IRS contributes to improving the overall SELD performance. Additionally, we conducted an ablation study to discuss the contribution and need for each component within the IRS.
△ Less
Submitted 28 April, 2022; v1 submitted 13 October, 2021;
originally announced October 2021.
-
Ensemble of ACCDOA- and EINV2-based Systems with D3Nets and Impulse Response Simulation for Sound Event Localization and Detection
Authors:
Kazuki Shimada,
Naoya Takahashi,
Yuichiro Koyama,
Shusuke Takahashi,
Emiru Tsunoo,
Masafumi Takahashi,
Yuki Mitsufuji
Abstract:
This report describes our systems submitted to the DCASE2021 challenge task 3: sound event localization and detection (SELD) with directional interference. Our previous system based on activity-coupled Cartesian direction of arrival (ACCDOA) representation enables us to solve a SELD task with a single target. This ACCDOA-based system with efficient network architecture called RD3Net and data augme…
▽ More
This report describes our systems submitted to the DCASE2021 challenge task 3: sound event localization and detection (SELD) with directional interference. Our previous system based on activity-coupled Cartesian direction of arrival (ACCDOA) representation enables us to solve a SELD task with a single target. This ACCDOA-based system with efficient network architecture called RD3Net and data augmentation techniques outperformed state-of-the-art SELD systems in terms of localization and location-dependent detection. Using the ACCDOA-based system as a base, we perform model ensembles by averaging outputs of several systems trained with different conditions such as input features, training folds, and model architectures. We also use the event independent network v2 (EINV2)-based system to increase the diversity of the model ensembles. To generalize the models, we further propose impulse response simulation (IRS), which generates simulated multi-channel signals by convolving simulated room impulse responses (RIRs) with source signals extracted from the original dataset. Our systems significantly improved over the baseline system on the development dataset.
△ Less
Submitted 20 June, 2021;
originally announced June 2021.
-
Learning Global and Local Features of Normal Brain Anatomy for Unsupervised Abnormality Detection
Authors:
Kazuma Kobayashi,
Ryuichiro Hataya,
Yusuke Kurose,
Amina Bolatkan,
Mototaka Miyake,
Hirokazu Watanabe,
Masamichi Takahashi,
Jun Itami,
Tatsuya Harada,
Ryuji Hamamoto
Abstract:
In real-world clinical practice, overlooking unanticipated findings can result in serious consequences. However, supervised learning, which is the foundation for the current success of deep learning, only encourages models to identify abnormalities that are defined in datasets in advance. Therefore, abnormality detection must be implemented in medical images that are not limited to a specific dise…
▽ More
In real-world clinical practice, overlooking unanticipated findings can result in serious consequences. However, supervised learning, which is the foundation for the current success of deep learning, only encourages models to identify abnormalities that are defined in datasets in advance. Therefore, abnormality detection must be implemented in medical images that are not limited to a specific disease category. In this study, we demonstrate an unsupervised learning framework for pixel-wise abnormality detection in brain magnetic resonance imaging captured from a patient population with metastatic brain tumor. Our concept is as follows: If an image reconstruction network can faithfully reproduce the global features of normal anatomy, then the abnormal lesions in unseen images can be identified based on the local difference from those reconstructed as normal by a discriminative network. Both networks are trained on a dataset comprising only normal images without labels. In addition, we devise a metric to evaluate the anatomical fidelity of the reconstructed images and confirm that the overall detection performance is improved when the image reconstruction network achieves a higher score. For evaluation, clinically significant abnormalities are comprehensively segmented. The results show that the area under the receiver operating characteristics curve values for metastatic brain tumors, extracranial metastatic tumors, postoperative cavities, and structural changes are 0.78, 0.61, 0.91, and 0.60, respectively.
△ Less
Submitted 8 May, 2021; v1 submitted 26 May, 2020;
originally announced May 2020.