Skip to main content

Showing 1–7 of 7 results for author: Çakır, E

.
  1. arXiv:2506.19929  [pdf

    cs.LG

    A Comparative Analysis of Reinforcement Learning and Conventional Deep Learning Approaches for Bearing Fault Diagnosis

    Authors: Efe Çakır, Patrick Dumond

    Abstract: Bearing faults in rotating machinery can lead to significant operational disruptions and maintenance costs. Modern methods for bearing fault diagnosis rely heavily on vibration analysis and machine learning techniques, which often require extensive labeled data and may not adapt well to dynamic environments. This study explores the feasibility of reinforcement learning (RL), specifically Deep Q-Ne… ▽ More

    Submitted 24 June, 2025; originally announced June 2025.

    Comments: 5 pages, 5 figures. To appear in the Proceedings of the Canadian Society for Mechanical Engineering (CSME) Congress 2025

    ACM Class: I.2.6

  2. arXiv:2106.14787  [pdf, other

    eess.AS

    Mobile Microphone Array Speech Detection and Localization in Diverse Everyday Environments

    Authors: Pasi Pertilä, Emre Cakir, Aapo Hakala, Eemi Fagerlund, Tuomas Virtanen, Archontis Politis, Antti Eronen

    Abstract: Joint sound event localization and detection (SELD) is an integral part of developing context awareness into communication interfaces of mobile robots, smartphones, and home assistants. For example, an automatic audio focus for video capture on a mobile phone requires robust detection of relevant acoustic events around the device and their direction. Existing SELD approaches have been evaluated us… ▽ More

    Submitted 28 June, 2021; originally announced June 2021.

    Comments: to be published in the proceedings of the 29th European Signal Processing Conference, EUSIPCO 2021

  3. arXiv:2007.04660  [pdf, other

    cs.SD cs.LG cs.MM eess.AS

    Multi-task Regularization Based on Infrequent Classes for Audio Captioning

    Authors: Emre Çakır, Konstantinos Drossos, Tuomas Virtanen

    Abstract: Audio captioning is a multi-modal task, focusing on using natural language for describing the contents of general audio. Most audio captioning methods are based on deep neural networks, employing an encoder-decoder scheme and a dataset with audio clips and corresponding natural language descriptions (i.e. captions). A significant challenge for audio captioning is the distribution of words in the c… ▽ More

    Submitted 9 July, 2020; originally announced July 2020.

  4. arXiv:1808.05777  [pdf, other

    eess.AS cs.LG cs.SD

    Unsupervised adversarial domain adaptation for acoustic scene classification

    Authors: Shayan Gharib, Konstantinos Drossos, Emre Çakir, Dmitriy Serdyuk, Tuomas Virtanen

    Abstract: A general problem in acoustic scene classification task is the mismatched conditions between training and testing data, which significantly reduces the performance of the developed methods on classification accuracy. As a countermeasure, we present the first method of unsupervised adversarial domain adaptation for acoustic scene classification. We employ a model pre-trained on data from one set of… ▽ More

    Submitted 17 August, 2018; originally announced August 2018.

  5. arXiv:1805.03647  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    End-to-End Polyphonic Sound Event Detection Using Convolutional Recurrent Neural Networks with Learned Time-Frequency Representation Input

    Authors: Emre Çakır, Tuomas Virtanen

    Abstract: Sound event detection systems typically consist of two stages: extracting hand-crafted features from the raw audio waveform, and learning a mapping between these features and the target sound events using a classifier. Recently, the focus of sound event detection research has been mostly shifted to the latter stage using standard features such as mel spectrogram as the input for classifiers such a… ▽ More

    Submitted 9 May, 2018; originally announced May 2018.

    Comments: accepted to IJCNN 2018

  6. arXiv:1706.02047  [pdf, other

    cs.SD cs.LG

    Stacked Convolutional and Recurrent Neural Networks for Bird Audio Detection

    Authors: Sharath Adavanne, Konstantinos Drossos, Emre Çakır, Tuomas Virtanen

    Abstract: This paper studies the detection of bird calls in audio segments using stacked convolutional and recurrent neural networks. Data augmentation by blocks mixing and domain adaptation using a novel method of test mixing are proposed and evaluated in regard to making the method robust to unseen data. The contributions of two kinds of acoustic features (dominant frequency and log mel-band energy) and t… ▽ More

    Submitted 7 June, 2017; originally announced June 2017.

    Comments: Accepted for European Signal Processing Conference 2017

  7. Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection

    Authors: Emre Çakır, Giambattista Parascandolo, Toni Heittola, Heikki Huttunen, Tuomas Virtanen

    Abstract: Sound events often occur in unstructured environments where they exhibit wide variations in their frequency content and temporal structure. Convolutional neural networks (CNN) are able to extract higher level features that are invariant to local spectral and temporal variations. Recurrent neural networks (RNNs) are powerful in learning the longer term temporal context in the audio signals. CNNs an… ▽ More

    Submitted 21 February, 2017; originally announced February 2017.

    Comments: Accepted for IEEE Transactions on Audio, Speech and Language Processing, Special Issue on Sound Scene and Event Analysis