-
Towards Input-Convex Neural Network Modeling for Battery Optimization in Power Systems
Authors:
Arash Omidi,
Tanmay Mishra,
Mads R. Almassalkhi
Abstract:
Battery energy storage systems (BESS) play an increasingly vital role in integrating renewable generation into power grids due to their ability to dynamically balance supply. Grid-tied batteries typically employ power converters, where part-load efficiencies vary non-linearly. While this non-linearity can be modeled with high accuracy, it poses challenges for optimization, particularly in ensuring…
▽ More
Battery energy storage systems (BESS) play an increasingly vital role in integrating renewable generation into power grids due to their ability to dynamically balance supply. Grid-tied batteries typically employ power converters, where part-load efficiencies vary non-linearly. While this non-linearity can be modeled with high accuracy, it poses challenges for optimization, particularly in ensuring computational tractability. In this paper, we consider a non-linear BESS formulation based on the Energy Reservoir Model (ERM). A data-driven approach is introduced with the input-convex neural network (ICNN) to approximate the nonlinear efficiency with a convex function. The epigraph of the convex function is used to engender a convex program for battery ERM optimization. This relaxed ICNN method is applied to two battery optimization use-cases: PV smoothing and revenue maximization, and it is compared with three other ERM formulations (nonlinear, linear, and mixed-integer). Specifically, ICNN-based methods appear to be promising for future battery optimization with desirable feasibility and optimality outcomes across both use-cases.
△ Less
Submitted 11 October, 2024;
originally announced October 2024.
-
Towards EMG-to-Speech with a Necklace Form Factor
Authors:
Peter Wu,
Ryan Kaveh,
Raghav Nautiyal,
Christine Zhang,
Albert Guo,
Anvitha Kachinthaya,
Tavish Mishra,
Bohan Yu,
Alan W Black,
Rikky Muller,
Gopala Krishna Anumanchipalli
Abstract:
Electrodes for decoding speech from electromyography (EMG) are typically placed on the face, requiring adhesives that are inconvenient and skin-irritating if used regularly. We explore a different device form factor, where dry electrodes are placed around the neck instead. 11-word, multi-speaker voiced EMG classifiers trained on data recorded with this device achieve 92.7% accuracy. Ablation studi…
▽ More
Electrodes for decoding speech from electromyography (EMG) are typically placed on the face, requiring adhesives that are inconvenient and skin-irritating if used regularly. We explore a different device form factor, where dry electrodes are placed around the neck instead. 11-word, multi-speaker voiced EMG classifiers trained on data recorded with this device achieve 92.7% accuracy. Ablation studies reveal the importance of having more than two electrodes on the neck, and phonological analyses reveal similar classification confusions between neck-only and neck-and-face form factors. Finally, speech-EMG correlation experiments demonstrate a linear relationship between many EMG spectrogram frequency bins and self-supervised speech representation dimensions.
△ Less
Submitted 31 July, 2024;
originally announced July 2024.
-
A Graphical Approach For Brain Haemorrhage Segmentation
Authors:
Ninad Mehendale,
Pragya Gupta,
Nishant Rajadhyaksha,
Ansh Dagha,
Mihir Hundiwala,
Aditi Paretkar,
Sakshi Chavan,
Tanmay Mishra
Abstract:
Haemorrhaging of the brain is the leading cause of death in people between the ages of 15 and 24 and the third leading cause of death in people older than that. Computed tomography (CT) is an imaging modality used to diagnose neurological emergencies, including stroke and traumatic brain injury. Recent advances in Deep Learning and Image Processing have utilised different modalities like CT scans…
▽ More
Haemorrhaging of the brain is the leading cause of death in people between the ages of 15 and 24 and the third leading cause of death in people older than that. Computed tomography (CT) is an imaging modality used to diagnose neurological emergencies, including stroke and traumatic brain injury. Recent advances in Deep Learning and Image Processing have utilised different modalities like CT scans to help automate the detection and segmentation of brain haemorrhage occurrences. In this paper, we propose a novel implementation of an architecture consisting of traditional Convolutional Neural Networks(CNN) along with Graph Neural Networks(GNN) to produce a holistic model for the task of brain haemorrhage segmentation.GNNs work on the principle of neighbourhood aggregation thus providing a reliable estimate of global structures present in images. GNNs work with few layers thus in turn requiring fewer parameters to work with. We were able to achieve a dice coefficient score of around 0.81 with limited data with our implementation.
△ Less
Submitted 14 February, 2022;
originally announced February 2022.
-
Transfer Learning From Sound Representations For Anger Detection in Speech
Authors:
Mohamed Ezzeldin A. ElShaer,
Scott Wisdom,
Taniya Mishra
Abstract:
In this work, we train fully convolutional networks to detect anger in speech. Since training these deep architectures requires large amounts of data and the size of emotion datasets is relatively small, we use transfer learning. However, unlike previous approaches that use speech or emotion-based tasks for the source model, we instead use SoundNet, a fully convolutional neural network trained mul…
▽ More
In this work, we train fully convolutional networks to detect anger in speech. Since training these deep architectures requires large amounts of data and the size of emotion datasets is relatively small, we use transfer learning. However, unlike previous approaches that use speech or emotion-based tasks for the source model, we instead use SoundNet, a fully convolutional neural network trained multimodally on a massive video dataset to classify audio, with ground-truth labels provided by vision-based classifiers. As a result of transfer learning from SoundNet, our trained anger detection model improves performance and generalizes well on a variety of acted, elicited, and natural emotional speech datasets. We also test the cross-lingual effectiveness of our model by evaluating our English-trained model on Mandarin Chinese speech emotion data. Furthermore, our proposed system has low latency suitable for real-time applications, only requiring 1.2 seconds of audio to make a reliable classification.
△ Less
Submitted 6 February, 2019;
originally announced February 2019.