Search | arXiv e-print repository

Sat2Sound: A Unified Framework for Zero-Shot Soundscape Mapping

Authors: Subash Khanal, Srikumar Sastry, Aayush Dhakal, Adeel Ahmad, Nathan Jacobs

Abstract: We present Sat2Sound, a multimodal representation learning framework for soundscape mapping, designed to predict the distribution of sounds at any location on Earth. Existing methods for this task rely on satellite image and paired geotagged audio samples, which often fail to capture the diversity of sound sources at a given location. To address this limitation, we enhance existing datasets by lev… ▽ More We present Sat2Sound, a multimodal representation learning framework for soundscape mapping, designed to predict the distribution of sounds at any location on Earth. Existing methods for this task rely on satellite image and paired geotagged audio samples, which often fail to capture the diversity of sound sources at a given location. To address this limitation, we enhance existing datasets by leveraging a Vision-Language Model (VLM) to generate semantically rich soundscape descriptions for locations depicted in satellite images. Our approach incorporates contrastive learning across audio, audio captions, satellite images, and satellite image captions. We hypothesize that there is a fixed set of soundscape concepts shared across modalities. To this end, we learn a shared codebook of soundscape concepts and represent each sample as a weighted average of these concepts. Sat2Sound achieves state-of-the-art performance in cross-modal retrieval between satellite image and audio on two datasets: GeoSound and SoundingEarth. Additionally, building on Sat2Sound's ability to retrieve detailed soundscape captions, we introduce a novel application: location-based soundscape synthesis, which enables immersive acoustic experiences. Our code and models will be publicly available. △ Less

Submitted 19 May, 2025; originally announced May 2025.

arXiv:2502.19781 [pdf, other]

RANGE: Retrieval Augmented Neural Fields for Multi-Resolution Geo-Embeddings

Authors: Aayush Dhakal, Srikumar Sastry, Subash Khanal, Adeel Ahmad, Eric Xing, Nathan Jacobs

Abstract: The choice of representation for geographic location significantly impacts the accuracy of models for a broad range of geospatial tasks, including fine-grained species classification, population density estimation, and biome classification. Recent works like SatCLIP and GeoCLIP learn such representations by contrastively aligning geolocation with co-located images. While these methods work excepti… ▽ More The choice of representation for geographic location significantly impacts the accuracy of models for a broad range of geospatial tasks, including fine-grained species classification, population density estimation, and biome classification. Recent works like SatCLIP and GeoCLIP learn such representations by contrastively aligning geolocation with co-located images. While these methods work exceptionally well, in this paper, we posit that the current training strategies fail to fully capture the important visual features. We provide an information-theoretic perspective on why the resulting embeddings from these methods discard crucial visual information that is important for many downstream tasks. To solve this problem, we propose a novel retrieval-augmented strategy called RANGE. We build our method on the intuition that the visual features of a location can be estimated by combining the visual features from multiple similar-looking locations. We evaluate our method across a wide variety of tasks. Our results show that RANGE outperforms the existing state-of-the-art models with significant margins in most tasks. We show gains of up to 13.1% on classification tasks and 0.145 $R^2$ on regression tasks. All our code and models will be made available at: https://github.com/mvrl/RANGE. △ Less

Submitted 3 April, 2025; v1 submitted 27 February, 2025; originally announced February 2025.

Comments: Accepted to CVPR 2025

arXiv:2502.06837 [pdf]

Comparison of CNN-based deep learning architectures for unsteady CFD acceleration on small datasets

Authors: Sangam Khanal, Shilaj Baral, Joongoo Jeon

Abstract: CFD acceleration for virtual nuclear reactors or digital twin technology is a primary goal in the nuclear industry. This study compares advanced convolutional neural network (CNN) architectures for accelerating unsteady computational fluid dynamics (CFD) simulations using small datasets based on a challenging natural convection flow dataset. The advanced architectures such as autoencoders, UNet, a… ▽ More CFD acceleration for virtual nuclear reactors or digital twin technology is a primary goal in the nuclear industry. This study compares advanced convolutional neural network (CNN) architectures for accelerating unsteady computational fluid dynamics (CFD) simulations using small datasets based on a challenging natural convection flow dataset. The advanced architectures such as autoencoders, UNet, and ConvLSTM-UNet, were evaluated under identical conditions to determine their predictive accuracy and robustness in autoregressive time-series predictions. ConvLSTM-UNet consistently outperformed other models, particularly in difference value calculation, achieving lower maximum errors and stable residuals. However, error accumulation remains a challenge, limiting reliable predictions to approximately 10 timesteps. This highlights the need for enhanced strategies to improve long-term prediction stability. The novelty of this work lies in its fair comparison of state-of-the-art CNN models within the RePIT framework, demonstrating their potential for accelerating CFD simulations while identifying limitations under small data conditions. Future research will focus on exploring alternative models, such as graph neural networks and implicit neural representations. These efforts aim to develop a robust hybrid approach for long-term unsteady CFD acceleration, contributing to practical applications in virtual nuclear reactor. △ Less

Submitted 5 February, 2025; originally announced February 2025.

Comments: 9 figures, 3 Tables

arXiv:2412.05996 [pdf]

Paddy Disease Detection and Classification Using Computer Vision Techniques: A Mobile Application to Detect Paddy Disease

Authors: Bimarsha Khanal, Paras Poudel, Anish Chapagai, Bijan Regmi, Sitaram Pokhrel, Salik Ram Khanal

Abstract: Plant diseases significantly impact our food supply, causing problems for farmers, economies reliant on agriculture, and global food security. Accurate and timely plant disease diagnosis is crucial for effective treatment and minimizing yield losses. Despite advancements in agricultural technology, a precise and early diagnosis remains a challenge, especially in underdeveloped regions where agricu… ▽ More Plant diseases significantly impact our food supply, causing problems for farmers, economies reliant on agriculture, and global food security. Accurate and timely plant disease diagnosis is crucial for effective treatment and minimizing yield losses. Despite advancements in agricultural technology, a precise and early diagnosis remains a challenge, especially in underdeveloped regions where agriculture is crucial and agricultural experts are scarce. However, adopting Deep Learning applications can assist in accurately identifying diseases without needing plant pathologists. In this study, the effectiveness of various computer vision models for detecting paddy diseases is evaluated and proposed the best deep learning-based disease detection system. Both classification and detection using the Paddy Doctor dataset, which contains over 20,000 annotated images of paddy leaves for disease diagnosis are tested and evaluated. For detection, we utilized the YOLOv8 model-based model were used for paddy disease detection and CNN models and the Vision Transformer were used for disease classification. The average mAP50 of 69% for detection tasks was achieved and the Vision Transformer classification accuracy was 99.38%. It was found that detection models are effective at identifying multiple diseases simultaneously with less computing power, whereas classification models, though computationally expensive, exhibit better performance for classifying single diseases. Additionally, a mobile application was developed to enable farmers to identify paddy diseases instantly. Experiments with the app showed encouraging results in utilizing the trained models for both disease classification and treatment guidance. △ Less

Submitted 8 December, 2024; originally announced December 2024.

Comments: 21 pages,12 figures and 2 tables

arXiv:2411.00683 [pdf, other]

TaxaBind: A Unified Embedding Space for Ecological Applications

Authors: Srikumar Sastry, Subash Khanal, Aayush Dhakal, Adeel Ahmad, Nathan Jacobs

Abstract: We present TaxaBind, a unified embedding space for characterizing any species of interest. TaxaBind is a multimodal embedding space across six modalities: ground-level images of species, geographic location, satellite image, text, audio, and environmental features, useful for solving ecological problems. To learn this joint embedding space, we leverage ground-level images of species as a binding m… ▽ More We present TaxaBind, a unified embedding space for characterizing any species of interest. TaxaBind is a multimodal embedding space across six modalities: ground-level images of species, geographic location, satellite image, text, audio, and environmental features, useful for solving ecological problems. To learn this joint embedding space, we leverage ground-level images of species as a binding modality. We propose multimodal patching, a technique for effectively distilling the knowledge from various modalities into the binding modality. We construct two large datasets for pretraining: iSatNat with species images and satellite images, and iSoundNat with species images and audio. Additionally, we introduce TaxaBench-8k, a diverse multimodal dataset with six paired modalities for evaluating deep learning models on ecological tasks. Experiments with TaxaBind demonstrate its strong zero-shot and emergent capabilities on a range of tasks including species classification, cross-model retrieval, and audio classification. The datasets and models are made available at https://github.com/mvrl/TaxaBind. △ Less

Submitted 1 November, 2024; originally announced November 2024.

Comments: Accepted to WACV 2025

arXiv:2408.07050 [pdf, other]

PSM: Learning Probabilistic Embeddings for Multi-scale Zero-Shot Soundscape Mapping

Authors: Subash Khanal, Eric Xing, Srikumar Sastry, Aayush Dhakal, Zhexiao Xiong, Adeel Ahmad, Nathan Jacobs

Abstract: A soundscape is defined by the acoustic environment a person perceives at a location. In this work, we propose a framework for mapping soundscapes across the Earth. Since soundscapes involve sound distributions that span varying spatial scales, we represent locations with multi-scale satellite imagery and learn a joint representation among this imagery, audio, and text. To capture the inherent unc… ▽ More A soundscape is defined by the acoustic environment a person perceives at a location. In this work, we propose a framework for mapping soundscapes across the Earth. Since soundscapes involve sound distributions that span varying spatial scales, we represent locations with multi-scale satellite imagery and learn a joint representation among this imagery, audio, and text. To capture the inherent uncertainty in the soundscape of a location, we design the representation space to be probabilistic. We also fuse ubiquitous metadata (including geolocation, time, and data source) to enable learning of spatially and temporally dynamic representations of soundscapes. We demonstrate the utility of our framework by creating large-scale soundscape maps integrating both audio and text with temporal control. To facilitate future research on this task, we also introduce a large-scale dataset, GeoSound, containing over $300k$ geotagged audio samples paired with both low- and high-resolution satellite imagery. We demonstrate that our method outperforms the existing state-of-the-art on both GeoSound and the existing SoundingEarth dataset. Our dataset and code is available at https://github.com/mvrl/PSM. △ Less

Submitted 13 August, 2024; originally announced August 2024.

Comments: Accepted at ACM MM 2024

arXiv:2407.09672 [pdf, ps, other]

Mixed-View Panorama Synthesis using Geospatially Guided Diffusion

Authors: Zhexiao Xiong, Xin Xing, Scott Workman, Subash Khanal, Nathan Jacobs

Abstract: We introduce the task of mixed-view panorama synthesis, where the goal is to synthesize a novel panorama given a small set of input panoramas and a satellite image of the area. This contrasts with previous work which only uses input panoramas (same-view synthesis), or an input satellite image (cross-view synthesis). We argue that the mixed-view setting is the most natural to support panorama synth… ▽ More We introduce the task of mixed-view panorama synthesis, where the goal is to synthesize a novel panorama given a small set of input panoramas and a satellite image of the area. This contrasts with previous work which only uses input panoramas (same-view synthesis), or an input satellite image (cross-view synthesis). We argue that the mixed-view setting is the most natural to support panorama synthesis for arbitrary locations worldwide. A critical challenge is that the spatial coverage of panoramas is uneven, with few panoramas available in many regions of the world. We introduce an approach that utilizes diffusion-based modeling and an attention-based architecture for extracting information from all available input imagery. Experimental results demonstrate the effectiveness of our proposed method. In particular, our model can handle scenarios when the available panoramas are sparse or far from the location of the panorama we are attempting to synthesize. The project page is available at https://mixed-view.github.io △ Less

Submitted 1 June, 2025; v1 submitted 12 July, 2024; originally announced July 2024.

Comments: Accepted by Transactions on Machine Learning Research (TMLR) Project page: https://mixed-view.github.io

arXiv:2407.07098 [pdf, other]

Concurrent Geometry, Control, and Layout Optimization of Wave Energy Converter Farms in Probabilistic Irregular Waves using Surrogate Modeling

Authors: Saeed Azad, Daniel R. Herber, Suraj Khanal, Gaofeng Jia

Abstract: A promising direction towards improving the performance of wave energy converter (WEC) farms is to leverage a system-level integrated approach known as control co-design (CCD). A WEC farm CCD problem may entail decision variables associated with the geometric attributes, control parameters, and layout of the farm. However, solving the resulting optimization problem, which requires the estimation o… ▽ More A promising direction towards improving the performance of wave energy converter (WEC) farms is to leverage a system-level integrated approach known as control co-design (CCD). A WEC farm CCD problem may entail decision variables associated with the geometric attributes, control parameters, and layout of the farm. However, solving the resulting optimization problem, which requires the estimation of hydrodynamic coefficients through numerical methods such as multiple scattering (MS), is computationally prohibitive. To mitigate this computational bottleneck, we construct data-driven surrogate models (SMs) using artificial neural networks in combination with concepts from many-body expansion. The resulting SMs, developed using an active learning strategy known as query by committee, are validated through a variety of methods to ensure acceptable performance in estimating the hydrodynamic coefficients, (energy-related) objective function, and decision variables. To rectify inherent errors in SMs, a hybrid optimization strategy is devised. It involves solving an optimization problem with a genetic algorithm and SMs to generate a starting point that will be used with a gradient-based optimizer and MS. The effectiveness of the proposed approach is demonstrated by solving a series of optimization problems with increasing levels of integration. For a layout optimization study, the framework offers a 91-fold increase in computational efficiency compared to MS. Previously unexplored investigations of much further complexity are also performed, leading to a concurrent geometry, control, and layout optimization of WEC devices in probabilistic irregular waves. The scalability of the method is evaluated by increasing the farm size to include 25 devices. The results indicate promising directions toward a practical framework for integrated WEC farm design with more tractable computational demands. △ Less

Submitted 17 June, 2024; originally announced July 2024.

Comments: 22 pages and 19 figures

arXiv:2405.15717 [pdf, other]

Integrated Design for Wave Energy Converter Farms: Assessing Plant, Control, Layout, and Site Selection Coupling in the Presence of Irregular Waves

Authors: Saeed Azad, Suraj Khanal, Daniel R. Herber, Gaofeng Jia

Abstract: A promising direction towards reducing the levelized cost of energy for wave energy converter (WEC) farms is to improve their performance. WEC design studies generally focus on a single design domain (e.g., geometry, control, or layout) to improve the farm's performance under simplifying assumptions, such as regular waves. This strategy, however, has resulted in design recommendations that are imp… ▽ More A promising direction towards reducing the levelized cost of energy for wave energy converter (WEC) farms is to improve their performance. WEC design studies generally focus on a single design domain (e.g., geometry, control, or layout) to improve the farm's performance under simplifying assumptions, such as regular waves. This strategy, however, has resulted in design recommendations that are impractical or limited in scope because WEC farms are complex systems that exhibit strong coupling among geometry, control, and layout domains. In addition, the location of the candidate site, which has a large impact on the performance of the farm, is often overlooked. Motivated by some of the limitations observed in WEC literature, this study uses an integrated design framework, based on simultaneous control co-design (CCD) principles, to discuss the impact of site selection and wave type on WEC farm design. Interactions among plant, control, and layout are also investigated and discussed using a wide range of simulations and optimization studies. All of the studies were conducted using frequency-domain heaving cylinder WEC devices within a farm with a linear reactive controller in the presence of irregular probabilistic waves. The results provide high-level guidelines to help the WEC design community move toward an integrated design perspective. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: 12 pages and 7 figures

arXiv:2405.10476 [pdf, other]

Analysis, Modeling and Design of Personalized Digital Learning Environment

Authors: Sanjaya Khanal, Shiva Raj Pokhrel

Abstract: This research analyzes, models and develops a novel Digital Learning Environment (DLE) fortified by the innovative Private Learning Intelligence (PLI) framework. The proposed PLI framework leverages federated machine learning (FL) techniques to autonomously construct and continuously refine personalized learning models for individual learners, ensuring robust privacy protection. Our approach is pi… ▽ More This research analyzes, models and develops a novel Digital Learning Environment (DLE) fortified by the innovative Private Learning Intelligence (PLI) framework. The proposed PLI framework leverages federated machine learning (FL) techniques to autonomously construct and continuously refine personalized learning models for individual learners, ensuring robust privacy protection. Our approach is pivotal in advancing DLE capabilities, empowering learners to actively participate in personalized real-time learning experiences. The integration of PLI within a DLE also streamlines instructional design and development demands for personalized teaching/learning. We seek ways to establish a foundation for the seamless integration of FL into learning systems, offering a transformative approach to personalized learning in digital environments. Our implementation details and code are made public. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: IEEE Trans on Education, 2024

arXiv:2405.06794 [pdf, other]

Site-dependent Solutions of Wave Energy Converter Farms with Surrogate Models, Control Co-design, and Layout Optimization

Authors: Saeed Azad, Daniel R. Herber, Suraj Khanal, Gaofeng Jia

Abstract: Design of wave energy converter farms entails multiple domains that are coupled, and thus, their concurrent representation and consideration in early-stage design optimization has the potential to offer new insights and promising solutions with improved performance. Concurrent optimization of physical attributes (e.g., plant) and the control system design is often known as control co-design or CCD… ▽ More Design of wave energy converter farms entails multiple domains that are coupled, and thus, their concurrent representation and consideration in early-stage design optimization has the potential to offer new insights and promising solutions with improved performance. Concurrent optimization of physical attributes (e.g., plant) and the control system design is often known as control co-design or CCD. To further improve performance, the layout of the farm must be carefully optimized in order to ensure that constructive effects from hydrodynamic interactions are leveraged, while destructive effects are avoided. The variations in the joint probability distribution of waves, stemming from distinct site locations, affect the farm's performance and can potentially influence decisions regarding optimal plant selection, control strategies, and layout configurations. Therefore, this paper undertakes a concurrent exploration of control co-design and layout optimization for a farm comprising five devices, modeled as heaving cylinders in the frequency domain, situated across four distinct site locations: Alaskan Coasts, East Coast, Pacific Islands, and West Coast. The challenge of efficiently and accurately estimating hydrodynamic coefficients within the optimization loop was mitigated through the application of surrogate modeling and many-body expansion principles. Results indicate the optimized solutions exhibit variations in plant, control, and layout for each candidate site, signifying the importance of system-level design with environmental considerations from the early stages of the design process. △ Less

Submitted 10 May, 2024; originally announced May 2024.

Comments: 9 pages, 9 figures

arXiv:2404.11720 [pdf, other]

GEOBIND: Binding Text, Image, and Audio through Satellite Images

Authors: Aayush Dhakal, Subash Khanal, Srikumar Sastry, Adeel Ahmad, Nathan Jacobs

Abstract: In remote sensing, we are interested in modeling various modalities for some geographic location. Several works have focused on learning the relationship between a location and type of landscape, habitability, audio, textual descriptions, etc. Recently, a common way to approach these problems is to train a deep-learning model that uses satellite images to infer some unique characteristics of the l… ▽ More In remote sensing, we are interested in modeling various modalities for some geographic location. Several works have focused on learning the relationship between a location and type of landscape, habitability, audio, textual descriptions, etc. Recently, a common way to approach these problems is to train a deep-learning model that uses satellite images to infer some unique characteristics of the location. In this work, we present a deep-learning model, GeoBind, that can infer about multiple modalities, specifically text, image, and audio, from satellite imagery of a location. To do this, we use satellite images as the binding element and contrastively align all other modalities to the satellite image data. Our training results in a joint embedding space with multiple types of data: satellite image, ground-level image, audio, and text. Furthermore, our approach does not require a single complex dataset that contains all the modalities mentioned above. Rather it only requires multiple satellite-image paired data. While we only align three modalities in this paper, we present a general framework that can be used to create an embedding space with any number of modalities by using satellite images as the binding element. Our results show that, unlike traditional unimodal models, GeoBind is versatile and can reason about multiple modalities for a given satellite image input. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: 2024 IEEE International Geoscience and Remote Sensing Symposium

arXiv:2404.06637 [pdf, other]

GeoSynth: Contextually-Aware High-Resolution Satellite Image Synthesis

Authors: Srikumar Sastry, Subash Khanal, Aayush Dhakal, Nathan Jacobs

Abstract: We present GeoSynth, a model for synthesizing satellite images with global style and image-driven layout control. The global style control is via textual prompts or geographic location. These enable the specification of scene semantics or regional appearance respectively, and can be used together. We train our model on a large dataset of paired satellite imagery, with automatically generated capti… ▽ More We present GeoSynth, a model for synthesizing satellite images with global style and image-driven layout control. The global style control is via textual prompts or geographic location. These enable the specification of scene semantics or regional appearance respectively, and can be used together. We train our model on a large dataset of paired satellite imagery, with automatically generated captions, and OpenStreetMap data. We evaluate various combinations of control inputs, including different types of layout controls. Results demonstrate that our model can generate diverse, high-quality images and exhibits excellent zero-shot generalization. The code and model checkpoints are available at https://github.com/mvrl/GeoSynth. △ Less

Submitted 9 April, 2024; originally announced April 2024.

arXiv:2401.06524 [pdf, ps, other]

Domain Adaptation for Time series Transformers using One-step fine-tuning

Authors: Subina Khanal, Seshu Tirupathi, Giulio Zizzo, Ambrish Rawat, Torben Bach Pedersen

Abstract: The recent breakthrough of Transformers in deep learning has drawn significant attention of the time series community due to their ability to capture long-range dependencies. However, like other deep learning models, Transformers face limitations in time series prediction, including insufficient temporal understanding, generalization challenges, and data shift issues for the domains with limited d… ▽ More The recent breakthrough of Transformers in deep learning has drawn significant attention of the time series community due to their ability to capture long-range dependencies. However, like other deep learning models, Transformers face limitations in time series prediction, including insufficient temporal understanding, generalization challenges, and data shift issues for the domains with limited data. Additionally, addressing the issue of catastrophic forgetting, where models forget previously learned information when exposed to new data, is another critical aspect that requires attention in enhancing the robustness of Transformers for time series tasks. To address these limitations, in this paper, we pre-train the time series Transformer model on a source domain with sufficient data and fine-tune it on the target domain with limited data. We introduce the \emph{One-step fine-tuning} approach, adding some percentage of source domain data to the target domains, providing the model with diverse time series instances. We then fine-tune the pre-trained model using a gradual unfreezing technique. This helps enhance the model's performance in time series prediction for domains with limited data. Extensive experimental results on two real-world datasets show that our approach improves over the state-of-the-art baselines by 4.35% and 11.54% for indoor temperature and wind power prediction, respectively. △ Less

Submitted 12 January, 2024; originally announced January 2024.

Comments: Accepted at the Fourth Workshop of Artificial Intelligence for Time Series Analysis (AI4TS): Theory, Algorithms, and Applications, AAAI 2024, Vancouver, Canada

arXiv:2312.08334 [pdf, other]

LD-SDM: Language-Driven Hierarchical Species Distribution Modeling

Authors: Srikumar Sastry, Xin Xing, Aayush Dhakal, Subash Khanal, Adeel Ahmad, Nathan Jacobs

Abstract: We focus on the problem of species distribution modeling using global-scale presence-only data. Most previous studies have mapped the range of a given species using geographical and environmental features alone. To capture a stronger implicit relationship between species, we encode the taxonomic hierarchy of species using a large language model. This enables range mapping for any taxonomic rank an… ▽ More We focus on the problem of species distribution modeling using global-scale presence-only data. Most previous studies have mapped the range of a given species using geographical and environmental features alone. To capture a stronger implicit relationship between species, we encode the taxonomic hierarchy of species using a large language model. This enables range mapping for any taxonomic rank and unseen species without additional supervision. Further, we propose a novel proximity-aware evaluation metric that enables evaluating species distribution models using any pixel-level representation of ground-truth species range map. The proposed metric penalizes the predictions of a model based on its proximity to the ground truth. We describe the effectiveness of our model by systematically evaluating on the task of species range prediction, zero-shot prediction and geo-feature regression against the state-of-the-art. Results show our model outperforms the strong baselines when trained with a variety of multi-label learning losses. △ Less

Submitted 13 December, 2023; originally announced December 2023.

Comments: 17 pages, 9 figures

arXiv:2311.10755 [pdf]

Robotic Pollination of Apples in Commercial Orchards

Authors: Ranjan Sapkota, Dawood Ahmed, Salik Ram Khanal, Uddhav Bhattarai, Changki Mo, Matthew D. Whiting, Manoj Karkee

Abstract: This research presents a novel, robotic pollination system designed for targeted pollination of apple flowers in modern fruiting wall orchards. Developed in response to the challenges of global colony collapse disorder, climate change, and the need for sustainable alternatives to traditional pollinators, the system utilizes a commercial manipulator, a vision system, and a spray nozzle for pollen a… ▽ More This research presents a novel, robotic pollination system designed for targeted pollination of apple flowers in modern fruiting wall orchards. Developed in response to the challenges of global colony collapse disorder, climate change, and the need for sustainable alternatives to traditional pollinators, the system utilizes a commercial manipulator, a vision system, and a spray nozzle for pollen application. Initial tests in April 2022 pollinated 56% of the target flower clusters with at least one fruit with a cycle time of 6.5 s. Significant improvements were made in 2023, with the system accurately detecting 91% of available flowers and pollinating 84% of target flowers with a reduced cycle time of 4.8 s. This system showed potential for precision artificial pollination that can also minimize the need for labor-intensive field operations such as flower and fruitlet thinning. △ Less

Submitted 3 February, 2024; v1 submitted 10 November, 2023; originally announced November 2023.

Comments: 2 Page, 1 figure

arXiv:2311.09563 [pdf, other]

doi 10.1109/TEMPR.2024.3390760

Multi-Objective Transmission Expansion: An Offshore Wind Power Integration Case Study

Authors: Saroj Khanal, Christoph Graf, Zhirui Liang, Yury Dvorkin, Burçin Ünel

Abstract: Despite ambitious offshore wind targets in the U.S. and globally, offshore grid planning guidance remains notably scarce, contrasting with well-established frameworks for onshore grids. This gap, alongside the increasing penetration of offshore wind and other clean-energy resources in onshore grids, highlights the urgent need for a coordinated planning framework. Our paper describes a multi-object… ▽ More Despite ambitious offshore wind targets in the U.S. and globally, offshore grid planning guidance remains notably scarce, contrasting with well-established frameworks for onshore grids. This gap, alongside the increasing penetration of offshore wind and other clean-energy resources in onshore grids, highlights the urgent need for a coordinated planning framework. Our paper describes a multi-objective, multistage generation, storage and transmission expansion planning model to facilitate efficient and resilient large-scale adoption of offshore wind power. Recognizing regulatory emphasis and, in some cases, requirements to consider externalities, this model explicitly accounts for negative externalities: greenhouse gas emissions and local emission-induced air pollution. Utilizing an 8-zone ISO-NE test system and a 9-zone PJM test system, we explore grid expansion sensitivities such as impacts of optimizing Points of Interconnection (POIs) versus fixed POIs, negative externalities, and consideration of extreme operational scenarios resulting from offshore wind integration. Our results indicate that accounting for negative externalities necessitates greater upfront investment in clean generation and storage (balanced by lower expected operational costs). Optimizing POIs could significantly reshape offshore topology or POIs, and lower total cost. Finally, accounting for extreme operational scenarios typically results in greater operational costs and sometimes may alter onshore line investment. △ Less

Submitted 21 April, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

arXiv:2310.19168 [pdf, other]

BirdSAT: Cross-View Contrastive Masked Autoencoders for Bird Species Classification and Mapping

Authors: Srikumar Sastry, Subash Khanal, Aayush Dhakal, Di Huang, Nathan Jacobs

Abstract: We propose a metadata-aware self-supervised learning~(SSL)~framework useful for fine-grained classification and ecological mapping of bird species around the world. Our framework unifies two SSL strategies: Contrastive Learning~(CL) and Masked Image Modeling~(MIM), while also enriching the embedding space with metadata available with ground-level imagery of birds. We separately train uni-modal and… ▽ More We propose a metadata-aware self-supervised learning~(SSL)~framework useful for fine-grained classification and ecological mapping of bird species around the world. Our framework unifies two SSL strategies: Contrastive Learning~(CL) and Masked Image Modeling~(MIM), while also enriching the embedding space with metadata available with ground-level imagery of birds. We separately train uni-modal and cross-modal ViT on a novel cross-view global bird species dataset containing ground-level imagery, metadata (location, time), and corresponding satellite imagery. We demonstrate that our models learn fine-grained and geographically conditioned features of birds, by evaluating on two downstream tasks: fine-grained visual classification~(FGVC) and cross-modal retrieval. Pre-trained models learned using our framework achieve SotA performance on FGVC of iNAT-2021 birds and in transfer learning settings for CUB-200-2011 and NABirds datasets. Moreover, the impressive cross-modal retrieval performance of our model enables the creation of species distribution maps across any geographic region. The dataset and source code will be released at https://github.com/mvrl/BirdSAT}. △ Less

Submitted 29 October, 2023; originally announced October 2023.

Comments: Accepted at WACV 2024

arXiv:2309.10667 [pdf, other]

Learning Tri-modal Embeddings for Zero-Shot Soundscape Mapping

Authors: Subash Khanal, Srikumar Sastry, Aayush Dhakal, Nathan Jacobs

Abstract: We focus on the task of soundscape mapping, which involves predicting the most probable sounds that could be perceived at a particular geographic location. We utilise recent state-of-the-art models to encode geotagged audio, a textual description of the audio, and an overhead image of its capture location using contrastive pre-training. The end result is a shared embedding space for the three moda… ▽ More We focus on the task of soundscape mapping, which involves predicting the most probable sounds that could be perceived at a particular geographic location. We utilise recent state-of-the-art models to encode geotagged audio, a textual description of the audio, and an overhead image of its capture location using contrastive pre-training. The end result is a shared embedding space for the three modalities, which enables the construction of soundscape maps for any geographic region from textual or audio queries. Using the SoundingEarth dataset, we find that our approach significantly outperforms the existing SOTA, with an improvement of image-to-audio Recall@100 from 0.256 to 0.450. Our code is available at https://github.com/mvrl/geoclap. △ Less

Submitted 19 September, 2023; originally announced September 2023.

Comments: Accepted at BMVC 2023

arXiv:2308.03998 [pdf, other]

Real-time Strawberry Detection Based on Improved YOLOv5s Architecture for Robotic Harvesting in open-field environment

Authors: Zixuan He, Salik Ram Khanal, Xin Zhang, Manoj Karkee, Qin Zhang

Abstract: This study proposed a YOLOv5-based custom object detection model to detect strawberries in an outdoor environment. The original architecture of the YOLOv5s was modified by replacing the C3 module with the C2f module in the backbone network, which provided a better feature gradient flow. Secondly, the Spatial Pyramid Pooling Fast in the final layer of the backbone network of YOLOv5s was combined wi… ▽ More This study proposed a YOLOv5-based custom object detection model to detect strawberries in an outdoor environment. The original architecture of the YOLOv5s was modified by replacing the C3 module with the C2f module in the backbone network, which provided a better feature gradient flow. Secondly, the Spatial Pyramid Pooling Fast in the final layer of the backbone network of YOLOv5s was combined with Cross Stage Partial Net to improve the generalization ability over the strawberry dataset in this study. The proposed architecture was named YOLOv5s-Straw. The RGB images dataset of the strawberry canopy with three maturity classes (immature, nearly mature, and mature) was collected in open-field environment and augmented through a series of operations including brightness reduction, brightness increase, and noise adding. To verify the superiority of the proposed method for strawberry detection in open-field environment, four competitive detection models (YOLOv3-tiny, YOLOv5s, YOLOv5s-C2f, and YOLOv8s) were trained, and tested under the same computational environment and compared with YOLOv5s-Straw. The results showed that the highest mean average precision of 80.3% was achieved using the proposed architecture whereas the same was achieved with YOLOv3-tiny, YOLOv5s, YOLOv5s-C2f, and YOLOv8s were 73.4%, 77.8%, 79.8%, 79.3%, respectively. Specifically, the average precision of YOLOv5s-Straw was 82.1% in the immature class, 73.5% in the nearly mature class, and 86.6% in the mature class, which were 2.3% and 3.7%, respectively, higher than that of the latest YOLOv8s. The model included 8.6*10^6 network parameters with an inference speed of 18ms per image while the inference speed of YOLOv8s had a slower inference speed of 21.0ms and heavy parameters of 11.1*10^6, which indicates that the proposed model is fast enough for real time strawberry detection and localization for the robotic picking. △ Less

Submitted 12 October, 2023; v1 submitted 7 August, 2023; originally announced August 2023.

Comments: 20 pages; 15 figures

arXiv:2307.15904 [pdf, other]

Sat2Cap: Mapping Fine-Grained Textual Descriptions from Satellite Images

Authors: Aayush Dhakal, Adeel Ahmad, Subash Khanal, Srikumar Sastry, Hannah Kerner, Nathan Jacobs

Abstract: We propose a weakly supervised approach for creating maps using free-form textual descriptions. We refer to this work of creating textual maps as zero-shot mapping. Prior works have approached mapping tasks by developing models that predict a fixed set of attributes using overhead imagery. However, these models are very restrictive as they can only solve highly specific tasks for which they were t… ▽ More We propose a weakly supervised approach for creating maps using free-form textual descriptions. We refer to this work of creating textual maps as zero-shot mapping. Prior works have approached mapping tasks by developing models that predict a fixed set of attributes using overhead imagery. However, these models are very restrictive as they can only solve highly specific tasks for which they were trained. Mapping text, on the other hand, allows us to solve a large variety of mapping problems with minimal restrictions. To achieve this, we train a contrastive learning framework called Sat2Cap on a new large-scale dataset with 6.1M pairs of overhead and ground-level images. For a given location and overhead image, our model predicts the expected CLIP embeddings of the ground-level scenery. The predicted CLIP embeddings are then used to learn about the textual space associated with that location. Sat2Cap is also conditioned on date-time information, allowing it to model temporally varying concepts over a location. Our experimental results demonstrate that our models successfully capture ground-level concepts and allow large-scale mapping of fine-grained textual queries. Our approach does not require any text-labeled data, making the training easily scalable. The code, dataset, and models will be made publicly available. △ Less

Submitted 11 April, 2024; v1 submitted 29 July, 2023; originally announced July 2023.

Comments: 16 pages

arXiv:2307.00112 [pdf]

Performance of ChatGPT on USMLE: Unlocking the Potential of Large Language Models for AI-Assisted Medical Education

Authors: Prabin Sharma, Kisan Thapa, Dikshya Thapa, Prastab Dhakal, Mala Deep Upadhaya, Santosh Adhikari, Salik Ram Khanal

Abstract: Artificial intelligence is gaining traction in more ways than ever before. The popularity of language models and AI-based businesses has soared since ChatGPT was made available to the general public via OpenAI. It is becoming increasingly common for people to use ChatGPT both professionally and personally. Considering the widespread use of ChatGPT and the reliance people place on it, this study de… ▽ More Artificial intelligence is gaining traction in more ways than ever before. The popularity of language models and AI-based businesses has soared since ChatGPT was made available to the general public via OpenAI. It is becoming increasingly common for people to use ChatGPT both professionally and personally. Considering the widespread use of ChatGPT and the reliance people place on it, this study determined how reliable ChatGPT can be for answering complex medical and clinical questions. Harvard University gross anatomy along with the United States Medical Licensing Examination (USMLE) questionnaire were used to accomplish the objective. The paper evaluated the obtained results using a 2-way ANOVA and posthoc analysis. Both showed systematic covariation between format and prompt. Furthermore, the physician adjudicators independently rated the outcome's accuracy, concordance, and insight. As a result of the analysis, ChatGPT-generated answers were found to be more context-oriented and represented a better model for deductive reasoning than regular Google search results. Furthermore, ChatGPT obtained 58.8% on logical questions and 60% on ethical questions. This means that the ChatGPT is approaching the passing range for logical questions and has crossed the threshold for ethical questions. The paper believes ChatGPT and other language learning models can be invaluable tools for e-learners; however, the study suggests that there is still room to improve their accuracy. In order to improve ChatGPT's performance in the future, further research is needed to better understand how it can answer different types of questions. △ Less

Submitted 27 July, 2023; v1 submitted 30 June, 2023; originally announced July 2023.

Comments: 12 pages, 4 Figues, 4 tables

arXiv:2306.14300 [pdf]

Screening Autism Spectrum Disorder in childrens using Deep Learning Approach : Evaluating the classification model of YOLOv8 by comparing with other models

Authors: Subash Gautam, Prabin Sharma, Kisan Thapa, Mala Deep Upadhaya, Dikshya Thapa, Salik Ram Khanal, Vítor Manuel de Jesus Filipe

Abstract: Autism spectrum disorder (ASD) is a developmental condition that presents significant challenges in social interaction, communication, and behavior. Early intervention plays a pivotal role in enhancing cognitive abilities and reducing autistic symptoms in children with ASD. Numerous clinical studies have highlighted distinctive facial characteristics that distinguish ASD children from typically de… ▽ More Autism spectrum disorder (ASD) is a developmental condition that presents significant challenges in social interaction, communication, and behavior. Early intervention plays a pivotal role in enhancing cognitive abilities and reducing autistic symptoms in children with ASD. Numerous clinical studies have highlighted distinctive facial characteristics that distinguish ASD children from typically developing (TD) children. In this study, we propose a practical solution for ASD screening using facial images using YoloV8 model. By employing YoloV8, a deep learning technique, on a dataset of Kaggle, we achieved exceptional results. Our model achieved a remarkable 89.64% accuracy in classification and an F1-score of 0.89. Our findings provide support for the clinical observations regarding facial feature discrepancies between children with ASD. The high F1-score obtained demonstrates the potential of deep learning models in screening children with ASD. We conclude that the newest version of YoloV8 which is usually used for object detection can be used for classification problem of Austistic and Non-autistic images. △ Less

Submitted 25 June, 2023; originally announced June 2023.

Comments: 17 pages,12 figures

arXiv:2305.05687 [pdf, other]

doi 10.3847/1538-4357/accc89

Coronal Heating as Determined by the Solar Flare Frequency Distribution Obtained by Aggregating Case Studies

Authors: James Paul Mason, Alexandra Werth, Colin G. West, Allison A. Youngblood, Donald L. Woodraska, Courtney Peck, Kevin Lacjak, Florian G. Frick, Moutamen Gabir, Reema A. Alsinan, Thomas Jacobsen, Mohammad Alrubaie, Kayla M. Chizmar, Benjamin P. Lau, Lizbeth Montoya Dominguez, David Price, Dylan R. Butler, Connor J. Biron, Nikita Feoktistov, Kai Dewey, N. E. Loomis, Michal Bodzianowski, Connor Kuybus, Henry Dietrick, Aubrey M. Wolfe , et al. (977 additional authors not shown)

Abstract: Flare frequency distributions represent a key approach to addressing one of the largest problems in solar and stellar physics: determining the mechanism that counter-intuitively heats coronae to temperatures that are orders of magnitude hotter than the corresponding photospheres. It is widely accepted that the magnetic field is responsible for the heating, but there are two competing mechanisms th… ▽ More Flare frequency distributions represent a key approach to addressing one of the largest problems in solar and stellar physics: determining the mechanism that counter-intuitively heats coronae to temperatures that are orders of magnitude hotter than the corresponding photospheres. It is widely accepted that the magnetic field is responsible for the heating, but there are two competing mechanisms that could explain it: nanoflares or Alfvén waves. To date, neither can be directly observed. Nanoflares are, by definition, extremely small, but their aggregate energy release could represent a substantial heating mechanism, presuming they are sufficiently abundant. One way to test this presumption is via the flare frequency distribution, which describes how often flares of various energies occur. If the slope of the power law fitting the flare frequency distribution is above a critical threshold, $α=2$ as established in prior literature, then there should be a sufficient abundance of nanoflares to explain coronal heating. We performed $>$600 case studies of solar flares, made possible by an unprecedented number of data analysts via three semesters of an undergraduate physics laboratory course. This allowed us to include two crucial, but nontrivial, analysis methods: pre-flare baseline subtraction and computation of the flare energy, which requires determining flare start and stop times. We aggregated the results of these analyses into a statistical study to determine that $α= 1.63 \pm 0.03$. This is below the critical threshold, suggesting that Alfvén waves are an important driver of coronal heating. △ Less

Submitted 9 May, 2023; originally announced May 2023.

Comments: 1,002 authors, 14 pages, 4 figures, 3 tables, published by The Astrophysical Journal on 2023-05-09, volume 948, page 71

arXiv:2304.09351 [pdf]

Machine Vision System for Early-stage Apple Flowers and Flower Clusters Detection for Precision Thinning and Pollination

Authors: Salik Ram Khanal, Ranjan Sapkota, Dawood Ahmed, Uddhav Bhattarai, Manoj Karkee

Abstract: Early-stage identification of fruit flowers that are in both opened and unopened condition in an orchard environment is significant information to perform crop load management operations such as flower thinning and pollination using automated and robotic platforms. These operations are important in tree-fruit agriculture to enhance fruit quality, manage crop load, and enhance the overall profit. T… ▽ More Early-stage identification of fruit flowers that are in both opened and unopened condition in an orchard environment is significant information to perform crop load management operations such as flower thinning and pollination using automated and robotic platforms. These operations are important in tree-fruit agriculture to enhance fruit quality, manage crop load, and enhance the overall profit. The recent development in agricultural automation suggests that this can be done using robotics which includes machine vision technology. In this article, we proposed a vision system that detects early-stage flowers in an unstructured orchard environment using YOLOv5 object detection algorithm. For the robotics implementation, the position of a cluster of the flower blossom is important to navigate the robot and the end effector. The centroid of individual flowers (both open and unopen) was identified and associated with flower clusters via K-means clustering. The accuracy of the opened and unopened flower detection is achieved up to mAP of 81.9% in commercial orchard images. △ Less

Submitted 18 April, 2023; originally announced April 2023.

arXiv:2303.12974 [pdf]

Performance Analysis and Evaluation of Cloud Vision Emotion APIs

Authors: Salik Ram Khanal, Prabin Sharma, Hugo Fernandes, João Barroso, Vítor Manuel de Jesus Filipe

Abstract: Facial expression is a way of communication that can be used to interact with computers or other electronic devices and the recognition of emotion from faces is an emerging practice with application in many fields. There are many cloud-based vision application programming interfaces available that recognize emotion from facial images and video. In this article, the performances of two well-known A… ▽ More Facial expression is a way of communication that can be used to interact with computers or other electronic devices and the recognition of emotion from faces is an emerging practice with application in many fields. There are many cloud-based vision application programming interfaces available that recognize emotion from facial images and video. In this article, the performances of two well-known APIs were compared using a public dataset of 980 images of facial emotions. For these experiments, a client program was developed which iterates over the image set, calls the cloud services, and caches the results of the emotion detection for each image. The performance was evaluated in each class of emotions using prediction accuracy. It has been found that the prediction accuracy for each emotion varies according to the cloud service being used. Similarly, each service provider presents a strong variation of performance according to the class being analyzed, as can be seen with more detail in this artilects. △ Less

Submitted 22 March, 2023; originally announced March 2023.

Comments: 10 pages, 6 figures

arXiv:2206.14841 [pdf, other]

Causality for Inherently Explainable Transformers: CAT-XPLAIN

Authors: Subash Khanal, Benjamin Brodie, Xin Xing, Ai-Ling Lin, Nathan Jacobs

Abstract: There have been several post-hoc explanation approaches developed to explain pre-trained black-box neural networks. However, there is still a gap in research efforts toward designing neural networks that are inherently explainable. In this paper, we utilize a recently proposed instance-wise post-hoc causal explanation method to make an existing transformer architecture inherently explainable. Once… ▽ More There have been several post-hoc explanation approaches developed to explain pre-trained black-box neural networks. However, there is still a gap in research efforts toward designing neural networks that are inherently explainable. In this paper, we utilize a recently proposed instance-wise post-hoc causal explanation method to make an existing transformer architecture inherently explainable. Once trained, our model provides an explanation in the form of top-$k$ regions in the input space of the given instance contributing to its decision. We evaluate our method on binary classification tasks using three image datasets: MNIST, FMNIST, and CIFAR. Our results demonstrate that compared to the causality-based post-hoc explainer model, our inherently explainable model achieves better explainability results while eliminating the need of training a separate explainer model. Our code is available at https://github.com/mvrl/CAT-XPLAIN. △ Less

Submitted 29 June, 2022; originally announced June 2022.

Comments: Accepted for spotlight presentation at the Explainable Artificial Intelligence for Computer Vision Workshop at CVPR 2022

arXiv:2111.06334 [pdf, other]

Identification of Fine-Grained Location Mentions in Crisis Tweets

Authors: Sarthak Khanal, Maria Traskowsky, Doina Caragea

Abstract: Identification of fine-grained location mentions in crisis tweets is central in transforming situational awareness information extracted from social media into actionable information. Most prior works have focused on identifying generic locations, without considering their specific types. To facilitate progress on the fine-grained location identification task, we assemble two tweet crisis datasets… ▽ More Identification of fine-grained location mentions in crisis tweets is central in transforming situational awareness information extracted from social media into actionable information. Most prior works have focused on identifying generic locations, without considering their specific types. To facilitate progress on the fine-grained location identification task, we assemble two tweet crisis datasets and manually annotate them with specific location types. The first dataset contains tweets from a mixed set of crisis events, while the second dataset contains tweets from the global COVID-19 pandemic. We investigate the performance of state-of-the-art deep learning models for sequence tagging on these datasets, in both in-domain and cross-domain settings. △ Less

Submitted 11 November, 2021; originally announced November 2021.

arXiv:2103.10080 [pdf]

doi 10.1088/2051-672X/abe71f

Comprehensive topography characterization of polycrystalline diamond coatings

Authors: Abhijeet Gujrati, Antoine Sanner, Subarna R. Khanal, Nicolaie Moldovan, Hongjun Zeng, Lars Pastewka, Tevis D. B. Jacobs

Abstract: The surface topography of diamond coatings strongly affects surface properties such as adhesion, friction, wear, and biocompatibility. However, the understanding of multi-scale topography, and its effect on properties, has been hindered by conventional measurement methods, which capture only a single length scale. Here, four different polycrystalline diamond coatings are characterized using transm… ▽ More The surface topography of diamond coatings strongly affects surface properties such as adhesion, friction, wear, and biocompatibility. However, the understanding of multi-scale topography, and its effect on properties, has been hindered by conventional measurement methods, which capture only a single length scale. Here, four different polycrystalline diamond coatings are characterized using transmission electron microscopy to assess the roughness down to the sub-nanometer scale. Then these measurements are combined, using the power spectral density (PSD), with conventional methods (stylus profilometry and atomic force microscopy) to characterize all scales of topography. The results demonstrate the critical importance of measuring topography across all length scales, especially because their PSDs cross over one another, such that a surface that is rougher at a larger scale may be smoother at a smaller scale and vice versa. Furthermore, these measurements reveal the connection between multi-scale topography and grain size, with characteristic scaling behavior at and slightly below the mean grain size, and self-affine fractal-like roughness at other length scales. At small (subgrain) scales, unpolished surfaces exhibit a common form of residual roughness that is self-affine in nature but difficult to detect with conventional methods. This approach of capturing topography from the atomic- to the macro-scale is termed comprehensive topography characterization, and all of the topography data from these surfaces has been made available for further analysis by experimentalists and theoreticians. Scientifically, this investigation has identified four characteristic regions of topography scaling in polycrystalline diamond materials. △ Less

Submitted 18 March, 2021; originally announced March 2021.

Comments: 13 pages, 6 figures

Journal ref: Surf. Topogr.: Metrol. Prop. 9, 014003 (2021)

arXiv:1912.08436 [pdf, other]

A Novel Optimal Modulation Strategy for Modular Multilevel Converter Based HVDC Systems

Authors: Saroj Khanal, Vahid R. Disfani

Abstract: Unlike conventional converters, modular multilevel converter (MMC) has a higher switching frequency -- which has direct implication on important parameters like converter loss and reliability -- mainly due to increased number of switching components. However, conventional switching techniques, where submodule sorting is just based on capacitor voltage balancing, are not able to achieve switching f… ▽ More Unlike conventional converters, modular multilevel converter (MMC) has a higher switching frequency -- which has direct implication on important parameters like converter loss and reliability -- mainly due to increased number of switching components. However, conventional switching techniques, where submodule sorting is just based on capacitor voltage balancing, are not able to achieve switching frequency reduction objective. A novel modulation algorithm for modular multilevel converters (MMCs) is proposed in this paper to reduce the switching frequency of MMC operation by defining a constrained multi-objective optimization model. The optimized switching algorithm incorporates all control objectives required for the proper operation of MMC and adds new constraints to limit the number of submodule switching events at each time step. Variation of severity of the constraints leads to a desired level of controllability in MMC switching algorithm to trade-off between capacitor voltage regulation and switching frequency reduction. Finally, performance of the proposed algorithm is validated against a seven-level back-to-back MMC-HVDC system under various operating conditions. △ Less

Submitted 18 December, 2019; originally announced December 2019.

Comments: Accepted and presented to the 2019 IEEE 2nd International Conference on Renewable Energy and Power Engineering

arXiv:1912.08433 [pdf, other]

Reduced Switching-Frequency Modulation Design for Model Predictive Control Based Modular Multilevel Converters

Authors: Saroj Khanal, Vahid R. Disfani

Abstract: This paper proposes a novel switching algorithm for modular multilevel converters (MMCs) that significantly reduces the switching frequency while fulfilling all control objectives required for their proper operation. Unlike in the conventional capacitor voltage-balancing strategies, in addition to submodule (SM) capacitor voltages, the proposed algorithm considers previous switching statuses durin… ▽ More This paper proposes a novel switching algorithm for modular multilevel converters (MMCs) that significantly reduces the switching frequency while fulfilling all control objectives required for their proper operation. Unlike in the conventional capacitor voltage-balancing strategies, in addition to submodule (SM) capacitor voltages, the proposed algorithm considers previous switching statuses during sorting. The algorithm is applied to a seven-level back-to-back MMC-HVDC system and tested under various operating conditions. Significant reduction in the switching frequency with trivial impacts on submodule capacitor voltages are observed. △ Less

Submitted 18 December, 2019; originally announced December 2019.

Comments: Accepted and presented to the 2019 IEEE 2nd International Conference on Renewable Energy and Power Engineering

arXiv:1909.12913 [pdf]

Student Engagement Detection Using Emotion Analysis, Eye Tracking and Head Movement with Machine Learning

Authors: Prabin Sharma, Shubham Joshi, Subash Gautam, Sneha Maharjan, Salik Ram Khanal, Manuel Cabral Reis, João Barroso, Vítor Manuel de Jesus Filipe

Abstract: With the increase of distance learning, in general, and e-learning, in particular, having a system capable of determining the engagement of students is of primordial importance, and one of the biggest challenges, both for teachers, researchers and policy makers. Here, we present a system to detect the engagement level of the students. It uses only information provided by the typical built-in web-c… ▽ More With the increase of distance learning, in general, and e-learning, in particular, having a system capable of determining the engagement of students is of primordial importance, and one of the biggest challenges, both for teachers, researchers and policy makers. Here, we present a system to detect the engagement level of the students. It uses only information provided by the typical built-in web-camera present in a laptop computer, and was designed to work in real time. We combine information about the movements of the eyes and head, and facial emotions to produce a concentration index with three classes of engagement: "very engaged", "nominally engaged" and "not engaged at all". The system was tested in a typical e-learning scenario, and the results show that it correctly identifies each period of time where students were "very engaged", "nominally engaged" and "not engaged at all". Additionally, the results also show that the students with best scores also have higher concentration indexes. △ Less

Submitted 23 March, 2023; v1 submitted 18 September, 2019; originally announced September 2019.

Comments: 9 pages, 9 Figures, 2 tables

arXiv:1909.07460 [pdf, other]

doi 10.1093/mnras/stz2709

The chemical composition of HIP34407/HIP34426 and other twin-star comoving pairs

Authors: I. Ramirez, S. Khanal, S. J. Lichon, J. Chaname, M. Endl, J. Melendez, D. L. Lambert

Abstract: We conducted a high-precision elemental abundance analysis of the twin-star comoving pair HIP34407/HIP34426. With mean error of 0.013 dex in the differential abundances (D[X/H]), a significant difference was found: HIP34407 is more metal-rich than HIP34426. The elemental abundance differences correlate strongly with condensation temperature, with the lowest for the volatile elements like carbon ar… ▽ More We conducted a high-precision elemental abundance analysis of the twin-star comoving pair HIP34407/HIP34426. With mean error of 0.013 dex in the differential abundances (D[X/H]), a significant difference was found: HIP34407 is more metal-rich than HIP34426. The elemental abundance differences correlate strongly with condensation temperature, with the lowest for the volatile elements like carbon around 0.05+/-0.02 dex, and the highest up to about 0.22+/-0.01 dex for the most refractory elements like aluminum. Dissimilar chemical composition for stars in twin-star comoving pairs are not uncommon, thus we compile previously-published results like ours and look for correlations between abundance differences and stellar parameters, finding no significant trends with average effective temperature, surface gravity, iron abundance, or their differences. Instead, we found a weak correlation between the absolute value of abundance difference and the projected distance between the stars in each pair that appears to be more important for elements which have a low absolute abundance. If confirmed, this correlation could be an important observational constraint for binary star system formation scenarios. △ Less

Submitted 16 September, 2019; originally announced September 2019.

Comments: MNRAS, in press

arXiv:1907.12491 [pdf]

doi 10.1073/pnas.1913126116

Linking energy loss in soft adhesion to surface roughness

Authors: Siddhesh Dalvi, Abhijeet Gujrati, Subarna R. Khanal, Lars Pastewka, Ali Dhinojwala, Tevis D. B. Jacobs

Abstract: A mechanistic understanding of adhesion in soft materials is critical in the fields of transportation (tires, gaskets, seals), biomaterials, micro-contact printing, and soft robotics. Measurements have long demonstrated that the apparent work of adhesion coming into contact is consistently lower than the intrinsic work of adhesion for the materials, and that there is adhesion hysteresis during sep… ▽ More A mechanistic understanding of adhesion in soft materials is critical in the fields of transportation (tires, gaskets, seals), biomaterials, micro-contact printing, and soft robotics. Measurements have long demonstrated that the apparent work of adhesion coming into contact is consistently lower than the intrinsic work of adhesion for the materials, and that there is adhesion hysteresis during separation, commonly explained by viscoelastic dissipation. Still lacking is a quantitative experimentally validated link between adhesion and measured topography. Here, we used in situ measurements of contact size to investigate the adhesion behavior of soft elastic polydimethylsiloxane (PDMS) hemispheres (modulus ranging from 0.7 to 10 MPa) on four different polycrystalline diamond substrates with topography characterized across eight orders of magnitude, including down to the Ångström-scale. The results show that the reduction in apparent work of adhesion is equal to the energy required to achieve conformal contact. Further, the energy loss during contact and removal is equal to the product of intrinsic work of adhesion and the true contact area. These findings provide a simple mechanism to quantitatively link the widely-observed adhesion hysteresis to roughness rather than viscoelastic dissipation. △ Less

Submitted 2 December, 2019; v1 submitted 29 July, 2019; originally announced July 2019.

Comments: Proceedings of the National Academy of Sciences (2019)

arXiv:1604.04977 [pdf, other]

doi 10.1364/OPTICA.3.000734

Terahertz plasmonic laser radiating in an ultra-narrow beam

Authors: Chongzhao Wu, Sudeep Khanal, John L. Reno, Sushil Kumar

Abstract: Plasmonic lasers (spasers) generate coherent surface-plasmon-polaritons (SPPs) and could be realized at subwavelength dimensions in metallic cavities for applications in nanoscale optics. Plasmonic cavities are also utilized for terahertz quantum-cascade lasers (QCLs), which are the brightest available solid-state sources of terahertz radiation. A long standing challenge for spasers is their poor… ▽ More Plasmonic lasers (spasers) generate coherent surface-plasmon-polaritons (SPPs) and could be realized at subwavelength dimensions in metallic cavities for applications in nanoscale optics. Plasmonic cavities are also utilized for terahertz quantum-cascade lasers (QCLs), which are the brightest available solid-state sources of terahertz radiation. A long standing challenge for spasers is their poor coupling to the far-field radiation. Unlike conventional lasers that could produce directional beams, spasers have highly divergent radiation patterns due to their subwavelength apertures. Here, we theoretically and experimentally demonstrate a new technique for implementing distributed-feedback (DFB) that is distinct from any other previously utilized DFB schemes for semiconductor lasers. The so-termed antenna-feedback scheme leads to single-mode operation in plasmonic lasers, couples the resonant SPP mode to a highly directional far-field radiation pattern, and integrates hybrid SPPs in surrounding medium into the operation of the DFB lasers. Experimentally, the antenna-feedback method, which does not require the phase matching to a well-defined effective index, is implemented for terahertz QCLs, and single-mode terahertz QCLs with beam divergence as small as 4 x 4 degree are demonstrated, which is the narrowest beam reported for any terahertz QCL to-date. Moreover, in contrast to negligible radiative-field in conventional photonic band-edge lasers, in which the periodicity follows the integer multiple of half-wavelength inside active medium, antenna-feedback breaks this integer-limit for the first time and enhances the radiative-field of lasing mode. The antenna-feedback scheme is generally applicable to any plasmonic laser with a Fabry-Perot cavity irrespective of its operating wavelength, and could bring plasmonic lasers closer to practical applications. △ Less

Submitted 23 July, 2016; v1 submitted 17 April, 2016; originally announced April 2016.

Journal ref: Optica 3, 734-740 (2016)

arXiv:1601.01731 [pdf, ps, other]

doi 10.3847/0004-637X/819/1/19

The Curious Case of Elemental Abundance Differences in the Dual Hot Jupiter Hosts WASP-94AB

Authors: Johanna K. Teske, Sandhya Khanal, Ivan Ramírez

Abstract: Binary stars provide an ideal laboratory for investigating the potential effects of planet formation on stellar composition. Assuming the stars formed in the same environment/from the same material, any compositional anomalies between binary components might indicate differences in how material was sequestered in planets, or accreted by the star in the process of planet formation. We present here… ▽ More Binary stars provide an ideal laboratory for investigating the potential effects of planet formation on stellar composition. Assuming the stars formed in the same environment/from the same material, any compositional anomalies between binary components might indicate differences in how material was sequestered in planets, or accreted by the star in the process of planet formation. We present here a study of the elemental abundance differences between WASP-94AB, a pair of stars that each host a hot Jupiter exoplanet. The two stars are very similar in spectral type (F8 and F9), and their ~2700 AU separation suggests their protoplanetary disks were likely not influenced by stellar interactions, but WASP-94Ab's orbit -- misaligned with the host star spin axis and likely retrograde -- points towards a dynamically active formation mechanism, perhaps different than that of WASP-94Bb, which is not misaligned and has nearly circular orbit. Based on our high-quality spectra and strictly relative abundance analysis, we detect a depletion of volatiles (~-0.02 dex, on average) and enhancement of refractories (~0.01 dex) in WASP-94A relative to B (standard errors are ~0.005 dex). This is different than every other published case of binary host star abundances, in which either no significant abundance differences are reported, or there is some degree of enhancement in all elements, including volatiles. Several scenarios that may explain the abundance trend are discussed, but none can be definitively accepted or rejected. Additional high-contrast imaging observations to search for companions that may be dynamically affecting the system, as well as a larger sample of binary host star studies, are needed to better understand the curious abundance trends we observe in WASP-94AB. △ Less

Submitted 7 January, 2016; originally announced January 2016.

Comments: Accepted to ApJ 7 Jan 2016. 21 pages, 5 figures, 3 Tables (1 MRT)

arXiv:1508.07046 [pdf, other]

doi 10.1016/j.nima.2015.11.157

Performance of a Large-area GEM Detector Read Out with Wide Radial Zigzag Strips

Authors: Aiwu Zhang, Vallary Bhopatkar, Eric Hansen, Marcus Hohlmann, Shreeya Khanal, Michael Phipps, Elizabeth Starling, Jessie Twigger, Kimberly Walton

Abstract: A 1-meter-long trapezoidal Triple-GEM detector with wide readout strips was tested in hadron beams at the Fermilab Test Beam Facility in October 2013. The readout strips have a special zigzag geometry and run along the radial direction with an azimuthal pitch of 1.37 mrad to measure the azimuthal phi-coordinate of incident particles. The zigzag geometry of the readout reduces the required number o… ▽ More A 1-meter-long trapezoidal Triple-GEM detector with wide readout strips was tested in hadron beams at the Fermilab Test Beam Facility in October 2013. The readout strips have a special zigzag geometry and run along the radial direction with an azimuthal pitch of 1.37 mrad to measure the azimuthal phi-coordinate of incident particles. The zigzag geometry of the readout reduces the required number of electronic channels by a factor of three compared to conventional straight readout strips while preserving good angular resolution. The average crosstalk between zigzag strips is measured to be an acceptable 5.5%. The detection efficiency of the detector is (98.4+-0.2)%. When the non-linearity of the zigzag-strip response is corrected with track information, the angular resolution is measured to be (193+-3) urad, which corresponds to 14% of the angular strip pitch. Multiple Coulomb scattering effects are fully taken into account in the data analysis with the help of a stand-alone Geant4 simulation that estimates interpolated track errors. △ Less

Submitted 1 September, 2015; v1 submitted 27 August, 2015; originally announced August 2015.

Comments: 30 pages, 28 figures, submitted to NIMA

arXiv:1506.01025 [pdf, other]

doi 10.1088/0004-637X/808/1/13

The dissimilar chemical composition of the planet-hosting stars of the XO-2 binary system

Authors: I. Ramirez, S. Khanal, P. Aleo, A. Sobotka, F. Liu, L. Casagrande, J. Melendez, D. Yong, D. L. Lambert, M. Asplund

Abstract: Using high-quality spectra of the twin stars in the XO-2 binary system, we have detected significant differences in the chemical composition of their photospheres. The differences correlate strongly with the elements' dust condensation temperature. In XO-2N, volatiles are enhanced by about 0.015 dex and refractories are overabundant by up to 0.090 dex. On average, our error bar in relative abundan… ▽ More Using high-quality spectra of the twin stars in the XO-2 binary system, we have detected significant differences in the chemical composition of their photospheres. The differences correlate strongly with the elements' dust condensation temperature. In XO-2N, volatiles are enhanced by about 0.015 dex and refractories are overabundant by up to 0.090 dex. On average, our error bar in relative abundance is 0.012 dex. We present an early metal-depletion scenario in which the formation of the gas giant planets known to exist around these stars is responsible for a 0.015 dex offset in the abundances of all elements while 20 M_Earth of non-detected rocky objects that formed around XO-2S explain the additional refractory-element difference. An alternative explanation involves the late accretion of at least 20 M_Earth of planet-like material by XO-2N, allegedly as a result of the migration of the hot Jupiter detected around that star. Dust cleansing by a nearby hot star as well as age or Galactic birthplace effects can be ruled out as valid explanations for this phenomenon. △ Less

Submitted 2 June, 2015; originally announced June 2015.

Comments: ApJ, in press. Complete linelist (Table 3) available in the "Other formats -> Source" download

Showing 1–38 of 38 results for author: Khanal, S