Search | arXiv e-print repository

The Software Documentor Mindset

Authors: Deeksha M. Arya, Jin L. C. Guo, Martin P. Robillard

Abstract: Software technologies are used by programmers with diverse backgrounds. To fulfill programmers' need for information, enthusiasts contribute numerous learning resources that vary in style and content, which act as documentation for the corresponding technology. We interviewed 26 volunteer documentation contributors, i.e. documentors, to understand why and how they create such documentation. From a… ▽ More Software technologies are used by programmers with diverse backgrounds. To fulfill programmers' need for information, enthusiasts contribute numerous learning resources that vary in style and content, which act as documentation for the corresponding technology. We interviewed 26 volunteer documentation contributors, i.e. documentors, to understand why and how they create such documentation. From a qualitative analysis of our interviews, we identified a total of sixteen considerations that documentors have during the documentation contribution process, along three dimensions, namely motivations, topic selection techniques, and styling objectives. We grouped related considerations based on common underlying themes, to elicit five software documentor mindsets that occur during documentation contribution activities. We propose a structure of mindsets, and their associated considerations across the three dimensions, as a framework for reasoning about the documentation contribution process. This framework can inform information seeking as well as documentation creation tools about the context in which documentation was contributed. △ Less

Submitted 12 December, 2024; originally announced December 2024.

arXiv:2211.11362 [pdf, other]

Crowdsensing-based Road Damage Detection Challenge (CRDDC-2022)

Authors: Deeksha Arya, Hiroya Maeda, Sanjay Kumar Ghosh, Durga Toshniwal, Hiroshi Omata, Takehiro Kashiyama, Yoshihide Sekimoto

Abstract: This paper summarizes the Crowdsensing-based Road Damage Detection Challenge (CRDDC), a Big Data Cup organized as a part of the IEEE International Conference on Big Data'2022. The Big Data Cup challenges involve a released dataset and a well-defined problem with clear evaluation metrics. The challenges run on a data competition platform that maintains a real-time online evaluation system for the p… ▽ More This paper summarizes the Crowdsensing-based Road Damage Detection Challenge (CRDDC), a Big Data Cup organized as a part of the IEEE International Conference on Big Data'2022. The Big Data Cup challenges involve a released dataset and a well-defined problem with clear evaluation metrics. The challenges run on a data competition platform that maintains a real-time online evaluation system for the participants. In the presented case, the data constitute 47,420 road images collected from India, Japan, the Czech Republic, Norway, the United States, and China to propose methods for automatically detecting road damages in these countries. More than 60 teams from 19 countries registered for this competition. The submitted solutions were evaluated using five leaderboards based on performance for unseen test images from the aforementioned six countries. This paper encapsulates the top 11 solutions proposed by these teams. The best-performing model utilizes ensemble learning based on YOLO and Faster-RCNN series models to yield an F1 score of 76% for test data combined from all 6 countries. The paper concludes with a comparison of current and past challenges and provides direction for the future. △ Less

Submitted 21 November, 2022; originally announced November 2022.

Comments: 9 pages 2 figures 5 tables. arXiv admin note: text overlap with arXiv:2011.08740

ACM Class: E.0; J.0

arXiv:2209.14225 [pdf, other]

doi 10.1109/BigData55660.2022.10020642

Road Rutting Detection using Deep Learning on Images

Authors: Poonam Kumari Saha, Deeksha Arya, Ashutosh Kumar, Hiroya Maeda, Yoshihide Sekimoto

Abstract: Road rutting is a severe road distress that can cause premature failure of road incurring early and costly maintenance costs. Research on road damage detection using image processing techniques and deep learning are being actively conducted in the past few years. However, these researches are mostly focused on detection of cracks, potholes, and their variants. Very few research has been done on th… ▽ More Road rutting is a severe road distress that can cause premature failure of road incurring early and costly maintenance costs. Research on road damage detection using image processing techniques and deep learning are being actively conducted in the past few years. However, these researches are mostly focused on detection of cracks, potholes, and their variants. Very few research has been done on the detection of road rutting. This paper proposes a novel road rutting dataset comprising of 949 images and provides both object level and pixel level annotations. Object detection models and semantic segmentation models were deployed to detect road rutting on the proposed dataset, and quantitative and qualitative analysis of model predictions were done to evaluate model performance and identify challenges faced in the detection of road rutting using the proposed method. Object detection model YOLOX-s achieves mAP@IoU=0.5 of 61.6% and semantic segmentation model PSPNet (Resnet-50) achieves IoU of 54.69 and accuracy of 72.67, thus providing a benchmark accuracy for similar work in future. The proposed road rutting dataset and the results of our research study will help accelerate the research on detection of road rutting using deep learning. △ Less

Submitted 28 September, 2022; originally announced September 2022.

Comments: 9 pages, 7 figures

ACM Class: E.0; J.0

Journal ref: 2022 IEEE International Conference on Big Data (Big Data), Osaka, Japan, 2022, pp. 6507-6515

arXiv:2209.08538 [pdf]

RDD2022: A multi-national image dataset for automatic Road Damage Detection

Authors: Deeksha Arya, Hiroya Maeda, Sanjay Kumar Ghosh, Durga Toshniwal, Yoshihide Sekimoto

Abstract: The data article describes the Road Damage Dataset, RDD2022, which comprises 47,420 road images from six countries, Japan, India, the Czech Republic, Norway, the United States, and China. The images have been annotated with more than 55,000 instances of road damage. Four types of road damage, namely longitudinal cracks, transverse cracks, alligator cracks, and potholes, are captured in the dataset… ▽ More The data article describes the Road Damage Dataset, RDD2022, which comprises 47,420 road images from six countries, Japan, India, the Czech Republic, Norway, the United States, and China. The images have been annotated with more than 55,000 instances of road damage. Four types of road damage, namely longitudinal cracks, transverse cracks, alligator cracks, and potholes, are captured in the dataset. The annotated dataset is envisioned for developing deep learning-based methods to detect and classify road damage automatically. The dataset has been released as a part of the Crowd sensing-based Road Damage Detection Challenge (CRDDC2022). The challenge CRDDC2022 invites researchers from across the globe to propose solutions for automatic road damage detection in multiple countries. The municipalities and road agencies may utilize the RDD2022 dataset, and the models trained using RDD2022 for low-cost automatic monitoring of road conditions. Further, computer vision and machine learning researchers may use the dataset to benchmark the performance of different algorithms for other image-based applications of the same type (classification, object detection, etc.). △ Less

Submitted 18 September, 2022; originally announced September 2022.

Comments: 16 pages, 20 figures, IEEE BigData Cup - Crowdsensing-based Road damage detection challenge (CRDDC'2022)

ACM Class: E.0; J.0

arXiv:2111.00801 [pdf, other]

Livestock Monitoring with Transformer

Authors: Bhavesh Tangirala, Ishan Bhandari, Daniel Laszlo, Deepak K. Gupta, Rajat M. Thomas, Devanshu Arya

Abstract: Tracking the behaviour of livestock enables early detection and thus prevention of contagious diseases in modern animal farms. Apart from economic gains, this would reduce the amount of antibiotics used in livestock farming which otherwise enters the human diet exasperating the epidemic of antibiotic resistance - a leading cause of death. We could use standard video cameras, available in most mode… ▽ More Tracking the behaviour of livestock enables early detection and thus prevention of contagious diseases in modern animal farms. Apart from economic gains, this would reduce the amount of antibiotics used in livestock farming which otherwise enters the human diet exasperating the epidemic of antibiotic resistance - a leading cause of death. We could use standard video cameras, available in most modern farms, to monitor livestock. However, most computer vision algorithms perform poorly on this task, primarily because, (i) animals bred in farms look identical, lacking any obvious spatial signature, (ii) none of the existing trackers are robust for long duration, and (iii) real-world conditions such as changing illumination, frequent occlusion, varying camera angles, and sizes of the animals make it hard for models to generalize. Given these challenges, we develop an end-to-end behaviour monitoring system for group-housed pigs to perform simultaneous instance level segmentation, tracking, action recognition and re-identification (STAR) tasks. We present starformer, the first end-to-end multiple-object livestock monitoring framework that learns instance-level embeddings for grouped pigs through the use of transformer architecture. For benchmarking, we present Pigtrace, a carefully curated dataset comprising video sequences with instance level bounding box, segmentation, tracking and activity classification of pigs in real indoor farming environment. Using simultaneous optimization on STAR tasks we show that starformer outperforms popular baseline models trained for individual tasks. △ Less

Submitted 2 November, 2021; v1 submitted 1 November, 2021; originally announced November 2021.

Comments: Accepted at BMVC 2021

arXiv:2109.10683 [pdf, other]

Adaptive Neural Message Passing for Inductive Learning on Hypergraphs

Authors: Devanshu Arya, Deepak K. Gupta, Stevan Rudinac, Marcel Worring

Abstract: Graphs are the most ubiquitous data structures for representing relational datasets and performing inferences in them. They model, however, only pairwise relations between nodes and are not designed for encoding the higher-order relations. This drawback is mitigated by hypergraphs, in which an edge can connect an arbitrary number of nodes. Most hypergraph learning approaches convert the hypergraph… ▽ More Graphs are the most ubiquitous data structures for representing relational datasets and performing inferences in them. They model, however, only pairwise relations between nodes and are not designed for encoding the higher-order relations. This drawback is mitigated by hypergraphs, in which an edge can connect an arbitrary number of nodes. Most hypergraph learning approaches convert the hypergraph structure to that of a graph and then deploy existing geometric deep learning methods. This transformation leads to information loss, and sub-optimal exploitation of the hypergraph's expressive power. We present HyperMSG, a novel hypergraph learning framework that uses a modular two-level neural message passing strategy to accurately and efficiently propagate information within each hyperedge and across the hyperedges. HyperMSG adapts to the data and task by learning an attention weight associated with each node's degree centrality. Such a mechanism quantifies both local and global importance of a node, capturing the structural properties of a hypergraph. HyperMSG is inductive, allowing inference on previously unseen nodes. Further, it is robust and outperforms state-of-the-art hypergraph learning methods on a wide range of tasks and datasets. Finally, we demonstrate the effectiveness of HyperMSG in learning multimodal relations through detailed experimentation on a challenging multimedia dataset. △ Less

Submitted 22 September, 2021; originally announced September 2021.

arXiv:2107.00100 [pdf]

FCMI: Feature Correlation based Missing Data Imputation

Authors: Prateek Mishra, Kumar Divya Mani, Prashant Johri, Dikhsa Arya

Abstract: Processed data are insightful, and crude data are obtuse. A serious threat to data reliability is missing values. Such data leads to inaccurate analysis and wrong predictions. We propose an efficient technique to impute the missing value in the dataset based on correlation called FCMI (Feature Correlation based Missing Data Imputation). We have considered the correlation of the attributes of the d… ▽ More Processed data are insightful, and crude data are obtuse. A serious threat to data reliability is missing values. Such data leads to inaccurate analysis and wrong predictions. We propose an efficient technique to impute the missing value in the dataset based on correlation called FCMI (Feature Correlation based Missing Data Imputation). We have considered the correlation of the attributes of the dataset, and that is our central idea. Our proposed algorithm picks the highly correlated attributes of the dataset and uses these attributes to build a regression model whose parameters are optimized such that the correlation of the dataset is maintained. Experiments conducted on both classification and regression datasets show that the proposed imputation technique outperforms existing imputation algorithms. △ Less

Submitted 26 June, 2021; originally announced July 2021.

arXiv:2104.04017 [pdf, other]

Improving Solar Cell Metallization Designs using Convolutional Neural Networks

Authors: Sumit Bhattacharya, Devanshu Arya, Debjani Bhowmick, Rajat Mani Thomas, Deepak Kumar Gupta

Abstract: Optimizing the design of solar cell metallizations is one of the ways to improve the performance of solar cells. Recently, it has been shown that Topology Optimization (TO) can be used to design complex metallization patterns for solar cells that lead to improved performance. TO generates unconventional design patterns that cannot be obtained with the traditional shape optimization methods. In thi… ▽ More Optimizing the design of solar cell metallizations is one of the ways to improve the performance of solar cells. Recently, it has been shown that Topology Optimization (TO) can be used to design complex metallization patterns for solar cells that lead to improved performance. TO generates unconventional design patterns that cannot be obtained with the traditional shape optimization methods. In this paper, we show that this design process can be improved further using a deep learning inspired strategy. We present SolarNet, a CNN-based reparameterization scheme that can be used to obtain improved metallization designs. SolarNet modifies the optimization domain such that rather than optimizing the electrode material distribution directly, the weights of a CNN model are optimized. The design generated by CNN is then evaluated using the physics equations, which in turn generates gradients for backpropagation. SolarNet is trained end-to-end involving backpropagation through the solar cell model as well as the CNN pipeline. Through application on solar cells of different shapes as well as different busbar geometries, we demonstrate that SolarNet improves the performance of solar cells compared to the traditional TO approach. △ Less

Submitted 8 April, 2021; originally announced April 2021.

Comments: Published as a workshop paper at ICLR 2021 SimDL Workshop

arXiv:2012.13078 [pdf, other]

Rotation Equivariant Siamese Networks for Tracking

Authors: Deepak K. Gupta, Devanshu Arya, Efstratios Gavves

Abstract: Rotation is among the long prevailing, yet still unresolved, hard challenges encountered in visual object tracking. The existing deep learning-based tracking algorithms use regular CNNs that are inherently translation equivariant, but not designed to tackle rotations. In this paper, we first demonstrate that in the presence of rotation instances in videos, the performance of existing trackers is s… ▽ More Rotation is among the long prevailing, yet still unresolved, hard challenges encountered in visual object tracking. The existing deep learning-based tracking algorithms use regular CNNs that are inherently translation equivariant, but not designed to tackle rotations. In this paper, we first demonstrate that in the presence of rotation instances in videos, the performance of existing trackers is severely affected. To circumvent the adverse effect of rotations, we present rotation-equivariant Siamese networks (RE-SiamNets), built through the use of group-equivariant convolutional layers comprising steerable filters. SiamNets allow estimating the change in orientation of the object in an unsupervised manner, thereby facilitating its use in relative 2D pose estimation as well. We further show that this change in orientation can be used to impose an additional motion constraint in Siamese tracking through imposing restriction on the change in orientation between two consecutive frames. For benchmarking, we present Rotation Tracking Benchmark (RTB), a dataset comprising a set of videos with rotation instances. Through experiments on two popular Siamese architectures, we show that RE-SiamNets handle the problem of rotation very well and out-perform their regular counterparts. Further, RE-SiamNets can accurately estimate the relative change in pose of the target in an unsupervised fashion, namely the in-plane rotation the target has sustained with respect to the reference frame. △ Less

Submitted 23 December, 2020; originally announced December 2020.

arXiv:2011.08740 [pdf]

doi 10.1109/BigData50022.2020.9377790

Global Road Damage Detection: State-of-the-art Solutions

Authors: Deeksha Arya, Hiroya Maeda, Sanjay Kumar Ghosh, Durga Toshniwal, Hiroshi Omata, Takehiro Kashiyama, Yoshihide Sekimoto

Abstract: This paper summarizes the Global Road Damage Detection Challenge (GRDDC), a Big Data Cup organized as a part of the IEEE International Conference on Big Data'2020. The Big Data Cup challenges involve a released dataset and a well-defined problem with clear evaluation metrics. The challenges run on a data competition platform that maintains a leaderboard for the participants. In the presented case,… ▽ More This paper summarizes the Global Road Damage Detection Challenge (GRDDC), a Big Data Cup organized as a part of the IEEE International Conference on Big Data'2020. The Big Data Cup challenges involve a released dataset and a well-defined problem with clear evaluation metrics. The challenges run on a data competition platform that maintains a leaderboard for the participants. In the presented case, the data constitute 26336 road images collected from India, Japan, and the Czech Republic to propose methods for automatically detecting road damages in these countries. In total, 121 teams from several countries registered for this competition. The submitted solutions were evaluated using two datasets test1 and test2, comprising 2,631 and 2,664 images. This paper encapsulates the top 12 solutions proposed by these teams. The best performing model utilizes YOLO-based ensemble learning to yield an F1 score of 0.67 on test1 and 0.66 on test2. The paper concludes with a review of the facets that worked well for the presented challenge and those that could be improved in future challenges. △ Less

Submitted 17 November, 2020; originally announced November 2020.

Comments: 11 Pages, 2 Figures, 3 Tables

arXiv:2010.04558 [pdf, other]

HyperSAGE: Generalizing Inductive Representation Learning on Hypergraphs

Authors: Devanshu Arya, Deepak K. Gupta, Stevan Rudinac, Marcel Worring

Abstract: Graphs are the most ubiquitous form of structured data representation used in machine learning. They model, however, only pairwise relations between nodes and are not designed for encoding the higher-order relations found in many real-world datasets. To model such complex relations, hypergraphs have proven to be a natural representation. Learning the node representations in a hypergraph is more co… ▽ More Graphs are the most ubiquitous form of structured data representation used in machine learning. They model, however, only pairwise relations between nodes and are not designed for encoding the higher-order relations found in many real-world datasets. To model such complex relations, hypergraphs have proven to be a natural representation. Learning the node representations in a hypergraph is more complex than in a graph as it involves information propagation at two levels: within every hyperedge and across the hyperedges. Most current approaches first transform a hypergraph structure to a graph for use in existing geometric deep learning algorithms. This transformation leads to information loss, and sub-optimal exploitation of the hypergraph's expressive power. We present HyperSAGE, a novel hypergraph learning framework that uses a two-level neural message passing strategy to accurately and efficiently propagate information through hypergraphs. The flexible design of HyperSAGE facilitates different ways of aggregating neighborhood information. Unlike the majority of related work which is transductive, our approach, inspired by the popular GraphSAGE method, is inductive. Thus, it can also be used on previously unseen nodes, facilitating deployment in problems such as evolving or partially observed hypergraphs. Through extensive experimentation, we show that HyperSAGE outperforms state-of-the-art hypergraph learning methods on representative benchmark datasets. We also demonstrate that the higher expressive power of HyperSAGE makes it more stable in learning node representations as compared to the alternatives. △ Less

Submitted 9 October, 2020; originally announced October 2020.

arXiv:2009.04787 [pdf, other]

Hard Occlusions in Visual Object Tracking

Authors: Thijs P. Kuipers, Devanshu Arya, Deepak K. Gupta

Abstract: Visual object tracking is among the hardest problems in computer vision, as trackers have to deal with many challenging circumstances such as illumination changes, fast motion, occlusion, among others. A tracker is assessed to be good or not based on its performance on the recent tracking datasets, e.g., VOT2019, and LaSOT. We argue that while the recent datasets contain large sets of annotated vi… ▽ More Visual object tracking is among the hardest problems in computer vision, as trackers have to deal with many challenging circumstances such as illumination changes, fast motion, occlusion, among others. A tracker is assessed to be good or not based on its performance on the recent tracking datasets, e.g., VOT2019, and LaSOT. We argue that while the recent datasets contain large sets of annotated videos that to some extent provide a large bandwidth for training data, the hard scenarios such as occlusion and in-plane rotation are still underrepresented. For trackers to be brought closer to the real-world scenarios and deployed in safety-critical devices, even the rarest hard scenarios must be properly addressed. In this paper, we particularly focus on hard occlusion cases and benchmark the performance of recent state-of-the-art trackers (SOTA) on them. We created a small-scale dataset containing different categories within hard occlusions, on which the selected trackers are evaluated. Results show that hard occlusions remain a very challenging problem for SOTA trackers. Furthermore, it is observed that tracker performance varies wildly between different categories of hard occlusions, where a top-performing tracker on one category performs significantly worse on a different category. The varying nature of tracker performance based on specific categories suggests that the common tracker rankings using averaged single performance scores are not adequate to gauge tracker performance in real-world scenarios. △ Less

Submitted 10 September, 2020; originally announced September 2020.

Comments: Accepted at ECCV 2020 Workshop RLQ-TOD

arXiv:2008.13101 [pdf, other]

Transfer Learning-based Road Damage Detection for Multiple Countries

Authors: Deeksha Arya, Hiroya Maeda, Sanjay Kumar Ghosh, Durga Toshniwal, Alexander Mraz, Takehiro Kashiyama, Yoshihide Sekimoto

Abstract: Many municipalities and road authorities seek to implement automated evaluation of road damage. However, they often lack technology, know-how, and funds to afford state-of-the-art equipment for data collection and analysis of road damages. Although some countries, like Japan, have developed less expensive and readily available Smartphone-based methods for automatic road condition monitoring, other… ▽ More Many municipalities and road authorities seek to implement automated evaluation of road damage. However, they often lack technology, know-how, and funds to afford state-of-the-art equipment for data collection and analysis of road damages. Although some countries, like Japan, have developed less expensive and readily available Smartphone-based methods for automatic road condition monitoring, other countries still struggle to find efficient solutions. This work makes the following contributions in this context. Firstly, it assesses the usability of the Japanese model for other countries. Secondly, it proposes a large-scale heterogeneous road damage dataset comprising 26620 images collected from multiple countries using smartphones. Thirdly, we propose generalized models capable of detecting and classifying road damages in more than one country. Lastly, we provide recommendations for readers, local agencies, and municipalities of other countries when one other country publishes its data and model for automatic road damage detection and classification. Our dataset is available at (https://github.com/sekilab/RoadDamageDetector/). △ Less

Submitted 30 August, 2020; originally announced August 2020.

Comments: 16 pages, 14 figures

arXiv:2008.07299 [pdf, other]

doi 10.1109/TVCG.2020.3030408

Visual Analytics for Temporal Hypergraph Model Exploration

Authors: Maximilian T. Fischer, Devanshu Arya, Dirk Streeb, Daniel Seebacher, Daniel A. Keim, Marcel Worring

Abstract: Many processes, from gene interaction in biology to computer networks to social media, can be modeled more precisely as temporal hypergraphs than by regular graphs. This is because hypergraphs generalize graphs by extending edges to connect any number of vertices, allowing complex relationships to be described more accurately and predict their behavior over time. However, the interactive explorati… ▽ More Many processes, from gene interaction in biology to computer networks to social media, can be modeled more precisely as temporal hypergraphs than by regular graphs. This is because hypergraphs generalize graphs by extending edges to connect any number of vertices, allowing complex relationships to be described more accurately and predict their behavior over time. However, the interactive exploration and seamless refinement of such hypergraph-based prediction models still pose a major challenge. We contribute Hyper-Matrix, a novel visual analytics technique that addresses this challenge through a tight coupling between machine-learning and interactive visualizations. In particular, the technique incorporates a geometric deep learning model as a blueprint for problem-specific models while integrating visualizations for graph-based and category-based data with a novel combination of interactions for an effective user-driven exploration of hypergraph models. To eliminate demanding context switches and ensure scalability, our matrix-based visualization provides drill-down capabilities across multiple levels of semantic zoom, from an overview of model predictions down to the content. We facilitate a focused analysis of relevant connections and groups based on interactive user-steering for filtering and search tasks, a dynamically modifiable partition hierarchy, various matrix reordering techniques, and interactive model feedback. We evaluate our technique in a case study and through formative evaluation with law enforcement experts using real-world internet forum communication data. The results show that our approach surpasses existing solutions in terms of scalability and applicability, enables the incorporation of domain knowledge, and allows for fast search-space traversal. With the technique, we pave the way for the visual analytics of temporal hypergraphs in a wide variety of domains. △ Less

Submitted 12 October, 2020; v1 submitted 17 August, 2020; originally announced August 2020.

Comments: 11 pages, 6 figures, IEEE VIS VAST 2020 - IEEE Transactions on Visualization and Computer Graphics

Journal ref: IEEE Transactions on Visualization and Computer Graphics, 2020

arXiv:2006.12150 [pdf, other]

Generating Annotated High-Fidelity Images Containing Multiple Coherent Objects

Authors: Bryan G. Cardenas, Devanshu Arya, Deepak K. Gupta

Abstract: Recent developments related to generative models have made it possible to generate diverse high-fidelity images. In particular, layout-to-image generation models have gained significant attention due to their capability to generate realistic complex images containing distinct objects. These models are generally conditioned on either semantic layouts or textual descriptions. However, unlike natural… ▽ More Recent developments related to generative models have made it possible to generate diverse high-fidelity images. In particular, layout-to-image generation models have gained significant attention due to their capability to generate realistic complex images containing distinct objects. These models are generally conditioned on either semantic layouts or textual descriptions. However, unlike natural images, providing auxiliary information can be extremely hard in domains such as biomedical imaging and remote sensing. In this work, we propose a multi-object generation framework that can synthesize images with multiple objects without explicitly requiring their contextual information during the generation process. Based on a vector-quantized variational autoencoder (VQ-VAE) backbone, our model learns to preserve spatial coherency within an image as well as semantic coherency between the objects and the background through two powerful autoregressive priors: PixelSNAIL and LayoutPixelSNAIL. While the PixelSNAIL learns the distribution of the latent encodings of the VQ-VAE, the LayoutPixelSNAIL is used to specifically learn the semantic distribution of the objects. An implicit advantage of our approach is that the generated samples are accompanied by object-level annotations. We demonstrate how coherency and fidelity are preserved with our method through experiments on the Multi-MNIST and CLEVR datasets; thereby outperforming state-of-the-art multi-object generative methods. The efficacy of our approach is demonstrated through application on medical imaging datasets, where we show that augmenting the training set with generated samples using our approach improves the performance of existing models. △ Less

Submitted 15 July, 2021; v1 submitted 22 June, 2020; originally announced June 2020.

Comments: 21 pages, 5 tables, 21 figures

arXiv:2006.08220 [pdf]

Implementation of Google Assistant & Amazon Alexa on Raspberry Pi

Authors: Shailesh D. Arya, Samir Patel

Abstract: This paper investigates the implementation of voice-enabled Google Assistant and Amazon Alexa on Raspberry Pi. Virtual Assistants are being a new trend in how we interact or do computations with physical devices. A voice-enabled system essentially means a system that processes voice as an input, decodes, or understands the meaning of that input and generates an appropriate voice output. In this pa… ▽ More This paper investigates the implementation of voice-enabled Google Assistant and Amazon Alexa on Raspberry Pi. Virtual Assistants are being a new trend in how we interact or do computations with physical devices. A voice-enabled system essentially means a system that processes voice as an input, decodes, or understands the meaning of that input and generates an appropriate voice output. In this paper, we are developing a smart speaker prototype that has the functionalities of both in the same Raspberry Pi. Users can invoke a virtual assistant by saying the hot words and can leverage the best services of both eco-systems. This paper also explains the complex architecture of Google Assistant and Amazon Alexa and the working of both assistants as well. Later, this system can be used to control the smart home IoT devices. △ Less

Submitted 15 June, 2020; originally announced June 2020.

Comments: 5 Pages, 5 Figures

arXiv:2001.06067 [pdf, other]

doi 10.1145/3313831.3376218

ArguLens: Anatomy of Community Opinions On Usability Issues Using Argumentation Models

Authors: Wenting Wang, Deeksha Arya, Nicole Novielli, Jinghui Cheng, Jin L. C. Guo

Abstract: In open-source software (OSS), the design of usability is often influenced by the discussions among community members on platforms such as issue tracking systems (ITSs). However, digesting the rich information embedded in issue discussions can be a major challenge due to the vast number and diversity of the comments. We propose and evaluate ArguLens, a conceptual framework and automated technique… ▽ More In open-source software (OSS), the design of usability is often influenced by the discussions among community members on platforms such as issue tracking systems (ITSs). However, digesting the rich information embedded in issue discussions can be a major challenge due to the vast number and diversity of the comments. We propose and evaluate ArguLens, a conceptual framework and automated technique leveraging an argumentation model to support effective understanding and consolidation of community opinions in ITSs. Through content analysis, we anatomized highly discussed usability issues from a large, active OSS project, into their argumentation components and standpoints. We then experimented with supervised machine learning techniques for automated argument extraction. Finally, through a study with experienced ITS users, we show that the information provided by ArguLens supported the digestion of usability-related opinions and facilitated the review of lengthy issues. ArguLens provides the direction of designing valuable tools for high-level reasoning and effective discussion about usability. △ Less

Submitted 16 January, 2020; originally announced January 2020.

Comments: 14 pages, 7 figures, ACM CHI Full Paper (2020). For codebook, data, and code for ArguLens, see https://github.com/HCDLab/ArguLens

arXiv:1909.09252 [pdf, other]

HyperLearn: A Distributed Approach for Representation Learning in Datasets With Many Modalities

Authors: Devanshu Arya, Stevan Rudinac, Marcel Worring

Abstract: Multimodal datasets contain an enormous amount of relational information, which grows exponentially with the introduction of new modalities. Learning representations in such a scenario is inherently complex due to the presence of multiple heterogeneous information channels. These channels can encode both (a) inter-relations between the items of different modalities and (b) intra-relations between… ▽ More Multimodal datasets contain an enormous amount of relational information, which grows exponentially with the introduction of new modalities. Learning representations in such a scenario is inherently complex due to the presence of multiple heterogeneous information channels. These channels can encode both (a) inter-relations between the items of different modalities and (b) intra-relations between the items of the same modality. Encoding multimedia items into a continuous low-dimensional semantic space such that both types of relations are captured and preserved is extremely challenging, especially if the goal is a unified end-to-end learning framework. The two key challenges that need to be addressed are: 1) the framework must be able to merge complex intra and inter relations without losing any valuable information and 2) the learning model should be invariant to the addition of new and potentially very different modalities. In this paper, we propose a flexible framework which can scale to data streams from many modalities. To that end we introduce a hypergraph-based model for data representation and deploy Graph Convolutional Networks to fuse relational information within and across modalities. Our approach provides an efficient solution for distributing otherwise extremely computationally expensive or even unfeasible training processes across multiple-GPUs, without any sacrifices in accuracy. Moreover, adding new modalities to our model requires only an additional GPU unit keeping the computational time unchanged, which brings representation learning to truly multimodal datasets. We demonstrate the feasibility of our approach in the experiments on multimedia datasets featuring second, third and fourth order relations. △ Less

Submitted 19 September, 2019; originally announced September 2019.

arXiv:1902.07093 [pdf, other]

Analysis and Detection of Information Types of Open Source Software Issue Discussions

Authors: Deeksha Arya, Wenting Wang, Jin L. C. Guo, Jinghui Cheng

Abstract: Most modern Issue Tracking Systems (ITSs) for open source software (OSS) projects allow users to add comments to issues. Over time, these comments accumulate into discussion threads embedded with rich information about the software project, which can potentially satisfy the diverse needs of OSS stakeholders. However, discovering and retrieving relevant information from the discussion threads is a… ▽ More Most modern Issue Tracking Systems (ITSs) for open source software (OSS) projects allow users to add comments to issues. Over time, these comments accumulate into discussion threads embedded with rich information about the software project, which can potentially satisfy the diverse needs of OSS stakeholders. However, discovering and retrieving relevant information from the discussion threads is a challenging task, especially when the discussions are lengthy and the number of issues in ITSs are vast. In this paper, we address this challenge by identifying the information types presented in OSS issue discussions. Through qualitative content analysis of 15 complex issue threads across three projects hosted on GitHub, we uncovered 16 information types and created a labeled corpus containing 4656 sentences. Our investigation of supervised, automated classification techniques indicated that, when prior knowledge about the issue is available, Random Forest can effectively detect most sentence types using conversational features such as the sentence length and its position. When classifying sentences from new issues, Logistic Regression can yield satisfactory performance using textual features for certain information types, while falling short on others. Our work represents a nontrivial first step towards tools and techniques for identifying and obtaining the rich information recorded in the ITSs to support various software engineering activities and to satisfy the diverse needs of OSS stakeholders. △ Less

Submitted 19 February, 2019; originally announced February 2019.

Comments: 41st ACM/IEEE International Conference on Software Engineering (ICSE2019)

arXiv:1602.04924 [pdf, other]

doi 10.1145/2806416.2806615

Personalized Federated Search at LinkedIn

Authors: Dhruv Arya, Viet Ha-Thuc, Shakti Sinha

Abstract: LinkedIn has grown to become a platform hosting diverse sources of information ranging from member profiles, jobs, professional groups, slideshows etc. Given the existence of multiple sources, when a member issues a query like "software engineer", the member could look for software engineer profiles, jobs or professional groups. To tackle this problem, we exploit a data-driven approach that extrac… ▽ More LinkedIn has grown to become a platform hosting diverse sources of information ranging from member profiles, jobs, professional groups, slideshows etc. Given the existence of multiple sources, when a member issues a query like "software engineer", the member could look for software engineer profiles, jobs or professional groups. To tackle this problem, we exploit a data-driven approach that extracts searcher intents from their profile data and recent activities at a large scale. The intents such as job seeking, hiring, content consuming are used to construct features to personalize federated search experience. We tested the approach on the LinkedIn homepage and A/B tests show significant improvements in member engagement. As of writing this paper, the approach powers all of federated search on LinkedIn homepage. △ Less

Submitted 16 February, 2016; originally announced February 2016.

Comments: in Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM 2015)

Showing 1–20 of 20 results for author: Arya, D