-
SC-Phi2: A Fine-tuned Small Language Model for StarCraft II Macromanagement Tasks
Authors:
Muhammad Junaid Khan,
Gita Sukthankar
Abstract:
This paper introduces SC-Phi2, a fine-tuned StarCraft II small language model for macromanagement tasks. Small language models, like Phi2, Gemma, and DistilBERT, are streamlined versions of large language models (LLMs) with fewer parameters that require less power and memory to run. To teach Microsoft's Phi2 model about StarCraft, we create a new SC2 text dataset with information about StarCraft r…
▽ More
This paper introduces SC-Phi2, a fine-tuned StarCraft II small language model for macromanagement tasks. Small language models, like Phi2, Gemma, and DistilBERT, are streamlined versions of large language models (LLMs) with fewer parameters that require less power and memory to run. To teach Microsoft's Phi2 model about StarCraft, we create a new SC2 text dataset with information about StarCraft races, roles, and actions and use it to fine-tune Phi-2 with self-supervised learning. We pair this language model with a Vision Transformer (ViT) from the pre-trained BLIP-2 (Bootstrapping Language Image Pre-training) model, fine-tuning it on the MSC replay dataset. This enables us to construct dynamic prompts that include visual game state information. Unlike the large models used in StarCraft LLMs such as GPT-3.5, Phi2 is trained primarily on textbook data and contains little inherent knowledge of StarCraft II beyond what is provided by our training process. By using LoRA (Low-rank Adaptation) and quantization, our model can be trained on a single GPU. We demonstrate that our model performs well at micromanagement tasks such as build order and global state prediction with a small number of parameters.
△ Less
Submitted 17 September, 2024;
originally announced September 2024.
-
Visual Episodic Memory-based Exploration
Authors:
Jack Vice,
Natalie Ruiz-Sanchez,
Pamela K. Douglas,
Gita Sukthankar
Abstract:
In humans, intrinsic motivation is an important mechanism for open-ended cognitive development; in robots, it has been shown to be valuable for exploration. An important aspect of human cognitive development is $\textit{episodic memory}$ which enables both the recollection of events from the past and the projection of subjective future. This paper explores the use of visual episodic memory as a so…
▽ More
In humans, intrinsic motivation is an important mechanism for open-ended cognitive development; in robots, it has been shown to be valuable for exploration. An important aspect of human cognitive development is $\textit{episodic memory}$ which enables both the recollection of events from the past and the projection of subjective future. This paper explores the use of visual episodic memory as a source of intrinsic motivation for robotic exploration problems. Using a convolutional recurrent neural network autoencoder, the agent learns an efficient representation for spatiotemporal features such that accurate sequence prediction can only happen once spatiotemporal features have been learned. Structural similarity between ground truth and autoencoder generated images is used as an intrinsic motivation signal to guide exploration. Our proposed episodic memory model also implicitly accounts for the agent's actions, motivating the robot to seek new interactive experiences rather than just areas that are visually dissimilar. When guiding robotic exploration, our proposed method outperforms the Curiosity-driven Variational Autoencoder (CVAE) at finding dynamic anomalies.
△ Less
Submitted 18 May, 2024;
originally announced May 2024.
-
Smart Sampling: Self-Attention and Bootstrapping for Improved Ensembled Q-Learning
Authors:
Muhammad Junaid Khan,
Syed Hammad Ahmed,
Gita Sukthankar
Abstract:
We present a novel method aimed at enhancing the sample efficiency of ensemble Q learning. Our proposed approach integrates multi-head self-attention into the ensembled Q networks while bootstrapping the state-action pairs ingested by the ensemble. This not only results in performance improvements over the original REDQ (Chen et al. 2021) and its variant DroQ (Hi-raoka et al. 2022), thereby enhanc…
▽ More
We present a novel method aimed at enhancing the sample efficiency of ensemble Q learning. Our proposed approach integrates multi-head self-attention into the ensembled Q networks while bootstrapping the state-action pairs ingested by the ensemble. This not only results in performance improvements over the original REDQ (Chen et al. 2021) and its variant DroQ (Hi-raoka et al. 2022), thereby enhancing Q predictions, but also effectively reduces both the average normalized bias and standard deviation of normalized bias within Q-function ensembles. Importantly, our method also performs well even in scenarios with a low update-to-data (UTD) ratio. Notably, the implementation of our proposed method is straightforward, requiring minimal modifications to the base model.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
Enhanced Multimodal Content Moderation of Children's Videos using Audiovisual Fusion
Authors:
Syed Hammad Ahmed,
Muhammad Junaid Khan,
Gita Sukthankar
Abstract:
Due to the rise in video content creation targeted towards children, there is a need for robust content moderation schemes for video hosting platforms. A video that is visually benign may include audio content that is inappropriate for young children while being impossible to detect with a unimodal content moderation system. Popular video hosting platforms for children such as YouTube Kids still p…
▽ More
Due to the rise in video content creation targeted towards children, there is a need for robust content moderation schemes for video hosting platforms. A video that is visually benign may include audio content that is inappropriate for young children while being impossible to detect with a unimodal content moderation system. Popular video hosting platforms for children such as YouTube Kids still publish videos which contain audio content that is not conducive to a child's healthy behavioral and physical development. A robust classification of malicious videos requires audio representations in addition to video features. However, recent content moderation approaches rarely employ multimodal architectures that explicitly consider non-speech audio cues. To address this, we present an efficient adaptation of CLIP (Contrastive Language-Image Pre-training) that can leverage contextual audio cues for enhanced content moderation. We incorporate 1) the audio modality and 2) prompt learning, while keeping the backbone modules of each modality frozen. We conduct our experiments on a multimodal version of the MOB (Malicious or Benign) dataset in supervised and few-shot settings.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
The Potential of Vision-Language Models for Content Moderation of Children's Videos
Authors:
Syed Hammad Ahmed,
Shengnan Hu,
Gita Sukthankar
Abstract:
Natural language supervision has been shown to be effective for zero-shot learning in many computer vision tasks, such as object detection and activity recognition. However, generating informative prompts can be challenging for more subtle tasks, such as video content moderation. This can be difficult, as there are many reasons why a video might be inappropriate, beyond violence and obscenity. For…
▽ More
Natural language supervision has been shown to be effective for zero-shot learning in many computer vision tasks, such as object detection and activity recognition. However, generating informative prompts can be challenging for more subtle tasks, such as video content moderation. This can be difficult, as there are many reasons why a video might be inappropriate, beyond violence and obscenity. For example, scammers may attempt to create junk content that is similar to popular educational videos but with no meaningful information. This paper evaluates the performance of several CLIP variations for content moderation of children's cartoons in both the supervised and zero-shot setting. We show that our proposed model (Vanilla CLIP with Projection Layer) outperforms previous work conducted on the Malicious or Benign (MOB) benchmark for video content moderation. This paper presents an in depth analysis of how context-specific language prompts affect content moderation performance. Our results indicate that it is important to include more context in content moderation prompts, particularly for cartoon videos as they are not well represented in the CLIP training data.
△ Less
Submitted 6 December, 2023;
originally announced December 2023.
-
LAMP: Leveraging Language Prompts for Multi-person Pose Estimation
Authors:
Shengnan Hu,
Ce Zheng,
Zixiang Zhou,
Chen Chen,
Gita Sukthankar
Abstract:
Human-centric visual understanding is an important desideratum for effective human-robot interaction. In order to navigate crowded public places, social robots must be able to interpret the activity of the surrounding humans. This paper addresses one key aspect of human-centric visual understanding, multi-person pose estimation. Achieving good performance on multi-person pose estimation in crowded…
▽ More
Human-centric visual understanding is an important desideratum for effective human-robot interaction. In order to navigate crowded public places, social robots must be able to interpret the activity of the surrounding humans. This paper addresses one key aspect of human-centric visual understanding, multi-person pose estimation. Achieving good performance on multi-person pose estimation in crowded scenes is difficult due to the challenges of occluded joints and instance separation. In order to tackle these challenges and overcome the limitations of image features in representing invisible body parts, we propose a novel prompt-based pose inference strategy called LAMP (Language Assisted Multi-person Pose estimation). By utilizing the text representations generated by a well-trained language model (CLIP), LAMP can facilitate the understanding of poses on the instance and joint levels, and learn more robust visual representations that are less susceptible to occlusion. This paper demonstrates that language-supervised training boosts the performance of single-stage multi-person pose estimation, and both instance-level and joint-level prompts are valuable for training. The code is available at https://github.com/shengnanh20/LAMP.
△ Less
Submitted 26 July, 2023; v1 submitted 21 July, 2023;
originally announced July 2023.
-
Malicious or Benign? Towards Effective Content Moderation for Children's Videos
Authors:
Syed Hammad Ahmed,
Muhammad Junaid Khan,
H. M. Umer Qaisar,
Gita Sukthankar
Abstract:
Online video platforms receive hundreds of hours of uploads every minute, making manual content moderation impossible. Unfortunately, the most vulnerable consumers of malicious video content are children from ages 1-5 whose attention is easily captured by bursts of color and sound. Scammers attempting to monetize their content may craft malicious children's videos that are superficially similar to…
▽ More
Online video platforms receive hundreds of hours of uploads every minute, making manual content moderation impossible. Unfortunately, the most vulnerable consumers of malicious video content are children from ages 1-5 whose attention is easily captured by bursts of color and sound. Scammers attempting to monetize their content may craft malicious children's videos that are superficially similar to educational videos, but include scary and disgusting characters, violent motions, loud music, and disturbing noises. Prominent video hosting platforms like YouTube have taken measures to mitigate malicious content on their platform, but these videos often go undetected by current content moderation tools that are focused on removing pornographic or copyrighted content. This paper introduces our toolkit Malicious or Benign for promoting research on automated content moderation of children's videos. We present 1) a customizable annotation tool for videos, 2) a new dataset with difficult to detect test cases of malicious content and 3) a benchmark suite of state-of-the-art video classification models.
△ Less
Submitted 24 May, 2023;
originally announced May 2023.
-
Improving the Generalizability of Collaborative Dialogue Analysis with Multi-Feature Embeddings
Authors:
Ayesha Enayet,
Gita Sukthankar
Abstract:
Conflict prediction in communication is integral to the design of virtual agents that support successful teamwork by providing timely assistance. The aim of our research is to analyze discourse to predict collaboration success. Unfortunately, resource scarcity is a problem that teamwork researchers commonly face since it is hard to gather a large number of training examples. To alleviate this prob…
▽ More
Conflict prediction in communication is integral to the design of virtual agents that support successful teamwork by providing timely assistance. The aim of our research is to analyze discourse to predict collaboration success. Unfortunately, resource scarcity is a problem that teamwork researchers commonly face since it is hard to gather a large number of training examples. To alleviate this problem, this paper introduces a multi-feature embedding (MFeEmb) that improves the generalizability of conflict prediction models trained on dialogue sequences. MFeEmb leverages textual, structural, and semantic information from the dialogues by incorporating lexical, dialogue acts, and sentiment features. The use of dialogue acts and sentiment features reduces performance loss from natural distribution shifts caused mainly by changes in vocabulary.
This paper demonstrates the performance of MFeEmb on domain adaptation problems in which the model is trained on discourse from one task domain and applied to predict team performance in a different domain. The generalizability of MFeEmb is quantified using the similarity measure proposed by Bontonou et al. (2021). Our results show that MFeEmb serves as an excellent domain-agnostic representation for meta-pretraining a few-shot model on collaborative multiparty dialogues.
△ Less
Submitted 9 February, 2023;
originally announced February 2023.
-
Improving Code Review with GitHub Issue Tracking
Authors:
Abduljaleel Al-Rubaye,
Gita Sukthankar
Abstract:
Software quality is an important problem for technology companies, since it substantially impacts the efficiency, usefulness, and maintainability of the final product; hence, code review is a must-do activity for software developers. During the code review process, senior engineers monitor other developers' work to spot possible problems and enforce coding standards. One of the most widely used op…
▽ More
Software quality is an important problem for technology companies, since it substantially impacts the efficiency, usefulness, and maintainability of the final product; hence, code review is a must-do activity for software developers. During the code review process, senior engineers monitor other developers' work to spot possible problems and enforce coding standards. One of the most widely used open-source software platforms, GitHub, attracts millions of developers who use it to store their projects. This study aims to analyze code quality on GitHub from the standpoint of code reviews. We examined the code review process using GitHub's Issues Tracker, which allows team members to evaluate, discuss, and share their opinions on the proposed code before it is approved. Based on our analysis, we present a novel approach for improving the code review process by promoting regularity and community involvement.
△ Less
Submitted 10 October, 2022;
originally announced October 2022.
-
Through a fair looking-glass: mitigating bias in image datasets
Authors:
Amirarsalan Rajabi,
Mehdi Yazdani-Jahromi,
Ozlem Ozmen Garibay,
Gita Sukthankar
Abstract:
With the recent growth in computer vision applications, the question of how fair and unbiased they are has yet to be explored. There is abundant evidence that the bias present in training data is reflected in the models, or even amplified. Many previous methods for image dataset de-biasing, including models based on augmenting datasets, are computationally expensive to implement. In this study, we…
▽ More
With the recent growth in computer vision applications, the question of how fair and unbiased they are has yet to be explored. There is abundant evidence that the bias present in training data is reflected in the models, or even amplified. Many previous methods for image dataset de-biasing, including models based on augmenting datasets, are computationally expensive to implement. In this study, we present a fast and effective model to de-bias an image dataset through reconstruction and minimizing the statistical dependence between intended variables. Our architecture includes a U-net to reconstruct images, combined with a pre-trained classifier which penalizes the statistical dependence between target attribute and the protected attribute. We evaluate our proposed model on CelebA dataset, compare the results with a state-of-the-art de-biasing method, and show that the model achieves a promising fairness-accuracy combination.
△ Less
Submitted 18 September, 2022;
originally announced September 2022.
-
Transformer-based Value Function Decomposition for Cooperative Multi-agent Reinforcement Learning in StarCraft
Authors:
Muhammad Junaid Khan,
Syed Hammad Ahmed,
Gita Sukthankar
Abstract:
The StarCraft II Multi-Agent Challenge (SMAC) was created to be a challenging benchmark problem for cooperative multi-agent reinforcement learning (MARL). SMAC focuses exclusively on the problem of StarCraft micromanagement and assumes that each unit is controlled individually by a learning agent that acts independently and only possesses local information; centralized training is assumed to occur…
▽ More
The StarCraft II Multi-Agent Challenge (SMAC) was created to be a challenging benchmark problem for cooperative multi-agent reinforcement learning (MARL). SMAC focuses exclusively on the problem of StarCraft micromanagement and assumes that each unit is controlled individually by a learning agent that acts independently and only possesses local information; centralized training is assumed to occur with decentralized execution (CTDE). To perform well in SMAC, MARL algorithms must handle the dual problems of multi-agent credit assignment and joint action evaluation.
This paper introduces a new architecture TransMix, a transformer-based joint action-value mixing network which we show to be efficient and scalable as compared to the other state-of-the-art cooperative MARL solutions. TransMix leverages the ability of transformers to learn a richer mixing function for combining the agents' individual value functions. It achieves comparable performance to previous work on easy SMAC scenarios and outperforms other techniques on hard scenarios, as well as scenarios that are corrupted with Gaussian noise to simulate fog of war.
△ Less
Submitted 15 August, 2022;
originally announced August 2022.
-
Predicting Team Performance with Spatial Temporal Graph Convolutional Networks
Authors:
Shengnan Hu,
Gita Sukthankar
Abstract:
This paper presents a new approach for predicting team performance from the behavioral traces of a set of agents. This spatiotemporal forecasting problem is very relevant to sports analytics challenges such as coaching and opponent modeling. We demonstrate that our proposed model, Spatial Temporal Graph Convolutional Networks (ST-GCN), outperforms other classification techniques at predicting game…
▽ More
This paper presents a new approach for predicting team performance from the behavioral traces of a set of agents. This spatiotemporal forecasting problem is very relevant to sports analytics challenges such as coaching and opponent modeling. We demonstrate that our proposed model, Spatial Temporal Graph Convolutional Networks (ST-GCN), outperforms other classification techniques at predicting game score from a short segment of player movement and game features. Our proposed architecture uses a graph convolutional network to capture the spatial relationships between team members and Gated Recurrent Units to analyze dynamic motion information. An ablative evaluation was performed to demonstrate the contributions of different aspects of our architecture.
△ Less
Submitted 21 June, 2022;
originally announced June 2022.
-
Leveraging Evolutionary Algorithms for Feasible Hexapod Locomotion Across Uneven Terrain
Authors:
Jack Vice,
Gita Sukthankar,
Pamela K. Douglas
Abstract:
Optimizing gait stability for legged robots is a difficult problem. Even on level surfaces, effectively traversing across different textures (e.g., carpet) rests on dynamically tuning parameters in multidimensional space. Inspired by biology, evolutionary algorithms (EA) remain an attractive solution for feasibly implementing robotic locomotion with both energetic economy and rapid parameter conve…
▽ More
Optimizing gait stability for legged robots is a difficult problem. Even on level surfaces, effectively traversing across different textures (e.g., carpet) rests on dynamically tuning parameters in multidimensional space. Inspired by biology, evolutionary algorithms (EA) remain an attractive solution for feasibly implementing robotic locomotion with both energetic economy and rapid parameter convergence. Here, we leveraged this class of algorithms to evolve a stable hexapod gait controller capable of traversing uneven terrain and obstacles. Gait parameters were evolved in a rigid body dynamics simulation on an 8 x 3 meter obstacle course comprised of random step field, linear obstacles and inclined surfaces. Using a fitness function that jointly optimized locomotion velocity and stability, we found that multiple successful gait parameter evolutions yielded specialized functionality for each leg. Specific gait parameters were identified as critical to developing a rough terrain gait.
△ Less
Submitted 29 March, 2022;
originally announced March 2022.
-
Leveraging Transformers for StarCraft Macromanagement Prediction
Authors:
Muhammad Junaid Khan,
Shah Hassan,
Gita Sukthankar
Abstract:
Inspired by the recent success of transformers in natural language processing and computer vision applications, we introduce a transformer-based neural architecture for two key StarCraft II (SC2) macromanagement tasks: global state and build order prediction. Unlike recurrent neural networks which suffer from a recency bias, transformers are able to capture patterns across very long time horizons,…
▽ More
Inspired by the recent success of transformers in natural language processing and computer vision applications, we introduce a transformer-based neural architecture for two key StarCraft II (SC2) macromanagement tasks: global state and build order prediction. Unlike recurrent neural networks which suffer from a recency bias, transformers are able to capture patterns across very long time horizons, making them well suited for full game analysis. Our model utilizes the MSC (Macromanagement in StarCraft II) dataset and improves on the top performing gated recurrent unit (GRU) architecture in predicting global state and build order as measured by mean accuracy over multiple time horizons. We present ablation studies on our proposed architecture that support our design decisions. One key advantage of transformers is their ability to generalize well, and we demonstrate that our model achieves an even better accuracy when used in a transfer learning setting in which models trained on games with one racial matchup (e.g., Terran vs. Protoss) are transferred to a different one. We believe that transformers' ability to model long games, potential for parallelization, and generalization performance make them an excellent choice for StarCraft agents.
△ Less
Submitted 11 October, 2021;
originally announced October 2021.
-
Do Bots Modify the Workflow of GitHub Teams?
Authors:
Samaneh Saadat,
Natalia Colmenares,
Gita Sukthankar
Abstract:
The ever-increasing complexity of modern software engineering projects makes the usage of automated assistants imperative. Bots can be used to complete repetitive tasks during development and testing, as well as promoting communication between team members through issue reporting and documentation. Although the ultimate aim of these automated assistants is to speed taskwork completion, their inclu…
▽ More
The ever-increasing complexity of modern software engineering projects makes the usage of automated assistants imperative. Bots can be used to complete repetitive tasks during development and testing, as well as promoting communication between team members through issue reporting and documentation. Although the ultimate aim of these automated assistants is to speed taskwork completion, their inclusion into GitHub repositories may affect teamwork as well. This paper studies the question of how bots modify the team workflow. We examined the event sequences of repositories with bots and without bots using a contrast motif discovery method to detect subsequences that are more prevalent in one set of event sequences vs. the other. Our study reveals that teams with bots are more likely to intersperse comments throughout their coding activities, while not actually being more prolific commenters.
△ Less
Submitted 16 March, 2021;
originally announced March 2021.
-
Analyzing Team Performance with Embeddings from Multiparty Dialogues
Authors:
Ayesha Enayet,
Gita Sukthankar
Abstract:
Good communication is indubitably the foundation of effective teamwork. Over time teams develop their own communication styles and often exhibit entrainment, a conversational phenomena in which humans synchronize their linguistic choices. This paper examines the problem of predicting team performance from embeddings learned from multiparty dialogues such that teams with similar conflict scores lie…
▽ More
Good communication is indubitably the foundation of effective teamwork. Over time teams develop their own communication styles and often exhibit entrainment, a conversational phenomena in which humans synchronize their linguistic choices. This paper examines the problem of predicting team performance from embeddings learned from multiparty dialogues such that teams with similar conflict scores lie close to one another in vector space. Embeddings were extracted from three types of features: 1) dialogue acts 2) sentiment polarity 3) syntactic entrainment. Although all of these features can be used to effectively predict team performance, their utility varies by the teamwork phase. We separate the dialogues of players playing a cooperative game into stages: 1) early (knowledge building) 2) middle (problem-solving) and 3) late (culmination). Unlike syntactic entrainment, both dialogue act and sentiment embeddings are effective for classifying team performance, even during the initial phase. This finding has potential ramifications for the development of conversational agents that facilitate teaming.
△ Less
Submitted 23 January, 2021;
originally announced January 2021.
-
Leveraging the Variance of Return Sequences for Exploration Policy
Authors:
Zerong Xi,
Gita Sukthankar
Abstract:
This paper introduces a method for constructing an upper bound for exploration policy using either the weighted variance of return sequences or the weighted temporal difference (TD) error. We demonstrate that the variance of the return sequence for a specific state-action pair is an important information source that can be leveraged to guide exploration in reinforcement learning. The intuition is…
▽ More
This paper introduces a method for constructing an upper bound for exploration policy using either the weighted variance of return sequences or the weighted temporal difference (TD) error. We demonstrate that the variance of the return sequence for a specific state-action pair is an important information source that can be leveraged to guide exploration in reinforcement learning. The intuition is that fluctuation in the return sequence indicates greater uncertainty in the near future returns. This divergence occurs because of the cyclic nature of value-based reinforcement learning; the evolving value function begets policy improvements which in turn modify the value function. Although both variance and TD errors capture different aspects of this uncertainty, our analysis shows that both can be valuable to guide exploration. We propose a two-stream network architecture to estimate weighted variance/TD errors within DQN agents for our exploration method and show that it outperforms the baseline on a wide range of Atari games.
△ Less
Submitted 17 November, 2020;
originally announced November 2020.
-
A Transfer Learning Approach for Dialogue Act Classification of GitHub Issue Comments
Authors:
Ayesha Enayet,
Gita Sukthankar
Abstract:
Social coding platforms, such as GitHub, serve as laboratories for studying collaborative problem solving in open source software development; a key feature is their ability to support issue reporting which is used by teams to discuss tasks and ideas. Analyzing the dialogue between team members, as expressed in issue comments, can yield important insights about the performance of virtual teams. Th…
▽ More
Social coding platforms, such as GitHub, serve as laboratories for studying collaborative problem solving in open source software development; a key feature is their ability to support issue reporting which is used by teams to discuss tasks and ideas. Analyzing the dialogue between team members, as expressed in issue comments, can yield important insights about the performance of virtual teams. This paper presents a transfer learning approach for performing dialogue act classification on issue comments. Since no large labeled corpus of GitHub issue comments exists, employing transfer learning enables us to leverage standard dialogue act datasets in combination with our own GitHub comment dataset. We compare the performance of several word and sentence level encoding models including Global Vectors for Word Representations (GloVe), Universal Sentence Encoder (USE), and Bidirectional Encoder Representations from Transformers (BERT). Being able to map the issue comments to dialogue acts is a useful stepping stone towards understanding cognitive team processes.
△ Less
Submitted 9 November, 2020;
originally announced November 2020.
-
Scoring Popularity in GitHub
Authors:
Abduljaleel Al-Rubaye,
Gita Sukthankar
Abstract:
Popularity and engagement are the currencies of social media platforms, serving as powerful reinforcement mechanisms to keep users online. Social coding platforms such as GitHub serve a dual purpose: they are practical tools that facilitate asynchronous, distributed collaborations between software developers while also supporting passive social media style interactions. There are several mechanism…
▽ More
Popularity and engagement are the currencies of social media platforms, serving as powerful reinforcement mechanisms to keep users online. Social coding platforms such as GitHub serve a dual purpose: they are practical tools that facilitate asynchronous, distributed collaborations between software developers while also supporting passive social media style interactions. There are several mechanisms for "liking" content on GitHub: 1) forking repositories to copy their content 2) watching repositories to be notified of updates and 3) starring to express approval. This paper presents a study of popularity in GitHub and examines the relationship between these three quantitative measures of popularity. We introduce a weight-based popularity score (WTPS) that is extracted from the history line of other popularity indicators.
△ Less
Submitted 9 November, 2020;
originally announced November 2020.
-
Analyzing the Productivity of GitHub Teams based on Formation Phase Activity
Authors:
Samaneh Saadat,
Olivia B. Newton,
Gita Sukthankar,
Stephen M. Fiore
Abstract:
Our goal is to understand the characteristics of high-performing teams on GitHub. Towards this end, we collect data from software repositories and evaluate teams by examining differences in productivity. Our study focuses on the team formation phase, the first six months after repository creation. To better understand team activity, we clustered repositories based on the proportion of their work a…
▽ More
Our goal is to understand the characteristics of high-performing teams on GitHub. Towards this end, we collect data from software repositories and evaluate teams by examining differences in productivity. Our study focuses on the team formation phase, the first six months after repository creation. To better understand team activity, we clustered repositories based on the proportion of their work activities and discovered three work styles in teams: toilers, communicators, and collaborators. Based on our results, we contend that early activities in software development repositories on GitHub establish coordination processes that enable effective collaborations over time.
△ Less
Submitted 6 November, 2020;
originally announced November 2020.
-
Explaining Differences in Classes of Discrete Sequences
Authors:
Samaneh Saadat,
Gita Sukthankar
Abstract:
While there are many machine learning methods to classify and cluster sequences, they fail to explain what are the differences in groups of sequences that make them distinguishable. Although in some cases having a black box model is sufficient, there is a need for increased explainability in research areas focused on human behaviors. For example, psychologists are less interested in having a model…
▽ More
While there are many machine learning methods to classify and cluster sequences, they fail to explain what are the differences in groups of sequences that make them distinguishable. Although in some cases having a black box model is sufficient, there is a need for increased explainability in research areas focused on human behaviors. For example, psychologists are less interested in having a model that predicts human behavior with high accuracy and more concerned with identifying differences between actions that lead to divergent human behavior. This paper presents techniques for understanding differences between classes of discrete sequences. Approaches introduced in this paper can be utilized to interpret black box machine learning models on sequences. The first approach compares k-gram representations of sequences using the silhouette score. The second method characterizes differences by analyzing the distance matrix of subsequences. As a case study, we trained black box supervised learning methods to classify sequences of GitHub teams and then utilized our sequence analysis techniques to measure and characterize differences between event sequences of teams with bots and teams without bots. In our second case study, we classified Minecraft event sequences to infer their high-level actions and analyzed differences between low-level event sequences of actions.
△ Less
Submitted 6 November, 2020;
originally announced November 2020.
-
Deep Agent: Studying the Dynamics of Information Spread and Evolution in Social Networks
Authors:
Ivan Garibay,
Toktam A. Oghaz,
Niloofar Yousefi,
Ece C. Mutlu,
Madeline Schiappa,
Steven Scheinert,
Georgios C. Anagnostopoulos,
Christina Bouwens,
Stephen M. Fiore,
Alexander Mantzaris,
John T. Murphy,
William Rand,
Anastasia Salter,
Mel Stanfill,
Gita Sukthankar,
Nisha Baral,
Gabriel Fair,
Chathika Gunaratne,
Neda B. Hajiakhoond,
Jasser Jasser,
Chathura Jayalath,
Olivia Newton,
Samaneh Saadat,
Chathurani Senevirathna,
Rachel Winter
, et al. (1 additional authors not shown)
Abstract:
This paper explains the design of a social network analysis framework, developed under DARPA's SocialSim program, with novel architecture that models human emotional, cognitive and social factors. Our framework is both theory and data-driven, and utilizes domain expertise. Our simulation effort helps in understanding how information flows and evolves in social media platforms. We focused on modeli…
▽ More
This paper explains the design of a social network analysis framework, developed under DARPA's SocialSim program, with novel architecture that models human emotional, cognitive and social factors. Our framework is both theory and data-driven, and utilizes domain expertise. Our simulation effort helps in understanding how information flows and evolves in social media platforms. We focused on modeling three information domains: cryptocurrencies, cyber threats, and software vulnerabilities for the three interrelated social environments: GitHub, Reddit, and Twitter. We participated in the SocialSim DARPA Challenge in December 2018, in which our models were subjected to extensive performance evaluation for accuracy, generalizability, explainability, and experimental power. This paper reports the main concepts and models, utilized in our social media modeling effort in developing a multi-resolution simulation at the user, community, population, and content levels.
△ Less
Submitted 29 May, 2021; v1 submitted 25 March, 2020;
originally announced March 2020.
-
Selfie Drone Stick: A Natural Interface for Quadcopter Photography
Authors:
Saif Alabachi,
Gita Sukthankar,
Rahul Sukthankar
Abstract:
A physical selfie stick extends the user's reach, enabling the acquisition of personal photos that include more of the background scene. Similarly, a quadcopter can capture photos from vantage points unattainable by the user; but teleoperating a quadcopter to good viewpoints is a difficult task. This paper presents a natural interface for quadcopter photography, the SelfieDroneStick that allows th…
▽ More
A physical selfie stick extends the user's reach, enabling the acquisition of personal photos that include more of the background scene. Similarly, a quadcopter can capture photos from vantage points unattainable by the user; but teleoperating a quadcopter to good viewpoints is a difficult task. This paper presents a natural interface for quadcopter photography, the SelfieDroneStick that allows the user to guide the quadcopter to the optimal vantage point based on the phone's sensors. Users specify the composition of their desired long-range selfies using their smartphone, and the quadcopter autonomously flies to a sequence of vantage points from where the desired shots can be taken. The robot controller is trained from a combination of real-world images and simulated flight data. This paper describes two key innovations required to deploy deep reinforcement learning models on a real robot: 1) an abstract state representation for transferring learning from simulation to the hardware platform, and 2) reward shaping and staging paradigms for training the controller. Both of these improvements were found to be essential in learning a robot controller from simulation that transfers successfully to the real robot.
△ Less
Submitted 18 August, 2020; v1 submitted 13 September, 2019;
originally announced September 2019.
-
Customizing Object Detectors for Indoor Robots
Authors:
Saif Alabachi,
Gita Sukthankar,
Rahul Sukthankar
Abstract:
Object detection models based on convolutional neural networks (CNNs) demonstrate impressive performance when trained on large-scale labeled datasets. While a generic object detector trained on such a dataset performs adequately in applications where the input data is similar to user photographs, the detector performs poorly on small objects, particularly ones with limited training data or imaged…
▽ More
Object detection models based on convolutional neural networks (CNNs) demonstrate impressive performance when trained on large-scale labeled datasets. While a generic object detector trained on such a dataset performs adequately in applications where the input data is similar to user photographs, the detector performs poorly on small objects, particularly ones with limited training data or imaged from uncommon viewpoints. Also, a specific room will have many objects that are missed by standard object detectors, frustrating a robot that continually operates in the same indoor environment.
This paper describes a system for rapidly creating customized object detectors. Data is collected from a quadcopter that is teleoperated with an interactive interface. Once an object is selected, the quadcopter autonomously photographs the object from multiple viewpoints to %create training data that is used by DUNet (Dense Upscaled Net), collect data to train DUNet (Dense Upscaled Network), our proposed model for learning customized object detectors from scratch given limited data. Our experiments compare the performance of learning models from scratch with DUNet vs.\ fine tuning existing state of the art object detectors, both on our indoor robotics domain and on standard datasets.
△ Less
Submitted 27 February, 2019;
originally announced February 2019.
-
Network Semantic Segmentation with Application to GitHub
Authors:
Neda Hajiakhoond Bidoki,
Gita Sukthankar
Abstract:
In this paper we introduce the concept of network semantic segmentation for social network analysis. We consider the GitHub social coding network which has been a center of attention for both researchers and software developers. Network semantic segmentation describes the process of associating each user with a class label such as a topic of interest. We augment node attributes with network signif…
▽ More
In this paper we introduce the concept of network semantic segmentation for social network analysis. We consider the GitHub social coding network which has been a center of attention for both researchers and software developers. Network semantic segmentation describes the process of associating each user with a class label such as a topic of interest. We augment node attributes with network significant connections and then employ machine learning approaches to cluster the users. We compare the results with a network segmentation performed using community detection algorithms and one executed by clustering with node attributes. Results are compared in terms of community diversity within the semantic segments along with topic
△ Less
Submitted 6 March, 2019; v1 submitted 13 February, 2019;
originally announced February 2019.
-
A Cross-Repository Model for Predicting Popularity in GitHub
Authors:
Neda Hajiakhoond Bidoki,
Gita Sukthankar,
Heather Keathley,
Ivan Garibay
Abstract:
Social coding platforms, such as GitHub, can serve as natural laboratories for studying the diffusion of innovation through tracking the pattern of code adoption by programmers. This paper focuses on the problem of predicting the popularity of software repositories over time; our aim is to forecast the time series of popularity-related events (code forks and watches). In particular, we are interes…
▽ More
Social coding platforms, such as GitHub, can serve as natural laboratories for studying the diffusion of innovation through tracking the pattern of code adoption by programmers. This paper focuses on the problem of predicting the popularity of software repositories over time; our aim is to forecast the time series of popularity-related events (code forks and watches). In particular, we are interested in cross-repository patterns-how do events on one repository affect other repositories? Our proposed LSTM (Long Short-Term Memory) recurrent neural network integrates events across multiple active repositories, outperforming a standard ARIMA (Auto-Regressive Integrated Moving Average) time series prediction based on the single repository. The ability of the LSTM to leverage cross-repository information gives it a significant edge over standard time series forecasting.
△ Less
Submitted 13 February, 2019;
originally announced February 2019.
-
A Communication Protocol for Man-Machine Networks
Authors:
Neda Hajiakhoond,
Gita Sukthankar
Abstract:
One of the most challenging coordination problems in artificial intelligence is to achieve successful collaboration across large-scale heterogeneous systems that include Robots, Agents, and People (RAP). In the best case, these RAP systems are potentially capable of leveraging the strengths of the individual entities to achieve complex distributed tasks. However, without intelligent communication…
▽ More
One of the most challenging coordination problems in artificial intelligence is to achieve successful collaboration across large-scale heterogeneous systems that include Robots, Agents, and People (RAP). In the best case, these RAP systems are potentially capable of leveraging the strengths of the individual entities to achieve complex distributed tasks. However, without intelligent communication protocols, man-machine partnerships are likely to fail as the humans become overloaded with irrelevant information. This paper introduces a communication protocol for man machine systems and demonstrates that its message routing performance approaches the central optimized solution in a simulated smart environment scenario.
△ Less
Submitted 23 August, 2018;
originally announced August 2018.
-
Intelligently Assisting Human-Guided Quadcopter Photography
Authors:
Saif Alabachi,
Gita Sukthankar
Abstract:
Drones are a versatile platform for both amateur and professional photographers, enabling them to capture photos that are impossible to shoot with ground-based cameras. However, when guided by inexperienced pilots, they have a high incidence of collisions, crashes, and poorly framed photographs. This paper presents an intelligent user interface for photographing objects that is robust against navi…
▽ More
Drones are a versatile platform for both amateur and professional photographers, enabling them to capture photos that are impossible to shoot with ground-based cameras. However, when guided by inexperienced pilots, they have a high incidence of collisions, crashes, and poorly framed photographs. This paper presents an intelligent user interface for photographing objects that is robust against navigation errors and reliably collects high quality photographs. By retaining the human in the loop, our system is faster and more selective than purely autonomous UAVs that employ simple coverage algorithms. The intelligent user interface operates in multiple modes, allowing the user to either directly control the quadcopter or fly in a semi-autonomous mode around a target object in the environment. To evaluate the interface, users completed a data set collection task in which they were asked to photograph objects from multiple views. Our sketchbased control paradigm facilitated task completion, reduced crashes, and was favorably reviewed by the participants.
△ Less
Submitted 20 June, 2018;
originally announced June 2018.
-
Investigating Negative Interactions in Multiplex Networks: A Mutual Information Approach
Authors:
Alireza Hajibagheri,
Gita Sukthankar
Abstract:
Many interesting real-world systems are represented as complex networks with multiple types of interactions and complicated dependency structures between layers. These interactions can be encoded as having a valence with positive links marking interactions such as trust and friendship and negative links denoting distrust or hostility. Extracting information from these negative interactions is chal…
▽ More
Many interesting real-world systems are represented as complex networks with multiple types of interactions and complicated dependency structures between layers. These interactions can be encoded as having a valence with positive links marking interactions such as trust and friendship and negative links denoting distrust or hostility. Extracting information from these negative interactions is challenging since standard topological metrics are often poor predictors of negative link formation, particularly across network layers. In this paper, we introduce a method based on mutual information which enables us to predict both negative and positive relationships. Our experiments show that SMLP (Signed Multiplex Link Prediction) can leverage negative relationship layers in multiplex networks to improve link prediction performance.
△ Less
Submitted 16 August, 2018; v1 submitted 19 April, 2018;
originally announced April 2018.
-
Real-World Modeling of a Pathfinding Robot Using Robot Operating System (ROS)
Authors:
Sayyed Jaffar Ali Raza,
Nitish A. Gupta,
Nisarg Chitaliya,
Gita R. Sukthankar
Abstract:
This paper presents a practical approach towards implementing pathfinding algorithms on real-world and low-cost non- commercial hardware platforms. While using robotics simulation platforms as a test-bed for our algorithms we easily overlook real- world exogenous problems that are developed by external factors. Such problems involve robot wheel slips, asynchronous motors, abnormal sensory data or…
▽ More
This paper presents a practical approach towards implementing pathfinding algorithms on real-world and low-cost non- commercial hardware platforms. While using robotics simulation platforms as a test-bed for our algorithms we easily overlook real- world exogenous problems that are developed by external factors. Such problems involve robot wheel slips, asynchronous motors, abnormal sensory data or unstable power sources. The real-world dynamics tend to be very painful even for executing simple algorithms like a Wavefront planner or A-star search. This paper addresses designing techniques that tend to be robust as well as reusable for any hardware platforms; covering problems like controlling asynchronous drives, odometry offset issues and handling abnormal sensory feedback. The algorithm implementation medium and hardware design tools have been kept general in order to present our work as a serving platform for future researchers and robotics enthusiast working in the field of path planning robotics.
△ Less
Submitted 27 February, 2018;
originally announced February 2018.
-
A Holistic Approach for Predicting Links in Coevolving Multilayer Networks
Authors:
Alireza Hajibagheri,
Gita Sukthankar,
Kiran Lakkaraju
Abstract:
Networks extracted from social media platforms frequently include multiple types of links that dynamically change over time; these links can be used to represent dyadic interactions such as economic transactions, communications, and shared activities. Organizing this data into a dynamic multiplex network, where each layer is composed of a single edge type linking the same underlying vertices, can…
▽ More
Networks extracted from social media platforms frequently include multiple types of links that dynamically change over time; these links can be used to represent dyadic interactions such as economic transactions, communications, and shared activities. Organizing this data into a dynamic multiplex network, where each layer is composed of a single edge type linking the same underlying vertices, can reveal interesting cross-layer interaction patterns. In coevolving networks, links in one layer result in an increased probability of other types of links forming between the same node pair. Hence we believe that a holistic approach in which all the layers are simultaneously considered can outperform a factored approach in which link prediction is performed separately in each layer. This paper introduces a comprehensive framework, MLP (Multilayer Link Prediction), in which link existence likelihoods for the target layer are learned from the other network layers. These likelihoods are used to reweight the output of a single layer link prediction method that uses rank aggregation to combine a set of topological metrics. Our experiments show that our reweighting procedure outperforms other methods for fusing information across network layers.
△ Less
Submitted 13 September, 2016;
originally announced September 2016.
-
Identifying Community Structures in Dynamic Networks
Authors:
Hamidreza Alvari,
Alireza Hajibagheri,
Gita Sukthankar,
Kiran Lakkaraju
Abstract:
Most real-world social networks are inherently dynamic, composed of communities that are constantly changing in membership. To track these evolving communities, we need dynamic community detection techniques. This article evaluates the performance of a set of game theoretic approaches for identifying communities in dynamic networks. Our method, D-GT (Dynamic Game Theoretic community detection), mo…
▽ More
Most real-world social networks are inherently dynamic, composed of communities that are constantly changing in membership. To track these evolving communities, we need dynamic community detection techniques. This article evaluates the performance of a set of game theoretic approaches for identifying communities in dynamic networks. Our method, D-GT (Dynamic Game Theoretic community detection), models each network node as a rational agent who periodically plays a community membership game with its neighbors. During game play, nodes seek to maximize their local utility by joining or leaving the communities of network neighbors. The community structure emerges after the game reaches a Nash equilibrium. Compared to the benchmark community detection methods, D-GT more accurately predicts the number of communities and finds community assignments with a higher normalized mutual information, while retaining a good modularity.
△ Less
Submitted 11 September, 2016; v1 submitted 8 September, 2016;
originally announced September 2016.
-
Leveraging Network Dynamics for Improved Link Prediction
Authors:
Alireza Hajibagheri,
Gita Sukthankar,
Kiran Lakkaraju
Abstract:
The aim of link prediction is to forecast connections that are most likely to occur in the future, based on examples of previously observed links. A key insight is that it is useful to explicitly model network dynamics, how frequently links are created or destroyed when doing link prediction. In this paper, we introduce a new supervised link prediction framework, RPM (Rate Prediction Model). In ad…
▽ More
The aim of link prediction is to forecast connections that are most likely to occur in the future, based on examples of previously observed links. A key insight is that it is useful to explicitly model network dynamics, how frequently links are created or destroyed when doing link prediction. In this paper, we introduce a new supervised link prediction framework, RPM (Rate Prediction Model). In addition to network similarity measures, RPM uses the predicted rate of link modifications, modeled using time series data; it is implemented in Spark-ML and trained with the original link distribution, rather than a small balanced subset. We compare the use of this network dynamics model to directly creating time series of network similarity measures. Our experiments show that RPM, which leverages predicted rates, outperforms the use of network similarity measures, either individually or within a time series.
△ Less
Submitted 8 April, 2016;
originally announced April 2016.
-
An Active Learning Approach for Jointly Estimating Worker Performance and Annotation Reliability with Crowdsourced Data
Authors:
Liyue Zhao,
Yu Zhang,
Gita Sukthankar
Abstract:
Crowdsourcing platforms offer a practical solution to the problem of affordably annotating large datasets for training supervised classifiers. Unfortunately, poor worker performance frequently threatens to compromise annotation reliability, and requesting multiple labels for every instance can lead to large cost increases without guaranteeing good results. Minimizing the required training samples…
▽ More
Crowdsourcing platforms offer a practical solution to the problem of affordably annotating large datasets for training supervised classifiers. Unfortunately, poor worker performance frequently threatens to compromise annotation reliability, and requesting multiple labels for every instance can lead to large cost increases without guaranteeing good results. Minimizing the required training samples using an active learning selection procedure reduces the labeling requirement but can jeopardize classifier training by focusing on erroneous annotations. This paper presents an active learning approach in which worker performance, task difficulty, and annotation reliability are jointly estimated and used to compute the risk function guiding the sample selection procedure. We demonstrate that the proposed approach, which employs active learning with Bayesian networks, significantly improves training accuracy and correctly ranks the expertise of unknown labelers in the presence of annotation noise.
△ Less
Submitted 15 January, 2014;
originally announced January 2014.