-
Iterative Recommendations based on Monte Carlo Sampling and Trust Estimation in Multi-Stage Vehicular Traffic Routing Games
Authors:
Doris E. M. Brown,
Venkata Sriram Siddhardh Nadendla,
Sajal K. Das
Abstract:
The shortest-time route recommendations offered by modern navigation systems fuel selfish routing in urban vehicular traffic networks and are therefore one of the main reasons for the growth of congestion. In contrast, intelligent transportation systems (ITS) prefer to steer driver-vehicle systems (DVS) toward system-optimal route recommendations, which are primarily designed to mitigate network c…
▽ More
The shortest-time route recommendations offered by modern navigation systems fuel selfish routing in urban vehicular traffic networks and are therefore one of the main reasons for the growth of congestion. In contrast, intelligent transportation systems (ITS) prefer to steer driver-vehicle systems (DVS) toward system-optimal route recommendations, which are primarily designed to mitigate network congestion. However, due to the misalignment in motives, drivers exhibit a lack of trust in the ITS. This paper models the interaction between a DVS and an ITS as a novel, multi-stage routing game where the DVS exhibits dynamics in its trust towards the recommendations of ITS based on counterfactual and observed game outcomes. Specifically, DVS and ITS are respectively modeled as a travel-time minimizer and network congestion minimizer, each having nonidentical prior beliefs about the network state. A novel approximate algorithm to compute the Bayesian Nash equilibrium, called ROSTER(Recommendation Outcome Sampling with Trust Estimation and Re-evaluation), is proposed based on Monte Carlo sampling with trust belief updating to determine the best response route recommendations of the ITS at each stage of the game. Simulation results demonstrate that the trust prediction error in the proposed algorithm converges to zero with a growing number of multi-stage DVS-ITS interactions and is effectively able to both mitigate congestion and reduce driver travel times when compared to alternative route recommendation strategies.
△ Less
Submitted 14 April, 2025;
originally announced April 2025.
-
PUBLICSPEAK: Hearing the Public with a Probabilistic Framework in Local Government
Authors:
Tianliang Xu,
Eva Maxfield Brown,
Dustin Dwyer,
Sabina Tomkins
Abstract:
Local governments around the world are making consequential decisions on behalf of their constituents, and these constituents are responding with requests, advice, and assessments of their officials at public meetings. So many small meetings cannot be covered by traditional newsrooms at scale. We propose PUBLICSPEAK, a probabilistic framework which can utilize meeting structure, domain knowledge,…
▽ More
Local governments around the world are making consequential decisions on behalf of their constituents, and these constituents are responding with requests, advice, and assessments of their officials at public meetings. So many small meetings cannot be covered by traditional newsrooms at scale. We propose PUBLICSPEAK, a probabilistic framework which can utilize meeting structure, domain knowledge, and linguistic information to discover public remarks in local government meetings. We then use our approach to inspect the issues raised by constituents in 7 cities across the United States. We evaluate our approach on a novel dataset of local government meetings and find that PUBLICSPEAK improves over state-of-the-art by 10% on average, and by up to 40%.
△ Less
Submitted 14 March, 2025;
originally announced March 2025.
-
Measuring Software Innovation with Open Source Software Development Data
Authors:
Eva Maxfield Brown,
Cailean Osborne,
Peter Cihon,
Moritz Böhmecke-Schwafert,
Kevin Xu,
Mirko Boehm,
Knut Blind
Abstract:
This paper introduces a novel measure of software innovation based on open source software (OSS) development activity on GitHub. We examine the dependency growth and release complexity among $\sim$200,000 unique releases from 28,000 unique packages across the JavaScript, Python, and Ruby ecosystems over two years post-release. We find that major versions show differential, strong prediction of one…
▽ More
This paper introduces a novel measure of software innovation based on open source software (OSS) development activity on GitHub. We examine the dependency growth and release complexity among $\sim$200,000 unique releases from 28,000 unique packages across the JavaScript, Python, and Ruby ecosystems over two years post-release. We find that major versions show differential, strong prediction of one-year lagged log change in dependencies. In addition, semantic versioning of OSS releases is correlated with their complexity and predict downstream adoption. We conclude that major releases of OSS packages count as a unit of innovation complementary to scientific publications, patents, and standards, offering applications for policymakers, managers, and researchers.
△ Less
Submitted 7 November, 2024;
originally announced November 2024.
-
Biomedical Open Source Software: Crucial Packages and Hidden Heroes
Authors:
Andrew Nesbitt,
Boris Veytsman,
Daniel Mietchen,
Eva Maxfield Brown,
James Howison,
João Felipe Pimentel,
Laurent Hébert-Dufresne,
Stephan Druskat
Abstract:
Despite the importance of scientific software for research, it is often not formally recognized and rewarded. This is especially true for foundation libraries, which are used by the software packages visible to the users, being ``hidden'' themselves. The funders and other organizations need to understand the complex network of computer programs that the modern research relies upon.
In this work…
▽ More
Despite the importance of scientific software for research, it is often not formally recognized and rewarded. This is especially true for foundation libraries, which are used by the software packages visible to the users, being ``hidden'' themselves. The funders and other organizations need to understand the complex network of computer programs that the modern research relies upon.
In this work we used CZ Software Mentions Dataset to map the dependencies of the software used in biomedical papers and find the packages critical to the software ecosystems. We propose the centrality metrics for the network of software dependencies, analyze three ecosystems (PyPi, CRAN, Bioconductor) and determine the packages with the highest centrality.
△ Less
Submitted 19 May, 2025; v1 submitted 9 April, 2024;
originally announced April 2024.
-
TASR: A Novel Trust-Aware Stackelberg Routing Algorithm to Mitigate Traffic Congestion
Authors:
Doris E. M. Brown,
Venkata Sriram Siddhardh Nadendla,
Sajal K. Das
Abstract:
Stackelberg routing platforms (SRP) reduce congestion in one-shot traffic networks by proposing optimal route recommendations to selfish travelers. Traditionally, Stackelberg routing is cast as a partial control problem where a fraction of traveler flow complies with route recommendations, while the remaining respond as selfish travelers. In this paper, a novel Stackelberg routing framework is for…
▽ More
Stackelberg routing platforms (SRP) reduce congestion in one-shot traffic networks by proposing optimal route recommendations to selfish travelers. Traditionally, Stackelberg routing is cast as a partial control problem where a fraction of traveler flow complies with route recommendations, while the remaining respond as selfish travelers. In this paper, a novel Stackelberg routing framework is formulated where the agents exhibit \emph{probabilistic compliance} by accepting SRP's route recommendations with a \emph{trust} probability. A greedy \emph{\textbf{T}rust-\textbf{A}ware \textbf{S}tackelberg \textbf{R}outing} algorithm (in short, TASR) is proposed for SRP to compute unique path recommendations to each traveler flow with a unique demand. Simulation experiments are designed with random travel demands with diverse trust values on real road networks such as Sioux Falls, Chicago Sketch, and Sydney networks for both single-commodity and multi-commodity flows. The performance of TASR is compared with state-of-the-art Stackelberg routing methods in terms of traffic congestion and trust dynamics over repeated interaction between the SRP and the travelers. Results show that TASR improves network congestion without causing a significant reduction in trust towards the SRP, when compared to most well-known Stackelberg routing strategies.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Collecting Qualitative Data at Scale with Large Language Models: A Case Study
Authors:
Alejandro Cuevas,
Jennifer V. Scurrell,
Eva M. Brown,
Jason Entenmann,
Madeleine I. G. Daepp
Abstract:
Chatbots have shown promise as tools to scale qualitative data collection. Recent advances in Large Language Models (LLMs) could accelerate this process by allowing researchers to easily deploy sophisticated interviewing chatbots. We test this assumption by conducting a large-scale user study (n=399) evaluating 3 different chatbots, two of which are LLM-based and a baseline which employs hard-code…
▽ More
Chatbots have shown promise as tools to scale qualitative data collection. Recent advances in Large Language Models (LLMs) could accelerate this process by allowing researchers to easily deploy sophisticated interviewing chatbots. We test this assumption by conducting a large-scale user study (n=399) evaluating 3 different chatbots, two of which are LLM-based and a baseline which employs hard-coded questions. We evaluate the results with respect to participant engagement and experience, established metrics of chatbot quality grounded in theories of effective communication, and a novel scale evaluating "richness" or the extent to which responses capture the complexity and specificity of the social context under study. We find that, while the chatbots were able to elicit high-quality responses based on established evaluation metrics, the responses rarely capture participants' specific motives or personalized examples, and thus perform poorly with respect to richness. We further find low inter-rater reliability between LLMs and humans in the assessment of both quality and richness metrics. Our study offers a cautionary tale for scaling and evaluating qualitative research with LLMs.
△ Less
Submitted 3 December, 2024; v1 submitted 18 September, 2023;
originally announced September 2023.
-
Soft-Search: Two Datasets to Study the Identification and Production of Research Software
Authors:
Eva Maxfield Brown,
Lindsey Schwartz,
Richard Lewei Huang,
Nicholas Weber
Abstract:
Software is an important tool for scholarly work, but software produced for research is in many cases not easily identifiable or discoverable. A potential first step in linking research and software is software identification. In this paper we present two datasets to study the identification and production of research software. The first dataset contains almost 1000 human labeled annotations of so…
▽ More
Software is an important tool for scholarly work, but software produced for research is in many cases not easily identifiable or discoverable. A potential first step in linking research and software is software identification. In this paper we present two datasets to study the identification and production of research software. The first dataset contains almost 1000 human labeled annotations of software production from National Science Foundation (NSF) awarded research projects. We use this dataset to train models that predict software production. Our second dataset is created by applying the trained predictive models across the abstracts and project outcomes reports for all NSF funded projects between the years of 2010 and 2023. The result is an inferred dataset of software production for over 150,000 NSF awards. We release the Soft-Search dataset to aid in identifying and understanding research software production: https://github.com/si2-urssi/eager
△ Less
Submitted 27 February, 2023;
originally announced February 2023.
-
Councils in Action: Automating the Curation of Municipal Governance Data for Research
Authors:
Eva Maxfield Brown,
Nicholas Weber
Abstract:
Large scale comparative research into municipal governance is often prohibitively difficult due to a lack of high-quality data. But, recent advances in speech-to-text algorithms and natural language processing has made it possible to more easily collect and analyze data about municipal governments. In this paper, we introduce an open-source platform, the Council Data Project (CDP), to curate novel…
▽ More
Large scale comparative research into municipal governance is often prohibitively difficult due to a lack of high-quality data. But, recent advances in speech-to-text algorithms and natural language processing has made it possible to more easily collect and analyze data about municipal governments. In this paper, we introduce an open-source platform, the Council Data Project (CDP), to curate novel datasets for research into municipal governance. The contribution of this work is two-fold: 1. We demonstrate that CDP, as an infrastructure, can be used to assemble reliable comparative data on municipal governance; 2. We provide exploratory analysis of three municipalities to show how CDP data can be used to gain insight into how municipal governments perform over time. We conclude by describing future directions for research on and with CDP such as the development of machine learning models for speaker annotation, outline generation, and named entity recognition for improved linked data.
△ Less
Submitted 31 August, 2022; v1 submitted 19 April, 2022;
originally announced April 2022.
-
Framing Effects on Strategic Information Design under Receiver Distrust and Unknown State
Authors:
Doris E. M. Brown,
Venkata Sriram Siddhardh Nadendla
Abstract:
Strategic information design is a framework where a sender designs information strategically to steer its receiver's decision towards a desired choice. Traditionally, such frameworks have always assumed that the sender and the receiver comprehends the state of the choice environment, and that the receiver always trusts the sender's signal. This paper deviates from these assumptions and re-investig…
▽ More
Strategic information design is a framework where a sender designs information strategically to steer its receiver's decision towards a desired choice. Traditionally, such frameworks have always assumed that the sender and the receiver comprehends the state of the choice environment, and that the receiver always trusts the sender's signal. This paper deviates from these assumptions and re-investigates strategic information design in the presence of distrustful receiver and when both sender and receiver cannot observe/comprehend the environment state space. Specifically, we assume that both sender and receiver has access to non-identical beliefs about choice rewards (with sender's belief being more accurate), but not the environment state that determines these rewards. Furthermore, given that the receiver does not trust the sender, we also assume that the receiver updates its prior in a non-Bayesian manner. We evaluate the Stackelberg equilibrium and investigate effects of information framing (i.e. send complete signal, or just expected value of the signal) on the equilibrium. Furthermore, we also investigate trust dynamics at the receiver, under the assumption that the receiver minimizes regret in hindsight. Simulation results are presented to illustrate signaling effects and trust dynamics in strategic information design.
△ Less
Submitted 21 July, 2021; v1 submitted 11 May, 2020;
originally announced May 2020.