-
BEST-Route: Adaptive LLM Routing with Test-Time Optimal Compute
Authors:
Dujian Ding,
Ankur Mallick,
Shaokun Zhang,
Chi Wang,
Daniel Madrigal,
Mirian Del Carmen Hipolito Garcia,
Menglin Xia,
Laks V. S. Lakshmanan,
Qingyun Wu,
Victor Rühle
Abstract:
Large language models (LLMs) are powerful tools but are often expensive to deploy at scale. LLM query routing mitigates this by dynamically assigning queries to models of varying cost and quality to obtain a desired trade-off. Prior query routing approaches generate only one response from the selected model and a single response from a small (inexpensive) model was often not good enough to beat a…
▽ More
Large language models (LLMs) are powerful tools but are often expensive to deploy at scale. LLM query routing mitigates this by dynamically assigning queries to models of varying cost and quality to obtain a desired trade-off. Prior query routing approaches generate only one response from the selected model and a single response from a small (inexpensive) model was often not good enough to beat a response from a large (expensive) model due to which they end up overusing the large model and missing out on potential cost savings. However, it is well known that for small models, generating multiple responses and selecting the best can enhance quality while remaining cheaper than a single large-model response. We leverage this idea to propose BEST-Route, a novel routing framework that chooses a model and the number of responses to sample from it based on query difficulty and the quality thresholds. Experiments on real-world datasets demonstrate that our method reduces costs by up to 60% with less than 1% performance drop.
△ Less
Submitted 27 June, 2025;
originally announced June 2025.
-
BEACON: A Benchmark for Efficient and Accurate Counting of Subgraphs
Authors:
Mohammad Matin Najafi,
Xianju Zhu,
Chrysanthi Kosyfaki,
Laks V. S. Lakshmanan,
Reynold Cheng
Abstract:
Subgraph counting the task of determining the number of instances of a query pattern within a large graph lies at the heart of many critical applications, from analyzing financial networks and transportation systems to understanding biological interactions. Despite decades of work yielding efficient algorithmic (AL) solutions and, more recently, machine learning (ML) approaches, a clear comparativ…
▽ More
Subgraph counting the task of determining the number of instances of a query pattern within a large graph lies at the heart of many critical applications, from analyzing financial networks and transportation systems to understanding biological interactions. Despite decades of work yielding efficient algorithmic (AL) solutions and, more recently, machine learning (ML) approaches, a clear comparative understanding is elusive. This gap stems from the absence of a unified evaluation framework, standardized datasets, and accessible ground truths, all of which hinder systematic analysis and fair benchmarking. To overcome these barriers, we introduce BEACON: a comprehensive benchmark designed to rigorously evaluate both AL and ML-based subgraph counting methods. BEACON provides a standardized dataset with verified ground truths, an integrated evaluation environment, and a public leaderboard, enabling reproducible and transparent comparisons across diverse approaches. Our extensive experiments reveal that while AL methods excel in efficiently counting subgraphs on very large graphs, they struggle with complex patterns (e.g., those exceeding six nodes). In contrast, ML methods are capable of handling larger patterns but demand massive graph data inputs and often yield suboptimal accuracy on small, dense graphs. These insights not only highlight the unique strengths and limitations of each approach but also pave the way for future advancements in subgraph counting techniques. Overall, BEACON represents a significant step towards unifying and accelerating research in subgraph counting, encouraging innovative solutions and fostering a deeper understanding of the trade-offs between algorithmic and machine learning paradigms.
△ Less
Submitted 15 April, 2025;
originally announced April 2025.
-
Finding Locally Densest Subgraphs: Convex Programming with Edge and Triangle Density
Authors:
Yi Yang,
Chenhao Ma,
Reynold Cheng,
Laks V. S. Lakshmanan,
Xiaolin Han
Abstract:
Finding the densest subgraph (DS) from a graph is a fundamental problem in graph databases. The DS obtained, which reveals closely related entities, has been found to be useful in various application domains such as e-commerce, social science, and biology. However, in a big graph that contains billions of edges, it is desirable to find more than one subgraph cluster that is not necessarily the den…
▽ More
Finding the densest subgraph (DS) from a graph is a fundamental problem in graph databases. The DS obtained, which reveals closely related entities, has been found to be useful in various application domains such as e-commerce, social science, and biology. However, in a big graph that contains billions of edges, it is desirable to find more than one subgraph cluster that is not necessarily the densest, yet they reveal closely related vertices. In this paper, we study the locally densest subgraph (LDS), a recently proposed variant of DS. An LDS is a subgraph which is the densest among the ``local neighbors''. Given a graph $G$, a number of LDSs can be returned, which reflect different dense regions of $G$ and thus give more information than DS. The existing LDS solution suffers from low efficiency. We thus develop a convex-programming-based solution that enables powerful pruning. We also extend our algorithm to triangle-based density to solve LTDS problem. Based on current algorithms, we propose a unified framework for the LDS and LTDS problems. Extensive experiments on thirteen real large graph datasets show that our proposed algorithm is up to four orders of magnitude faster than the state-of-the-art.
△ Less
Submitted 15 April, 2025;
originally announced April 2025.
-
TRACE: Intra-visit Clinical Event Nowcasting via Effective Patient Trajectory Encoding
Authors:
Yuyang Liang,
Yankai Chen,
Yixiang Fang,
Laks V. S. Lakshmanan,
Chenhao Ma
Abstract:
Electronic Health Records (EHR) have become a valuable resource for a wide range of predictive tasks in healthcare. However, existing approaches have largely focused on inter-visit event predictions, overlooking the importance of intra-visit nowcasting, which provides prompt clinical insights during an ongoing patient visit. To address this gap, we introduce the task of laboratory measurement pred…
▽ More
Electronic Health Records (EHR) have become a valuable resource for a wide range of predictive tasks in healthcare. However, existing approaches have largely focused on inter-visit event predictions, overlooking the importance of intra-visit nowcasting, which provides prompt clinical insights during an ongoing patient visit. To address this gap, we introduce the task of laboratory measurement prediction within a hospital visit. We study the laboratory data that, however, remained underexplored in previous work. We propose TRACE, a Transformer-based model designed for clinical event nowcasting by encoding patient trajectories. TRACE effectively handles long sequences and captures temporal dependencies through a novel timestamp embedding that integrates decay properties and periodic patterns of data. Additionally, we introduce a smoothed mask for denoising, improving the robustness of the model. Experiments on two large-scale electronic health record datasets demonstrate that the proposed model significantly outperforms previous methods, highlighting its potential for improving patient care through more accurate laboratory measurement nowcasting. The code is available at https://github.com/Amehi/TRACE.
△ Less
Submitted 29 March, 2025;
originally announced March 2025.
-
On Aggregation Queries over Predicted Nearest Neighbors
Authors:
Carrie Wang,
Sihem Amer-Yahia,
Laks V. S. Lakshmanan,
Reynold Cheng
Abstract:
We introduce Aggregation Queries over Nearest Neighbors (AQNNs), a novel type of aggregation queries over the predicted neighborhood of a designated object. AQNNs are prevalent in modern applications where, for instance, a medical professional may want to compute "the average systolic blood pressure of patients whose predicted condition is similar to a given insomnia patient". Since prediction typ…
▽ More
We introduce Aggregation Queries over Nearest Neighbors (AQNNs), a novel type of aggregation queries over the predicted neighborhood of a designated object. AQNNs are prevalent in modern applications where, for instance, a medical professional may want to compute "the average systolic blood pressure of patients whose predicted condition is similar to a given insomnia patient". Since prediction typically involves an expensive deep learning model or a human expert, we formulate query processing as the problem of returning an approximate aggregate by combining an expensive oracle and a cheaper model (e.g, a simple ML model) to compute the predictions. We design the Sampler with Precision-Recall in Target (SPRinT) framework for answering AQNNs. SPRinT consists of sampling, nearest neighbor refinement, and aggregation, and is tailored for various aggregation functions. It enjoys provable theoretical guarantees, including bounds on sample size and on error in approximate aggregates. Our extensive experiments on medical, e-commerce, and video datasets demonstrate that SPRinT consistently achieves the lowest aggregation error with minimal computation cost compared to its baselines. Scalability results show that SPRinT's execution time and aggregation error remain stable as the dataset size increases, confirming its suitability for large-scale applications.
△ Less
Submitted 25 February, 2025;
originally announced February 2025.
-
Hyperparametric Robust and Dynamic Influence Maximization
Authors:
Arkaprava Saha,
Bogdan Cautis,
Xiaokui Xiao,
Laks V. S. Lakshmanan
Abstract:
We study the problem of robust influence maximization in dynamic diffusion networks. In line with recent works, we consider the scenario where the network can undergo insertion and removal of nodes and edges, in discrete time steps, and the influence weights are determined by the features of the corresponding nodes and a global hyperparameter. Given this, our goal is to find, at every time step, t…
▽ More
We study the problem of robust influence maximization in dynamic diffusion networks. In line with recent works, we consider the scenario where the network can undergo insertion and removal of nodes and edges, in discrete time steps, and the influence weights are determined by the features of the corresponding nodes and a global hyperparameter. Given this, our goal is to find, at every time step, the seed set maximizing the worst-case influence spread across all possible values of the hyperparameter. We propose an approximate solution using multiplicative weight updates and a greedy algorithm, with provable quality guarantees. Our experiments validate the effectiveness and efficiency of the proposed methods.
△ Less
Submitted 16 December, 2024;
originally announced December 2024.
-
Cross-Modal Consistency in Multimodal Large Language Models
Authors:
Xiang Zhang,
Senyu Li,
Ning Shi,
Bradley Hauer,
Zijun Wu,
Grzegorz Kondrak,
Muhammad Abdul-Mageed,
Laks V. S. Lakshmanan
Abstract:
Recent developments in multimodal methodologies have marked the beginning of an exciting era for models adept at processing diverse data types, encompassing text, audio, and visual content. Models like GPT-4V, which merge computer vision with advanced language processing, exhibit extraordinary proficiency in handling intricate tasks that require a simultaneous understanding of both textual and vis…
▽ More
Recent developments in multimodal methodologies have marked the beginning of an exciting era for models adept at processing diverse data types, encompassing text, audio, and visual content. Models like GPT-4V, which merge computer vision with advanced language processing, exhibit extraordinary proficiency in handling intricate tasks that require a simultaneous understanding of both textual and visual information. Prior research efforts have meticulously evaluated the efficacy of these Vision Large Language Models (VLLMs) in various domains, including object detection, image captioning, and other related fields. However, existing analyses have often suffered from limitations, primarily centering on the isolated evaluation of each modality's performance while neglecting to explore their intricate cross-modal interactions. Specifically, the question of whether these models achieve the same level of accuracy when confronted with identical task instances across different modalities remains unanswered. In this study, we take the initiative to delve into the interaction and comparison among these modalities of interest by introducing a novel concept termed cross-modal consistency. Furthermore, we propose a quantitative evaluation framework founded on this concept. Our experimental findings, drawn from a curated collection of parallel vision-language datasets developed by us, unveil a pronounced inconsistency between the vision and language modalities within GPT-4V, despite its portrayal as a unified multimodal model. Our research yields insights into the appropriate utilization of such models and hints at potential avenues for enhancing their design.
△ Less
Submitted 14 November, 2024;
originally announced November 2024.
-
Efficient and Effective Algorithms for A Family of Influence Maximization Problems with A Matroid Constraint
Authors:
Yiqian Huang,
Shiqi Zhang,
Laks V. S. Lakshmanan,
Wenqing Lin,
Xiaokui Xiao,
Bo Tang
Abstract:
Influence maximization (IM) is a classic problem that aims to identify a small group of critical individuals, known as seeds, who can influence the largest number of users in a social network through word-of-mouth. This problem finds important applications including viral marketing, infection detection, and misinformation containment. The conventional IM problem is typically studied with the overs…
▽ More
Influence maximization (IM) is a classic problem that aims to identify a small group of critical individuals, known as seeds, who can influence the largest number of users in a social network through word-of-mouth. This problem finds important applications including viral marketing, infection detection, and misinformation containment. The conventional IM problem is typically studied with the oversimplified goal of selecting a single seed set. Many real-world scenarios call for multiple sets of seeds, particularly on social media platforms where various viral marketing campaigns need different sets of seeds to propagate effectively. To this end, previous works have formulated various IM variants, central to which is the requirement of multiple seed sets, naturally modeled as a matroid constraint. However, the current best-known solutions for these variants either offer a weak $(1/2-ε)$-approximation, or offer a $(1-1/e-ε)$-approximation algorithm that is very expensive. We propose an efficient seed selection method called AMP, an algorithm with a $(1-1/e-ε)$-approximation guarantee for this family of IM variants. To further improve efficiency, we also devise a fast implementation, called RAMP. We extensively evaluate the performance of our proposal against 6 competitors across 4 IM variants and on 7 real-world networks, demonstrating that our proposal outperforms all competitors in terms of result quality, running time, and memory usage. We have also deployed RAMP in a real industry strength application involving online gaming, where we show that our deployed solution significantly improves upon the baselines.
△ Less
Submitted 21 October, 2024;
originally announced October 2024.
-
Autoregressive + Chain of Thought = Recurrent: Recurrence's Role in Language Models' Computability and a Revisit of Recurrent Transformer
Authors:
Xiang Zhang,
Muhammad Abdul-Mageed,
Laks V. S. Lakshmanan
Abstract:
The Transformer architecture excels in a variety of language modeling tasks, outperforming traditional neural architectures such as RNN and LSTM. This is partially due to its elimination of recurrent connections, which allows for parallel training and a smoother flow of gradients. However, this move away from recurrent structures places the Transformer model at the lower end of Chomsky's computati…
▽ More
The Transformer architecture excels in a variety of language modeling tasks, outperforming traditional neural architectures such as RNN and LSTM. This is partially due to its elimination of recurrent connections, which allows for parallel training and a smoother flow of gradients. However, this move away from recurrent structures places the Transformer model at the lower end of Chomsky's computational hierarchy, imposing limitations on its computational abilities. Consequently, even advanced Transformer-based models face considerable difficulties in tasks like counting, string reversal, and multiplication. These tasks, though seemingly elementary, require a level of computational complexity that exceeds the capabilities of the Transformer architecture. Concurrently, the emergence of ``Chain of Thought" (CoT) prompting has enabled Transformer-based language models to tackle tasks that were previously impossible or poorly executed. In this work, we thoroughly investigate the influence of recurrent structures in neural models on their reasoning abilities and computability, contrasting the role autoregression plays in the neural models' computational power. We then shed light on how the CoT approach can mimic recurrent computation and act as a bridge between autoregression and recurrence in the context of language models. It is this approximated recurrence that notably improves the model's performance and computational capacity. Moreover, we revisit recent recurrent-based Transformer model designs, focusing on their computational abilities through our proposed concept of ``recurrence-completeness" and identify key theoretical limitations in models like Linear Transformer and RWKV. Through this, we aim to provide insight into the neural model architectures and prompt better model design.
△ Less
Submitted 20 September, 2024; v1 submitted 13 September, 2024;
originally announced September 2024.
-
Benchmarking Spectral Graph Neural Networks: A Comprehensive Study on Effectiveness and Efficiency
Authors:
Ningyi Liao,
Haoyu Liu,
Zulun Zhu,
Siqiang Luo,
Laks V. S. Lakshmanan
Abstract:
With the recent advancements in graph neural networks (GNNs), spectral GNNs have received increasing popularity by virtue of their specialty in capturing graph signals in the frequency domain, demonstrating promising capability in specific tasks. However, few systematic studies have been conducted on assessing their spectral characteristics. This emerging family of models also varies in terms of d…
▽ More
With the recent advancements in graph neural networks (GNNs), spectral GNNs have received increasing popularity by virtue of their specialty in capturing graph signals in the frequency domain, demonstrating promising capability in specific tasks. However, few systematic studies have been conducted on assessing their spectral characteristics. This emerging family of models also varies in terms of designs and settings, leading to difficulties in comparing their performance and deciding on the suitable model for specific scenarios, especially for large-scale tasks. In this work, we extensively benchmark spectral GNNs with a focus on the frequency perspective. We analyze and categorize over 30 GNNs with 27 corresponding filters. Then, we implement these spectral models under a unified framework with dedicated graph computations and efficient training schemes. Thorough experiments are conducted on the spectral models with inclusive metrics on effectiveness and efficiency, offering practical guidelines on evaluating and selecting spectral GNNs with desirable performance. Our implementation enables application on larger graphs with comparable performance and less overhead, which is available at: https://github.com/gdmnl/Spectral-GNN-Benchmark.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Predicting Cascading Failures with a Hyperparametric Diffusion Model
Authors:
Bin Xiang,
Bogdan Cautis,
Xiaokui Xiao,
Olga Mula,
Dusit Niyato,
Laks V. S. Lakshmanan
Abstract:
In this paper, we study cascading failures in power grids through the lens of information diffusion models. Similar to the spread of rumors or influence in an online social network, it has been observed that failures (outages) in a power grid can spread contagiously, driven by viral spread mechanisms. We employ a stochastic diffusion model that is Markovian (memoryless) and local (the activation o…
▽ More
In this paper, we study cascading failures in power grids through the lens of information diffusion models. Similar to the spread of rumors or influence in an online social network, it has been observed that failures (outages) in a power grid can spread contagiously, driven by viral spread mechanisms. We employ a stochastic diffusion model that is Markovian (memoryless) and local (the activation of one node, i.e., transmission line, can only be caused by its neighbors). Our model integrates viral diffusion principles with physics-based concepts, by correlating the diffusion weights (contagion probabilities between transmission lines) with the hyperparametric Information Cascades (IC) model. We show that this diffusion model can be learned from traces of cascading failures, enabling accurate modeling and prediction of failure propagation. This approach facilitates actionable information through well-understood and efficient graph analysis methods and graph diffusion simulations. Furthermore, by leveraging the hyperparametric model, we can predict diffusion and mitigate the risks of cascading failures even in unseen grid configurations, whereas existing methods falter due to a lack of training data. Extensive experiments based on a benchmark power grid and simulations therein show that our approach effectively captures the failure diffusion phenomena and guides decisions to strengthen the grid, reducing the risk of large-scale cascading failures. Additionally, we characterize our model's sample complexity, improving upon the existing bound.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
OCCAM: Towards Cost-Efficient and Accuracy-Aware Classification Inference
Authors:
Dujian Ding,
Bicheng Xu,
Laks V. S. Lakshmanan
Abstract:
Classification tasks play a fundamental role in various applications, spanning domains such as healthcare, natural language processing and computer vision. With the growing popularity and capacity of machine learning models, people can easily access trained classifiers as a service online or offline. However, model use comes with a cost and classifiers of higher capacity (such as large foundation…
▽ More
Classification tasks play a fundamental role in various applications, spanning domains such as healthcare, natural language processing and computer vision. With the growing popularity and capacity of machine learning models, people can easily access trained classifiers as a service online or offline. However, model use comes with a cost and classifiers of higher capacity (such as large foundation models) usually incur higher inference costs. To harness the respective strengths of different classifiers, we propose a principled approach, OCCAM, to compute the best classifier assignment strategy over classification queries (termed as the optimal model portfolio) so that the aggregated accuracy is maximized, under user-specified cost budgets. Our approach uses an unbiased and low-variance accuracy estimator and effectively computes the optimal solution by solving an integer linear programming problem. On a variety of real-world datasets, OCCAM achieves 40% cost reduction with little to no accuracy drop.
△ Less
Submitted 24 February, 2025; v1 submitted 6 June, 2024;
originally announced June 2024.
-
Hybrid LLM: Cost-Efficient and Quality-Aware Query Routing
Authors:
Dujian Ding,
Ankur Mallick,
Chi Wang,
Robert Sim,
Subhabrata Mukherjee,
Victor Ruhle,
Laks V. S. Lakshmanan,
Ahmed Hassan Awadallah
Abstract:
Large language models (LLMs) excel in most NLP tasks but also require expensive cloud servers for deployment due to their size, while smaller models that can be deployed on lower cost (e.g., edge) devices, tend to lag behind in terms of response quality. Therefore in this work we propose a hybrid inference approach which combines their respective strengths to save cost and maintain quality. Our ap…
▽ More
Large language models (LLMs) excel in most NLP tasks but also require expensive cloud servers for deployment due to their size, while smaller models that can be deployed on lower cost (e.g., edge) devices, tend to lag behind in terms of response quality. Therefore in this work we propose a hybrid inference approach which combines their respective strengths to save cost and maintain quality. Our approach uses a router that assigns queries to the small or large model based on the predicted query difficulty and the desired quality level. The desired quality level can be tuned dynamically at test time to seamlessly trade quality for cost as per the scenario requirements. In experiments our approach allows us to make up to 40% fewer calls to the large model, with no drop in response quality.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
DetoxLLM: A Framework for Detoxification with Explanations
Authors:
Md Tawkat Islam Khondaker,
Muhammad Abdul-Mageed,
Laks V. S. Lakshmanan
Abstract:
Prior works on detoxification are scattered in the sense that they do not cover all aspects of detoxification needed in a real-world scenario. Notably, prior works restrict the task of developing detoxification models to only a seen subset of platforms, leaving the question of how the models would perform on unseen platforms unexplored. Additionally, these works do not address non-detoxifiability,…
▽ More
Prior works on detoxification are scattered in the sense that they do not cover all aspects of detoxification needed in a real-world scenario. Notably, prior works restrict the task of developing detoxification models to only a seen subset of platforms, leaving the question of how the models would perform on unseen platforms unexplored. Additionally, these works do not address non-detoxifiability, a phenomenon whereby the toxic text cannot be detoxified without altering the meaning. We propose DetoxLLM, the first comprehensive end-to-end detoxification framework, which attempts to alleviate the aforementioned limitations. We first introduce a cross-platform pseudo-parallel corpus applying multi-step data processing and generation strategies leveraging ChatGPT. We then train a suite of detoxification models with our cross-platform corpus. We show that our detoxification models outperform the SoTA model trained with human-annotated parallel corpus. We further introduce explanation to promote transparency and trustworthiness. DetoxLLM additionally offers a unique paraphrase detector especially dedicated for the detoxification task to tackle the non-detoxifiable cases. Through experimental analysis, we demonstrate the effectiveness of our cross-platform corpus and the robustness of DetoxLLM against adversarial toxicity.
△ Less
Submitted 3 October, 2024; v1 submitted 24 February, 2024;
originally announced February 2024.
-
LLM Performance Predictors are good initializers for Architecture Search
Authors:
Ganesh Jawahar,
Muhammad Abdul-Mageed,
Laks V. S. Lakshmanan,
Dujian Ding
Abstract:
In this work, we utilize Large Language Models (LLMs) for a novel use case: constructing Performance Predictors (PP) that estimate the performance of specific deep neural network architectures on downstream tasks. We create PP prompts for LLMs, comprising (i) role descriptions, (ii) instructions for the LLM, (iii) hyperparameter definitions, and (iv) demonstrations presenting sample architectures…
▽ More
In this work, we utilize Large Language Models (LLMs) for a novel use case: constructing Performance Predictors (PP) that estimate the performance of specific deep neural network architectures on downstream tasks. We create PP prompts for LLMs, comprising (i) role descriptions, (ii) instructions for the LLM, (iii) hyperparameter definitions, and (iv) demonstrations presenting sample architectures with efficiency metrics and `training from scratch' performance. In machine translation (MT) tasks, GPT-4 with our PP prompts (LLM-PP) achieves a SoTA mean absolute error and a slight degradation in rank correlation coefficient compared to baseline predictors. Additionally, we demonstrate that predictions from LLM-PP can be distilled to a compact regression model (LLM-Distill-PP), which surprisingly retains much of the performance of LLM-PP. This presents a cost-effective alternative for resource-intensive performance estimation. Specifically, for Neural Architecture Search (NAS), we introduce a Hybrid-Search algorithm (HS-NAS) employing LLM-Distill-PP for the initial search stages and reverting to the baseline predictor later. HS-NAS performs similarly to SoTA NAS, reducing search hours by approximately 50%, and in some cases, improving latency, GFLOPs, and model size. The code can be found at: https://github.com/UBC-NLP/llmas.
△ Less
Submitted 7 August, 2024; v1 submitted 25 October, 2023;
originally announced October 2023.
-
KEST: Kernel Distance Based Efficient Self-Training for Improving Controllable Text Generation
Authors:
Yuxi Feng,
Xiaoyuan Yi,
Laks V. S. Lakshmanan,
Xing Xie
Abstract:
Self-training (ST) has come to fruition in language understanding tasks by producing pseudo labels, which reduces the labeling bottleneck of language model fine-tuning. Nevertheless, in facilitating semi-supervised controllable language generation, ST faces two key challenges. First, augmented by self-generated pseudo text, generation models tend to over-exploit the previously learned text distrib…
▽ More
Self-training (ST) has come to fruition in language understanding tasks by producing pseudo labels, which reduces the labeling bottleneck of language model fine-tuning. Nevertheless, in facilitating semi-supervised controllable language generation, ST faces two key challenges. First, augmented by self-generated pseudo text, generation models tend to over-exploit the previously learned text distribution, suffering from mode collapse and poor generation diversity. Second, generating pseudo text in each iteration is time-consuming, severely decelerating the training process. In this work, we propose KEST, a novel and efficient self-training framework to handle these problems. KEST utilizes a kernel-based loss, rather than standard cross entropy, to learn from the soft pseudo text produced by a shared non-autoregressive generator. We demonstrate both theoretically and empirically that KEST can benefit from more diverse pseudo text in an efficient manner, which allows not only refining and exploiting the previously fitted distribution but also enhanced exploration towards a larger potential text space, providing a guarantee of improved performance. Experiments on three controllable generation tasks demonstrate that KEST significantly improves control accuracy while maintaining comparable text fluency and generation diversity against several strong baselines.
△ Less
Submitted 17 June, 2023;
originally announced June 2023.
-
A Survey of Densest Subgraph Discovery on Large Graphs
Authors:
Wensheng Luo,
Chenhao Ma,
Yixiang Fang,
Laks V. S. Lakshmanan
Abstract:
With the prevalence of graphs for modeling complex relationships among objects, the topic of graph mining has attracted a great deal of attention from both academic and industrial communities in recent years. As one of the most fundamental problems in graph mining, the densest subgraph discovery (DSD) problem has found a wide spectrum of real applications, such as discovery of filter bubbles in so…
▽ More
With the prevalence of graphs for modeling complex relationships among objects, the topic of graph mining has attracted a great deal of attention from both academic and industrial communities in recent years. As one of the most fundamental problems in graph mining, the densest subgraph discovery (DSD) problem has found a wide spectrum of real applications, such as discovery of filter bubbles in social media, finding groups of actors propagating misinformation in social media, social network community detection, graph index construction, regulatory motif discovery in DNA, fake follower detection, and so on. Theoretically, DSD closely relates to other fundamental graph problems, such as network flow and bipartite matching. Triggered by these applications and connections, DSD has garnered much attention from the database, data mining, theory, and network communities.
In this survey, we first highlight the importance of DSD in various real-world applications and the unique challenges that need to be addressed. Subsequently, we classify existing DSD solutions into several groups, which cover around 50 research papers published in many well-known venues (e.g., SIGMOD, PVLDB, TODS, WWW), and conduct a thorough review of these solutions in each group. Afterwards, we analyze and compare the models and solutions in these works. Finally, we point out a list of promising future research directions. It is our hope that this survey not only helps researchers have a better understanding of existing densest subgraph models and solutions, but also provides insights and identifies directions for future study.
△ Less
Submitted 13 June, 2023; v1 submitted 2 June, 2023;
originally announced June 2023.
-
Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-Experts
Authors:
Ganesh Jawahar,
Haichuan Yang,
Yunyang Xiong,
Zechun Liu,
Dilin Wang,
Fei Sun,
Meng Li,
Aasish Pappu,
Barlas Oguz,
Muhammad Abdul-Mageed,
Laks V. S. Lakshmanan,
Raghuraman Krishnamoorthi,
Vikas Chandra
Abstract:
Weight-sharing supernets are crucial for performance estimation in cutting-edge neural architecture search (NAS) frameworks. Despite their ability to generate diverse subnetworks without retraining, the quality of these subnetworks is not guaranteed due to weight sharing. In NLP tasks like machine translation and pre-trained language modeling, there is a significant performance gap between superne…
▽ More
Weight-sharing supernets are crucial for performance estimation in cutting-edge neural architecture search (NAS) frameworks. Despite their ability to generate diverse subnetworks without retraining, the quality of these subnetworks is not guaranteed due to weight sharing. In NLP tasks like machine translation and pre-trained language modeling, there is a significant performance gap between supernet and training from scratch for the same model architecture, necessitating retraining post optimal architecture identification.
This study introduces a solution called mixture-of-supernets, a generalized supernet formulation leveraging mixture-of-experts (MoE) to enhance supernet model expressiveness with minimal training overhead. Unlike conventional supernets, this method employs an architecture-based routing mechanism, enabling indirect sharing of model weights among subnetworks. This customization of weights for specific architectures, learned through gradient descent, minimizes retraining time, significantly enhancing training efficiency in NLP. The proposed method attains state-of-the-art (SoTA) performance in NAS for fast machine translation models, exhibiting a superior latency-BLEU tradeoff compared to HAT, the SoTA NAS framework for machine translation. Furthermore, it excels in NAS for building memory-efficient task-agnostic BERT models, surpassing NAS-BERT and AutoDistil across various model sizes. The code can be found at: https://github.com/UBC-NLP/MoS.
△ Less
Submitted 7 August, 2024; v1 submitted 7 June, 2023;
originally announced June 2023.
-
DuNST: Dual Noisy Self Training for Semi-Supervised Controllable Text Generation
Authors:
Yuxi Feng,
Xiaoyuan Yi,
Xiting Wang,
Laks V. S. Lakshmanan,
Xing Xie
Abstract:
Self-training (ST) has prospered again in language understanding by augmenting the fine-tuning of pre-trained language models when labeled data is insufficient. However, it remains challenging to incorporate ST into attribute-controllable language generation. Augmented by only self-generated pseudo text, generation models over-emphasize exploitation of the previously learned space, suffering from…
▽ More
Self-training (ST) has prospered again in language understanding by augmenting the fine-tuning of pre-trained language models when labeled data is insufficient. However, it remains challenging to incorporate ST into attribute-controllable language generation. Augmented by only self-generated pseudo text, generation models over-emphasize exploitation of the previously learned space, suffering from a constrained generalization boundary. We revisit ST and propose a novel method, DuNST to alleviate this problem. DuNST jointly models text generation and classification with a shared Variational AutoEncoder and corrupts the generated pseudo text by two kinds of flexible noise to disturb the space. In this way, our model could construct and utilize both pseudo text from given labels and pseudo labels from available unlabeled text, which are gradually refined during the ST process. We theoretically demonstrate that DuNST can be regarded as enhancing exploration towards the potential real text space, providing a guarantee of improved performance. Experiments on three controllable generation tasks show that DuNST could significantly boost control accuracy while maintaining comparable generation fluency and diversity against several strong baselines.
△ Less
Submitted 6 June, 2023; v1 submitted 16 December, 2022;
originally announced December 2022.
-
Cross-Platform and Cross-Domain Abusive Language Detection with Supervised Contrastive Learning
Authors:
Md Tawkat Islam Khondaker,
Muhammad Abdul-Mageed,
Laks V. S. Lakshmanan
Abstract:
The prevalence of abusive language on different online platforms has been a major concern that raises the need for automated cross-platform abusive language detection. However, prior works focus on concatenating data from multiple platforms, inherently adopting Empirical Risk Minimization (ERM) method. In this work, we address this challenge from the perspective of domain generalization objective.…
▽ More
The prevalence of abusive language on different online platforms has been a major concern that raises the need for automated cross-platform abusive language detection. However, prior works focus on concatenating data from multiple platforms, inherently adopting Empirical Risk Minimization (ERM) method. In this work, we address this challenge from the perspective of domain generalization objective. We design SCL-Fish, a supervised contrastive learning integrated meta-learning algorithm to detect abusive language on unseen platforms. Our experimental analysis shows that SCL-Fish achieves better performance over ERM and the existing state-of-the-art models. We also show that SCL-Fish is data-efficient and achieves comparable performance with the large-scale pre-trained models upon finetuning for the abusive language detection task.
△ Less
Submitted 11 November, 2022;
originally announced November 2022.
-
A Benchmark Study of Contrastive Learning for Arabic Social Meaning
Authors:
Md Tawkat Islam Khondaker,
El Moatez Billah Nagoudi,
AbdelRahim Elmadany,
Muhammad Abdul-Mageed,
Laks V. S. Lakshmanan
Abstract:
Contrastive learning (CL) brought significant progress to various NLP tasks. Despite this progress, CL has not been applied to Arabic NLP to date. Nor is it clear how much benefits it could bring to particular classes of tasks such as those involved in Arabic social meaning (e.g., sentiment analysis, dialect identification, hate speech detection). In this work, we present a comprehensive benchmark…
▽ More
Contrastive learning (CL) brought significant progress to various NLP tasks. Despite this progress, CL has not been applied to Arabic NLP to date. Nor is it clear how much benefits it could bring to particular classes of tasks such as those involved in Arabic social meaning (e.g., sentiment analysis, dialect identification, hate speech detection). In this work, we present a comprehensive benchmark study of state-of-the-art supervised CL methods on a wide array of Arabic social meaning tasks. Through extensive empirical analyses, we show that CL methods outperform vanilla finetuning on most tasks we consider. We also show that CL can be data efficient and quantify this efficiency. Overall, our work allows us to demonstrate the promise of CL methods, including in low-resource settings.
△ Less
Submitted 21 October, 2022;
originally announced October 2022.
-
AutoMoE: Heterogeneous Mixture-of-Experts with Adaptive Computation for Efficient Neural Machine Translation
Authors:
Ganesh Jawahar,
Subhabrata Mukherjee,
Xiaodong Liu,
Young Jin Kim,
Muhammad Abdul-Mageed,
Laks V. S. Lakshmanan,
Ahmed Hassan Awadallah,
Sebastien Bubeck,
Jianfeng Gao
Abstract:
Mixture-of-Expert (MoE) models have obtained state-of-the-art performance in Neural Machine Translation (NMT) tasks. Existing works in MoE mostly consider a homogeneous design where the same number of experts of the same size are placed uniformly throughout the network. Furthermore, existing MoE works do not consider computational constraints (e.g., FLOPs, latency) to guide their design. To this e…
▽ More
Mixture-of-Expert (MoE) models have obtained state-of-the-art performance in Neural Machine Translation (NMT) tasks. Existing works in MoE mostly consider a homogeneous design where the same number of experts of the same size are placed uniformly throughout the network. Furthermore, existing MoE works do not consider computational constraints (e.g., FLOPs, latency) to guide their design. To this end, we develop AutoMoE -- a framework for designing heterogeneous MoE's under computational constraints. AutoMoE leverages Neural Architecture Search (NAS) to obtain efficient sparse MoE sub-transformers with 4x inference speedup (CPU) and FLOPs reduction over manually designed Transformers, with parity in BLEU score over dense Transformer and within 1 BLEU point of MoE SwitchTransformer, on aggregate over benchmark datasets for NMT. Heterogeneous search space with dense and sparsely activated Transformer modules (e.g., how many experts? where to place them? what should be their sizes?) allows for adaptive compute -- where different amounts of computations are used for different tokens in the input. Adaptivity comes naturally from routing decisions which send tokens to experts of different sizes. AutoMoE code, data, and trained models are available at https://aka.ms/AutoMoE.
△ Less
Submitted 7 June, 2023; v1 submitted 14 October, 2022;
originally announced October 2022.
-
Small Character Models Match Large Word Models for Autocomplete Under Memory Constraints
Authors:
Ganesh Jawahar,
Subhabrata Mukherjee,
Debadeepta Dey,
Muhammad Abdul-Mageed,
Laks V. S. Lakshmanan,
Caio Cesar Teodoro Mendes,
Gustavo Henrique de Rosa,
Shital Shah
Abstract:
Autocomplete is a task where the user inputs a piece of text, termed prompt, which is conditioned by the model to generate semantically coherent continuation. Existing works for this task have primarily focused on datasets (e.g., email, chat) with high frequency user prompt patterns (or focused prompts) where word-based language models have been quite effective. In this work, we study the more cha…
▽ More
Autocomplete is a task where the user inputs a piece of text, termed prompt, which is conditioned by the model to generate semantically coherent continuation. Existing works for this task have primarily focused on datasets (e.g., email, chat) with high frequency user prompt patterns (or focused prompts) where word-based language models have been quite effective. In this work, we study the more challenging open-domain setting consisting of low frequency user prompt patterns (or broad prompts, e.g., prompt about 93rd academy awards) and demonstrate the effectiveness of character-based language models. We study this problem under memory-constrained settings (e.g., edge devices and smartphones), where character-based representation is effective in reducing the overall model size (in terms of parameters). We use WikiText-103 benchmark to simulate broad prompts and demonstrate that character models rival word models in exact match accuracy for the autocomplete task, when controlled for the model size. For instance, we show that a 20M parameter character model performs similar to an 80M parameter word model in the vanilla setting. We further propose novel methods to improve character models by incorporating inductive bias in the form of compositional information and representation transfer from large word models. Datasets and code used in this work are available at https://github.com/UBC-NLP/char_autocomplete.
△ Less
Submitted 7 June, 2023; v1 submitted 6 October, 2022;
originally announced October 2022.
-
Voting-based Opinion Maximization
Authors:
Arkaprava Saha,
Xiangyu Ke,
Arijit Khan,
Laks V. S. Lakshmanan
Abstract:
We investigate the novel problem of voting-based opinion maximization in a social network: Find a given number of seed nodes for a target campaigner, in the presence of other competing campaigns, so as to maximize a voting-based score for the target campaigner at a given time horizon.
The bulk of the influence maximization literature assumes that social network users can switch between only two…
▽ More
We investigate the novel problem of voting-based opinion maximization in a social network: Find a given number of seed nodes for a target campaigner, in the presence of other competing campaigns, so as to maximize a voting-based score for the target campaigner at a given time horizon.
The bulk of the influence maximization literature assumes that social network users can switch between only two discrete states, inactive and active, and the choice to switch is frozen upon one-time activation. In reality, even when having a preferred opinion, a user may not completely despise the other opinions, and the preference level may vary over time due to social influence. To this end, we employ models rooted in opinion formation and diffusion, and use several voting-based scores to determine a user's vote for each of the multiple campaigners at a given time horizon.
Our problem is NP-hard and non-submodular for various scores. We design greedy seed selection algorithms with quality guarantees for our scoring functions via sandwich approximation. To improve the efficiency, we develop random walk and sketch-based opinion computation, with quality guarantees. Empirical results validate our effectiveness, efficiency, and scalability.
△ Less
Submitted 14 September, 2022;
originally announced September 2022.
-
FirmCore Decomposition of Multilayer Networks
Authors:
Farnoosh Hashemi,
Ali Behrouz,
Laks V. S. Lakshmanan
Abstract:
A key graph mining primitive is extracting dense structures from graphs, and this has led to interesting notions such as $k$-cores which subsequently have been employed as building blocks for capturing the structure of complex networks and for designing efficient approximation algorithms for challenging problems such as finding the densest subgraph. In applications such as biological, social, and…
▽ More
A key graph mining primitive is extracting dense structures from graphs, and this has led to interesting notions such as $k$-cores which subsequently have been employed as building blocks for capturing the structure of complex networks and for designing efficient approximation algorithms for challenging problems such as finding the densest subgraph. In applications such as biological, social, and transportation networks, interactions between objects span multiple aspects. Multilayer (ML) networks have been proposed for accurately modeling such applications. In this paper, we present FirmCore, a new family of dense subgraphs in ML networks, and show that it satisfies many of the nice properties of $k$-cores in single-layer graphs. Unlike the state of the art core decomposition of ML graphs, FirmCores have a polynomial time algorithm, making them a powerful tool for understanding the structure of massive ML networks. We also extend FirmCore for directed ML graphs. We show that FirmCores and directed FirmCores can be used to obtain efficient approximation algorithms for finding the densest subgraphs of ML graphs and their directed counterparts. Our extensive experiments over several real ML graphs show that our FirmCore decomposition algorithm is significantly more efficient than known algorithms for core decompositions of ML graphs. Furthermore, it returns solutions of matching or better quality for the densest subgraph problem over (possibly directed) ML graphs.
△ Less
Submitted 23 August, 2022;
originally announced August 2022.
-
Misinformation Mitigation under Differential Propagation Rates and Temporal Penalties
Authors:
Michael Simpson,
Farnoosh Hashemi,
Laks V. S. Lakshmanan
Abstract:
We propose an information propagation model that captures important temporal aspects that have been well observed in the dynamics of fake news diffusion, in contrast with the diffusion of truth. The model accounts for differential propagation rates of truth and misinformation and for user reaction times. We study a time-sensitive variant of the \textit{misinformation mitigation} problem, where…
▽ More
We propose an information propagation model that captures important temporal aspects that have been well observed in the dynamics of fake news diffusion, in contrast with the diffusion of truth. The model accounts for differential propagation rates of truth and misinformation and for user reaction times. We study a time-sensitive variant of the \textit{misinformation mitigation} problem, where $k$ seeds are to be selected to activate a truth campaign so as to minimize the number of users that adopt misinformation propagating through a social network. We show that the resulting objective is non-submodular and employ a sandwiching technique by defining submodular upper and lower bounding functions, providing data-dependent guarantees. In order to enable the use of a reverse sampling framework, we introduce a weighted version of reverse reachability sets that captures the associated differential propagation rates and establish a key equivalence between weighted set coverage probabilities and mitigation with respect to the sandwiching functions. Further, we propose an offline reverse sampling framework that provides $(1 - 1/e - ε)$-approximate solutions to our bounding functions and introduce an importance sampling technique to reduce the sample complexity of our solution. Finally, we show how our framework can provide an anytime solution to the problem. Experiments over five datasets show that our approach outperforms previous approaches and is robust to uncertainty in the model parameters.
△ Less
Submitted 22 June, 2022;
originally announced June 2022.
-
On Efficient Approximate Queries over Machine Learning Models
Authors:
Dujian Ding,
Sihem Amer-Yahia,
Laks VS Lakshmanan
Abstract:
The question of answering queries over ML predictions has been gaining attention in the database community. This question is challenging because the cost of finding high quality answers corresponds to invoking an oracle such as a human expert or an expensive deep neural network model on every single item in the DB and then applying the query. We develop a novel unified framework for approximate qu…
▽ More
The question of answering queries over ML predictions has been gaining attention in the database community. This question is challenging because the cost of finding high quality answers corresponds to invoking an oracle such as a human expert or an expensive deep neural network model on every single item in the DB and then applying the query. We develop a novel unified framework for approximate query answering by leveraging a proxy to minimize the oracle usage of finding high quality answers for both Precision-Target (PT) and Recall-Target (RT) queries. Our framework uses a judicious combination of invoking the expensive oracle on data samples and applying the cheap proxy on the objects in the DB. It relies on two assumptions. Under the Proxy Quality assumption, proxy quality can be quantified in a probabilistic manner w.r.t. the oracle. This allows us to develop two algorithms: PQA that efficiently finds high quality answers with high probability and no oracle calls, and PQE, a heuristic extension that achieves empirically good performance with a small number of oracle calls. Alternatively, under the Core Set Closure assumption, we develop two algorithms: CSC that efficiently returns high quality answers with high probability and minimal oracle usage, and CSE, which extends it to more general settings. Our extensive experiments on five real-world datasets on both query types, PT and RT, demonstrate that our algorithms outperform the state-of-the-art and achieve high result quality with provable statistical guarantees.
△ Less
Submitted 17 November, 2022; v1 submitted 6 June, 2022;
originally announced June 2022.
-
FirmTruss Community Search in Multilayer Networks
Authors:
Ali Behrouz,
Farnoosh Hashemi,
Laks V. S. Lakshmanan
Abstract:
In applications such as biological, social, and transportation networks, interactions between objects span multiple aspects. For accurately modeling such applications, multilayer networks have been proposed. Community search allows for personalized community discovery and has a wide range of applications in large real-world networks. While community search has been widely explored for single-layer…
▽ More
In applications such as biological, social, and transportation networks, interactions between objects span multiple aspects. For accurately modeling such applications, multilayer networks have been proposed. Community search allows for personalized community discovery and has a wide range of applications in large real-world networks. While community search has been widely explored for single-layer graphs, the problem for multilayer graphs has just recently attracted attention. Existing community models in multilayer graphs have several limitations, including disconnectivity, free-rider effect, resolution limits, and inefficiency. To address these limitations, we study the problem of community search over large multilayer graphs. We first introduce FirmTruss, a novel dense structure in multilayer networks, which extends the notion of truss to multilayer graphs. We show that FirmTrusses possess nice structural and computational properties and bring many advantages compared to the existing models. Building on this, we present a new community model based on FirmTruss, called FTCS, and show that finding an FTCS community is NP-hard. We propose two efficient 2-approximation algorithms, and show that no polynomial-time algorithm can have a better approximation guarantee unless P = NP. We propose an index-based method to further improve the efficiency of the algorithms. We then consider attributed multilayer networks and propose a new community model based on network homophily. We show that community search in attributed multilayer graphs is NP-hard and present an effective and efficient approximation algorithm. Experimental studies on real-world graphs with ground-truth communities validate the quality of the solutions we obtain and the efficiency of the proposed algorithms.
△ Less
Submitted 17 November, 2022; v1 submitted 2 May, 2022;
originally announced May 2022.
-
Automatic Detection of Entity-Manipulated Text using Factual Knowledge
Authors:
Ganesh Jawahar,
Muhammad Abdul-Mageed,
Laks V. S. Lakshmanan
Abstract:
In this work, we focus on the problem of distinguishing a human written news article from a news article that is created by manipulating entities in a human written news article (e.g., replacing entities with factually incorrect entities). Such manipulated articles can mislead the reader by posing as a human written news article. We propose a neural network based detector that detects manipulated…
▽ More
In this work, we focus on the problem of distinguishing a human written news article from a news article that is created by manipulating entities in a human written news article (e.g., replacing entities with factually incorrect entities). Such manipulated articles can mislead the reader by posing as a human written news article. We propose a neural network based detector that detects manipulated news articles by reasoning about the facts mentioned in the article. Our proposed detector exploits factual knowledge via graph convolutional neural network along with the textual information in the news article. We also create challenging datasets for this task by considering various strategies to generate the new replacement entity (e.g., entity generation from GPT-2). In all the settings, our proposed model either matches or outperforms the state-of-the-art detector in terms of accuracy. Our code and data are available at https://github.com/UBC-NLP/manipulated_entity_detection.
△ Less
Submitted 19 March, 2022;
originally announced March 2022.
-
Efficient and Effective Algorithms for Revenue Maximization in Social Advertising
Authors:
Kai Han,
Benwei Wu,
Jing Tang,
Shuang Cui,
Cigdem Aslay,
Laks V. S. Lakshmanan
Abstract:
We consider the revenue maximization problem in social advertising, where a social network platform owner needs to select seed users for a group of advertisers, each with a payment budget, such that the total expected revenue that the owner gains from the advertisers by propagating their ads in the network is maximized. Previous studies on this problem show that it is intractable and present appro…
▽ More
We consider the revenue maximization problem in social advertising, where a social network platform owner needs to select seed users for a group of advertisers, each with a payment budget, such that the total expected revenue that the owner gains from the advertisers by propagating their ads in the network is maximized. Previous studies on this problem show that it is intractable and present approximation algorithms. We revisit this problem from a fresh perspective and develop novel efficient approximation algorithms, both under the setting where an exact influence oracle is assumed and under one where this assumption is relaxed. Our approximation ratios significantly improve upon the previous ones. Furthermore, we empirically show, using extensive experiments on four datasets, that our algorithms considerably outperform the existing methods on both the solution quality and computation efficiency.
△ Less
Submitted 26 July, 2021; v1 submitted 11 July, 2021;
originally announced July 2021.
-
Exploring Text-to-Text Transformers for English to Hinglish Machine Translation with Synthetic Code-Mixing
Authors:
Ganesh Jawahar,
El Moatez Billah Nagoudi,
Muhammad Abdul-Mageed,
Laks V. S. Lakshmanan
Abstract:
We describe models focused at the understudied problem of translating between monolingual and code-mixed language pairs. More specifically, we offer a wide range of models that convert monolingual English text into Hinglish (code-mixed Hindi and English). Given the recent success of pretrained language models, we also test the utility of two recent Transformer-based encoder-decoder models (i.e., m…
▽ More
We describe models focused at the understudied problem of translating between monolingual and code-mixed language pairs. More specifically, we offer a wide range of models that convert monolingual English text into Hinglish (code-mixed Hindi and English). Given the recent success of pretrained language models, we also test the utility of two recent Transformer-based encoder-decoder models (i.e., mT5 and mBART) on the task finding both to work well. Given the paucity of training data for code-mixing, we also propose a dependency-free method for generating code-mixed texts from bilingual distributed representations that we exploit for improving language model performance. In particular, armed with this additional data, we adopt a curriculum learning approach where we first finetune the language models on synthetic data then on gold code-mixed data. We find that, although simple, our synthetic code-mixing method is competitive with (and in some cases is even superior to) several standard methods (backtranslation, method based on equivalence constraint theory) under a diverse set of conditions. Our work shows that the mT5 model, finetuned following the curriculum learning procedure, achieves best translation performance (12.67 BLEU). Our models place first in the overall ranking of the English-Hinglish official shared task.
△ Less
Submitted 18 May, 2021;
originally announced May 2021.
-
Maximizing Social Welfare in a Competitive Diffusion Model
Authors:
Prithu Banerjee,
Wei Chen,
Laks V. S. Lakshmanan
Abstract:
Influence maximization (IM) has garnered a lot of attention in the literature owing to applications such as viral marketing and infection containment. It aims to select a small number of seed users to adopt an item such that adoption propagates to a large number of users in the network. Competitive IM focuses on the propagation of competing items in the network. Existing works on competitive IM ha…
▽ More
Influence maximization (IM) has garnered a lot of attention in the literature owing to applications such as viral marketing and infection containment. It aims to select a small number of seed users to adopt an item such that adoption propagates to a large number of users in the network. Competitive IM focuses on the propagation of competing items in the network. Existing works on competitive IM have several limitations. (1) They fail to incorporate economic incentives in users' decision making in item adoptions. (2) Majority of the works aim to maximize the adoption of one particular item, and ignore the collective role that different items play. (3) They focus mostly on one aspect of competition -- pure competition. To address these concerns we study competitive IM under a utility-driven propagation model called UIC, and study social welfare maximization. The problem in general is not only NP-hard but also NP-hard to approximate within any constant factor. We, therefore, devise instant dependent efficient approximation algorithms for the general case as well as a $(1-1/e-ε)$-approximation algorithm for a restricted setting. Our algorithms outperform different baselines on competitive IM, both in terms of solution quality and running time on large real networks under both synthetic and real utility configurations.
△ Less
Submitted 6 December, 2020;
originally announced December 2020.
-
Automatic Detection of Machine Generated Text: A Critical Survey
Authors:
Ganesh Jawahar,
Muhammad Abdul-Mageed,
Laks V. S. Lakshmanan
Abstract:
Text generative models (TGMs) excel in producing text that matches the style of human language reasonably well. Such TGMs can be misused by adversaries, e.g., by automatically generating fake news and fake product reviews that can look authentic and fool humans. Detectors that can distinguish text generated by TGM from human written text play a vital role in mitigating such misuse of TGMs. Recentl…
▽ More
Text generative models (TGMs) excel in producing text that matches the style of human language reasonably well. Such TGMs can be misused by adversaries, e.g., by automatically generating fake news and fake product reviews that can look authentic and fool humans. Detectors that can distinguish text generated by TGM from human written text play a vital role in mitigating such misuse of TGMs. Recently, there has been a flurry of works from both natural language processing (NLP) and machine learning (ML) communities to build accurate detectors for English. Despite the importance of this problem, there is currently no work that surveys this fast-growing literature and introduces newcomers to important research challenges. In this work, we fill this void by providing a critical survey and review of this literature to facilitate a comprehensive understanding of this problem. We conduct an in-depth error analysis of the state-of-the-art detector and discuss research directions to guide future work in this exciting area.
△ Less
Submitted 2 November, 2020;
originally announced November 2020.
-
Efficient Approximation Algorithms for Adaptive Seed Minimization
Authors:
Jing Tang,
Keke Huang,
Xiaokui Xiao,
Laks V. S. Lakshmanan,
Xueyan Tang,
Aixin Sun,
Andrew Lim
Abstract:
As a dual problem of influence maximization, the seed minimization problem asks for the minimum number of seed nodes to influence a required number $η$ of users in a given social network $G$. Existing algorithms for seed minimization mostly consider the non-adaptive setting, where all seed nodes are selected in one batch without observing how they may influence other users. In this paper, we study…
▽ More
As a dual problem of influence maximization, the seed minimization problem asks for the minimum number of seed nodes to influence a required number $η$ of users in a given social network $G$. Existing algorithms for seed minimization mostly consider the non-adaptive setting, where all seed nodes are selected in one batch without observing how they may influence other users. In this paper, we study seed minimization in the adaptive setting, where the seed nodes are selected in several batches, such that the choice of a batch may exploit information about the actual influence of the previous batches. We propose a novel algorithm, ASTI, which addresses the adaptive seed minimization problem in $O\Big(\frac{η\cdot (m+n)}{\varepsilon^2}\ln n \Big)$ expected time and offers an approximation guarantee of $\frac{(\ln η+1)^2}{(1 - (1-1/b)^b) (1-1/e)(1-\varepsilon)}$ in expectation, where $η$ is the targeted number of influenced nodes, $b$ is size of each seed node batch, and $\varepsilon \in (0, 1)$ is a user-specified parameter. To the best of our knowledge, ASTI is the first algorithm that provides such an approximation guarantee without incurring prohibitive computation overhead. With extensive experiments on a variety of datasets, we demonstrate the effectiveness and efficiency of ASTI over competing methods.
△ Less
Submitted 22 July, 2019;
originally announced July 2019.
-
Efficient Algorithms for Densest Subgraph Discovery
Authors:
Yixiang Fang,
Kaiqiang Yu,
Reynold Cheng,
Laks V. S. Lakshmanan,
Xuemin Lin
Abstract:
Densest subgraph discovery (DSD) is a fundamental problem in graph mining. It has been studied for decades, and is widely used in various areas, including network science, biological analysis, and graph databases. Given a graph G, DSD aims to find a subgraph D of G with the highest density (e.g., the number of edges over the number of vertices in D). Because DSD is difficult to solve, we propose a…
▽ More
Densest subgraph discovery (DSD) is a fundamental problem in graph mining. It has been studied for decades, and is widely used in various areas, including network science, biological analysis, and graph databases. Given a graph G, DSD aims to find a subgraph D of G with the highest density (e.g., the number of edges over the number of vertices in D). Because DSD is difficult to solve, we propose a new solution paradigm in this paper. Our main observation is that a densest subgraph can be accurately found through a k-core (a kind of dense subgraph of G), with theoretical guarantees. Based on this intuition, we develop efficient exact and approximation solutions for DSD. Moreover, our solutions are able to find the densest subgraphs for a wide range of graph density definitions, including clique-based and general pattern-based density. We have performed extensive experimental evaluation on eleven real datasets. Our results show that our algorithms are up to four orders of magnitude faster than existing approaches.
△ Less
Submitted 7 August, 2019; v1 submitted 2 June, 2019;
originally announced June 2019.
-
Maximizing Welfare in Social Networks under a Utility Driven Influence Diffusion Model
Authors:
Prithu Banerjee,
Wei Chen,
Laks V. S. Lakshmanan
Abstract:
Motivated by applications such as viral marketing, the problem of influence maximization (IM) has been extensively studied in the literature. The goal is to select a small number of users to adopt an item such that it results in a large cascade of adoptions by others. Existing works have three key limitations. (1) They do not account for economic considerations of a user in buying/adopting items.…
▽ More
Motivated by applications such as viral marketing, the problem of influence maximization (IM) has been extensively studied in the literature. The goal is to select a small number of users to adopt an item such that it results in a large cascade of adoptions by others. Existing works have three key limitations. (1) They do not account for economic considerations of a user in buying/adopting items. (2) Most studies on multiple items focus on competition, with complementary items receiving limited attention. (3) For the network owner, maximizing social welfare is important to ensure customer loyalty, which is not addressed in prior work in the IM literature. In this paper, we address all three limitations and propose a novel model called UIC that combines utility-driven item adoption with influence propagation over networks. Focusing on the mutually complementary setting, we formulate the problem of social welfare maximization in this novel setting. We show that while the objective function is neither submodular nor supermodular, surprisingly a simple greedy allocation algorithm achieves a factor of $(1-1/e-ε)$ of the optimum expected social welfare. We develop \textsf{bundleGRD}, a scalable version of this approximation algorithm, and demonstrate, with comprehensive experiments on real and synthetic datasets, that it significantly outperforms all baselines.
△ Less
Submitted 30 May, 2019; v1 submitted 6 July, 2018;
originally announced July 2018.
-
Refutations on "Debunking the Myths of Influence Maximization: An In-Depth Benchmarking Study"
Authors:
Wei Lu,
Xiaokui Xiao,
Amit Goyal,
Keke Huang,
Laks V. S. Lakshmanan
Abstract:
In a recent SIGMOD paper titled "Debunking the Myths of Influence Maximization: An In-Depth Benchmarking Study", Arora et al. [1] undertake a performance benchmarking study of several well-known algorithms for influence maximization. In the process, they contradict several published results, and claim to have unearthed and debunked several "myths" that existed around the research of influence maxi…
▽ More
In a recent SIGMOD paper titled "Debunking the Myths of Influence Maximization: An In-Depth Benchmarking Study", Arora et al. [1] undertake a performance benchmarking study of several well-known algorithms for influence maximization. In the process, they contradict several published results, and claim to have unearthed and debunked several "myths" that existed around the research of influence maximization. It is the goal of this article to examine their claims objectively and critically, and refute the erroneous ones. Our investigation discovers that first, the overall experimental methodology in Arora et al. [1] is flawed and leads to scientifically incorrect conclusions. Second, the paper [1] is riddled with issues specific to a variety of influence maximization algorithms, including buggy experiments, and draws many misleading conclusions regarding those algorithms. Importantly, they fail to appreciate the trade-off between running time and solution quality, and did not incorporate it correctly in their experimental methodology. In this article, we systematically point out the issues present in [1] and refute 11 of their misclaims.
△ Less
Submitted 9 June, 2017; v1 submitted 15 May, 2017;
originally announced May 2017.
-
Horde of Bandits using Gaussian Markov Random Fields
Authors:
Sharan Vaswani,
Mark Schmidt,
Laks V. S. Lakshmanan
Abstract:
The gang of bandits (GOB) model \cite{cesa2013gang} is a recent contextual bandits framework that shares information between a set of bandit problems, related by a known (possibly noisy) graph. This model is useful in problems like recommender systems where the large number of users makes it vital to transfer information between users. Despite its effectiveness, the existing GOB model can only be…
▽ More
The gang of bandits (GOB) model \cite{cesa2013gang} is a recent contextual bandits framework that shares information between a set of bandit problems, related by a known (possibly noisy) graph. This model is useful in problems like recommender systems where the large number of users makes it vital to transfer information between users. Despite its effectiveness, the existing GOB model can only be applied to small problems due to its quadratic time-dependence on the number of nodes. Existing solutions to combat the scalability issue require an often-unrealistic clustering assumption. By exploiting a connection to Gaussian Markov random fields (GMRFs), we show that the GOB model can be made to scale to much larger graphs without additional assumptions. In addition, we propose a Thompson sampling algorithm which uses the recent GMRF sampling-by-perturbation technique, allowing it to scale to even larger problems (leading to a "horde" of bandits). We give regret bounds and experimental results for GOB with Thompson sampling and epoch-greedy algorithms, indicating that these methods are as good as or significantly better than ignoring the graph or adopting a clustering-based approach. Finally, when an existing graph is not available, we propose a heuristic for learning it on the fly and show promising results.
△ Less
Submitted 7 March, 2017;
originally announced March 2017.
-
Combating the Cold Start User Problem in Model Based Collaborative Filtering
Authors:
Sampoorna Biswas,
Laks V. S. Lakshmanan,
Senjuti Basu Ray
Abstract:
For tackling the well known cold-start user problem in model-based recommender systems, one approach is to recommend a few items to a cold-start user and use the feedback to learn a profile. The learned profile can then be used to make good recommendations to the cold user. In the absence of a good initial profile, the recommendations are like random probes, but if not chosen judiciously, both bad…
▽ More
For tackling the well known cold-start user problem in model-based recommender systems, one approach is to recommend a few items to a cold-start user and use the feedback to learn a profile. The learned profile can then be used to make good recommendations to the cold user. In the absence of a good initial profile, the recommendations are like random probes, but if not chosen judiciously, both bad recommendations and too many recommendations may turn off a user. We formalize the cold-start user problem by asking what are the $b$ best items we should recommend to a cold-start user, in order to learn her profile most accurately, where $b$, a given budget, is typically a small number. We formalize the problem as an optimization problem and present multiple non-trivial results, including NP-hardness as well as hardness of approximation. We furthermore show that the objective function, i.e., the least square error of the learned profile w.r.t. the true user profile, is neither submodular nor supermodular, suggesting efficient approximations are unlikely to exist. Finally, we discuss several scalable heuristic approaches for identifying the $b$ best items to recommend to the user and experimentally evaluate their performance on 4 real datasets. Our experiments show that our proposed accelerated algorithms significantly outperform the prior art in runnning time, while achieving similar error in the learned user profile as well as in the rating predictions.
△ Less
Submitted 17 February, 2017;
originally announced March 2017.
-
Revenue Maximization in Incentivized Social Advertising
Authors:
Cigdem Aslay,
Francesco Bonchi,
Laks V. S. Lakshmanan,
Wei Lu
Abstract:
Incentivized social advertising, an emerging marketing model, provides monetization opportunities not only to the owners of the social networking platforms but also to their influential users by offering a "cut" on the advertising revenue. We consider a social network (the host) that sells ad-engagements to advertisers by inserting their ads, in the form of promoted posts, into the feeds of carefu…
▽ More
Incentivized social advertising, an emerging marketing model, provides monetization opportunities not only to the owners of the social networking platforms but also to their influential users by offering a "cut" on the advertising revenue. We consider a social network (the host) that sells ad-engagements to advertisers by inserting their ads, in the form of promoted posts, into the feeds of carefully selected "initial endorsers" or seed users: these users receive monetary incentives in exchange for their endorsements. The endorsements help propagate the ads to the feeds of their followers. In this context, the problem for the host is is to allocate ads to influential users, taking into account the propensity of ads for viral propagation, and carefully apportioning the monetary budget of each of the advertisers between incentives to influential users and ad-engagement costs, with the rational goal of maximizing its own revenue. We consider a monetary incentive for the influential users, which is proportional to their influence potential. We show that revenue maximization in incentivized social advertising corresponds to the problem of monotone submodular function maximization, subject to a partition matroid constraint on the ads-to-seeds allocation, and submodular knapsack constraints on the advertisers' budgets. This problem is NP-hard and we devise 2 greedy algorithms with provable approximation guarantees, which differ in their sensitivity to seed user incentive costs. Our approximation algorithms require repeatedly estimating the expected marginal gain in revenue as well as in advertiser payment. By exploiting a connection to the recent advances made in scalable estimation of expected influence spread, we devise efficient and scalable versions of the greedy algorithms.
△ Less
Submitted 22 June, 2021; v1 submitted 1 December, 2016;
originally announced December 2016.
-
Attribute Truss Community Search
Authors:
Xin Huang,
Laks V. S. Lakshmanan
Abstract:
Recently, community search over graphs has attracted significant attention and many algorithms have been developed for finding dense subgraphs from large graphs that contain given query nodes. In applications such as analysis of protein protein interaction (PPI) networks, citation graphs, and collaboration networks, nodes tend to have attributes. Unfortunately, previously developed community searc…
▽ More
Recently, community search over graphs has attracted significant attention and many algorithms have been developed for finding dense subgraphs from large graphs that contain given query nodes. In applications such as analysis of protein protein interaction (PPI) networks, citation graphs, and collaboration networks, nodes tend to have attributes. Unfortunately, previously developed community search algorithms ignore these attributes and result in communities with poor cohesion w.r.t. their node attributes. In this paper, we study the problem of attribute-driven community search, that is, given an undirected graph $G$ where nodes are associated with attributes, and an input query $Q$ consisting of nodes $V_q$ and attributes $W_q$, find the communities containing $V_q$, in which most community members are densely inter-connected and have similar attributes.
We formulate our problem of finding attributed truss communities (ATC), as finding all connected and close k-truss subgraphs containing $V_q$, that are locally maximal and have the largest attribute relevance score among such subgraphs. We design a novel attribute relevance score function and establish its desirable properties. The problem is shown to be NP-hard. However, we develop an efficient greedy algorithmic framework, which finds a maximal $k$-truss containing $V_q$, and then iteratively removes the nodes with the least popular attributes and shrinks the graph so as to satisfy community constraints. We also build an elegant index to maintain the known $k$-truss structure and attribute information, and propose efficient query processing algorithms. Extensive experiments on large real-world networks with ground-truth communities shows the efficiency and effectiveness of our proposed methods.
△ Less
Submitted 14 February, 2017; v1 submitted 31 August, 2016;
originally announced September 2016.
-
Adaptive Influence Maximization in Social Networks: Why Commit when You can Adapt?
Authors:
Sharan Vaswani,
Laks V. S. Lakshmanan
Abstract:
Most previous work on influence maximization in social networks is limited to the non-adaptive setting in which the marketer is supposed to select all of the seed users, to give free samples or discounts to, up front. A disadvantage of this setting is that the marketer is forced to select all the seeds based solely on a diffusion model. If some of the selected seeds do not perform well, there is n…
▽ More
Most previous work on influence maximization in social networks is limited to the non-adaptive setting in which the marketer is supposed to select all of the seed users, to give free samples or discounts to, up front. A disadvantage of this setting is that the marketer is forced to select all the seeds based solely on a diffusion model. If some of the selected seeds do not perform well, there is no opportunity to course-correct. A more practical setting is the adaptive setting in which the marketer initially selects a batch of users and observes how well seeding those users leads to a diffusion of product adoptions. Based on this market feedback, she formulates a policy for choosing the remaining seeds. In this paper, we study adaptive offline strategies for two problems: (a) MAXSPREAD -- given a budget on number of seeds and a time horizon, maximize the spread of influence and (b) MINTSS -- given a time horizon and an expected number of target users to be influenced, minimize the number of seeds that will be required. In particular, we present theoretical bounds and empirical results for an adaptive strategy and quantify its practical benefit over the non-adaptive strategy. We evaluate adaptive and non-adaptive policies on three real data sets. We conclude that while benefit of going adaptive for the MAXSPREAD problem is modest, adaptive policies lead to significant savings for the MINTSS problem.
△ Less
Submitted 27 April, 2016;
originally announced April 2016.
-
From Competition to Complementarity: Comparative Influence Diffusion and Maximization
Authors:
Wei Lu,
Wei Chen,
Laks V. S. Lakshmanan
Abstract:
Influence maximization is a well-studied problem that asks for a small set of influential users from a social network, such that by targeting them as early adopters, the expected total adoption through influence cascades over the network is maximized. However, almost all prior work focuses on cascades of a single propagating entity or purely-competitive entities. In this work, we propose the Compa…
▽ More
Influence maximization is a well-studied problem that asks for a small set of influential users from a social network, such that by targeting them as early adopters, the expected total adoption through influence cascades over the network is maximized. However, almost all prior work focuses on cascades of a single propagating entity or purely-competitive entities. In this work, we propose the Comparative Independent Cascade (Com-IC) model that covers the full spectrum of entity interactions from competition to complementarity. In Com-IC, users' adoption decisions depend not only on edge-level information propagation, but also on a node-level automaton whose behavior is governed by a set of model parameters, enabling our model to capture not only competition, but also complementarity, to any possible degree. We study two natural optimization problems, Self Influence Maximization and Complementary Influence Maximization, in a novel setting with complementary entities. Both problems are NP-hard, and we devise efficient and effective approximation algorithms via non-trivial techniques based on reverse-reachable sets and a novel "sandwich approximation". The applicability of both techniques extends beyond our model and problems. Our experiments show that the proposed algorithms consistently outperform intuitive baselines in four real-world social networks, often by a significant margin. In addition, we learn model parameters from real user action logs.
△ Less
Submitted 4 November, 2015; v1 submitted 1 July, 2015;
originally announced July 2015.
-
Approximate Closest Community Search in Networks
Authors:
Xin Huang,
Laks V. S. Lakshmanan,
Jeffrey Xu Yu,
Hong Cheng
Abstract:
Recently, there has been significant interest in the study of the community search problem in social and information networks: given one or more query nodes, find densely connected communities containing the query nodes. However, most existing studies do not address the "free rider" issue, that is, nodes far away from query nodes and irrelevant to them are included in the detected community. Some…
▽ More
Recently, there has been significant interest in the study of the community search problem in social and information networks: given one or more query nodes, find densely connected communities containing the query nodes. However, most existing studies do not address the "free rider" issue, that is, nodes far away from query nodes and irrelevant to them are included in the detected community. Some state-of-the-art models have attempted to address this issue, but not only are their formulated problems NP-hard, they do not admit any approximations without restrictive assumptions, which may not always hold in practice.
In this paper, given an undirected graph G and a set of query nodes Q, we study community search using the k-truss based community model. We formulate our problem of finding a closest truss community (CTC), as finding a connected k-truss subgraph with the largest k that contains Q, and has the minimum diameter among such subgraphs. We prove this problem is NP-hard. Furthermore, it is NP-hard to approximate the problem within a factor $(2-\varepsilon)$, for any $\varepsilon >0 $. However, we develop a greedy algorithmic framework, which first finds a CTC containing Q, and then iteratively removes the furthest nodes from Q, from the graph. The method achieves 2-approximation to the optimal solution. To further improve the efficiency, we make use of a compact truss index and develop efficient algorithms for k-truss identification and maintenance as nodes get eliminated. In addition, using bulk deletion optimization and local exploration strategies, we propose two more efficient algorithms. One of them trades some approximation quality for efficiency while the other is a very efficient heuristic. Extensive experiments on 6 real-world networks show the effectiveness and efficiency of our community model and search algorithms.
△ Less
Submitted 12 September, 2015; v1 submitted 22 May, 2015;
originally announced May 2015.
-
From Group Recommendations to Group Formation
Authors:
Senjuti Basu Roy,
Laks V. S. Lakshmanan,
Rui Liu
Abstract:
There has been significant recent interest in the area of group recommendations, where, given groups of users of a recommender system, one wants to recommend top-k items to a group that maximize the satisfaction of the group members, according to a chosen semantics of group satisfaction. Examples semantics of satisfaction of a recommended itemset to a group include the so-called least misery (LM)…
▽ More
There has been significant recent interest in the area of group recommendations, where, given groups of users of a recommender system, one wants to recommend top-k items to a group that maximize the satisfaction of the group members, according to a chosen semantics of group satisfaction. Examples semantics of satisfaction of a recommended itemset to a group include the so-called least misery (LM) and aggregate voting (AV). We consider the complementary problem of how to form groups such that the users in the formed groups are most satisfied with the suggested top-k recommendations. We assume that the recommendations will be generated according to one of the two group recommendation semantics - LM or AV. Rather than assuming groups are given, or rely on ad hoc group formation dynamics, our framework allows a strategic approach for forming groups of users in order to maximize satisfaction. We show that the problem is NP-hard to solve optimally under both semantics. Furthermore, we develop two efficient algorithms for group formation under LM and show that they achieve bounded absolute error. We develop efficient heuristic algorithms for group formation under AV. We validate our results and demonstrate the scalability and effectiveness of our group formation algorithms on two large real data sets.
△ Less
Submitted 11 March, 2015;
originally announced March 2015.
-
Influence Maximization with Bandits
Authors:
Sharan Vaswani,
Laks. V. S. Lakshmanan,
Mark Schmidt
Abstract:
We consider the problem of \emph{influence maximization}, the problem of maximizing the number of people that become aware of a product by finding the `best' set of `seed' users to expose the product to. Most prior work on this topic assumes that we know the probability of each user influencing each other user, or we have data that lets us estimate these influences. However, this information is ty…
▽ More
We consider the problem of \emph{influence maximization}, the problem of maximizing the number of people that become aware of a product by finding the `best' set of `seed' users to expose the product to. Most prior work on this topic assumes that we know the probability of each user influencing each other user, or we have data that lets us estimate these influences. However, this information is typically not initially available or is difficult to obtain. To avoid this assumption, we adopt a combinatorial multi-armed bandit paradigm that estimates the influence probabilities as we sequentially try different seed sets. We establish bounds on the performance of this procedure under the existing edge-level feedback as well as a novel and more realistic node-level feedback. Beyond our theoretical results, we describe a practical implementation and experimentally demonstrate its efficiency and effectiveness on four real datasets.
△ Less
Submitted 27 April, 2016; v1 submitted 27 February, 2015;
originally announced March 2015.
-
Viral Marketing Meets Social Advertising: Ad Allocation with Minimum Regret
Authors:
Cigdem Aslay,
Wei Lu,
Francesco Bonchi,
Amit Goyal,
Laks V. S. Lakshmanan
Abstract:
In this paper, we study the problem of allocating ads to users through the viral-marketing lens. Advertisers approach the host with a budget in return for the marketing campaign service provided by the host. We show that allocation that takes into account the propensity of ads for viral propagation can achieve significantly better performance. However, uncontrolled virality could be undesirable fo…
▽ More
In this paper, we study the problem of allocating ads to users through the viral-marketing lens. Advertisers approach the host with a budget in return for the marketing campaign service provided by the host. We show that allocation that takes into account the propensity of ads for viral propagation can achieve significantly better performance. However, uncontrolled virality could be undesirable for the host as it creates room for exploitation by the advertisers: hoping to tap uncontrolled virality, an advertiser might declare a lower budget for its marketing campaign, aiming at the same large outcome with a smaller cost.
This creates a challenging trade-off: on the one hand, the host aims at leveraging virality and the network effect to improve advertising efficacy, while on the other hand the host wants to avoid giving away free service due to uncontrolled virality. We formalize this as the problem of ad allocation with minimum regret, which we show is NP-hard and inapproximable w.r.t. any factor. However, we devise an algorithm that provides approximation guarantees w.r.t. the total budget of all advertisers. We develop a scalable version of our approximation algorithm, which we extensively test on four real-world data sets, confirming that our algorithm delivers high quality solutions, is scalable, and significantly outperforms several natural baselines.
△ Less
Submitted 24 August, 2015; v1 submitted 3 December, 2014;
originally announced December 2014.
-
Show Me the Money: Dynamic Recommendations for Revenue Maximization
Authors:
Wei Lu,
Shanshan Chen,
Keqian Li,
Laks V. S. Lakshmanan
Abstract:
Recommender Systems (RS) play a vital role in applications such as e-commerce and on-demand content streaming. Research on RS has mainly focused on the customer perspective, i.e., accurate prediction of user preferences and maximization of user utilities. As a result, most existing techniques are not explicitly built for revenue maximization, the primary business goal of enterprises. In this work,…
▽ More
Recommender Systems (RS) play a vital role in applications such as e-commerce and on-demand content streaming. Research on RS has mainly focused on the customer perspective, i.e., accurate prediction of user preferences and maximization of user utilities. As a result, most existing techniques are not explicitly built for revenue maximization, the primary business goal of enterprises. In this work, we explore and exploit a novel connection between RS and the profitability of a business. As recommendations can be seen as an information channel between a business and its customers, it is interesting and important to investigate how to make strategic dynamic recommendations leading to maximum possible revenue. To this end, we propose a novel \model that takes into account a variety of factors including prices, valuations, saturation effects, and competition amongst products. Under this model, we study the problem of finding revenue-maximizing recommendation strategies over a finite time horizon. We show that this problem is NP-hard, but approximation guarantees can be obtained for a slightly relaxed version, by establishing an elegant connection to matroid theory. Given the prohibitively high complexity of the approximation algorithm, we also design intelligent heuristics for the original problem. Finally, we conduct extensive experiments on two real and synthetic datasets and demonstrate the efficiency, scalability, and effectiveness our algorithms, and that they significantly outperform several intuitive baselines.
△ Less
Submitted 25 August, 2015; v1 submitted 30 August, 2014;
originally announced September 2014.
-
Modeling Non-Progressive Phenomena for Influence Propagation
Authors:
Vincent Yun Lou,
Smriti Bhagat,
Laks V. S. Lakshmanan,
Sharan Vaswani
Abstract:
Recent work on modeling influence propagation focus on progressive models, i.e., once a node is influenced (active) the node stays in that state and cannot become inactive. However, this assumption is unrealistic in many settings where nodes can transition between active and inactive states. For instance, a user of a social network may stop using an app and become inactive, but again activate when…
▽ More
Recent work on modeling influence propagation focus on progressive models, i.e., once a node is influenced (active) the node stays in that state and cannot become inactive. However, this assumption is unrealistic in many settings where nodes can transition between active and inactive states. For instance, a user of a social network may stop using an app and become inactive, but again activate when instigated by a friend, or when the app adds a new feature or releases a new version. In this work, we study such non-progressive phenomena and propose an efficient model of influence propagation. Specifically, we model in influence propagation as a continuous-time Markov process with 2 states: active and inactive. Such a model is both highly scalable (we evaluated on graphs with over 2 million nodes), 17-20 times faster, and more accurate for estimating the spread of influence, as compared with state-of-the-art progressive models for several applications where nodes may switch states.
△ Less
Submitted 26 August, 2014;
originally announced August 2014.
-
Event Evolution Tracking from Streaming Social Posts
Authors:
Pei Lee,
Laks V. S. Lakshmanan,
Evangelos E. Milios
Abstract:
Online social post streams such as Twitter timelines and forum discussions have emerged as important channels for information dissemination. They are noisy, informal, and surge quickly. Real life events, which may happen and evolve every minute, are perceived and circulated in post streams by social users. Intuitively, an event can be viewed as a dense cluster of posts with a life cycle sharing th…
▽ More
Online social post streams such as Twitter timelines and forum discussions have emerged as important channels for information dissemination. They are noisy, informal, and surge quickly. Real life events, which may happen and evolve every minute, are perceived and circulated in post streams by social users. Intuitively, an event can be viewed as a dense cluster of posts with a life cycle sharing the same descriptive words. There are many previous works on event detection from social streams. However, there has been surprisingly little work on tracking the evolution patterns of events, e.g., birth/death, growth/decay, merge/split, which we address in this paper. To define a tracking scope, we use a sliding time window, where old posts disappear and new posts appear at each moment. Following that, we model a social post stream as an evolving network, where each social post is a node, and edges between posts are constructed when the post similarity is above a threshold. We propose a framework which summarizes the information in the stream within the current time window as a ``sketch graph'' composed of ``core'' posts. We develop incremental update algorithms to handle highly dynamic social streams and track event evolution patterns in real time. Moreover, we visualize events as word clouds to aid human perception. Our evaluation on a real data set consisting of 5.2 million posts demonstrates that our method can effectively track event dynamics in the whole life cycle from very large volumes of social streams on the fly.
△ Less
Submitted 23 November, 2013;
originally announced November 2013.