-
Detection of Active Emergency Vehicles using Per-Frame CNNs and Output Smoothing
Authors:
Meng Fan,
Craig Bidstrup,
Zhaoen Su,
Jason Owens,
Gary Yang,
Nemanja Djuric
Abstract:
While inferring common actor states (such as position or velocity) is an important and well-explored task of the perception system aboard a self-driving vehicle (SDV), it may not always provide sufficient information to the SDV. This is especially true in the case of active emergency vehicles (EVs), where light-based signals also need to be captured to provide a full context. We consider this prob…
▽ More
While inferring common actor states (such as position or velocity) is an important and well-explored task of the perception system aboard a self-driving vehicle (SDV), it may not always provide sufficient information to the SDV. This is especially true in the case of active emergency vehicles (EVs), where light-based signals also need to be captured to provide a full context. We consider this problem and propose a sequential methodology for the detection of active EVs, using an off-the-shelf CNN model operating at a frame level and a downstream smoother that accounts for the temporal aspect of flashing EV lights. We also explore model improvements through data augmentation and training with additional hard samples.
△ Less
Submitted 27 December, 2022;
originally announced December 2022.
-
Convolutions for Spatial Interaction Modeling
Authors:
Zhaoen Su,
Chao Wang,
David Bradley,
Carlos Vallespi-Gonzalez,
Carl Wellington,
Nemanja Djuric
Abstract:
In many different fields interactions between objects play a critical role in determining their behavior. Graph neural networks (GNNs) have emerged as a powerful tool for modeling interactions, although often at the cost of adding considerable complexity and latency. In this paper, we consider the problem of spatial interaction modeling in the context of predicting the motion of actors around auto…
▽ More
In many different fields interactions between objects play a critical role in determining their behavior. Graph neural networks (GNNs) have emerged as a powerful tool for modeling interactions, although often at the cost of adding considerable complexity and latency. In this paper, we consider the problem of spatial interaction modeling in the context of predicting the motion of actors around autonomous vehicles, and investigate alternatives to GNNs. We revisit 2D convolutions and show that they can demonstrate comparable performance to graph networks in modeling spatial interactions with lower latency, thus providing an effective and efficient alternative in time-critical systems. Moreover, we propose a novel interaction loss to further improve the interaction modeling of the considered methods.
△ Less
Submitted 8 June, 2022; v1 submitted 14 April, 2021;
originally announced April 2021.
-
Investigating the Effect of Sensor Modalities in Multi-Sensor Detection-Prediction Models
Authors:
Abhishek Mohta,
Fang-Chieh Chou,
Brian C. Becker,
Carlos Vallespi-Gonzalez,
Nemanja Djuric
Abstract:
Detection of surrounding objects and their motion prediction are critical components of a self-driving system. Recently proposed models that jointly address these tasks rely on a number of sensors to achieve state-of-the-art performance. However, this increases system complexity and may result in a brittle model that overfits to any single sensor modality while ignoring others, leading to reduced…
▽ More
Detection of surrounding objects and their motion prediction are critical components of a self-driving system. Recently proposed models that jointly address these tasks rely on a number of sensors to achieve state-of-the-art performance. However, this increases system complexity and may result in a brittle model that overfits to any single sensor modality while ignoring others, leading to reduced generalization. We focus on this important problem and analyze the contribution of sensor modalities towards the model performance. In addition, we investigate the use of sensor dropout to mitigate the above-mentioned issues, leading to a more robust, better-performing model on real-world driving data.
△ Less
Submitted 8 January, 2021;
originally announced January 2021.
-
Ellipse Loss for Scene-Compliant Motion Prediction
Authors:
Henggang Cui,
Hoda Shajari,
Sai Yalamanchi,
Nemanja Djuric
Abstract:
Motion prediction is a critical part of self-driving technology, responsible for inferring future behavior of traffic actors in autonomous vehicle's surroundings. In order to ensure safe and efficient operations, prediction models need to output accurate trajectories that obey the map constraints. In this paper, we address this task and propose a novel ellipse loss that allows the models to better…
▽ More
Motion prediction is a critical part of self-driving technology, responsible for inferring future behavior of traffic actors in autonomous vehicle's surroundings. In order to ensure safe and efficient operations, prediction models need to output accurate trajectories that obey the map constraints. In this paper, we address this task and propose a novel ellipse loss that allows the models to better reason about scene compliance and predict more realistic trajectories. Ellipse loss penalizes off-road predictions directly in a supervised manner, by projecting the output trajectories into the top-down map frame using a differentiable trajectory rasterizer module. Moreover, it takes into account actor dimensions and orientation, providing more direct training signals to the model. We applied ellipse loss to a recently proposed state-of-the-art joint detection-prediction model to showcase its benefits. Evaluation on large-scale autonomous driving data strongly indicates that the method allows for more accurate and more realistic trajectory predictions.
△ Less
Submitted 25 March, 2021; v1 submitted 5 November, 2020;
originally announced November 2020.
-
Uncertainty-Aware Vehicle Orientation Estimation for Joint Detection-Prediction Models
Authors:
Henggang Cui,
Fang-Chieh Chou,
Jake Charland,
Carlos Vallespi-Gonzalez,
Nemanja Djuric
Abstract:
Object detection is a critical component of a self-driving system, tasked with inferring the current states of the surrounding traffic actors. While there exist a number of studies on the problem of inferring the position and shape of vehicle actors, understanding actors' orientation remains a challenge for existing state-of-the-art detectors. Orientation is an important property for downstream mo…
▽ More
Object detection is a critical component of a self-driving system, tasked with inferring the current states of the surrounding traffic actors. While there exist a number of studies on the problem of inferring the position and shape of vehicle actors, understanding actors' orientation remains a challenge for existing state-of-the-art detectors. Orientation is an important property for downstream modules of an autonomous system, particularly relevant for motion prediction of stationary or reversing actors where current approaches struggle. We focus on this task and present a method that extends the existing models that perform joint object detection and motion prediction, allowing us to more accurately infer vehicle orientations. In addition, the approach is able to quantify prediction uncertainty, outputting the probability that the inferred orientation is flipped, which allows for improved motion prediction and safer autonomous operations. Empirical results show the benefits of the approach, obtaining state-of-the-art performance on the open-sourced nuScenes data set.
△ Less
Submitted 5 November, 2020;
originally announced November 2020.
-
Temporally-Continuous Probabilistic Prediction using Polynomial Trajectory Parameterization
Authors:
Zhaoen Su,
Chao Wang,
Henggang Cui,
Nemanja Djuric,
Carlos Vallespi-Gonzalez,
David Bradley
Abstract:
A commonly-used representation for motion prediction of actors is a sequence of waypoints (comprising positions and orientations) for each actor at discrete future time-points. While this approach is simple and flexible, it can exhibit unrealistic higher-order derivatives (such as acceleration) and approximation errors at intermediate time steps. To address this issue we propose a simple and gener…
▽ More
A commonly-used representation for motion prediction of actors is a sequence of waypoints (comprising positions and orientations) for each actor at discrete future time-points. While this approach is simple and flexible, it can exhibit unrealistic higher-order derivatives (such as acceleration) and approximation errors at intermediate time steps. To address this issue we propose a simple and general representation for temporally continuous probabilistic trajectory prediction that is based on polynomial trajectory parameterization. We evaluate the proposed representation on supervised trajectory prediction tasks using two large self-driving data sets. The results show realistic higher-order derivatives and better accuracy at interpolated time-points, as well as the benefits of the inferred noise distributions over the trajectories. Extensive experimental studies based on existing state-of-the-art models demonstrate the effectiveness of the proposed approach relative to other representations in predicting the future motions of vehicle, bicyclist, and pedestrian traffic actors.
△ Less
Submitted 31 October, 2020;
originally announced November 2020.
-
Multi-View Fusion of Sensor Data for Improved Perception and Prediction in Autonomous Driving
Authors:
Sudeep Fadadu,
Shreyash Pandey,
Darshan Hegde,
Yi Shi,
Fang-Chieh Chou,
Nemanja Djuric,
Carlos Vallespi-Gonzalez
Abstract:
We present an end-to-end method for object detection and trajectory prediction utilizing multi-view representations of LiDAR returns and camera images. In this work, we recognize the strengths and weaknesses of different view representations, and we propose an efficient and generic fusing method that aggregates benefits from all views. Our model builds on a state-of-the-art Bird's-Eye View (BEV) n…
▽ More
We present an end-to-end method for object detection and trajectory prediction utilizing multi-view representations of LiDAR returns and camera images. In this work, we recognize the strengths and weaknesses of different view representations, and we propose an efficient and generic fusing method that aggregates benefits from all views. Our model builds on a state-of-the-art Bird's-Eye View (BEV) network that fuses voxelized features from a sequence of historical LiDAR data as well as rasterized high-definition map to perform detection and prediction tasks. We extend this model with additional LiDAR Range-View (RV) features that use the raw LiDAR information in its native, non-quantized representation. The RV feature map is projected into BEV and fused with the BEV features computed from LiDAR and high-definition map. The fused features are then further processed to output the final detections and trajectories, within a single end-to-end trainable network. In addition, the RV fusion of LiDAR and camera is performed in a straightforward and computationally efficient manner using this framework. The proposed multi-view fusion approach improves the state-of-the-art on proprietary large-scale real-world data collected by a fleet of self-driving vehicles, as well as on the public nuScenes data set with minimal increases on the computational cost.
△ Less
Submitted 18 October, 2021; v1 submitted 26 August, 2020;
originally announced August 2020.
-
Multi-Modal Trajectory Prediction of NBA Players
Authors:
Sandro Hauri,
Nemanja Djuric,
Vladan Radosavljevic,
Slobodan Vucetic
Abstract:
National Basketball Association (NBA) players are highly motivated and skilled experts that solve complex decision making problems at every time point during a game. As a step towards understanding how players make their decisions, we focus on their movement trajectories during games. We propose a method that captures the multi-modal behavior of players, where they might consider multiple trajecto…
▽ More
National Basketball Association (NBA) players are highly motivated and skilled experts that solve complex decision making problems at every time point during a game. As a step towards understanding how players make their decisions, we focus on their movement trajectories during games. We propose a method that captures the multi-modal behavior of players, where they might consider multiple trajectories and select the most advantageous one. The method is built on an LSTM-based architecture predicting multiple trajectories and their probabilities, trained by a multi-modal loss function that updates the best trajectories. Experiments on large, fine-grained NBA tracking data show that the proposed method outperforms the state-of-the-art. In addition, the results indicate that the approach generates more realistic trajectories and that it can learn individual playing styles of specific players.
△ Less
Submitted 18 August, 2020;
originally announced August 2020.
-
MultiXNet: Multiclass Multistage Multimodal Motion Prediction
Authors:
Nemanja Djuric,
Henggang Cui,
Zhaoen Su,
Shangxuan Wu,
Huahua Wang,
Fang-Chieh Chou,
Luisa San Martin,
Song Feng,
Rui Hu,
Yang Xu,
Alyssa Dayan,
Sidney Zhang,
Brian C. Becker,
Gregory P. Meyer,
Carlos Vallespi-Gonzalez,
Carl K. Wellington
Abstract:
One of the critical pieces of the self-driving puzzle is understanding the surroundings of a self-driving vehicle (SDV) and predicting how these surroundings will change in the near future. To address this task we propose MultiXNet, an end-to-end approach for detection and motion prediction based directly on lidar sensor data. This approach builds on prior work by handling multiple classes of traf…
▽ More
One of the critical pieces of the self-driving puzzle is understanding the surroundings of a self-driving vehicle (SDV) and predicting how these surroundings will change in the near future. To address this task we propose MultiXNet, an end-to-end approach for detection and motion prediction based directly on lidar sensor data. This approach builds on prior work by handling multiple classes of traffic actors, adding a jointly trained second-stage trajectory refinement step, and producing a multimodal probability distribution over future actor motion that includes both multiple discrete traffic behaviors and calibrated continuous position uncertainties. The method was evaluated on large-scale, real-world data collected by a fleet of SDVs in several cities, with the results indicating that it outperforms existing state-of-the-art approaches.
△ Less
Submitted 24 May, 2021; v1 submitted 2 June, 2020;
originally announced June 2020.
-
Improving Movement Predictions of Traffic Actors in Bird's-Eye View Models using GANs and Differentiable Trajectory Rasterization
Authors:
Eason Wang,
Henggang Cui,
Sai Yalamanchi,
Mohana Moorthy,
Fang-Chieh Chou,
Nemanja Djuric
Abstract:
One of the most critical pieces of the self-driving puzzle is the task of predicting future movement of surrounding traffic actors, which allows the autonomous vehicle to safely and effectively plan its future route in a complex world. Recently, a number of algorithms have been proposed to address this important problem, spurred by a growing interest of researchers from both industry and academia.…
▽ More
One of the most critical pieces of the self-driving puzzle is the task of predicting future movement of surrounding traffic actors, which allows the autonomous vehicle to safely and effectively plan its future route in a complex world. Recently, a number of algorithms have been proposed to address this important problem, spurred by a growing interest of researchers from both industry and academia. Methods based on top-down scene rasterization on one side and Generative Adversarial Networks (GANs) on the other have shown to be particularly successful, obtaining state-of-the-art accuracies on the task of traffic movement prediction. In this paper we build upon these two directions and propose a raster-based conditional GAN architecture, powered by a novel differentiable rasterizer module at the input of the conditional discriminator that maps generated trajectories into the raster space in a differentiable manner. This simplifies the task for the discriminator as trajectories that are not scene-compliant are easier to discern, and allows the gradients to flow back forcing the generator to output better, more realistic trajectories. We evaluated the proposed method on a large-scale, real-world data set, showing that it outperforms state-of-the-art GAN-based baselines.
△ Less
Submitted 11 June, 2020; v1 submitted 13 April, 2020;
originally announced April 2020.
-
Long-term Prediction of Vehicle Behavior using Short-term Uncertainty-aware Trajectories and High-definition Maps
Authors:
Sai Yalamanchi,
Tzu-Kuo Huang,
Galen Clark Haynes,
Nemanja Djuric
Abstract:
Motion prediction of surrounding vehicles is one of the most important tasks handled by a self-driving vehicle, and represents a critical step in the autonomous system necessary to ensure safety for all the involved traffic actors. Recently a number of researchers from both academic and industrial communities have focused on this important problem, proposing ideas ranging from engineered, rule-bas…
▽ More
Motion prediction of surrounding vehicles is one of the most important tasks handled by a self-driving vehicle, and represents a critical step in the autonomous system necessary to ensure safety for all the involved traffic actors. Recently a number of researchers from both academic and industrial communities have focused on this important problem, proposing ideas ranging from engineered, rule-based methods to learned approaches, shown to perform well at different prediction horizons. In particular, while for longer-term trajectories the engineered methods outperform the competing approaches, the learned methods have proven to be the best choice at short-term horizons. In this work we describe how to overcome the discrepancy between these two research directions, and propose a method that combines the disparate approaches under a single unifying framework. The resulting algorithm fuses learned, uncertainty-aware trajectories with lane-based paths in a principled manner, resulting in improved prediction accuracy at both shorter- and longer-term horizons. Experiments on real-world, large-scale data strongly suggest benefits of the proposed unified method, which outperformed the existing state-of-the-art. Moreover, following offline evaluation the proposed method was successfully tested onboard a self-driving vehicle.
△ Less
Submitted 12 June, 2020; v1 submitted 13 March, 2020;
originally announced March 2020.
-
Deep Kinematic Models for Kinematically Feasible Vehicle Trajectory Predictions
Authors:
Henggang Cui,
Thi Nguyen,
Fang-Chieh Chou,
Tsung-Han Lin,
Jeff Schneider,
David Bradley,
Nemanja Djuric
Abstract:
Self-driving vehicles (SDVs) hold great potential for improving traffic safety and are poised to positively affect the quality of life of millions of people. To unlock this potential one of the critical aspects of the autonomous technology is understanding and predicting future movement of vehicles surrounding the SDV. This work presents a deep-learning-based method for kinematically feasible moti…
▽ More
Self-driving vehicles (SDVs) hold great potential for improving traffic safety and are poised to positively affect the quality of life of millions of people. To unlock this potential one of the critical aspects of the autonomous technology is understanding and predicting future movement of vehicles surrounding the SDV. This work presents a deep-learning-based method for kinematically feasible motion prediction of such traffic actors. Previous work did not explicitly encode vehicle kinematics and instead relied on the models to learn the constraints directly from the data, potentially resulting in kinematically infeasible, suboptimal trajectory predictions. To address this issue we propose a method that seamlessly combines ideas from the AI with physically grounded vehicle motion models. In this way we employ best of the both worlds, coupling powerful learning models with strong feasibility guarantees for their outputs. The proposed approach is general, being applicable to any type of learning method. Extensive experiments using deep convnets on real-world data strongly indicate its benefits, outperforming the existing state-of-the-art.
△ Less
Submitted 24 October, 2020; v1 submitted 1 August, 2019;
originally announced August 2019.
-
Predicting Motion of Vulnerable Road Users using High-Definition Maps and Efficient ConvNets
Authors:
Fang-Chieh Chou,
Tsung-Han Lin,
Henggang Cui,
Vladan Radosavljevic,
Thi Nguyen,
Tzu-Kuo Huang,
Matthew Niedoba,
Jeff Schneider,
Nemanja Djuric
Abstract:
Following detection and tracking of traffic actors, prediction of their future motion is the next critical component of a self-driving vehicle (SDV) technology, allowing the SDV to operate safely and efficiently in its environment. This is particularly important when it comes to vulnerable road users (VRUs), such as pedestrians and bicyclists. These actors need to be handled with special care due…
▽ More
Following detection and tracking of traffic actors, prediction of their future motion is the next critical component of a self-driving vehicle (SDV) technology, allowing the SDV to operate safely and efficiently in its environment. This is particularly important when it comes to vulnerable road users (VRUs), such as pedestrians and bicyclists. These actors need to be handled with special care due to an increased risk of injury, as well as the fact that their behavior is less predictable than that of motorized actors. To address this issue, in the current study we present a deep learning-based method for predicting VRU movement, where we rasterize high-definition maps and actor's surroundings into a bird's-eye view image used as an input to deep convolutional networks. In addition, we propose a fast architecture suitable for real-time inference, and perform an ablation study of various rasterization approaches to find the optimal choice for accurate prediction. The results strongly indicate benefits of using the proposed approach for motion prediction of VRUs, both in terms of accuracy and latency.
△ Less
Submitted 11 June, 2020; v1 submitted 20 June, 2019;
originally announced June 2019.
-
Multimodal Trajectory Predictions for Autonomous Driving using Deep Convolutional Networks
Authors:
Henggang Cui,
Vladan Radosavljevic,
Fang-Chieh Chou,
Tsung-Han Lin,
Thi Nguyen,
Tzu-Kuo Huang,
Jeff Schneider,
Nemanja Djuric
Abstract:
Autonomous driving presents one of the largest problems that the robotics and artificial intelligence communities are facing at the moment, both in terms of difficulty and potential societal impact. Self-driving vehicles (SDVs) are expected to prevent road accidents and save millions of lives while improving the livelihood and life quality of many more. However, despite large interest and a number…
▽ More
Autonomous driving presents one of the largest problems that the robotics and artificial intelligence communities are facing at the moment, both in terms of difficulty and potential societal impact. Self-driving vehicles (SDVs) are expected to prevent road accidents and save millions of lives while improving the livelihood and life quality of many more. However, despite large interest and a number of industry players working in the autonomous domain, there still remains more to be done in order to develop a system capable of operating at a level comparable to best human drivers. One reason for this is high uncertainty of traffic behavior and large number of situations that an SDV may encounter on the roads, making it very difficult to create a fully generalizable system. To ensure safe and efficient operations, an autonomous vehicle is required to account for this uncertainty and to anticipate a multitude of possible behaviors of traffic actors in its surrounding. We address this critical problem and present a method to predict multiple possible trajectories of actors while also estimating their probabilities. The method encodes each actor's surrounding context into a raster image, used as input by deep convolutional networks to automatically derive relevant features for the task. Following extensive offline evaluation and comparison to state-of-the-art baselines, the method was successfully tested on SDVs in closed-course tests.
△ Less
Submitted 1 March, 2019; v1 submitted 18 September, 2018;
originally announced September 2018.
-
Uncertainty-aware Short-term Motion Prediction of Traffic Actors for Autonomous Driving
Authors:
Nemanja Djuric,
Vladan Radosavljevic,
Henggang Cui,
Thi Nguyen,
Fang-Chieh Chou,
Tsung-Han Lin,
Nitin Singh,
Jeff Schneider
Abstract:
We address one of the crucial aspects necessary for safe and efficient operations of autonomous vehicles, namely predicting future state of traffic actors in the autonomous vehicle's surroundings. We introduce a deep learning-based approach that takes into account a current world state and produces raster images of each actor's vicinity. The rasters are then used as inputs to deep convolutional mo…
▽ More
We address one of the crucial aspects necessary for safe and efficient operations of autonomous vehicles, namely predicting future state of traffic actors in the autonomous vehicle's surroundings. We introduce a deep learning-based approach that takes into account a current world state and produces raster images of each actor's vicinity. The rasters are then used as inputs to deep convolutional models to infer future movement of actors while also accounting for and capturing inherent uncertainty of the prediction task. Extensive experiments on real-world data strongly suggest benefits of the proposed approach. Moreover, following completion of the offline tests the system was successfully tested onboard self-driving vehicles.
△ Less
Submitted 4 March, 2020; v1 submitted 17 August, 2018;
originally announced August 2018.
-
Proceedings of the 2017 AdKDD & TargetAd Workshop
Authors:
Abraham Bagherjeiran,
Nemanja Djuric,
Mihajlo Grbovic,
Kuang-Chih Lee,
Kun Liu,
Vladan Radosavljevic,
Suju Rajan
Abstract:
Proceedings of the 2017 AdKDD and TargetAd Workshop held in conjunction with the 23rd ACM SIGKDD Conference on Knowledge Discovery and Data Mining Halifax, Nova Scotia, Canada.
Proceedings of the 2017 AdKDD and TargetAd Workshop held in conjunction with the 23rd ACM SIGKDD Conference on Knowledge Discovery and Data Mining Halifax, Nova Scotia, Canada.
△ Less
Submitted 11 July, 2017;
originally announced July 2017.
-
Travel the World: Analyzing and Predicting Booking Behavior using E-mail Travel Receipts
Authors:
Nemanja Djuric,
Mihajlo Grbovic,
Vladan Radosavljevic,
Jaikit Savla,
Varun Bhagwan,
Doug Sharp
Abstract:
Tourism industry has grown tremendously in the previous several decades. Despite its global impact, there still remain a number of open questions related to better understanding of tourists and their habits. In this work we analyze the largest data set of travel receipts considered thus far, and focus on exploring and modeling booking behavior of online customers. We extract useful, actionable ins…
▽ More
Tourism industry has grown tremendously in the previous several decades. Despite its global impact, there still remain a number of open questions related to better understanding of tourists and their habits. In this work we analyze the largest data set of travel receipts considered thus far, and focus on exploring and modeling booking behavior of online customers. We extract useful, actionable insights into the booking behavior, and tackle the task of predicting the booking time. The presented results can be directly used to improve booking experience of customers and optimize targeting campaigns of travel operators.
△ Less
Submitted 28 June, 2016;
originally announced September 2016.
-
Scalable Semantic Matching of Queries to Ads in Sponsored Search Advertising
Authors:
Mihajlo Grbovic,
Nemanja Djuric,
Vladan Radosavljevic,
Fabrizio Silvestri,
Ricardo Baeza-Yates,
Andrew Feng,
Erik Ordentlich,
Lee Yang,
Gavin Owens
Abstract:
Sponsored search represents a major source of revenue for web search engines. This popular advertising model brings a unique possibility for advertisers to target users' immediate intent communicated through a search query, usually by displaying their ads alongside organic search results for queries deemed relevant to their products or services. However, due to a large number of unique queries it…
▽ More
Sponsored search represents a major source of revenue for web search engines. This popular advertising model brings a unique possibility for advertisers to target users' immediate intent communicated through a search query, usually by displaying their ads alongside organic search results for queries deemed relevant to their products or services. However, due to a large number of unique queries it is challenging for advertisers to identify all such relevant queries. For this reason search engines often provide a service of advanced matching, which automatically finds additional relevant queries for advertisers to bid on. We present a novel advanced matching approach based on the idea of semantic embeddings of queries and ads. The embeddings were learned using a large data set of user search sessions, consisting of search queries, clicked ads and search links, while utilizing contextual information such as dwell time and skipped ads. To address the large-scale nature of our problem, both in terms of data and vocabulary size, we propose a novel distributed algorithm for training of the embeddings. Finally, we present an approach for overcoming a cold-start problem associated with new ads and queries. We report results of editorial evaluation and online tests on actual search traffic. The results show that our approach significantly outperforms baselines in terms of relevance, coverage, and incremental revenue. Lastly, we open-source learned query embeddings to be used by researchers in computational advertising and related fields.
△ Less
Submitted 6 July, 2016;
originally announced July 2016.
-
Non-linear Label Ranking for Large-scale Prediction of Long-Term User Interests
Authors:
Nemanja Djuric,
Mihajlo Grbovic,
Vladan Radosavljevic,
Narayan Bhamidipati,
Slobodan Vucetic
Abstract:
We consider the problem of personalization of online services from the viewpoint of ad targeting, where we seek to find the best ad categories to be shown to each user, resulting in improved user experience and increased advertisers' revenue. We propose to address this problem as a task of ranking the ad categories depending on a user's preference, and introduce a novel label ranking approach capa…
▽ More
We consider the problem of personalization of online services from the viewpoint of ad targeting, where we seek to find the best ad categories to be shown to each user, resulting in improved user experience and increased advertisers' revenue. We propose to address this problem as a task of ranking the ad categories depending on a user's preference, and introduce a novel label ranking approach capable of efficiently learning non-linear, highly accurate models in large-scale settings. Experiments on a real-world advertising data set with more than 3.2 million users show that the proposed algorithm outperforms the existing solutions in terms of both rank loss and top-K retrieval performance, strongly suggesting the benefit of using the proposed model on large-scale ranking problems.
△ Less
Submitted 29 June, 2016;
originally announced June 2016.
-
Hierarchical Neural Language Models for Joint Representation of Streaming Documents and their Content
Authors:
Nemanja Djuric,
Hao Wu,
Vladan Radosavljevic,
Mihajlo Grbovic,
Narayan Bhamidipati
Abstract:
We consider the problem of learning distributed representations for documents in data streams. The documents are represented as low-dimensional vectors and are jointly learned with distributed vector representations of word tokens using a hierarchical framework with two embedded neural language models. In particular, we exploit the context of documents in streams and use one of the language models…
▽ More
We consider the problem of learning distributed representations for documents in data streams. The documents are represented as low-dimensional vectors and are jointly learned with distributed vector representations of word tokens using a hierarchical framework with two embedded neural language models. In particular, we exploit the context of documents in streams and use one of the language models to model the document sequences, and the other to model word sequences within them. The models learn continuous vector representations for both word tokens and documents such that semantically similar documents and words are close in a common vector space. We discuss extensions to our model, which can be applied to personalized recommendation and social relationship mining by adding further user layers to the hierarchy, thus learning user-specific vectors to represent individual preferences. We validated the learned representations on a public movie rating data set from MovieLens, as well as on a large-scale Yahoo News data comprising three months of user activity logs collected on Yahoo servers. The results indicate that the proposed model can learn useful representations of both documents and word tokens, outperforming the current state-of-the-art by a large margin.
△ Less
Submitted 28 June, 2016;
originally announced June 2016.
-
Network-Efficient Distributed Word2vec Training System for Large Vocabularies
Authors:
Erik Ordentlich,
Lee Yang,
Andy Feng,
Peter Cnudde,
Mihajlo Grbovic,
Nemanja Djuric,
Vladan Radosavljevic,
Gavin Owens
Abstract:
Word2vec is a popular family of algorithms for unsupervised training of dense vector representations of words on large text corpuses. The resulting vectors have been shown to capture semantic relationships among their corresponding words, and have shown promise in reducing a number of natural language processing (NLP) tasks to mathematical operations on these vectors. While heretofore applications…
▽ More
Word2vec is a popular family of algorithms for unsupervised training of dense vector representations of words on large text corpuses. The resulting vectors have been shown to capture semantic relationships among their corresponding words, and have shown promise in reducing a number of natural language processing (NLP) tasks to mathematical operations on these vectors. While heretofore applications of word2vec have centered around vocabularies with a few million words, wherein the vocabulary is the set of words for which vectors are simultaneously trained, novel applications are emerging in areas outside of NLP with vocabularies comprising several 100 million words. Existing word2vec training systems are impractical for training such large vocabularies as they either require that the vectors of all vocabulary words be stored in the memory of a single server or suffer unacceptable training latency due to massive network data transfer. In this paper, we present a novel distributed, parallel training system that enables unprecedented practical training of vectors for vocabularies with several 100 million words on a shared cluster of commodity servers, using far less network traffic than the existing solutions. We evaluate the proposed system on a benchmark dataset, showing that the quality of vectors does not degrade relative to non-distributed training. Finally, for several quarters, the system has been deployed for the purpose of matching queries to ads in Gemini, the sponsored search advertising platform at Yahoo, resulting in significant improvement of business metrics.
△ Less
Submitted 27 June, 2016;
originally announced June 2016.
-
Gender and Interest Targeting for Sponsored Post Advertising at Tumblr
Authors:
Mihajlo Grbovic,
Vladan Radosavljevic,
Nemanja Djuric,
Narayan Bhamidipati,
Ananth Nagarajan
Abstract:
As one of the leading platforms for creative content, Tumblr offers advertisers a unique way of creating brand identity. Advertisers can tell their story through images, animation, text, music, video, and more, and promote that content by sponsoring it to appear as an advertisement in the streams of Tumblr users. In this paper we present a framework that enabled one of the key targeted advertising…
▽ More
As one of the leading platforms for creative content, Tumblr offers advertisers a unique way of creating brand identity. Advertisers can tell their story through images, animation, text, music, video, and more, and promote that content by sponsoring it to appear as an advertisement in the streams of Tumblr users. In this paper we present a framework that enabled one of the key targeted advertising components for Tumblr, specifically gender and interest targeting. We describe the main challenges involved in development of the framework, which include creating the ground truth for training gender prediction models, as well as mapping Tumblr content to an interest taxonomy. For purposes of inferring user interests we propose a novel semi-supervised neural language model for categorization of Tumblr content (i.e., post tags and post keywords). The model was trained on a large-scale data set consisting of 6.8 billion user posts, with very limited amount of categorized keywords, and was shown to have superior performance over the bag-of-words model. We successfully deployed gender and interest targeting capability in Yahoo production systems, delivering inference for users that cover more than 90% of daily activities at Tumblr. Online performance results indicate advantages of the proposed approach, where we observed 20% lift in user engagement with sponsored posts as compared to untargeted campaigns.
△ Less
Submitted 23 June, 2016;
originally announced June 2016.
-
E-commerce in Your Inbox: Product Recommendations at Scale
Authors:
Mihajlo Grbovic,
Vladan Radosavljevic,
Nemanja Djuric,
Narayan Bhamidipati,
Jaikit Savla,
Varun Bhagwan,
Doug Sharp
Abstract:
In recent years online advertising has become increasingly ubiquitous and effective. Advertisements shown to visitors fund sites and apps that publish digital content, manage social networks, and operate e-mail services. Given such large variety of internet resources, determining an appropriate type of advertising for a given platform has become critical to financial success. Native advertisements…
▽ More
In recent years online advertising has become increasingly ubiquitous and effective. Advertisements shown to visitors fund sites and apps that publish digital content, manage social networks, and operate e-mail services. Given such large variety of internet resources, determining an appropriate type of advertising for a given platform has become critical to financial success. Native advertisements, namely ads that are similar in look and feel to content, have had great success in news and social feeds. However, to date there has not been a winning formula for ads in e-mail clients. In this paper we describe a system that leverages user purchase history determined from e-mail receipts to deliver highly personalized product ads to Yahoo Mail users. We propose to use a novel neural language-based algorithm specifically tailored for delivering effective product recommendations, which was evaluated against baselines that included showing popular products and products predicted based on co-occurrence. We conducted rigorous offline testing using a large-scale product purchase data set, covering purchases of more than 29 million users from 172 e-commerce websites. Ads in the form of product recommendations were successfully tested on online traffic, where we observed a steady 9% lift in click-through rates over other ad formats in mail, as well as comparable lift in conversion rates. Following successful tests, the system was launched into production during the holiday season of 2014.
△ Less
Submitted 22 June, 2016;
originally announced June 2016.
-
Portrait of an Online Shopper: Understanding and Predicting Consumer Behavior
Authors:
Farshad Kooti,
Kristina Lerman,
Luca Maria Aiello,
Mihajlo Grbovic,
Nemanja Djuric,
Vladan Radosavljevic
Abstract:
Consumer spending accounts for a large fraction of the US economic activity. Increasingly, consumer activity is moving to the web, where digital traces of shopping and purchases provide valuable data about consumer behavior. We analyze these data extracted from emails and combine them with demographic information to characterize, model, and predict consumer behavior. Breaking down purchasing by ag…
▽ More
Consumer spending accounts for a large fraction of the US economic activity. Increasingly, consumer activity is moving to the web, where digital traces of shopping and purchases provide valuable data about consumer behavior. We analyze these data extracted from emails and combine them with demographic information to characterize, model, and predict consumer behavior. Breaking down purchasing by age and gender, we find that the amount of money spent on online purchases grows sharply with age, peaking in late 30s. Men are more frequent online purchasers and spend more money when compared to women. Linking online shopping to income, we find that shoppers from more affluent areas purchase more expensive items and buy them more frequently, resulting in significantly more money spent on online purchases. We also look at dynamics of purchasing behavior and observe daily and weekly cycles in purchasing behavior, similarly to other online activities.
More specifically, we observe temporal patterns in purchasing behavior suggesting shoppers have finite budgets: the more expensive an item, the longer the shopper waits since the last purchase to buy it. We also observe that shoppers who email each other purchase more similar items than socially unconnected shoppers, and this effect is particularly evident among women. Finally, we build a model to predict when shoppers will make a purchase and how much will spend on it. We find that temporal features improve prediction accuracy over competitive baselines. A better understanding of consumer behavior can help improve marketing efforts and make online shopping more pleasant and efficient.
△ Less
Submitted 15 December, 2015;
originally announced December 2015.