Skip to main content

Showing 1–9 of 9 results for author: Girgis, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.11234  [pdf, ps, other

    cs.RO cs.CV

    Poutine: Vision-Language-Trajectory Pre-Training and Reinforcement Learning Post-Training Enable Robust End-to-End Autonomous Driving

    Authors: Luke Rowe, Rodrigue de Schaetzen, Roger Girgis, Christopher Pal, Liam Paull

    Abstract: We present Poutine, a 3B-parameter vision-language model (VLM) tailored for end-to-end autonomous driving in long-tail driving scenarios. Poutine is trained in two stages. To obtain strong base driving capabilities, we train Poutine-Base in a self-supervised vision-language-trajectory (VLT) next-token prediction fashion on 83 hours of CoVLA nominal driving and 11 hours of Waymo long-tail driving.… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  2. arXiv:2505.15925  [pdf, ps, other

    cs.RO cs.AI cs.CV

    VERDI: VLM-Embedded Reasoning for Autonomous Driving

    Authors: Bowen Feng, Zhiting Mei, Baiang Li, Julian Ost, Roger Girgis, Anirudha Majumdar, Felix Heide

    Abstract: While autonomous driving (AD) stacks struggle with decision making under partial observability and real-world complexity, human drivers are capable of commonsense reasoning to make near-optimal decisions with limited information. Recent work has attempted to leverage finetuned Vision-Language Models (VLMs) for trajectory planning at inference time to emulate human behavior. Despite their success i… ▽ More

    Submitted 23 May, 2025; v1 submitted 21 May, 2025; originally announced May 2025.

  3. arXiv:2503.22496  [pdf, other

    cs.RO cs.CV

    Scenario Dreamer: Vectorized Latent Diffusion for Generating Driving Simulation Environments

    Authors: Luke Rowe, Roger Girgis, Anthony Gosselin, Liam Paull, Christopher Pal, Felix Heide

    Abstract: We introduce Scenario Dreamer, a fully data-driven generative simulator for autonomous vehicle planning that generates both the initial traffic scene - comprising a lane graph and agent bounding boxes - and closed-loop agent behaviours. Existing methods for generating driving simulation environments encode the initial traffic scene as a rasterized image and, as such, require parameter-heavy networ… ▽ More

    Submitted 28 March, 2025; originally announced March 2025.

    Comments: CVPR 2025

  4. arXiv:2403.19918  [pdf, other

    cs.RO cs.AI cs.LG

    CtRL-Sim: Reactive and Controllable Driving Agents with Offline Reinforcement Learning

    Authors: Luke Rowe, Roger Girgis, Anthony Gosselin, Bruno Carrez, Florian Golemo, Felix Heide, Liam Paull, Christopher Pal

    Abstract: Evaluating autonomous vehicle stacks (AVs) in simulation typically involves replaying driving logs from real-world recorded traffic. However, agents replayed from offline data are not reactive and hard to intuitively control. Existing approaches address these challenges by proposing methods that rely on heuristics or generative models of real-world data but these approaches either lack realism or… ▽ More

    Submitted 14 October, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: CoRL 2024

  5. arXiv:2112.12228  [pdf, other

    cs.LG

    Direct Behavior Specification via Constrained Reinforcement Learning

    Authors: Julien Roy, Roger Girgis, Joshua Romoff, Pierre-Luc Bacon, Christopher Pal

    Abstract: The standard formulation of Reinforcement Learning lacks a practical way of specifying what are admissible and forbidden behaviors. Most often, practitioners go about the task of behavior specification by manually engineering the reward function, a counter-intuitive process that requires several iterations and is prone to reward hacking by the agent. In this work, we argue that constrained RL, whi… ▽ More

    Submitted 18 June, 2022; v1 submitted 22 December, 2021; originally announced December 2021.

  6. arXiv:2104.00563  [pdf, other

    cs.RO cs.AI cs.CV cs.LG cs.MA

    Latent Variable Sequential Set Transformers For Joint Multi-Agent Motion Prediction

    Authors: Roger Girgis, Florian Golemo, Felipe Codevilla, Martin Weiss, Jim Aldon D'Souza, Samira Ebrahimi Kahou, Felix Heide, Christopher Pal

    Abstract: Robust multi-agent trajectory prediction is essential for the safe control of robotic systems. A major challenge is to efficiently learn a representation that approximates the true joint distribution of contextual, social, and temporal information to enable planning. We propose Latent Variable Sequential Set Transformers which are encoder-decoder architectures that generate scene-consistent multi-… ▽ More

    Submitted 10 February, 2022; v1 submitted 19 February, 2021; originally announced April 2021.

    Comments: 26 pages, 17 figures, 8 tables

  7. arXiv:1910.13249  [pdf, other

    cs.CV cs.HC cs.LG

    Navigation Agents for the Visually Impaired: A Sidewalk Simulator and Experiments

    Authors: Martin Weiss, Simon Chamorro, Roger Girgis, Margaux Luck, Samira E. Kahou, Joseph P. Cohen, Derek Nowrouzezahrai, Doina Precup, Florian Golemo, Chris Pal

    Abstract: Millions of blind and visually-impaired (BVI) people navigate urban environments every day, using smartphones for high-level path-planning and white canes or guide dogs for local information. However, many BVI people still struggle to travel to new places. In our endeavor to create a navigation assistant for the BVI, we found that existing Reinforcement Learning (RL) environments were unsuitable f… ▽ More

    Submitted 29 October, 2019; originally announced October 2019.

    Comments: Accepted at CoRL2019. Code & video available at https://mweiss17.github.io/SEVN/

  8. arXiv:1811.10120  [pdf, other

    cs.HC cs.AI

    A Survey of Mobile Computing for the Visually Impaired

    Authors: Martin Weiss, Margaux Luck, Roger Girgis, Chris Pal, Joseph Paul Cohen

    Abstract: The number of visually impaired or blind (VIB) people in the world is estimated at several hundred million. Based on a series of interviews with the VIB and developers of assistive technology, this paper provides a survey of machine-learning based mobile applications and identifies the most relevant applications. We discuss the functionality of these apps, how they align with the needs and require… ▽ More

    Submitted 27 November, 2018; v1 submitted 25 November, 2018; originally announced November 2018.

  9. arXiv:1004.0771  [pdf

    cs.NI

    Performance evaluation of a new route optimization technique for mobile IP

    Authors: Moheb R Girgis, Tarek M Mahmoud, Youssef S Takroni, Hassan S Hassan

    Abstract: Mobile ip (mip) is an internet protocol that allows mobile nodes to have continuous network connectivity to the internet without changing their ip addresses while moving to other networks. The packets sent from correspondent node (cn) to a mobile node (mn) go first through the mobile node's home agent (ha), then the ha tunnels them to the mn's foreign network. One of the main problems in the origi… ▽ More

    Submitted 6 April, 2010; originally announced April 2010.

    Comments: 11Pages

    Journal ref: International Journal of Network Security & Its Applications 1.3 (2009) 63-73