-
EgoToM: Benchmarking Theory of Mind Reasoning from Egocentric Videos
Authors:
Yuxuan Li,
Vijay Veerabadran,
Michael L. Iuzzolino,
Brett D. Roads,
Asli Celikyilmaz,
Karl Ridgeway
Abstract:
We introduce EgoToM, a new video question-answering benchmark that extends Theory-of-Mind (ToM) evaluation to egocentric domains. Using a causal ToM model, we generate multi-choice video QA instances for the Ego4D dataset to benchmark the ability to predict a camera wearer's goals, beliefs, and next actions. We study the performance of both humans and state of the art multimodal large language mod…
▽ More
We introduce EgoToM, a new video question-answering benchmark that extends Theory-of-Mind (ToM) evaluation to egocentric domains. Using a causal ToM model, we generate multi-choice video QA instances for the Ego4D dataset to benchmark the ability to predict a camera wearer's goals, beliefs, and next actions. We study the performance of both humans and state of the art multimodal large language models (MLLMs) on these three interconnected inference problems. Our evaluation shows that MLLMs achieve close to human-level accuracy on inferring goals from egocentric videos. However, MLLMs (including the largest ones we tested with over 100B parameters) fall short of human performance when inferring the camera wearers' in-the-moment belief states and future actions that are most consistent with the unseen video future. We believe that our results will shape the future design of an important class of egocentric digital assistants which are equipped with a reasonable model of the user's internal mental states.
△ Less
Submitted 28 March, 2025;
originally announced March 2025.
-
Less or More: Towards Glanceable Explanations for LLM Recommendations Using Ultra-Small Devices
Authors:
Xinru Wang,
Mengjie Yu,
Hannah Nguyen,
Michael Iuzzolino,
Tianyi Wang,
Peiqi Tang,
Natasha Lynova,
Co Tran,
Ting Zhang,
Naveen Sendhilnathan,
Hrvoje Benko,
Haijun Xia,
Tanya Jonker
Abstract:
Large Language Models (LLMs) have shown remarkable potential in recommending everyday actions as personal AI assistants, while Explainable AI (XAI) techniques are being increasingly utilized to help users understand why a recommendation is given. Personal AI assistants today are often located on ultra-small devices such as smartwatches, which have limited screen space. The verbosity of LLM-generat…
▽ More
Large Language Models (LLMs) have shown remarkable potential in recommending everyday actions as personal AI assistants, while Explainable AI (XAI) techniques are being increasingly utilized to help users understand why a recommendation is given. Personal AI assistants today are often located on ultra-small devices such as smartwatches, which have limited screen space. The verbosity of LLM-generated explanations, however, makes it challenging to deliver glanceable LLM explanations on such ultra-small devices. To address this, we explored 1) spatially structuring an LLM's explanation text using defined contextual components during prompting and 2) presenting temporally adaptive explanations to users based on confidence levels. We conducted a user study to understand how these approaches impacted user experiences when interacting with LLM recommendations and explanations on ultra-small devices. The results showed that structured explanations reduced users' time to action and cognitive load when reading an explanation. Always-on structured explanations increased users' acceptance of AI recommendations. However, users were less satisfied with structured explanations compared to unstructured ones due to their lack of sufficient, readable details. Additionally, adaptively presenting structured explanations was less effective at improving user perceptions of the AI compared to the always-on structured explanations. Together with users' interview feedback, the results led to design implications to be mindful of when personalizing the content and timing of LLM explanations that are displayed on ultra-small devices.
△ Less
Submitted 26 February, 2025;
originally announced February 2025.
-
Multiscale Video Pretraining for Long-Term Activity Forecasting
Authors:
Reuben Tan,
Matthias De Lange,
Michael Iuzzolino,
Bryan A. Plummer,
Kate Saenko,
Karl Ridgeway,
Lorenzo Torresani
Abstract:
Long-term activity forecasting is an especially challenging research problem because it requires understanding the temporal relationships between observed actions, as well as the variability and complexity of human activities. Despite relying on strong supervision via expensive human annotations, state-of-the-art forecasting approaches often generalize poorly to unseen data. To alleviate this issu…
▽ More
Long-term activity forecasting is an especially challenging research problem because it requires understanding the temporal relationships between observed actions, as well as the variability and complexity of human activities. Despite relying on strong supervision via expensive human annotations, state-of-the-art forecasting approaches often generalize poorly to unseen data. To alleviate this issue, we propose Multiscale Video Pretraining (MVP), a novel self-supervised pretraining approach that learns robust representations for forecasting by learning to predict contextualized representations of future video clips over multiple timescales. MVP is based on our observation that actions in videos have a multiscale nature, where atomic actions typically occur at a short timescale and more complex actions may span longer timescales. We compare MVP to state-of-the-art self-supervised video learning approaches on downstream long-term forecasting tasks including long-term action anticipation and video summary prediction. Our comprehensive experiments across the Ego4D and Epic-Kitchens-55/100 datasets demonstrate that MVP out-performs state-of-the-art methods by significant margins. Notably, MVP obtains a relative performance gain of over 20% accuracy in video summary forecasting over existing methods.
△ Less
Submitted 24 July, 2023;
originally announced July 2023.
-
EgoAdapt: A multi-stream evaluation study of adaptation to real-world egocentric user video
Authors:
Matthias De Lange,
Hamid Eghbalzadeh,
Reuben Tan,
Michael Iuzzolino,
Franziska Meier,
Karl Ridgeway
Abstract:
In egocentric action recognition a single population model is typically trained and subsequently embodied on a head-mounted device, such as an augmented reality headset. While this model remains static for new users and environments, we introduce an adaptive paradigm of two phases, where after pretraining a population model, the model adapts on-device and online to the user's experience. This sett…
▽ More
In egocentric action recognition a single population model is typically trained and subsequently embodied on a head-mounted device, such as an augmented reality headset. While this model remains static for new users and environments, we introduce an adaptive paradigm of two phases, where after pretraining a population model, the model adapts on-device and online to the user's experience. This setting is highly challenging due to the change from population to user domain and the distribution shifts in the user's data stream. Coping with the latter in-stream distribution shifts is the focus of continual learning, where progress has been rooted in controlled benchmarks but challenges faced in real-world applications often remain unaddressed. We introduce EgoAdapt, a benchmark for real-world egocentric action recognition that facilitates our two-phased adaptive paradigm, and real-world challenges naturally occur in the egocentric video streams from Ego4d, such as long-tailed action distributions and large-scale classification over 2740 actions. We introduce an evaluation framework that directly exploits the user's data stream with new metrics to measure the adaptation gain over the population model, online generalization, and hindsight performance. In contrast to single-stream evaluation in existing works, our framework proposes a meta-evaluation that aggregates the results from 50 independent user streams. We provide an extensive empirical study for finetuning and experience replay.
△ Less
Submitted 11 July, 2023;
originally announced July 2023.
-
Pretrained Language Models as Visual Planners for Human Assistance
Authors:
Dhruvesh Patel,
Hamid Eghbalzadeh,
Nitin Kamra,
Michael Louis Iuzzolino,
Unnat Jain,
Ruta Desai
Abstract:
In our pursuit of advancing multi-modal AI assistants capable of guiding users to achieve complex multi-step goals, we propose the task of "Visual Planning for Assistance (VPA)". Given a succinct natural language goal, e.g., "make a shelf", and a video of the user's progress so far, the aim of VPA is to devise a plan, i.e., a sequence of actions such as "sand shelf", "paint shelf", etc. to realize…
▽ More
In our pursuit of advancing multi-modal AI assistants capable of guiding users to achieve complex multi-step goals, we propose the task of "Visual Planning for Assistance (VPA)". Given a succinct natural language goal, e.g., "make a shelf", and a video of the user's progress so far, the aim of VPA is to devise a plan, i.e., a sequence of actions such as "sand shelf", "paint shelf", etc. to realize the specified goal. This requires assessing the user's progress from the (untrimmed) video, and relating it to the requirements of natural language goal, i.e., which actions to select and in what order? Consequently, this requires handling long video history and arbitrarily complex action dependencies. To address these challenges, we decompose VPA into video action segmentation and forecasting. Importantly, we experiment by formulating the forecasting step as a multi-modal sequence modeling problem, allowing us to leverage the strength of pre-trained LMs (as the sequence model). This novel approach, which we call Visual Language Model based Planner (VLaMP), outperforms baselines across a suite of metrics that gauge the quality of the generated plans. Furthermore, through comprehensive ablations, we also isolate the value of each component--language pre-training, visual observations, and goal information. We have open-sourced all the data, model checkpoints, and training code.
△ Less
Submitted 26 August, 2023; v1 submitted 17 April, 2023;
originally announced April 2023.
-
Action Dynamics Task Graphs for Learning Plannable Representations of Procedural Tasks
Authors:
Weichao Mao,
Ruta Desai,
Michael Louis Iuzzolino,
Nitin Kamra
Abstract:
Given video demonstrations and paired narrations of an at-home procedural task such as changing a tire, we present an approach to extract the underlying task structure -- relevant actions and their temporal dependencies -- via action-centric task graphs. Learnt structured representations from our method, Action Dynamics Task Graphs (ADTG), can then be used for understanding such tasks in unseen vi…
▽ More
Given video demonstrations and paired narrations of an at-home procedural task such as changing a tire, we present an approach to extract the underlying task structure -- relevant actions and their temporal dependencies -- via action-centric task graphs. Learnt structured representations from our method, Action Dynamics Task Graphs (ADTG), can then be used for understanding such tasks in unseen videos of humans performing them. Furthermore, ADTG can enable providing user-centric guidance to humans in these tasks, either for performing them better or for learning new tasks. Specifically, we show how ADTG can be used for: (1) tracking an ongoing task, (2) recommending next actions, and (3) planning a sequence of actions to accomplish a procedural task. We compare against state-of-the-art Neural Task Graph method and demonstrate substantial gains on 18 procedural tasks from the CrossTask dataset, including 30.1% improvement in task tracking accuracy and 20.3% accuracy gain in next action prediction.
△ Less
Submitted 11 January, 2023;
originally announced February 2023.
-
Software solutions for numerical modeling of wide-field telescopes
Authors:
Salvatore Savarese,
Pietro Schipani,
Giulio Capasso,
Mirko Colapietro,
Sergio D'Orsi,
Marcella Iuzzolino,
Laurent Marty,
Francesco Perrotta,
Giacomo Basile
Abstract:
This paper presents an integrated modeling software to analyze the PSF of wide-field telescopes affected by misalignments. Even relatively small misalignments in the optical system of a telescope can significantly deteriorate the image quality by introducing large aberrations. In particular, wide-field telescopes are critically affected by these errors, insomuch that usually a closed-loop active o…
▽ More
This paper presents an integrated modeling software to analyze the PSF of wide-field telescopes affected by misalignments. Even relatively small misalignments in the optical system of a telescope can significantly deteriorate the image quality by introducing large aberrations. In particular, wide-field telescopes are critically affected by these errors, insomuch that usually a closed-loop active optics system is adopted for a continuous correction, rather than for sporadic alignment procedures. Typically, a ray-tracing software such as Zemax OpticStudio is employed to accurately analyze the system during the optical design. However, an analytical model of the optical system is preferable when the PSF of the telescope must be reconstructed quickly for algorithmic purposes. Here the analytical model is derived through a hybrid approach and developed in a custom software package, designed to be general and flexible in order to be tailored to different optical configurations. First, leveraging on the Zemax OpticStudio API, the ray-tracing software is integrated into a Matlab pipeline. This allows to perform a statistical analysis by automatically simulating the system response in a variety of misaligned working conditions. Then, the resulting dataset is employed to populate a database of parameters describing the model.
△ Less
Submitted 13 December, 2021;
originally announced December 2021.
-
Online Unsupervised Learning of Visual Representations and Categories
Authors:
Mengye Ren,
Tyler R. Scott,
Michael L. Iuzzolino,
Michael C. Mozer,
Richard Zemel
Abstract:
Real world learning scenarios involve a nonstationary distribution of classes with sequential dependencies among the samples, in contrast to the standard machine learning formulation of drawing samples independently from a fixed, typically uniform distribution. Furthermore, real world interactions demand learning on-the-fly from few or no class labels. In this work, we propose an unsupervised mode…
▽ More
Real world learning scenarios involve a nonstationary distribution of classes with sequential dependencies among the samples, in contrast to the standard machine learning formulation of drawing samples independently from a fixed, typically uniform distribution. Furthermore, real world interactions demand learning on-the-fly from few or no class labels. In this work, we propose an unsupervised model that simultaneously performs online visual representation learning and few-shot learning of new categories without relying on any class labels. Our model is a prototype-based memory network with a control component that determines when to form a new class prototype. We formulate it as an online mixture model, where components are created with only a single new example, and assignments do not have to be balanced, which permits an approximation to natural imbalanced distributions from uncurated raw data. Learning includes a contrastive loss that encourages different views of the same image to be assigned to the same prototype. The result is a mechanism that forms categorical representations of objects in nonstationary environments. Experiments show that our method can learn from an online stream of visual input data and its learned representations are significantly better at category recognition compared to state-of-the-art self-supervised learning methods.
△ Less
Submitted 28 May, 2022; v1 submitted 12 September, 2021;
originally announced September 2021.
-
Improving Anytime Prediction with Parallel Cascaded Networks and a Temporal-Difference Loss
Authors:
Michael L. Iuzzolino,
Michael C. Mozer,
Samy Bengio
Abstract:
Although deep feedforward neural networks share some characteristics with the primate visual system, a key distinction is their dynamics. Deep nets typically operate in serial stages wherein each layer completes its computation before processing begins in subsequent layers. In contrast, biological systems have cascaded dynamics: information propagates from neurons at all layers in parallel but tra…
▽ More
Although deep feedforward neural networks share some characteristics with the primate visual system, a key distinction is their dynamics. Deep nets typically operate in serial stages wherein each layer completes its computation before processing begins in subsequent layers. In contrast, biological systems have cascaded dynamics: information propagates from neurons at all layers in parallel but transmission occurs gradually over time, leading to speed-accuracy trade offs even in feedforward architectures. We explore the consequences of biologically inspired parallel hardware by constructing cascaded ResNets in which each residual block has propagation delays but all blocks update in parallel in a stateful manner. Because information transmitted through skip connections avoids delays, the functional depth of the architecture increases over time, yielding anytime predictions that improve with internal-processing time. We introduce a temporal-difference training loss that achieves a strictly superior speed-accuracy profile over standard losses and enables the cascaded architecture to outperform state-of-the-art anytime-prediction methods. The cascaded architecture has intriguing properties, including: it classifies typical instances more rapidly than atypical instances; it is more robust to both persistent and transient noise than is a conventional ResNet; and its time-varying output trace provides a signal that can be exploited to improve information processing and inference.
△ Less
Submitted 2 November, 2021; v1 submitted 19 February, 2021;
originally announced February 2021.
-
Wandering Within a World: Online Contextualized Few-Shot Learning
Authors:
Mengye Ren,
Michael L. Iuzzolino,
Michael C. Mozer,
Richard S. Zemel
Abstract:
We aim to bridge the gap between typical human and machine-learning environments by extending the standard framework of few-shot learning to an online, continual setting. In this setting, episodes do not have separate training and testing phases, and instead models are evaluated online while learning novel classes. As in the real world, where the presence of spatiotemporal context helps us retriev…
▽ More
We aim to bridge the gap between typical human and machine-learning environments by extending the standard framework of few-shot learning to an online, continual setting. In this setting, episodes do not have separate training and testing phases, and instead models are evaluated online while learning novel classes. As in the real world, where the presence of spatiotemporal context helps us retrieve learned skills in the past, our online few-shot learning setting also features an underlying context that changes throughout time. Object classes are correlated within a context and inferring the correct context can lead to better performance. Building upon this setting, we propose a new few-shot learning dataset based on large scale indoor imagery that mimics the visual experience of an agent wandering within a world. Furthermore, we convert popular few-shot learning approaches into online versions and we also propose a new contextual prototypical memory model that can make use of spatiotemporal contextual information from the recent past.
△ Less
Submitted 22 April, 2021; v1 submitted 9 July, 2020;
originally announced July 2020.
-
In Automation We Trust: Investigating the Role of Uncertainty in Active Learning Systems
Authors:
Michael L. Iuzzolino,
Tetsumichi Umada,
Nisar R. Ahmed,
Danielle A. Szafir
Abstract:
We investigate how different active learning (AL) query policies coupled with classification uncertainty visualizations affect analyst trust in automated classification systems. A current standard policy for AL is to query the oracle (e.g., the analyst) to refine labels for datapoints where the classifier has the highest uncertainty. This is an optimal policy for the automation system as it yields…
▽ More
We investigate how different active learning (AL) query policies coupled with classification uncertainty visualizations affect analyst trust in automated classification systems. A current standard policy for AL is to query the oracle (e.g., the analyst) to refine labels for datapoints where the classifier has the highest uncertainty. This is an optimal policy for the automation system as it yields maximal information gain. However, model-centric policies neglect the effects of this uncertainty on the human component of the system and the consequent manner in which the human will interact with the system post-training. In this paper, we present an empirical study evaluating how AL query policies and visualizations lending transparency to classification influence trust in automated classification of image data. We found that query policy significantly influences an analyst's trust in an image classification system, and we use these results to propose a set of oracle query policies and visualizations for use during AL training phases that can influence analyst trust in classification.
△ Less
Submitted 1 April, 2020;
originally announced April 2020.
-
The GAPS Programme at TNG XXI -- A GIARPS case-study of known young planetary candidates: confirmation of HD 285507 b and refutation of AD Leo b
Authors:
I. Carleo,
L. Malavolta,
A. F. Lanza,
M. Damasso,
S. Desidera,
F. Borsa,
M. Mallonn,
M. Pinamonti,
R. Gratton,
E. Alei,
S. Benatti,
L. Mancini,
J. Maldonado,
K. Biazzo,
M. Esposito,
G. Frustagli,
E. González-Álvarez,
G. Micela,
G. Scandariato,
A. Sozzetti,
L. Affer,
A. Bignamini,
A. S. Bonomo,
R. Claudi,
R. Cosentino
, et al. (45 additional authors not shown)
Abstract:
The existence of hot Jupiters is still not well understood. Two main channels are thought to be responsible for their current location: a smooth planet migration through the proto-planetary disk or the circularization of an initial high eccentric orbit by tidal dissipation leading to a strong decrease of the semimajor axis. Different formation scenarios result in different observable effects, such…
▽ More
The existence of hot Jupiters is still not well understood. Two main channels are thought to be responsible for their current location: a smooth planet migration through the proto-planetary disk or the circularization of an initial high eccentric orbit by tidal dissipation leading to a strong decrease of the semimajor axis. Different formation scenarios result in different observable effects, such as orbital parameters (obliquity/eccentricity), or frequency of planets at different stellar ages. In the context of the GAPS Young-Objects project, we are carrying out a radial velocity survey with the aim to search and characterize young hot-Jupiter planets. Our purpose is to put constraints on evolutionary models and establish statistical properties, such as the frequency of these planets from a homogeneous sample. Since young stars are in general magnetically very active, we performed multi-band (visible and near-infrared) spectroscopy with simultaneous GIANO-B + HARPS-N (GIARPS) observing mode at TNG. This helps to deal with stellar activity and distinguish the nature of radial velocity variations: stellar activity will introduce a wavelength-dependent radial velocity amplitude, whereas a Keplerian signal is achromatic. As a pilot study, we present here the cases of two already claimed hot Jupiters orbiting young stars: HD285507 b and AD Leo b. Our analysis of simultaneous high-precision GIARPS spectroscopic data confirms the Keplerian nature of HD285507's radial velocities variation and refines the orbital parameters of the hot Jupiter, obtaining an eccentricity consistent with a circular orbit. On the other hand, our analysis does not confirm the signal previously attributed to a planet orbiting AD Leo. This demonstrates the power of the multi-band spectroscopic technique when observing active stars.
△ Less
Submitted 24 February, 2020;
originally announced February 2020.
-
MMTM: Multimodal Transfer Module for CNN Fusion
Authors:
Hamid Reza Vaezi Joze,
Amirreza Shaban,
Michael L. Iuzzolino,
Kazuhito Koishida
Abstract:
In late fusion, each modality is processed in a separate unimodal Convolutional Neural Network (CNN) stream and the scores of each modality are fused at the end. Due to its simplicity late fusion is still the predominant approach in many state-of-the-art multimodal applications. In this paper, we present a simple neural network module for leveraging the knowledge from multiple modalities in convol…
▽ More
In late fusion, each modality is processed in a separate unimodal Convolutional Neural Network (CNN) stream and the scores of each modality are fused at the end. Due to its simplicity late fusion is still the predominant approach in many state-of-the-art multimodal applications. In this paper, we present a simple neural network module for leveraging the knowledge from multiple modalities in convolutional neural networks. The propose unit, named Multimodal Transfer Module (MMTM), can be added at different levels of the feature hierarchy, enabling slow modality fusion. Using squeeze and excitation operations, MMTM utilizes the knowledge of multiple modalities to recalibrate the channel-wise features in each CNN stream. Despite other intermediate fusion methods, the proposed module could be used for feature modality fusion in convolution layers with different spatial dimensions. Another advantage of the proposed method is that it could be added among unimodal branches with minimum changes in the their network architectures, allowing each branch to be initialized with existing pretrained weights. Experimental results show that our framework improves the recognition accuracy of well-known multimodal networks. We demonstrate state-of-the-art or competitive performance on four datasets that span the task domains of dynamic hand gesture recognition, speech enhancement, and action recognition with RGB and body joints.
△ Less
Submitted 30 March, 2020; v1 submitted 19 November, 2019;
originally announced November 2019.
-
Experimental characterization of modal noise in multimode fibers for astronomical spectrometers
Authors:
E. Oliva,
M. Rainer,
A. Tozzi,
N. Sanna,
M. Iuzzolino,
A. Brucalassi
Abstract:
Starting from our puzzling on-sky experience with the GIANO-TNG spectrometer we set up an infrared high resolution spectrometer in our laboratory and used this instrument to characterize the modal noise generated in fibers of different types (circular and octagonal) and sizes. Our experiment includes two conventional scrambling systems for fibers: a mechanical agitator and an optical double scramb…
▽ More
Starting from our puzzling on-sky experience with the GIANO-TNG spectrometer we set up an infrared high resolution spectrometer in our laboratory and used this instrument to characterize the modal noise generated in fibers of different types (circular and octagonal) and sizes. Our experiment includes two conventional scrambling systems for fibers: a mechanical agitator and an optical double scrambler. We find that the strength of the modal noise primarily depends on how the fiber is illuminated. It dramatically increases when the fiber is under-illuminated, either in the near field or in the far field. The modal noise is similar in circular and octagonal fibers. The Fourier spectrum of the noise decreases exponentially with frequency; i.e., the modal noise is not white but favors broad spectral features. Using the optical double scrambler has no effect on modal noise. The mechanical agitator has effects that vary between different types of fibers and input illuminations. In some cases this agitator has virtually no effect. In other cases, it mitigates the modal noise, but flattens the noise spectrum in Fourier space; i.e., the mechanical agitator preferentially filters the broad spectral features. Our results show that modal noise is frustratingly insensitive to the use of octagonal fibers and optical double scramblers; i.e., the conventional systems used to improve the performances of spectrographs fed via unevenly illuminated fibers. Fiber agitation may help in some cases, but its effect has to be verified on a case-by-case basis. More generally, our results indicate that the design of the fiber link feeding a spectrograph should be coupled with laboratory measurements that reproduce, as closely as possible, the conditions expected at the telescope
△ Less
Submitted 23 October, 2019; v1 submitted 22 October, 2019;
originally announced October 2019.
-
Convolutional Bipartite Attractor Networks
Authors:
Michael Iuzzolino,
Yoram Singer,
Michael C. Mozer
Abstract:
In human perception and cognition, a fundamental operation that brains perform is interpretation: constructing coherent neural states from noisy, incomplete, and intrinsically ambiguous evidence. The problem of interpretation is well matched to an early and often overlooked architecture, the attractor network---a recurrent neural net that performs constraint satisfaction, imputation of missing fea…
▽ More
In human perception and cognition, a fundamental operation that brains perform is interpretation: constructing coherent neural states from noisy, incomplete, and intrinsically ambiguous evidence. The problem of interpretation is well matched to an early and often overlooked architecture, the attractor network---a recurrent neural net that performs constraint satisfaction, imputation of missing features, and clean up of noisy data via energy minimization dynamics. We revisit attractor nets in light of modern deep learning methods and propose a convolutional bipartite architecture with a novel training loss, activation function, and connectivity constraints. We tackle larger problems than have been previously explored with attractor nets and demonstrate their potential for image completion and super-resolution. We argue that this architecture is better motivated than ever-deeper feedforward models and is a viable alternative to more costly sampling-based generative methods on a range of supervised and unsupervised tasks.
△ Less
Submitted 26 September, 2019; v1 submitted 8 June, 2019;
originally announced June 2019.
-
Virtual-to-Real-World Transfer Learning for Robots on Wilderness Trails
Authors:
Michael L. Iuzzolino,
Michael E. Walker,
Daniel Szafir
Abstract:
Robots hold promise in many scenarios involving outdoor use, such as search-and-rescue, wildlife management, and collecting data to improve environment, climate, and weather forecasting. However, autonomous navigation of outdoor trails remains a challenging problem. Recent work has sought to address this issue using deep learning. Although this approach has achieved state-of-the-art results, the d…
▽ More
Robots hold promise in many scenarios involving outdoor use, such as search-and-rescue, wildlife management, and collecting data to improve environment, climate, and weather forecasting. However, autonomous navigation of outdoor trails remains a challenging problem. Recent work has sought to address this issue using deep learning. Although this approach has achieved state-of-the-art results, the deep learning paradigm may be limited due to a reliance on large amounts of annotated training data. Collecting and curating training datasets may not be feasible or practical in many situations, especially as trail conditions may change due to seasonal weather variations, storms, and natural erosion. In this paper, we explore an approach to address this issue through virtual-to-real-world transfer learning using a variety of deep learning models trained to classify the direction of a trail in an image. Our approach utilizes synthetic data gathered from virtual environments for model training, bypassing the need to collect a large amount of real images of the outdoors. We validate our approach in three main ways. First, we demonstrate that our models achieve classification accuracies upwards of 95% on our synthetic data set. Next, we utilize our classification models in the control system of a simulated robot to demonstrate feasibility. Finally, we evaluate our models on real-world trail data and demonstrate the potential of virtual-to-real-world transfer learning.
△ Less
Submitted 16 January, 2019;
originally announced January 2019.
-
GIARPS: commissioning and first scientific results
Authors:
R. Claudi,
S. Benatti,
I. Carleo,
A. Ghedina,
J. Guerra,
F. Ghinassi,
A. Harutyunyan,
G. Micela,
E. Molinari,
E. Oliva,
M. Rainer,
A. Tozzi,
C. Baffa,
A. Baruffolo,
V. Biliotti,
N. Buchschacher,
M. Cecconi,
R. Cosentino,
G. Falcini,
D. Fantinel,
L. Fini,
E. Giani,
E. Gonzalez--Alvarez,
M. Gonzalez,
C. Gonzalez
, et al. (20 additional authors not shown)
Abstract:
GIARPS (GIAno \& haRPS) is a project devoted to have on the same focal station of the Telescopio Nazionale Galileo (TNG) both high resolution spectrographs, HARPS-N (VIS) and GIANO-B (NIR), working simultaneously. This could be considered the first and unique worldwide instrument providing cross-dispersed echelle spectroscopy at a resolution of 50,000 in the NIR range and 115,000 in the VIS and ov…
▽ More
GIARPS (GIAno \& haRPS) is a project devoted to have on the same focal station of the Telescopio Nazionale Galileo (TNG) both high resolution spectrographs, HARPS-N (VIS) and GIANO-B (NIR), working simultaneously. This could be considered the first and unique worldwide instrument providing cross-dispersed echelle spectroscopy at a resolution of 50,000 in the NIR range and 115,000 in the VIS and over in a wide spectral range ($0.383 - 2.45\ μ$m) in a single exposure. The science case is very broad, given the versatility of such an instrument and its large wavelength range. A number of outstanding science cases encompassing mainly extra-solar planet science starting from rocky planets search and hot Jupiters to atmosphere characterization can be considered. Furthermore both instruments can measure high precision radial velocities by means the simultaneous thorium technique (HARPS-N) and absorbing cell technique (GIANO-B) in a single exposure. Other science cases are also possible. GIARPS, as a brand new observing mode of the TNG started after the moving of GIANO-A (fiber fed spectrograph) from Nasmyth-A to Nasmyth-B where it was re-born as GIANO-B (no more fiber feed spectrograph). The official Commissioning finished on March 2017 and then it was offered to the community. Despite the work is not finished yet. In this paper we describe the preliminary scientific results obtained with GIANO-B and GIARPS observing mode with data taken during commissioning and first open time observations.
△ Less
Submitted 9 August, 2018;
originally announced August 2018.
-
Multi-band high resolution spectroscopy rules out the hot Jupiter BD+20 1790b - First data from the GIARPS Commissioning
Authors:
I. Carleo,
S. Benatti,
A. F. Lanza,
R. Gratton,
R. Claudi,
S. Desidera,
G. N. Mace,
S. Messina,
N. Sanna,
E. Sissa,
A. Ghedina,
F. Ghinassi,
J. Guerra,
A. Harutyunyan,
G. Micela,
E. Molinari,
E. Oliva,
A. Tozzi,
C. Baffa,
A. Baruffolo,
A. Bignamini,
N. Buchschacher,
M. Cecconi,
R. Cosentino,
M. Endl
, et al. (29 additional authors not shown)
Abstract:
Context. Stellar activity is currently challenging the detection of young planets via the radial velocity (RV) technique. Aims. We attempt to definitively discriminate the nature of the RV variations for the young active K5 star BD+20 1790, for which visible (VIS) RV measurements show divergent results on the existence of a substellar companion. Methods. We compare VIS data with high precision RVs…
▽ More
Context. Stellar activity is currently challenging the detection of young planets via the radial velocity (RV) technique. Aims. We attempt to definitively discriminate the nature of the RV variations for the young active K5 star BD+20 1790, for which visible (VIS) RV measurements show divergent results on the existence of a substellar companion. Methods. We compare VIS data with high precision RVs in the near infrared (NIR) range by using the GIANO - B and IGRINS spectrographs. In addition, we present for the first time simultaneous VIS-NIR observations obtained with GIARPS (GIANO - B and HARPS - N) at Telescopio Nazionale Galileo (TNG). Orbital RVs are achromatic, so the RV amplitude does not change at different wavelengths, while stellar activity induces wavelength-dependent RV variations, which are significantly reduced in the NIR range with respect to the VIS. Results. The NIR radial velocity measurements from GIANO - B and IGRINS show an average amplitude of about one quarter with respect to previously published VIS data, as expected when the RV jitter is due to stellar activity. Coeval multi-band photometry surprisingly shows larger amplitudes in the NIR range, explainable with a mixture of cool and hot spots in the same active region. Conclusions. In this work, the claimed massive planet around BD+20 1790 is ruled out by our data. We exploited the crucial role of multi- wavelength spectroscopy when observing young active stars: thanks to facilities like GIARPS that provide simultaneous observations, this method can reach its maximum potential.
△ Less
Submitted 3 May, 2018;
originally announced May 2018.
-
GIARPS: the unique VIS-NIR high precision radial velocity facility in this world
Authors:
Riccardo Claudi,
Serena Benatti,
Ilaria Carleo,
Adriano Ghedina,
Emilio Molinari,
Ernesto Oliva,
Andrea Tozzi,
Andrea Baruffolo,
Massimo Cecconi,
Rosario Cosentino,
Daniela Fantinel,
Luca Fini,
Francesca Ghinassi,
Manuel Gonzalez,
Raffaele Gratton,
Jose Guerra,
Avet Harutyunyan,
Nauzet Hernandez,
Marcella Iuzzolino,
Marcello Lodi,
Luca Malavolta,
Jesus Maldonado,
Giusi Micela,
Nicoletta Sanna,
Jose Sanjuan
, et al. (8 additional authors not shown)
Abstract:
GIARPS (GIAno & haRPS) is a project devoted to have on the same focal station of the Telescopio Nazionale Galileo (TNG) both the high resolution spectrographs HARPS-N (VIS) and GIANO (NIR) working simultaneously. This could be considered the first and unique worldwide instrument providing cross-dispersed echelle spectroscopy at a high resolution (R=115,000 in the visual and R=50,000 in the IR) and…
▽ More
GIARPS (GIAno & haRPS) is a project devoted to have on the same focal station of the Telescopio Nazionale Galileo (TNG) both the high resolution spectrographs HARPS-N (VIS) and GIANO (NIR) working simultaneously. This could be considered the first and unique worldwide instrument providing cross-dispersed echelle spectroscopy at a high resolution (R=115,000 in the visual and R=50,000 in the IR) and over in a wide spectral range (0.383 - 2.45 micron) in a single exposure. The science case is very broad, given the versatility of such an instrument and the large wavelength range. A number of outstanding science cases encompassing mainly extra-solar planet science starting from rocky planet search and hot Jupiters, atmosphere characterization can be considered. Furthermore both instrument can measure high precision radial velocity by means the simultaneous thorium technique (HARPS - N) and absorbing cell technique (GIANO) in a single exposure. Other science cases are also possible. Young stars and proto-planetary disks, cool stars and stellar populations, moving minor bodies in the solar system, bursting young stellar objects, cataclysmic variables and X-ray binary transients in our Galaxy, supernovae up to gamma-ray bursts in the very distant and young Universe, can take advantage of the unicity of this facility both in terms of contemporaneous wide wavelength range and high resolution spectroscopy.
△ Less
Submitted 22 November, 2016;
originally announced November 2016.
-
The new SOXS instrument for the ESO NTT
Authors:
P. Schipani,
R. Claudi,
S. Campana,
A. Baruffolo,
S. Basa,
S. Basso,
E. Cappellaro,
E. Cascone,
R. Cosentino,
F. DAlessio,
V. De Caprio,
M. Della Valle,
A. de Ugarte Postigo,
S. DOrsi,
R. Franzen,
J. Fynbo,
A. Gal-Yam,
D. Gardiol,
E. Giro,
M. Hamuy,
M. Iuzzolino,
D. Loreggia,
S. Mattila,
M. Munari,
G. Pignata
, et al. (6 additional authors not shown)
Abstract:
SOXS (Son Of X-Shooter) will be a unique spectroscopic facility for the ESO-NTT 3.5-m telescope in La Silla (Chile), able to cover the optical/NIR band (350-1750 nm). The design foresees a high-efficiency spectrograph with a resolution-slit product of ~4,500, capable of simultaneously observing the complete spectral range 350 - 1750 nm with a good sensitivity, with light imaging capabilities in th…
▽ More
SOXS (Son Of X-Shooter) will be a unique spectroscopic facility for the ESO-NTT 3.5-m telescope in La Silla (Chile), able to cover the optical/NIR band (350-1750 nm). The design foresees a high-efficiency spectrograph with a resolution-slit product of ~4,500, capable of simultaneously observing the complete spectral range 350 - 1750 nm with a good sensitivity, with light imaging capabilities in the visible band. This paper outlines the status of the project.
△ Less
Submitted 13 July, 2016;
originally announced July 2016.
-
GIANO-TNG spectroscopy of red supergiants in the young star cluster RSGC3
Authors:
L. Origlia,
E. Oliva,
N. Sanna,
A. Mucciarelli,
E. Dalessandro,
S. Scuderi,
C. Baffa,
V. Biliotti,
L. Carbonaro,
G. Falcini,
E. Giani,
M. Iuzzolino,
F. Massi,
M. Sozzi,
A. Tozzi,
A. Ghedina,
F. Ghinassi,
M. Lodi,
A. Harutyunyan,
M. Pedani
Abstract:
The Scutum complex in the inner disk of the Galaxy has a number of young star clusters dominated by red supergiants that are heavily obscured by dust extinction and observable only at infrared wavelengths. These clusters are important tracers of the recent star formation and chemical enrichment history in the inner Galaxy. During the technical commissioning and as a first science verification of t…
▽ More
The Scutum complex in the inner disk of the Galaxy has a number of young star clusters dominated by red supergiants that are heavily obscured by dust extinction and observable only at infrared wavelengths. These clusters are important tracers of the recent star formation and chemical enrichment history in the inner Galaxy. During the technical commissioning and as a first science verification of the GIANO spectrograph at the Telescopio Nazionale Galileo, we secured high-resolution (R=50,000) near-infrared spectra of five red supergiants in the young Scutum cluster RSGC3. Taking advantage of the full YJHK spectral coverage of GIANO in a single exposure, we were able to measure several tens of atomic and molecular lines that were suitable for determining chemical abundances. By means of spectral synthesis and line equivalent width measurements, we obtained abundances of Fe and iron-peak elements such as Ni, Cr, and Cu, alpha (O, Mg, Si, Ca, Ti), other light elements (C, N, F, Na, Al, and Sc), and some s-process elements (Y, Sr). We found average half-solar iron abundances and solar-scaled [X/Fe] abundance patterns for most of the elements, consistent with a thin-disk chemistry. We found depletion of [C/Fe] and enhancement of [N/Fe], consistent with standard CN burning, and low 12C/13C abundance ratios (between 9 and 11), which require extra-mixing processes in the stellar interiors during the post-main sequence evolution. We also found local standard of rest V(LSR)=106 km/s and heliocentric V(HEL)=90 km/s radial velocities with a dispersion of 2.3 km/s. The inferred radial velocities, abundances, and abundance patterns of RSGC3 are very similar to those previously measured in the other two young clusters of the Scutum complex, RSGC1 and RSGC2, suggesting a common kinematics and chemistry within the Scutum complex.
△ Less
Submitted 23 October, 2015;
originally announced October 2015.
-
Lines and continuum sky emission in the near infrared: observational constraints from deep high spectral resolution spectra with GIANO-TNG
Authors:
E. Oliva,
L. Origlia,
S. Scuderi,
S. Benatti,
I. Carleo,
E. Lapenna,
A. Mucciarelli,
C. Baffa,
V. Biliotti,
L. Carbonaro,
G. Falcini,
E. Giani,
M. Iuzzolino,
F. Massi,
N. Sanna,
M. Sozzi,
A Tozzi,
A. Ghedina,
F. Ghinassi,
M. Lodi,
A. Harutyunyan,
M. Pedani
Abstract:
Aims Determining the intensity of lines and continuum airglow emission in the H-band is important for the design of faint-object infrared spectrographs. Existing spectra at low/medium resolution cannot disentangle the true sky-continuum from instrumental effects (e.g. diffuse light in the wings of strong lines). We aim to obtain, for the first time, a high resolution infrared spectrum deep enough…
▽ More
Aims Determining the intensity of lines and continuum airglow emission in the H-band is important for the design of faint-object infrared spectrographs. Existing spectra at low/medium resolution cannot disentangle the true sky-continuum from instrumental effects (e.g. diffuse light in the wings of strong lines). We aim to obtain, for the first time, a high resolution infrared spectrum deep enough to set significant constraints on the continuum emission between the lines in the H-band. Methods During the second commissioning run of the GIANO high-resolution infrared spectrograph at La Palma Observatory, we pointed the instrument directly to the sky and obtained a deep spectrum that extends from 0.97 to 2.4 micron. Results The spectrum shows about 1500 emission lines, a factor of two more than in previous works. Of these, 80% are identified as OH transitions; half of these are from highly excited molecules (hot-OH component) that are not included in the OH airglow emission models normally used for astronomical applications. The other lines are attributable to O2 or unidentified. Several of the faint lines are in spectral regions that were previously believed to be free of line emission. The continuum in the H-band is marginally detected at a level of about 300 photons/m^2/s/arcsec^2/micron, equivalent to 20.1 AB-mag/arcsec^2. The observed spectrum and the list of observed sky-lines are published in electronic format. Conclusions Our measurements indicate that the sky continuum in the H-band could be even darker than previously believed. However, the myriad of airglow emission lines severely limits the spectral ranges where very low background can be effectively achieved with low/medium resolution spectrographs. We identify a few spectral bands that could still remain quite dark at the resolving power foreseen for VLT-MOONS (R ~6,600).
△ Less
Submitted 30 June, 2015;
originally announced June 2015.
-
The fiber-fed preslit of GIANO at T.N.G
Authors:
A. Tozzi,
E. Oliva,
L. Origlia,
C. Baffa,
V. Biliotti,
G. Falcini,
E. Giani,
M. Iuzzolino,
F. Massi,
N. Sanna,
S. Scuderi,
M. Sozzi
Abstract:
Giano is a Cryogenic Spectrograph located in T.N.G. (Spain) and commisioned in 2013. It works in the range 950-2500 nm with a resolving power of 50000. This instrument was designed and built for direct feeding from the telescope [2]. However, due to constraints imposed on the telescope interfacing during the pre-commissioning phase, it had to be positioned on the rotating building, far from the te…
▽ More
Giano is a Cryogenic Spectrograph located in T.N.G. (Spain) and commisioned in 2013. It works in the range 950-2500 nm with a resolving power of 50000. This instrument was designed and built for direct feeding from the telescope [2]. However, due to constraints imposed on the telescope interfacing during the pre-commissioning phase, it had to be positioned on the rotating building, far from the telescope focus. Therefore, a new interface to the telescope, based on IR-transmitting ZBLAN fibers with 85μm core, was developed.Originally designed to work directly at the $f/11$ nasmyth focus of the telescope, in 2011 it has decided to use a fiber to feed it. The beam from the telescope is focused on a double fiber boundle by a Preslit Optical Bench attached to the Nasmith A interface of the telescope. This Optical Bench contains the fiber feeding system and other important features as a guiding system, a fiber viewer, a fiber feed calibration lamp and a nodding facility between the two fibers. The use of two fibers allow us to have in the echellogram two spectrograms side by side in the same acquisition: one of the star and the other of the sky or simultaneously to have the star and a calibration lamp. Before entering the cryostat the light from the fiber is collectd by a second Preslit Optical Bench attached directly to the Giano cryostat: on this bench the correct f-number to illuminate the cold stop is generated and on the same bench is placed an image slicer to increase the efficiency of the system.
△ Less
Submitted 11 July, 2014;
originally announced July 2014.
-
Updated optical design and trade-off study for MOONS, the Multi-Object Optical and Near Infrared spectrometer for the VLT
Authors:
E. Oliva,
S. Todd,
M. Cirasuolo,
H. Schnetler,
D. Lunney,
P. Rees,
A. Bianco,
E. Diolaiti,
D. Ferruzzi,
M. Fisher,
I. Guinouard,
M. Iuzzolino,
I. Parry,
X. Sun,
A. Tozzi,
F. Vitali
Abstract:
This paper presents the latest optical design for the MOONS triple-arm spectrographs. MOONS will be a Multi-Object Optical and Near-infrared Spectrograph and will be installed on one of the European Southern Observatory (ESO) Very Large Telescopes (VLT). Included in this paper is a trade-off analysis of different types of collimators, cameras, dichroics and filters.
This paper presents the latest optical design for the MOONS triple-arm spectrographs. MOONS will be a Multi-Object Optical and Near-infrared Spectrograph and will be installed on one of the European Southern Observatory (ESO) Very Large Telescopes (VLT). Included in this paper is a trade-off analysis of different types of collimators, cameras, dichroics and filters.
△ Less
Submitted 11 July, 2014;
originally announced July 2014.
-
Preliminary results on the characterization and performances of ZBLAN fiber for infrared spectrographs
Authors:
M. Iuzzolino,
A. Tozzi,
N. Sanna,
L. Zangrilli,
E. Oliva
Abstract:
Present telescopes and future extremely large telescopes make use of fiber-fed spectrographs to observe at optical and infrared wavelengths. The use of fibers largely simplifies the interfacing of the spectrograph to the telescope. At a high spectral resolution (R>50,000) the fibers can be used to achieve very high spectral accuracy. GIANO is an infrared (0.95-2.5μm) high resolution (R=50,000) spe…
▽ More
Present telescopes and future extremely large telescopes make use of fiber-fed spectrographs to observe at optical and infrared wavelengths. The use of fibers largely simplifies the interfacing of the spectrograph to the telescope. At a high spectral resolution (R>50,000) the fibers can be used to achieve very high spectral accuracy. GIANO is an infrared (0.95-2.5μm) high resolution (R=50,000) spectrometer[1] [2] [3] that was recently commissioned at the TNG telescope (La Palma). This instrument was designed and built for direct feeding from the telescope [4]. However, due to constraints imposed on the telescope interfacing during the pre-commissioning phase, it had to be positioned on the rotating building, far from the telescope focus. Therefore, a new interface to the telescope, based on IR-transmitting ZBLAN fibers with 85 μm core, was developed. In this article we report the first, preliminary results of the effects of these fibers on the quality of the recorded spectra with GIANO and with a similar spectrograph that we set-up in the laboratory. The effects can be primarily associated to modal-noise (MN) that, in GIANO, is much more evident than in optical spectrometers, because of the much longer wavelengths.
△ Less
Submitted 11 July, 2014;
originally announced July 2014.