-
ARMADA: Augmented Reality for Robot Manipulation and Robot-Free Data Acquisition
Authors:
Nataliya Nechyporenko,
Ryan Hoque,
Christopher Webb,
Mouli Sivapurapu,
Jian Zhang
Abstract:
Teleoperation for robot imitation learning is bottlenecked by hardware availability. Can high-quality robot data be collected without a physical robot? We present a system for augmenting Apple Vision Pro with real-time virtual robot feedback. By providing users with an intuitive understanding of how their actions translate to robot motions, we enable the collection of natural barehanded human data…
▽ More
Teleoperation for robot imitation learning is bottlenecked by hardware availability. Can high-quality robot data be collected without a physical robot? We present a system for augmenting Apple Vision Pro with real-time virtual robot feedback. By providing users with an intuitive understanding of how their actions translate to robot motions, we enable the collection of natural barehanded human data that is compatible with the limitations of physical robot hardware. We conducted a user study with 15 participants demonstrating 3 different tasks each under 3 different feedback conditions and directly replayed the collected trajectories on physical robot hardware. Results suggest live robot feedback dramatically improves the quality of the collected data, suggesting a new avenue for scalable human data collection without access to robot hardware. Videos and more are available at https://nataliya.dev/armada.
△ Less
Submitted 13 December, 2024;
originally announced December 2024.
-
Pre-Trained Foundation Model representations to uncover Breathing patterns in Speech
Authors:
Vikramjit Mitra,
Anirban Chatterjee,
Ke Zhai,
Helen Weng,
Ayuko Hill,
Nicole Hay,
Christopher Webb,
Jamie Cheng,
Erdrin Azemi
Abstract:
The process of human speech production involves coordinated respiratory action to elicit acoustic speech signals. Typically, speech is produced when air is forced from the lungs and is modulated by the vocal tract, where such actions are interspersed by moments of breathing in air (inhalation) to refill the lungs again. Respiratory rate (RR) is a vital metric that is used to assess the overall hea…
▽ More
The process of human speech production involves coordinated respiratory action to elicit acoustic speech signals. Typically, speech is produced when air is forced from the lungs and is modulated by the vocal tract, where such actions are interspersed by moments of breathing in air (inhalation) to refill the lungs again. Respiratory rate (RR) is a vital metric that is used to assess the overall health, fitness, and general well-being of an individual. Existing approaches to measure RR (number of breaths one takes in a minute) are performed using specialized equipment or training. Studies have demonstrated that machine learning algorithms can be used to estimate RR using bio-sensor signals as input. Speech-based estimation of RR can offer an effective approach to measure the vital metric without requiring any specialized equipment or sensors. This work investigates a machine learning based approach to estimate RR from speech segments obtained from subjects speaking to a close-talking microphone device. Data were collected from N=26 individuals, where the groundtruth RR was obtained through commercial grade chest-belts and then manually corrected for any errors. A convolutional long-short term memory network (Conv-LSTM) is proposed to estimate respiration time-series data from the speech signal. We demonstrate that the use of pre-trained representations obtained from a foundation model, such as Wav2Vec2, can be used to estimate respiration-time-series with low root-mean-squared error and high correlation coefficient, when compared with the baseline. The model-driven time series can be used to estimate $RR$ with a low mean absolute error (MAE) ~ 1.6 breaths/min.
△ Less
Submitted 17 July, 2024;
originally announced July 2024.
-
Applications of fast triangulation simplification
Authors:
Mark C. Bell,
Richard C. H. Webb
Abstract:
We describe a new algorithm to compute the geometric intersection number between two curves, given as edge vectors on an ideal triangulation. Most importantly, this algorithm runs in polynomial time in the bit-size of the two edge vectors.
In its simplest instances, this algorithm works by finding the minimal position of the two curves. We achieve this by phrasing the problem as a collection of…
▽ More
We describe a new algorithm to compute the geometric intersection number between two curves, given as edge vectors on an ideal triangulation. Most importantly, this algorithm runs in polynomial time in the bit-size of the two edge vectors.
In its simplest instances, this algorithm works by finding the minimal position of the two curves. We achieve this by phrasing the problem as a collection of linear programming problems. We describe how to reduce the more general case down to one of these simplest instances in polynomial time. This reduction relies on an algorithm by the first author to quickly switch to a new triangulation in which an edge vector is significantly smaller.
△ Less
Submitted 11 May, 2016;
originally announced May 2016.
-
Sample NLPDE and NLODE Social-Media Modeling of Information Transmission for Infectious Diseases:Case Study Ebola
Authors:
Armin Smailhodvic,
Keith Andrew,
Lance Hahn,
Phillip C. Womble,
Cathleen Webb
Abstract:
We investigate the spreading of information through Twitter messaging related to the spread of Ebola in western Africa using epidemic based dynamic models. Diffusive spreading leads to NLPDE models and fixed point analysis yields systems of NLODE models. When tweets are mapped as connected nodes in a graph and are treated as a time sequenced Markov chain, TSMC, then by the Kurtz theorem these spec…
▽ More
We investigate the spreading of information through Twitter messaging related to the spread of Ebola in western Africa using epidemic based dynamic models. Diffusive spreading leads to NLPDE models and fixed point analysis yields systems of NLODE models. When tweets are mapped as connected nodes in a graph and are treated as a time sequenced Markov chain, TSMC, then by the Kurtz theorem these specific paths can be identified as being near solutions to systems of ordinary differential equations that in the large N limit retain many of the features of the original Tweet dynamics. Constraints on the model related to Tweet and re-Tweet rates lead to different versions of the system of equations. We use Ebola Twitter meme based data to investigate a modified four parameter model and apply the resulting fit to an accuracy metric for a set of Ebola memes. In principle the temporal and spatial evolution equations describing the propagation of the Twitter based memes can help ascertain and inform decision makers on the nature of the spreading and containment of an epidemic of this type.
△ Less
Submitted 31 December, 2014;
originally announced January 2015.