Search | arXiv e-print repository

VideoAgent: Self-Improving Video Generation

Authors: Achint Soni, Sreyas Venkataraman, Abhranil Chandra, Sebastian Fischmeister, Percy Liang, Bo Dai, Sherry Yang

Abstract: Video generation has been used to generate visual plans for controlling robotic systems. Given an image observation and a language instruction, previous work has generated video plans which are then converted to robot controls to be executed. However, a major bottleneck in leveraging video generation for control lies in the quality of the generated videos, which often suffer from hallucinatory con… ▽ More Video generation has been used to generate visual plans for controlling robotic systems. Given an image observation and a language instruction, previous work has generated video plans which are then converted to robot controls to be executed. However, a major bottleneck in leveraging video generation for control lies in the quality of the generated videos, which often suffer from hallucinatory content and unrealistic physics, resulting in low task success when control actions are extracted from the generated videos. While scaling up dataset and model size provides a partial solution, integrating external feedback is both natural and essential for grounding video generation in the real world. With this observation, we propose VideoAgent for self-improving generated video plans based on external feedback. Instead of directly executing the generated video plan, VideoAgent first refines the generated video plans using a novel procedure which we call self-conditioning consistency, allowing inference-time compute to be turned into better generated video plans. As the refined video plan is being executed, VideoAgent can collect additional data from the environment to further improve video plan generation. Experiments in simulated robotic manipulation from MetaWorld and iTHOR show that VideoAgent drastically reduces hallucination, thereby boosting success rate of downstream manipulation tasks. We further illustrate that VideoAgent can effectively refine real-robot videos, providing an early indicator that robots can be an effective tool in grounding video generation in the physical world. Video demos and code can be found at https://video-as-agent.github.io. △ Less

Submitted 9 February, 2025; v1 submitted 13 October, 2024; originally announced October 2024.

arXiv:2407.04144 [pdf, ps, other]

doi 10.1109/ICSTW60967.2024.00021

Annotating Control-Flow Graphs for Formalized Test Coverage Criteria

Authors: Sean Kauffman, Carlos Moreno, Sebastian Fischmeister

Abstract: Control flow coverage criteria are an important part of the process of qualifying embedded software for safety-critical systems. Criteria such as modified condition/decision coverage (MC/DC) as defined by DO-178B are used by regulators to judge the adequacy of testing and by QA engineers to design tests when full path coverage is impossible. Despite their importance, these coverage criteria are… ▽ More Control flow coverage criteria are an important part of the process of qualifying embedded software for safety-critical systems. Criteria such as modified condition/decision coverage (MC/DC) as defined by DO-178B are used by regulators to judge the adequacy of testing and by QA engineers to design tests when full path coverage is impossible. Despite their importance, these coverage criteria are often misunderstood. One problem is that their definitions are typically written in natural language specification documents, making them imprecise. Other works have proposed formal definitions using binary predicate logic, but these definitions are difficult to apply to the analysis of real programs. Control-Flow Graphs (CFGs) are the most common model for analyzing program logic in compilers, and seem to be a good fit for defining and analyzing coverage criteria. However, CFGs discard the explicit concept of a decision, making their use for this task seem impossible. In this paper, we show how to annotate a CFG with decision information inferred from the graph itself. We call this annotated model a Control-Flow Decision Graph (CFDG) and we use it to formally define several common coverage criteria. We have implemented our algorithms in a tool which we show can be applied to automatically annotate CFGs output from popular compilers. △ Less

Submitted 4 July, 2024; originally announced July 2024.

arXiv:2006.09504 [pdf, other]

A generalizable saliency map-based interpretation of model outcome

Authors: Shailja Thakur, Sebastian Fischmeister

Abstract: One of the significant challenges of deep neural networks is that the complex nature of the network prevents human comprehension of the outcome of the network. Consequently, the applicability of complex machine learning models is limited in the safety-critical domains, which incurs risk to life and property. To fully exploit the capabilities of complex neural networks, we propose a non-intrusive i… ▽ More One of the significant challenges of deep neural networks is that the complex nature of the network prevents human comprehension of the outcome of the network. Consequently, the applicability of complex machine learning models is limited in the safety-critical domains, which incurs risk to life and property. To fully exploit the capabilities of complex neural networks, we propose a non-intrusive interpretability technique that uses the input and output of the model to generate a saliency map. The method works by empirically optimizing a randomly initialized input mask by localizing and weighing individual pixels according to their sensitivity towards the target class. Our experiments show that the proposed model interpretability approach performs better than the existing saliency map-based approaches methods at localizing the relevant input pixels. Furthermore, to obtain a global perspective on the target-specific explanation, we propose a saliency map reconstruction approach to generate acceptable variations of the salient inputs from the space of input data distribution for which the model outcome remains unaltered. Experiments show that our interpretability method can reconstruct the salient part of the input with a classification accuracy of 89%. △ Less

Submitted 19 June, 2020; v1 submitted 16 June, 2020; originally announced June 2020.

arXiv:2006.06993 [pdf, ps, other]

CANOA: CAN Origin Authentication Through Power Side-Channel Monitoring

Authors: Shailja Thakur, Carlos Moreno, Sebastian Fischmeister

Abstract: The lack of any sender authentication mechanism in place makes CAN (Controller Area Network) vulnerable to security threats. For instance, an attacker can impersonate an ECU (Electronic Control Unit) on the bus and send spoofed messages unobtrusively with the identifier of the impersonated ECU. To address this problem, we propose a novel sender authentication technique that uses power consumption… ▽ More The lack of any sender authentication mechanism in place makes CAN (Controller Area Network) vulnerable to security threats. For instance, an attacker can impersonate an ECU (Electronic Control Unit) on the bus and send spoofed messages unobtrusively with the identifier of the impersonated ECU. To address this problem, we propose a novel sender authentication technique that uses power consumption measurements of the ECU to authenticate the sender of a message. When an ECU is transmitting, its power requirement is affected, and a characteristic pattern appears in its power consumption. Our technique exploits the power consumption of each ECU during the transmission of a message to determine whether the message actually originated from the purported sender. We evaluate our approach in both a lab setup and a real vehicle. We also evaluate our approach against factors that can impact the power consumption measurement of the ECU. The results of the evaluation show that the proposed technique is applicable in a broad range of operating conditions with reasonable computational power requirements and attaining good accuracy. △ Less

Submitted 12 June, 2020; originally announced June 2020.

arXiv:1904.05411 [pdf, other]

Deep Learning for System Trace Restoration

Authors: Ilia Sucholutsky, Apurva Narayan, Matthias Schonlau, Sebastian Fischmeister

Abstract: Most real-world datasets, and particularly those collected from physical systems, are full of noise, packet loss, and other imperfections. However, most specification mining, anomaly detection and other such algorithms assume, or even require, perfect data quality to function properly. Such algorithms may work in lab conditions when given clean, controlled data, but will fail in the field when giv… ▽ More Most real-world datasets, and particularly those collected from physical systems, are full of noise, packet loss, and other imperfections. However, most specification mining, anomaly detection and other such algorithms assume, or even require, perfect data quality to function properly. Such algorithms may work in lab conditions when given clean, controlled data, but will fail in the field when given imperfect data. We propose a method for accurately reconstructing discrete temporal or sequential system traces affected by data loss, using Long Short-Term Memory Networks (LSTMs). The model works by learning to predict the next event in a sequence of events, and uses its own output as an input to continue predicting future events. As a result, this method can be used for data restoration even with streamed data. Such a method can reconstruct even long sequence of missing events, and can also help validate and improve data quality for noisy data. The output of the model will be a close reconstruction of the true data, and can be fed to algorithms that rely on clean data. We demonstrate our method by reconstructing automotive CAN traces consisting of long sequences of discrete events. We show that given even small parts of a CAN trace, our LSTM model can predict future events with an accuracy of almost 90%, and can successfully reconstruct large portions of the original trace, greatly outperforming a Markov Model benchmark. We separately feed the original, lossy, and reconstructed traces into a specification mining framework to perform downstream analysis of the effect of our method on state-of-the-art models that use these traces for understanding the behavior of complex systems. △ Less

Submitted 10 April, 2019; originally announced April 2019.

Comments: Pre-print (accepted to IJCNN 2019)

arXiv:1704.03397 [pdf, other]

Debugging Behaviour of Embedded-Software Developers: An Exploratory Study

Authors: Pansy Arafa, Daniel Solomon, Samaneh Navabpour, Sebastian Fischmeister

Abstract: Many researchers have studied the behaviour of successful developers while debugging desktop software. In this paper, we investigate the embedded-software debugging by intermediate programmers through an exploratory study. The bugs are semantic low-level errors, and the participants are students who completed a real-time operating systems course in addition to five other programming courses. We co… ▽ More Many researchers have studied the behaviour of successful developers while debugging desktop software. In this paper, we investigate the embedded-software debugging by intermediate programmers through an exploratory study. The bugs are semantic low-level errors, and the participants are students who completed a real-time operating systems course in addition to five other programming courses. We compare between the behaviour involved in successful debugging attempts versus unsuccessful ones. We describe some characteristics of smooth and successful debugging behaviour. △ Less

Submitted 11 April, 2017; originally announced April 2017.

Comments: 5 pages

arXiv:1703.02873 [pdf, other]

Redundancy Suppression In Time-Aware Dynamic Binary Instrumentation

Authors: Pansy Arafa, Hany Kashif, Sebastian Fischmeister

Abstract: Software tracing techniques are well-established and used by instrumentation tools to extract run-time information for program analysis and debugging. Dynamic binary instrumentation as one tool instruments program binaries to extract information. Unfortunately, instrumentation causes perturbation that is unacceptable for time-sensitive applications. Consequently we developed DIME*, a tool for dyna… ▽ More Software tracing techniques are well-established and used by instrumentation tools to extract run-time information for program analysis and debugging. Dynamic binary instrumentation as one tool instruments program binaries to extract information. Unfortunately, instrumentation causes perturbation that is unacceptable for time-sensitive applications. Consequently we developed DIME*, a tool for dynamic binary instrumentation that considers timing constraints. DIME* uses Pin and a rate-based server approach to extract information only as long as user-specified constraints are maintained. Due to the large amount of redundancies in program traces, DIME* reduces the instrumentation overhead by one to three orders of magnitude compared to native Pin while extracting up to 99% of the information. We instrument VLC and PostgreSQL to demonstrate the usability of DIME*. △ Less

Submitted 7 March, 2017; originally announced March 2017.

Comments: 11 pages

arXiv:1503.00793 [pdf, ps, other]

DAG-width of Control Flow Graphs with Applications to Model Checking

Authors: Therese Biedl, Sebastian Fischmeister, Neeraj Kumar

Abstract: The treewidth of control flow graphs arising from structured programs is known to be at most six. However, as a control flow graph is inherently directed, it makes sense to consider a measure of width for digraphs instead. We use the so-called DAG-width and show that the DAG-width of control flow graphs arising from structured (goto-free) programs is at most three. Additionally, we also give a lin… ▽ More The treewidth of control flow graphs arising from structured programs is known to be at most six. However, as a control flow graph is inherently directed, it makes sense to consider a measure of width for digraphs instead. We use the so-called DAG-width and show that the DAG-width of control flow graphs arising from structured (goto-free) programs is at most three. Additionally, we also give a linear time algorithm to compute the DAG decomposition of these control flow graphs. One consequence of this result is that parity games (and hence the $μ$-calculus model checking problem), which are known to be tractable on graphs of bounded DAG-width, can be solved efficiently in practice on control flow graphs. △ Less

Submitted 2 March, 2015; originally announced March 2015.

Comments: 12 pages, 4 figures

arXiv:1411.2239 [pdf, other]

Accelerated Runtime Verification of LTL Specifications with Counting Semantics

Authors: Ramy Medhat, Yogi Joshi, Borzoo Bonakdarpour, Sebastian Fischmeister

Abstract: Runtime verification is an effective automated method for specification-based offline testing and analysis as well as online monitoring of complex systems. The specification language is often a variant of regular expressions or a popular temporal logic, such as LTL. This paper presents a novel and efficient parallel algorithm for verifying a more expressive version of LTL specifications that incor… ▽ More Runtime verification is an effective automated method for specification-based offline testing and analysis as well as online monitoring of complex systems. The specification language is often a variant of regular expressions or a popular temporal logic, such as LTL. This paper presents a novel and efficient parallel algorithm for verifying a more expressive version of LTL specifications that incorporates counting semantics, where nested quantifiers can be subject to numerical constraints. Such constraints are useful in evaluating thresholds (e.g., expected uptime of a web server). The significance of this extension is that it enables us to reason about the correctness of a large class of systems, such as web servers, OS kernels, and network behavior, where properties are required to be instantiated for parameterized requests, kernel objects, network nodes, etc. Our algorithm uses the popular {\em MapReduce} architecture to split a program trace into variable-based clusters at run time. Each cluster is then mapped to its respective monitor instances, verified, and reduced collectively on a multi-core CPU or the GPU. Our algorithm is fully implemented and we report very encouraging experimental results, where the monitoring overhead is negligible on real-world data sets. △ Less

Submitted 9 November, 2014; originally announced November 2014.

arXiv:1410.6824 [pdf, other]

Power Redistribution for Optimizing Performance in MPI Clusters

Authors: Ramy Medhat, Borzoo Bonakdarpour, Sebastian Fischmeister

Abstract: Power efficiency has recently become a major concern in the high-performance computing domain. HPC centers are provisioned by a power bound which impacts execution time. Naturally, a tradeoff arises between power efficiency and computational efficiency. This paper tackles the problem of performance optimization for MPI applications, where a power bound is assumed. The paper exposes a subset of HPC… ▽ More Power efficiency has recently become a major concern in the high-performance computing domain. HPC centers are provisioned by a power bound which impacts execution time. Naturally, a tradeoff arises between power efficiency and computational efficiency. This paper tackles the problem of performance optimization for MPI applications, where a power bound is assumed. The paper exposes a subset of HPC applications that leverage cluster parallelism using MPI, where nodes encounter multiple synchronization points and exhibit inter-node dependency. We abstract this structure into a dependency graph, and leverage the asymmetry in execution time of parallel jobs on different nodes by redistributing power gained from idling a blocked node to nodes that are lagging in their jobs. We introduce a solution based on integer linear programming (ILP) for optimal power distribution algorithm that minimizes total execution time, while maintaining an upper power bound. We then present an online heuristic that dynamically redistributes power at run time. The heuristic shows significant reductions in total execution time of a set of parallel benchmarks with speedup up to 2.25x. △ Less

Submitted 24 October, 2014; originally announced October 2014.

Showing 1–10 of 10 results for author: Fischmeister, S