-
EvalGIM: A Library for Evaluating Generative Image Models
Authors:
Melissa Hall,
Oscar Mañas,
Reyhane Askari-Hemmat,
Mark Ibrahim,
Candace Ross,
Pietro Astolfi,
Tariq Berrada Ifriqi,
Marton Havasi,
Yohann Benchetrit,
Karen Ullrich,
Carolina Braga,
Abhishek Charnalia,
Maeve Ryan,
Mike Rabbat,
Michal Drozdzal,
Jakob Verbeek,
Adriana Romero-Soriano
Abstract:
As the use of text-to-image generative models increases, so does the adoption of automatic benchmarking methods used in their evaluation. However, while metrics and datasets abound, there are few unified benchmarking libraries that provide a framework for performing evaluations across many datasets and metrics. Furthermore, the rapid introduction of increasingly robust benchmarking methods require…
▽ More
As the use of text-to-image generative models increases, so does the adoption of automatic benchmarking methods used in their evaluation. However, while metrics and datasets abound, there are few unified benchmarking libraries that provide a framework for performing evaluations across many datasets and metrics. Furthermore, the rapid introduction of increasingly robust benchmarking methods requires that evaluation libraries remain flexible to new datasets and metrics. Finally, there remains a gap in synthesizing evaluations in order to deliver actionable takeaways about model performance. To enable unified, flexible, and actionable evaluations, we introduce EvalGIM (pronounced ''EvalGym''), a library for evaluating generative image models. EvalGIM contains broad support for datasets and metrics used to measure quality, diversity, and consistency of text-to-image generative models. In addition, EvalGIM is designed with flexibility for user customization as a top priority and contains a structure that allows plug-and-play additions of new datasets and metrics. To enable actionable evaluation insights, we introduce ''Evaluation Exercises'' that highlight takeaways for specific evaluation questions. The Evaluation Exercises contain easy-to-use and reproducible implementations of two state-of-the-art evaluation methods of text-to-image generative models: consistency-diversity-realism Pareto Fronts and disaggregated measurements of performance disparities across groups. EvalGIM also contains Evaluation Exercises that introduce two new analysis methods for text-to-image generative models: robustness analyses of model rankings and balanced evaluations across different prompt styles. We encourage text-to-image model exploration with EvalGIM and invite contributions at https://github.com/facebookresearch/EvalGIM/.
△ Less
Submitted 18 December, 2024; v1 submitted 13 December, 2024;
originally announced December 2024.
-
B Maude: A formal executable environment for Abstract Machine Notation Descriptions
Authors:
Christiano Braga,
Narciso Martí-Oliet
Abstract:
We propose B Maude, a prototype executable environment for the Abstract Machine Notation implemented in the Maude language. B Maude is formally defined and results from the implementation of the semantics of AMN as denotations in the $π$ Framework, a realization of Mosses' Component-based Semantics and Plotkin's Interpreting Automata. B Maude endows the B method with execution by rewriting, symbol…
▽ More
We propose B Maude, a prototype executable environment for the Abstract Machine Notation implemented in the Maude language. B Maude is formally defined and results from the implementation of the semantics of AMN as denotations in the $π$ Framework, a realization of Mosses' Component-based Semantics and Plotkin's Interpreting Automata. B Maude endows the B method with execution by rewriting, symbolic search with narrowing and Linear Temporal Logic model checking of AMN descriptions.
△ Less
Submitted 17 August, 2021;
originally announced August 2021.
-
Tracking Pedestrian Heads in Dense Crowd
Authors:
Ramana Sundararaman,
Cedric De Almeida Braga,
Eric Marchand,
Julien Pettre
Abstract:
Tracking humans in crowded video sequences is an important constituent of visual scene understanding. Increasing crowd density challenges visibility of humans, limiting the scalability of existing pedestrian trackers to higher crowd densities. For that reason, we propose to revitalize head tracking with Crowd of Heads Dataset (CroHD), consisting of 9 sequences of 11,463 frames with over 2,276,838…
▽ More
Tracking humans in crowded video sequences is an important constituent of visual scene understanding. Increasing crowd density challenges visibility of humans, limiting the scalability of existing pedestrian trackers to higher crowd densities. For that reason, we propose to revitalize head tracking with Crowd of Heads Dataset (CroHD), consisting of 9 sequences of 11,463 frames with over 2,276,838 heads and 5,230 tracks annotated in diverse scenes. For evaluation, we proposed a new metric, IDEucl, to measure an algorithm's efficacy in preserving a unique identity for the longest stretch in image coordinate space, thus building a correspondence between pedestrian crowd motion and the performance of a tracking algorithm. Moreover, we also propose a new head detector, HeadHunter, which is designed for small head detection in crowded scenes. We extend HeadHunter with a Particle Filter and a color histogram based re-identification module for head tracking. To establish this as a strong baseline, we compare our tracker with existing state-of-the-art pedestrian trackers on CroHD and demonstrate superiority, especially in identity preserving tracking metrics. With a light-weight head detector and a tracker which is efficient at identity preservation, we believe our contributions will serve useful in advancement of pedestrian tracking in dense crowds.
△ Less
Submitted 24 March, 2021;
originally announced March 2021.
-
π: Towards a Simple Formal Semantic Framework for Compiler Construction
Authors:
Christiano Braga
Abstract:
This paper proposes π, a formal semantic framework for compiler construction together with program validation. π is comprised by π Lib, a set of programming languages constructs inspired by Peter Mosses' Component-Based Semantics and π Automata, an automata-based formalism to describe the operational semantics of programming languages, that generalizes Gordon Plotkin's Interpreting Automata.
This paper proposes π, a formal semantic framework for compiler construction together with program validation. π is comprised by π Lib, a set of programming languages constructs inspired by Peter Mosses' Component-Based Semantics and π Automata, an automata-based formalism to describe the operational semantics of programming languages, that generalizes Gordon Plotkin's Interpreting Automata.
△ Less
Submitted 31 July, 2018; v1 submitted 12 May, 2018;
originally announced May 2018.
-
Towards the Modular Specification and Validation of Cyber-Physical Systems
Authors:
Andre Metelo,
Christiano Braga,
Diego Brandão
Abstract:
Cyber-Physical Systems (CPS) are systems controlled by one or more computer-based components tightly integrated with a set of physical components, typically described as sensors and actuators, that can either be directly attached to the computer components, or at a remote location, and accessible through a network connection. The modeling and verification of such systems is a hard task and error p…
▽ More
Cyber-Physical Systems (CPS) are systems controlled by one or more computer-based components tightly integrated with a set of physical components, typically described as sensors and actuators, that can either be directly attached to the computer components, or at a remote location, and accessible through a network connection. The modeling and verification of such systems is a hard task and error prone that require rigorous techniques. Hybrid automata is a formalism that extends finite-state automata with continuous behavior, described by ordinary differential equations. This paper uses a rewriting logic-based technique to model and validate CPS, thus exploring the use of a formal technique to develop such systems that combines expressive specification with efficient state-based analysis. Moreover, we aim at the modular specification of such systems such that each CPS component is independently specified and the final system emerges as the synchronous product of its constituent components. We model CPSs using Linear Hybrid Automaton and implement them in Real-Time Maude, a rewriting logic tool for real-time systems. With this method, we develop a specification for the $n$-reservoir problem, a CPS that controls a hose to fill a number of reservoirs according to the physical properties of the hose and the reservoirs.
△ Less
Submitted 8 March, 2018;
originally announced March 2018.
-
Modeling Normative Multi-Agent Systems from a Kelsenian Perspective
Authors:
Christiano Braga,
Edward Hermann Haeusler,
Jéssica S. Santos
Abstract:
Standard Deontic Logic (SDL) has been used as the underlying logic to model and reason over Multi-Agent Systems governed by norms (NorMAS). It is known that SDL is not able to represent contrary-to-duty (CTD) scenarios in a consistent way. That is the case, for example, of the so-called Chisholm paradox, which models a situation in which a conditional obligation that specifies what must be done wh…
▽ More
Standard Deontic Logic (SDL) has been used as the underlying logic to model and reason over Multi-Agent Systems governed by norms (NorMAS). It is known that SDL is not able to represent contrary-to-duty (CTD) scenarios in a consistent way. That is the case, for example, of the so-called Chisholm paradox, which models a situation in which a conditional obligation that specifies what must be done when a primary obligation is violated holds. In SDL, the set of sentences that represent the Chisholm paradox derives inconsistent sentences. Due to the autonomy of the software agents of a NorMAS, norms may be violated and the underlying logic used to model the NorMAS should be able to represent violation scenarios. The contribution of this paper is threefold: (i) we present how Kelsenian thinking, from his jurisprudence in the context of legal ontologies, and Intuitionist Hybrid Logic can be adopted in the modeling of NorMAS, (ii) discuss how this approach overcomes limitations of the SDL and (iii) present a discussion about normative conflict identification according to Hill's functional taxonomy, that generalizes from standard identification by impossibility-of-joint-compliance test.
△ Less
Submitted 26 March, 2018; v1 submitted 6 September, 2017;
originally announced September 2017.