-
Temporal Context Consistency Above All: Enhancing Long-Term Anticipation by Learning and Enforcing Temporal Constraints
Authors:
Alberto Maté,
Mariella Dimiccoli
Abstract:
This paper proposes a method for long-term action anticipation (LTA), the task of predicting action labels and their duration in a video given the observation of an initial untrimmed video interval. We build on an encoder-decoder architecture with parallel decoding and make two key contributions. First, we introduce a bi-directional action context regularizer module on the top of the decoder that…
▽ More
This paper proposes a method for long-term action anticipation (LTA), the task of predicting action labels and their duration in a video given the observation of an initial untrimmed video interval. We build on an encoder-decoder architecture with parallel decoding and make two key contributions. First, we introduce a bi-directional action context regularizer module on the top of the decoder that ensures temporal context coherence in temporally adjacent segments. Second, we learn from classified segments a transition matrix that models the probability of transitioning from one action to another and the sequence is optimized globally over the full prediction interval. In addition, we use a specialized encoder for the task of action segmentation to increase the quality of the predictions in the observation interval at inference time, leading to a better understanding of the past. We validate our methods on four benchmark datasets for LTA, the EpicKitchen-55, EGTEA+, 50Salads and Breakfast demonstrating superior or comparable performance to state-of-the-art methods, including probabilistic models and also those based on Large Language Models, that assume trimmed video as input. The code will be released upon acceptance.
△ Less
Submitted 26 December, 2024;
originally announced December 2024.
-
2by2: Weakly-Supervised Learning for Global Action Segmentation
Authors:
Elena Bueno-Benito,
Mariella Dimiccoli
Abstract:
This paper presents a simple yet effective approach for the poorly investigated task of global action segmentation, aiming at grouping frames capturing the same action across videos of different activities. Unlike the case of videos depicting all the same activity, the temporal order of actions is not roughly shared among all videos, making the task even more challenging. We propose to use activit…
▽ More
This paper presents a simple yet effective approach for the poorly investigated task of global action segmentation, aiming at grouping frames capturing the same action across videos of different activities. Unlike the case of videos depicting all the same activity, the temporal order of actions is not roughly shared among all videos, making the task even more challenging. We propose to use activity labels to learn, in a weakly-supervised fashion, action representations suitable for global action segmentation. For this purpose, we introduce a triadic learning approach for video pairs, to ensure intra-video action discrimination, as well as inter-video and inter-activity action association. For the backbone architecture, we use a Siamese network based on sparse transformers that takes as input video pairs and determine whether they belong to the same activity. The proposed approach is validated on two challenging benchmark datasets: Breakfast and YouTube Instructions, outperforming state-of-the-art methods.
△ Less
Submitted 17 December, 2024;
originally announced December 2024.
-
3M-TRANSFORMER: A Multi-Stage Multi-Stream Multimodal Transformer for Embodied Turn-Taking Prediction
Authors:
Mehdi Fatan,
Emanuele Mincato,
Dimitra Pintzou,
Mariella Dimiccoli
Abstract:
Predicting turn-taking in multiparty conversations has many practical applications in human-computer/robot interaction. However, the complexity of human communication makes it a challenging task. Recent advances have shown that synchronous multi-perspective egocentric data can significantly improve turn-taking prediction compared to asynchronous, single-perspective transcriptions. Building on this…
▽ More
Predicting turn-taking in multiparty conversations has many practical applications in human-computer/robot interaction. However, the complexity of human communication makes it a challenging task. Recent advances have shown that synchronous multi-perspective egocentric data can significantly improve turn-taking prediction compared to asynchronous, single-perspective transcriptions. Building on this research, we propose a new multimodal transformer-based architecture for predicting turn-taking in embodied, synchronized multi-perspective data. Our experimental results on the recently introduced EgoCom dataset show a substantial performance improvement of up to 14.01% on average compared to existing baselines and alternative transformer-based approaches. The source code, and the pre-trained models of our 3M-Transformer will be available upon acceptance.
△ Less
Submitted 21 December, 2023; v1 submitted 23 October, 2023;
originally announced October 2023.
-
Leveraging triplet loss for unsupervised action segmentation
Authors:
E. Bueno-Benito,
B. Tura,
M. Dimiccoli
Abstract:
In this paper, we propose a novel fully unsupervised framework that learns action representations suitable for the action segmentation task from the single input video itself, without requiring any training data. Our method is a deep metric learning approach rooted in a shallow network with a triplet loss operating on similarity distributions and a novel triplet selection strategy that effectively…
▽ More
In this paper, we propose a novel fully unsupervised framework that learns action representations suitable for the action segmentation task from the single input video itself, without requiring any training data. Our method is a deep metric learning approach rooted in a shallow network with a triplet loss operating on similarity distributions and a novel triplet selection strategy that effectively models temporal and semantic priors to discover actions in the new representational space. Under these circumstances, we successfully recover temporal boundaries in the learned action representations with higher quality compared with existing unsupervised approaches. The proposed method is evaluated on two widely used benchmark datasets for the action segmentation task and it achieves competitive performance by applying a generic clustering algorithm on the learned representations.
△ Less
Submitted 19 July, 2023; v1 submitted 13 April, 2023;
originally announced April 2023.
-
Debiased-CAM to mitigate systematic error with faithful visual explanations of machine learning
Authors:
Wencan Zhang,
Mariella Dimiccoli,
Brian Y. Lim
Abstract:
Model explanations such as saliency maps can improve user trust in AI by highlighting important features for a prediction. However, these become distorted and misleading when explaining predictions of images that are subject to systematic error (bias). Furthermore, the distortions persist despite model fine-tuning on images biased by different factors (blur, color temperature, day/night). We prese…
▽ More
Model explanations such as saliency maps can improve user trust in AI by highlighting important features for a prediction. However, these become distorted and misleading when explaining predictions of images that are subject to systematic error (bias). Furthermore, the distortions persist despite model fine-tuning on images biased by different factors (blur, color temperature, day/night). We present Debiased-CAM to recover explanation faithfulness across various bias types and levels by training a multi-input, multi-task model with auxiliary tasks for explanation and bias level predictions. In simulation studies, the approach not only enhanced prediction accuracy, but also generated highly faithful explanations about these predictions as if the images were unbiased. In user studies, debiased explanations improved user task performance, perceived truthfulness and perceived helpfulness. Debiased training can provide a versatile platform for robust performance and explanation faithfulness for a wide range of applications with data biases.
△ Less
Submitted 28 February, 2022; v1 submitted 30 January, 2022;
originally announced January 2022.
-
Enhancing Egocentric 3D Pose Estimation with Third Person Views
Authors:
Ameya Dhamanaskar,
Mariella Dimiccoli,
Enric Corona,
Albert Pumarola,
Francesc Moreno-Noguer
Abstract:
In this paper, we propose a novel approach to enhance the 3D body pose estimation of a person computed from videos captured from a single wearable camera. The key idea is to leverage high-level features linking first- and third-views in a joint embedding space. To learn such embedding space we introduce First2Third-Pose, a new paired synchronized dataset of nearly 2,000 videos depicting human acti…
▽ More
In this paper, we propose a novel approach to enhance the 3D body pose estimation of a person computed from videos captured from a single wearable camera. The key idea is to leverage high-level features linking first- and third-views in a joint embedding space. To learn such embedding space we introduce First2Third-Pose, a new paired synchronized dataset of nearly 2,000 videos depicting human activities captured from both first- and third-view perspectives. We explicitly consider spatial- and motion-domain features, combined using a semi-Siamese architecture trained in a self-supervised fashion. Experimental results demonstrate that the joint multi-view embedded space learned with our dataset is useful to extract discriminatory features from arbitrary single-view egocentric videos, without needing domain adaptation nor knowledge of camera parameters. We achieve significant improvement of egocentric 3D body pose estimation performance on two unconstrained datasets, over three supervised state-of-the-art approaches. Our dataset and code will be available for research purposes.
△ Less
Submitted 15 June, 2022; v1 submitted 6 January, 2022;
originally announced January 2022.
-
Learning grounded word meaning representations on similarity graphs
Authors:
Mariella Dimiccoli,
Herwig Wendt,
Pau Batlle
Abstract:
This paper introduces a novel approach to learn visually grounded meaning representations of words as low-dimensional node embeddings on an underlying graph hierarchy. The lower level of the hierarchy models modality-specific word representations through dedicated but communicating graphs, while the higher level puts these representations together on a single graph to learn a representation jointl…
▽ More
This paper introduces a novel approach to learn visually grounded meaning representations of words as low-dimensional node embeddings on an underlying graph hierarchy. The lower level of the hierarchy models modality-specific word representations through dedicated but communicating graphs, while the higher level puts these representations together on a single graph to learn a representation jointly from both modalities. The topology of each graph models similarity relations among words, and is estimated jointly with the graph embedding. The assumption underlying this model is that words sharing similar meaning correspond to communities in an underlying similarity graph in a low-dimensional space. We named this model Hierarchical Multi-Modal Similarity Graph Embedding (HM-SGE). Experimental results validate the ability of HM-SGE to simulate human similarity judgements and concept categorization, outperforming the state of the art.
△ Less
Submitted 7 September, 2021;
originally announced September 2021.
-
Graph Constrained Data Representation Learning for Human Motion Segmentation
Authors:
Mariella Dimiccoli,
Lluís Garrido,
Guillem Rodriguez-Corominas,
Herwig Wendt
Abstract:
Recently, transfer subspace learning based approaches have shown to be a valid alternative to unsupervised subspace clustering and temporal data clustering for human motion segmentation (HMS). These approaches leverage prior knowledge from a source domain to improve clustering performance on a target domain, and currently they represent the state of the art in HMS. Bucking this trend, in this pape…
▽ More
Recently, transfer subspace learning based approaches have shown to be a valid alternative to unsupervised subspace clustering and temporal data clustering for human motion segmentation (HMS). These approaches leverage prior knowledge from a source domain to improve clustering performance on a target domain, and currently they represent the state of the art in HMS. Bucking this trend, in this paper, we propose a novel unsupervised model that learns a representation of the data and digs clustering information from the data itself. Our model is reminiscent of temporal subspace clustering, but presents two critical differences. First, we learn an auxiliary data matrix that can deviate from the initial data, hence confer more degrees of freedom to the coding matrix. Second, we introduce a regularization term for this auxiliary data matrix that preserves the local geometrical structure present in the high-dimensional space. The proposed model is efficiently optimized by using an original Alternating Direction Method of Multipliers (ADMM) formulation allowing to learn jointly the auxiliary data representation, a nonnegative dictionary and a coding matrix. Experimental results on four benchmark datasets for HMS demonstrate that our approach achieves significantly better clustering performance then state-of-the-art methods, including both unsupervised and more recent semi-supervised transfer learning approaches.
△ Less
Submitted 27 August, 2021; v1 submitted 28 July, 2021;
originally announced July 2021.
-
Interaction-GCN: A Graph Convolutional Network based framework for social interaction recognition in egocentric videos
Authors:
Simone Felicioni,
Mariella Dimiccoli
Abstract:
In this paper we propose a new framework to categorize social interactions in egocentric videos, we named InteractionGCN. Our method extracts patterns of relational and non-relational cues at the frame level and uses them to build a relational graph from which the interactional context at the frame level is estimated via a Graph Convolutional Network based approach. Then it propagates this context…
▽ More
In this paper we propose a new framework to categorize social interactions in egocentric videos, we named InteractionGCN. Our method extracts patterns of relational and non-relational cues at the frame level and uses them to build a relational graph from which the interactional context at the frame level is estimated via a Graph Convolutional Network based approach. Then it propagates this context over time, together with first-person motion information, through a Gated Recurrent Unit architecture. Ablation studies and experimental evaluation on two publicly available datasets validate the proposed approach and establish state of the art results.
△ Less
Submitted 8 June, 2021; v1 submitted 28 April, 2021;
originally announced April 2021.
-
Modeling long-term interactions to enhance action recognition
Authors:
Alejandro Cartas,
Petia Radeva,
Mariella Dimiccoli
Abstract:
In this paper, we propose a new approach to under-stand actions in egocentric videos that exploits the semantics of object interactions at both frame and temporal levels. At the frame level, we use a region-based approach that takes as input a primary region roughly corresponding to the user hands and a set of secondary regions potentially corresponding to the interacting objects and calculates th…
▽ More
In this paper, we propose a new approach to under-stand actions in egocentric videos that exploits the semantics of object interactions at both frame and temporal levels. At the frame level, we use a region-based approach that takes as input a primary region roughly corresponding to the user hands and a set of secondary regions potentially corresponding to the interacting objects and calculates the action score through a CNN formulation. This information is then fed to a Hierarchical LongShort-Term Memory Network (HLSTM) that captures temporal dependencies between actions within and across shots. Ablation studies thoroughly validate the proposed approach, showing in particular that both levels of the HLSTM architecture contribute to performance improvement. Furthermore, quantitative comparisons show that the proposed approach outperforms the state-of-the-art in terms of action recognition on standard benchmarks,without relying on motion information
△ Less
Submitted 23 April, 2021;
originally announced April 2021.
-
Debiased-CAM to mitigate image perturbations with faithful visual explanations of machine learning
Authors:
Wencan Zhang,
Mariella Dimiccoli,
Brian Y. Lim
Abstract:
Model explanations such as saliency maps can improve user trust in AI by highlighting important features for a prediction. However, these become distorted and misleading when explaining predictions of images that are subject to systematic error (bias) by perturbations and corruptions. Furthermore, the distortions persist despite model fine-tuning on images biased by different factors (blur, color…
▽ More
Model explanations such as saliency maps can improve user trust in AI by highlighting important features for a prediction. However, these become distorted and misleading when explaining predictions of images that are subject to systematic error (bias) by perturbations and corruptions. Furthermore, the distortions persist despite model fine-tuning on images biased by different factors (blur, color temperature, day/night). We present Debiased-CAM to recover explanation faithfulness across various bias types and levels by training a multi-input, multi-task model with auxiliary tasks for explanation and bias level predictions. In simulation studies, the approach not only enhanced prediction accuracy, but also generated highly faithful explanations about these predictions as if the images were unbiased. In user studies, debiased explanations improved user task performance, perceived truthfulness and perceived helpfulness. Debiased training can provide a versatile platform for robust performance and explanation faithfulness for a wide range of applications with data biases.
△ Less
Submitted 29 March, 2022; v1 submitted 10 December, 2020;
originally announced December 2020.
-
Seeing and Hearing Egocentric Actions: How Much Can We Learn?
Authors:
Alejandro Cartas,
Jordi Luque,
Petia Radeva,
Carlos Segura,
Mariella Dimiccoli
Abstract:
Our interaction with the world is an inherently multimodal experience. However, the understanding of human-to-object interactions has historically been addressed focusing on a single modality. In particular, a limited number of works have considered to integrate the visual and audio modalities for this purpose. In this work, we propose a multimodal approach for egocentric action recognition in a k…
▽ More
Our interaction with the world is an inherently multimodal experience. However, the understanding of human-to-object interactions has historically been addressed focusing on a single modality. In particular, a limited number of works have considered to integrate the visual and audio modalities for this purpose. In this work, we propose a multimodal approach for egocentric action recognition in a kitchen environment that relies on audio and visual information. Our model combines a sparse temporal sampling strategy with a late fusion of audio, spatial, and temporal streams. Experimental results on the EPIC-Kitchens dataset show that multimodal integration leads to better performance than unimodal approaches. In particular, we achieved a 5.18% improvement over the state of the art on verb classification.
△ Less
Submitted 15 October, 2019;
originally announced October 2019.
-
Learning event representations for temporal segmentation of image sequences by dynamic graph embedding
Authors:
Mariella Dimiccoli,
Herwig Wendt
Abstract:
Recently, self-supervised learning has proved to be effective to learn representations of events suitable for temporal segmentation in image sequences, where events are understood as sets of temporally adjacent images that are semantically perceived as a whole. However, although this approach does not require expensive manual annotations, it is data hungry and suffers from domain adaptation proble…
▽ More
Recently, self-supervised learning has proved to be effective to learn representations of events suitable for temporal segmentation in image sequences, where events are understood as sets of temporally adjacent images that are semantically perceived as a whole. However, although this approach does not require expensive manual annotations, it is data hungry and suffers from domain adaptation problems. As an alternative, in this work, we propose a novel approach for learning event representations named Dynamic Graph Embedding (DGE). The assumption underlying our model is that a sequence of images can be represented by a graph that encodes both semantic and temporal similarity. The key novelty of DGE is to learn jointly the graph and its graph embedding. At its core, DGE works by iterating over two steps: 1) updating the graph representing the semantic and temporal similarity of the data based on the current data representation, and 2) updating the data representation to take into account the current data graph structure. The main advantage of DGE over state-of-the-art self-supervised approaches is that it does not require any training set, but instead learns iteratively from the data itself a low-dimensional embedding that reflects their temporal and semantic similarity. Experimental results on two benchmark datasets of real image sequences captured at regular time intervals demonstrate that the proposed DGE leads to event representations effective for temporal segmentation. In particular, it achieves robust temporal segmentation on the EDUBSeg and EDUBSeg-Desc benchmark datasets, outperforming the state of the art. Additional experiments on two Human Motion Segmentation benchmark datasets demonstrate the generalization capabilities of the proposed DGE.
△ Less
Submitted 10 December, 2020; v1 submitted 8 October, 2019;
originally announced October 2019.
-
Enhancing temporal segmentation by nonlocal self-similarity
Authors:
Mariella Dimiccoli,
Herwig Wendt
Abstract:
Temporal segmentation of untrimmed videos and photo-streams is currently an active area of research in computer vision and image processing. This paper proposes a new approach to improve the temporal segmentation of photo-streams. The method consists in enhancing image representations by encoding long-range temporal dependencies. Our key contribution is to take advantage of the temporal stationari…
▽ More
Temporal segmentation of untrimmed videos and photo-streams is currently an active area of research in computer vision and image processing. This paper proposes a new approach to improve the temporal segmentation of photo-streams. The method consists in enhancing image representations by encoding long-range temporal dependencies. Our key contribution is to take advantage of the temporal stationarity assumption of photostreams for modeling each frame by its nonlocal self-similarity function. The proposed approach is put to test on the EDUB-Seg dataset, a standard benchmark for egocentric photostream temporal segmentation. Starting from seven different (CNN based) image features, the method yields consistent improvements in event segmentation quality, leading to an average increase of F-measure of 3.71% with respect to the state of the art.
△ Less
Submitted 14 June, 2019;
originally announced June 2019.
-
How Much Does Audio Matter to Recognize Egocentric Object Interactions?
Authors:
Alejandro Cartas,
Jordi Luque,
Petia Radeva,
Carlos Segura,
Mariella Dimiccoli
Abstract:
Sounds are an important source of information on our daily interactions with objects. For instance, a significant amount of people can discern the temperature of water that it is being poured just by using the sense of hearing. However, only a few works have explored the use of audio for the classification of object interactions in conjunction with vision or as single modality. In this preliminary…
▽ More
Sounds are an important source of information on our daily interactions with objects. For instance, a significant amount of people can discern the temperature of water that it is being poured just by using the sense of hearing. However, only a few works have explored the use of audio for the classification of object interactions in conjunction with vision or as single modality. In this preliminary work, we propose an audio model for egocentric action recognition and explore its usefulness on the parts of the problem (noun, verb, and action classification). Our model achieves a competitive result in terms of verb classification (34.26% accuracy) on a standard benchmark with respect to vision-based state of the art systems, using a comparatively lighter architecture.
△ Less
Submitted 3 June, 2019;
originally announced June 2019.
-
Social Relation Recognition in Egocentric Photostreams
Authors:
Emanuel Sanchez Aimar,
Petia Radeva,
Mariella Dimiccoli
Abstract:
This paper proposes an approach to automatically categorize the social interactions of a user wearing a photo-camera 2fpm, by relying solely on what the camera is seeing. The problem is challenging due to the overwhelming complexity of social life and the extreme intra-class variability of social interactions captured under unconstrained conditions. We adopt the formalization proposed in Bugental'…
▽ More
This paper proposes an approach to automatically categorize the social interactions of a user wearing a photo-camera 2fpm, by relying solely on what the camera is seeing. The problem is challenging due to the overwhelming complexity of social life and the extreme intra-class variability of social interactions captured under unconstrained conditions. We adopt the formalization proposed in Bugental's social theory, that groups human relations into five social domains with related categories. Our method is a new deep learning architecture that exploits the hierarchical structure of the label space and relies on a set of social attributes estimated at frame level to provide a semantic representation of social interactions. Experimental results on the new EgoSocialRelation dataset demonstrate the effectiveness of our proposal.
△ Less
Submitted 12 May, 2019;
originally announced May 2019.
-
On the Role of Event Boundaries in Egocentric Activity Recognition from Photostreams
Authors:
Alejandro Cartas,
Estefania Talavera,
Petia Radeva,
Mariella Dimiccoli
Abstract:
Event boundaries play a crucial role as a pre-processing step for detection, localization, and recognition tasks of human activities in videos. Typically, although their intrinsic subjectiveness, temporal bounds are provided manually as input for training action recognition algorithms. However, their role for activity recognition in the domain of egocentric photostreams has been so far neglected.…
▽ More
Event boundaries play a crucial role as a pre-processing step for detection, localization, and recognition tasks of human activities in videos. Typically, although their intrinsic subjectiveness, temporal bounds are provided manually as input for training action recognition algorithms. However, their role for activity recognition in the domain of egocentric photostreams has been so far neglected. In this paper, we provide insights of how automatically computed boundaries can impact activity recognition results in the emerging domain of egocentric photostreams. Furthermore, we collected a new annotated dataset acquired by 15 people by a wearable photo-camera and we used it to show the generalization capabilities of several deep learning based architectures to unseen users.
△ Less
Submitted 6 September, 2018; v1 submitted 2 September, 2018;
originally announced September 2018.
-
Smartphone picture organization: A hierarchical approach
Authors:
Stefan Lonn,
Petia Radeva,
Mariella Dimiccoli
Abstract:
We live in a society where the large majority of the population has a camera-equipped smartphone. In addition, hard drives and cloud storage are getting cheaper and cheaper, leading to a tremendous growth in stored personal photos. Unlike photo collections captured by a digital camera, which typically are pre-processed by the user who organizes them into event-related folders, smartphone pictures…
▽ More
We live in a society where the large majority of the population has a camera-equipped smartphone. In addition, hard drives and cloud storage are getting cheaper and cheaper, leading to a tremendous growth in stored personal photos. Unlike photo collections captured by a digital camera, which typically are pre-processed by the user who organizes them into event-related folders, smartphone pictures are automatically stored in the cloud. As a consequence, photo collections captured by a smartphone are highly unstructured and because smartphones are ubiquitous, they present a larger variability compared to pictures captured by a digital camera. To solve the need of organizing large smartphone photo collections automatically, we propose here a new methodology for hierarchical photo organization into topics and topic-related categories. Our approach successfully estimates latent topics in the pictures by applying probabilistic Latent Semantic Analysis, and automatically assigns a name to each topic by relying on a lexical database. Topic-related categories are then estimated by using a set of topic-specific Convolutional Neuronal Networks. To validate our approach, we ensemble and make public a large dataset of more than 8,000 smartphone pictures from 10 persons. Experimental results demonstrate better user satisfaction with respect to state of the art solutions in terms of organization.
△ Less
Submitted 6 September, 2019; v1 submitted 15 March, 2018;
originally announced March 2018.
-
Batch-based Activity Recognition from Egocentric Photo-Streams Revisited
Authors:
Alejandro Cartas,
Juan Marin,
Petia Radeva,
Mariella Dimiccoli
Abstract:
Wearable cameras can gather large a\-mounts of image data that provide rich visual information about the daily activities of the wearer. Motivated by the large number of health applications that could be enabled by the automatic recognition of daily activities, such as lifestyle characterization for habit improvement, context-aware personal assistance and tele-rehabilitation services, we propose a…
▽ More
Wearable cameras can gather large a\-mounts of image data that provide rich visual information about the daily activities of the wearer. Motivated by the large number of health applications that could be enabled by the automatic recognition of daily activities, such as lifestyle characterization for habit improvement, context-aware personal assistance and tele-rehabilitation services, we propose a system to classify 21 daily activities from photo-streams acquired by a wearable photo-camera. Our approach combines the advantages of a Late Fusion Ensemble strategy relying on convolutional neural networks at image level with the ability of recurrent neural networks to account for the temporal evolution of high level features in photo-streams without relying on event boundaries. The proposed batch-based approach achieved an overall accuracy of 89.85\%, outperforming state of the art end-to-end methodologies. These results were achieved on a dataset consists of 44,902 egocentric pictures from three persons captured during 26 days in average.
△ Less
Submitted 9 May, 2018; v1 submitted 11 October, 2017;
originally announced October 2017.
-
Social Style Characterization from Egocentric Photo-streams
Authors:
Maedeh Aghaei,
Mariella Dimiccoli,
Cristian Canton Ferrer,
Petia Radeva
Abstract:
This paper proposes a system for automatic social pattern characterization using a wearable photo-camera. The proposed pipeline consists of three major steps. First, detection of people with whom the camera wearer interacts and, second, categorization of the detected social interactions into formal and informal. These two steps act at event-level where each potential social event is modeled as a m…
▽ More
This paper proposes a system for automatic social pattern characterization using a wearable photo-camera. The proposed pipeline consists of three major steps. First, detection of people with whom the camera wearer interacts and, second, categorization of the detected social interactions into formal and informal. These two steps act at event-level where each potential social event is modeled as a multi-dimensional time-series, whose dimensions correspond to a set of relevant features for each task, and a LSTM network is employed for time-series classification. In the last step, recurrences of the same person across the whole set of social interactions are clustered to achieve a comprehensive understanding of the diversity and frequency of the social relations of the user. Experiments over a dataset acquired by a user wearing a photo-camera during a month show promising results on the task of social pattern characterization from egocentric photo-streams.
△ Less
Submitted 18 September, 2017;
originally announced September 2017.
-
Detecting Hands in Egocentric Videos: Towards Action Recognition
Authors:
Alejandro Cartas,
Mariella Dimiccoli,
Petia Radeva
Abstract:
Recently, there has been a growing interest in analyzing human daily activities from data collected by wearable cameras. Since the hands are involved in a vast set of daily tasks, detecting hands in egocentric images is an important step towards the recognition of a variety of egocentric actions. However, besides extreme illumination changes in egocentric images, hand detection is not a trivial ta…
▽ More
Recently, there has been a growing interest in analyzing human daily activities from data collected by wearable cameras. Since the hands are involved in a vast set of daily tasks, detecting hands in egocentric images is an important step towards the recognition of a variety of egocentric actions. However, besides extreme illumination changes in egocentric images, hand detection is not a trivial task because of the intrinsic large variability of hand appearance. We propose a hand detector that exploits skin modeling for fast hand proposal generation and Convolutional Neural Networks for hand recognition. We tested our method on UNIGE-HANDS dataset and we showed that the proposed approach achieves competitive hand detection results.
△ Less
Submitted 8 September, 2017;
originally announced September 2017.
-
Towards social pattern characterization in egocentric photo-streams
Authors:
Maedeh Aghaei,
Mariella Dimiccoli,
Cristian Canton Ferrer,
Petia Radeva
Abstract:
Following the increasingly popular trend of social interaction analysis in egocentric vision, this manuscript presents a comprehensive study for automatic social pattern characterization of a wearable photo-camera user, by relying on the visual analysis of egocentric photo-streams. The proposed framework consists of three major steps. The first step is to detect social interactions of the user whe…
▽ More
Following the increasingly popular trend of social interaction analysis in egocentric vision, this manuscript presents a comprehensive study for automatic social pattern characterization of a wearable photo-camera user, by relying on the visual analysis of egocentric photo-streams. The proposed framework consists of three major steps. The first step is to detect social interactions of the user where the impact of several social signals on the task is explored. The detected social events are inspected in the second step for categorization into different social meetings. These two steps act at event-level where each potential social event is modeled as a multi-dimensional time-series, whose dimensions correspond to a set of relevant features for each task, and LSTM is employed to classify the time-series. The last step of the framework is to characterize social patterns, which is essentially to infer the diversity and frequency of the social relations of the user through discovery of recurrences of the same people across the whole set of social events of the user. Experimental evaluation over a dataset acquired by 9 users demonstrates promising results on the task of social pattern characterization from egocentric photo-streams.
△ Less
Submitted 9 January, 2018; v1 submitted 5 September, 2017;
originally announced September 2017.
-
Batch-Based Activity Recognition from Egocentric Photo-Streams
Authors:
Alejandro Cartas,
Mariella Dimiccoli,
Petia Radeva
Abstract:
Activity recognition from long unstructured egocentric photo-streams has several applications in assistive technology such as health monitoring and frailty detection, just to name a few. However, one of its main technical challenges is to deal with the low frame rate of wearable photo-cameras, which causes abrupt appearance changes between consecutive frames. In consequence, important discriminato…
▽ More
Activity recognition from long unstructured egocentric photo-streams has several applications in assistive technology such as health monitoring and frailty detection, just to name a few. However, one of its main technical challenges is to deal with the low frame rate of wearable photo-cameras, which causes abrupt appearance changes between consecutive frames. In consequence, important discriminatory low-level features from motion such as optical flow cannot be estimated. In this paper, we present a batch-driven approach for training a deep learning architecture that strongly rely on Long short-term units to tackle this problem. We propose two different implementations of the same approach that process a photo-stream sequence using batches of fixed size with the goal of capturing the temporal evolution of high-level features. The main difference between these implementations is that one explicitly models consecutive batches by overlapping them. Experimental results over a public dataset acquired by three users demonstrate the validity of the proposed architectures to exploit the temporal evolution of convolutional features over time without relying on event boundaries.
△ Less
Submitted 25 August, 2017;
originally announced August 2017.
-
Recognizing Activities of Daily Living from Egocentric Images
Authors:
Alejandro Cartas,
Juan Marín,
Petia Radeva,
Mariella Dimiccoli
Abstract:
Recognizing Activities of Daily Living (ADLs) has a large number of health applications, such as characterize lifestyle for habit improvement, nursing and rehabilitation services. Wearable cameras can daily gather large amounts of image data that provide rich visual information about ADLs than using other wearable sensors. In this paper, we explore the classification of ADLs from images captured b…
▽ More
Recognizing Activities of Daily Living (ADLs) has a large number of health applications, such as characterize lifestyle for habit improvement, nursing and rehabilitation services. Wearable cameras can daily gather large amounts of image data that provide rich visual information about ADLs than using other wearable sensors. In this paper, we explore the classification of ADLs from images captured by low temporal resolution wearable camera (2fpm) by using a Convolutional Neural Networks (CNN) approach. We show that the classification accuracy of a CNN largely improves when its output is combined, through a random decision forest, with contextual information from a fully connected layer. The proposed method was tested on a subset of the NTCIR-12 egocentric dataset, consisting of 18,674 images and achieved an overall accuracy of 86% activity recognition on 21 classes.
△ Less
Submitted 13 April, 2017;
originally announced April 2017.
-
Pay Attention to Those Sets! Learning Quantification from Images
Authors:
Ionut Sorodoc,
Sandro Pezzelle,
Aurélie Herbelot,
Mariella Dimiccoli,
Raffaella Bernardi
Abstract:
Major advances have recently been made in merging language and vision representations. But most tasks considered so far have confined themselves to the processing of objects and lexicalised relations amongst objects (content words). We know, however, that humans (even pre-school children) can abstract over raw data to perform certain types of higher-level reasoning, expressed in natural language b…
▽ More
Major advances have recently been made in merging language and vision representations. But most tasks considered so far have confined themselves to the processing of objects and lexicalised relations amongst objects (content words). We know, however, that humans (even pre-school children) can abstract over raw data to perform certain types of higher-level reasoning, expressed in natural language by function words. A case in point is given by their ability to learn quantifiers, i.e. expressions like 'few', 'some' and 'all'. From formal semantics and cognitive linguistics, we know that quantifiers are relations over sets which, as a simplification, we can see as proportions. For instance, in 'most fish are red', most encodes the proportion of fish which are red fish. In this paper, we study how well current language and vision strategies model such relations. We show that state-of-the-art attention mechanisms coupled with a traditional linguistic formalisation of quantifiers gives best performance on the task. Additionally, we provide insights on the role of 'gist' representations in quantification. A 'logical' strategy to tackle the task would be to first obtain a numerosity estimation for the two involved sets and then compare their cardinalities. We however argue that precisely identifying the composition of the sets is not only beyond current state-of-the-art models but perhaps even detrimental to a task that is most efficiently performed by refining the approximate numerosity estimator of the system.
△ Less
Submitted 10 April, 2017;
originally announced April 2017.
-
R-Clustering for Egocentric Video Segmentation
Authors:
Estefania Talavera,
Mariella Dimiccoli,
Marc Bolaños,
Maedeh Aghaei,
Petia Radeva
Abstract:
In this paper, we present a new method for egocentric video temporal segmentation based on integrating a statistical mean change detector and agglomerative clustering(AC) within an energy-minimization framework. Given the tendency of most AC methods to oversegment video sequences when clustering their frames, we combine the clustering with a concept drift detection technique (ADWIN) that has rigor…
▽ More
In this paper, we present a new method for egocentric video temporal segmentation based on integrating a statistical mean change detector and agglomerative clustering(AC) within an energy-minimization framework. Given the tendency of most AC methods to oversegment video sequences when clustering their frames, we combine the clustering with a concept drift detection technique (ADWIN) that has rigorous guarantee of performances. ADWIN serves as a statistical upper bound for the clustering-based video segmentation. We integrate both techniques in an energy-minimization framework that serves to disambiguate the decision of both techniques and to complete the segmentation taking into account the temporal continuity of video frames descriptors. We present experiments over egocentric sets of more than 13.000 images acquired with different wearable cameras, showing that our method outperforms state-of-the-art clustering methods.
△ Less
Submitted 10 April, 2017;
originally announced April 2017.
-
Clothing and People - A Social Signal Processing Perspective
Authors:
Maedeh Aghaei,
Federico Parezzan,
Mariella Dimiccoli,
Petia Radeva,
Marco Cristani
Abstract:
In our society and century, clothing is not anymore used only as a means for body protection. Our paper builds upon the evidence, studied within the social sciences, that clothing brings a clear communicative message in terms of social signals, influencing the impression and behaviour of others towards a person. In fact, clothing correlates with personality traits, both in terms of self-assessment…
▽ More
In our society and century, clothing is not anymore used only as a means for body protection. Our paper builds upon the evidence, studied within the social sciences, that clothing brings a clear communicative message in terms of social signals, influencing the impression and behaviour of others towards a person. In fact, clothing correlates with personality traits, both in terms of self-assessment and assessments that unacquainted people give to an individual. The consequences of these facts are important: the influence of clothing on the decision making of individuals has been investigated in the literature, showing that it represents a discriminative factor to differentiate among diverse groups of people. Unfortunately, this has been observed after cumbersome and expensive manual annotations, on very restricted populations, limiting the scope of the resulting claims. With this position paper, we want to sketch the main steps of the very first systematic analysis, driven by social signal processing techniques, of the relationship between clothing and social signals, both sent and perceived. Thanks to human parsing technologies, which exhibit high robustness owing to deep learning architectures, we are now capable to isolate visual patterns characterising a large types of garments. These algorithms will be used to capture statistical relations on a large corpus of evidence to confirm the sociological findings and to go beyond the state of the art.
△ Less
Submitted 7 April, 2017;
originally announced April 2017.
-
All the people around me: face discovery in egocentric photo-streams
Authors:
Maedeh Aghaei,
Mariella Dimiccoli,
Petia Radeva
Abstract:
Given an unconstrained stream of images captured by a wearable photo-camera (2fpm), we propose an unsupervised bottom-up approach for automatic clustering appearing faces into the individual identities present in these data. The problem is challenging since images are acquired under real world conditions; hence the visible appearance of the people in the images undergoes intensive variations. Our…
▽ More
Given an unconstrained stream of images captured by a wearable photo-camera (2fpm), we propose an unsupervised bottom-up approach for automatic clustering appearing faces into the individual identities present in these data. The problem is challenging since images are acquired under real world conditions; hence the visible appearance of the people in the images undergoes intensive variations. Our proposed pipeline consists of first arranging the photo-stream into events, later, localizing the appearance of multiple people in them, and finally, grouping various appearances of the same person across different events. Experimental results performed on a dataset acquired by wearing a photo-camera during one month, demonstrate the effectiveness of the proposed approach for the considered purpose.
△ Less
Submitted 12 May, 2017; v1 submitted 6 March, 2017;
originally announced March 2017.
-
With Whom Do I Interact? Detecting Social Interactions in Egocentric Photo-streams
Authors:
Maedeh Aghaei,
Mariella Dimiccoli,
Petia Radeva
Abstract:
Given a user wearing a low frame rate wearable camera during a day, this work aims to automatically detect the moments when the user gets engaged into a social interaction solely by reviewing the automatically captured photos by the worn camera. The proposed method, inspired by the sociological concept of F-formation, exploits distance and orientation of the appearing individuals -with respect to…
▽ More
Given a user wearing a low frame rate wearable camera during a day, this work aims to automatically detect the moments when the user gets engaged into a social interaction solely by reviewing the automatically captured photos by the worn camera. The proposed method, inspired by the sociological concept of F-formation, exploits distance and orientation of the appearing individuals -with respect to the user- in the scene from a bird-view perspective. As a result, the interaction pattern over the sequence can be understood as a two-dimensional time series that corresponds to the temporal evolution of the distance and orientation features over time. A Long-Short Term Memory-based Recurrent Neural Network is then trained to classify each time series. Experimental evaluation over a dataset of 30.000 images has shown promising results on the proposed method for social interaction detection in egocentric photo-streams.
△ Less
Submitted 12 May, 2017; v1 submitted 13 May, 2016;
originally announced May 2016.
-
SR-Clustering: Semantic Regularized Clustering for Egocentric Photo Streams Segmentation
Authors:
Mariella Dimiccoli,
Marc Bolaños,
Estefania Talavera,
Maedeh Aghaei,
Stavri G. Nikolov,
Petia Radeva
Abstract:
While wearable cameras are becoming increasingly popular, locating relevant information in large unstructured collections of egocentric images is still a tedious and time consuming processes. This paper addresses the problem of organizing egocentric photo streams acquired by a wearable camera into semantically meaningful segments. First, contextual and semantic information is extracted for each im…
▽ More
While wearable cameras are becoming increasingly popular, locating relevant information in large unstructured collections of egocentric images is still a tedious and time consuming processes. This paper addresses the problem of organizing egocentric photo streams acquired by a wearable camera into semantically meaningful segments. First, contextual and semantic information is extracted for each image by employing a Convolutional Neural Networks approach. Later, by integrating language processing, a vocabulary of concepts is defined in a semantic space. Finally, by exploiting the temporal coherence in photo streams, images which share contextual and semantic attributes are grouped together. The resulting temporal segmentation is particularly suited for further analysis, ranging from activity and event recognition to semantic indexing and summarization. Experiments over egocentric sets of nearly 17,000 images, show that the proposed approach outperforms state-of-the-art methods.
△ Less
Submitted 17 October, 2016; v1 submitted 22 December, 2015;
originally announced December 2015.
-
A Computational Model for Amodal Completion
Authors:
Maria Oliver,
Gloria Haro,
Mariella Dimiccoli,
Baptiste Mazin,
Coloma Ballester
Abstract:
This paper presents a computational model to recover the most likely interpretation of the 3D scene structure from a planar image, where some objects may occlude others. The estimated scene interpretation is obtained by integrating some global and local cues and provides both the complete disoccluded objects that form the scene and their ordering according to depth. Our method first computes sever…
▽ More
This paper presents a computational model to recover the most likely interpretation of the 3D scene structure from a planar image, where some objects may occlude others. The estimated scene interpretation is obtained by integrating some global and local cues and provides both the complete disoccluded objects that form the scene and their ordering according to depth. Our method first computes several distal scenes which are compatible with the proximal planar image. To compute these different hypothesized scenes, we propose a perceptually inspired object disocclusion method, which works by minimizing the Euler's elastica as well as by incorporating the relatability of partially occluded contours and the convexity of the disoccluded objects. Then, to estimate the preferred scene we rely on a Bayesian model and define probabilities taking into account the global complexity of the objects in the hypothesized scenes as well as the effort of bringing these objects in their relative position in the planar image, which is also measured by an Euler's elastica-based quantity. The model is illustrated with numerical experiments on, both, synthetic and real images showing the ability of our model to reconstruct the occluded objects and the preferred perceptual order among them. We also present results on images of the Berkeley dataset with provided figure-ground ground-truth labeling.
△ Less
Submitted 29 March, 2016; v1 submitted 26 November, 2015;
originally announced November 2015.
-
Semantic Summarization of Egocentric Photo Stream Events
Authors:
Aniol Lidon,
Marc Bolaños,
Mariella Dimiccoli,
Petia Radeva,
Maite Garolera,
Xavier Giró-i-Nieto
Abstract:
With the rapid increase of users of wearable cameras in recent years and of the amount of data they produce, there is a strong need for automatic retrieval and summarization techniques. This work addresses the problem of automatically summarizing egocentric photo streams captured through a wearable camera by taking an image retrieval perspective. After removing non-informative images by a new CNN-…
▽ More
With the rapid increase of users of wearable cameras in recent years and of the amount of data they produce, there is a strong need for automatic retrieval and summarization techniques. This work addresses the problem of automatically summarizing egocentric photo streams captured through a wearable camera by taking an image retrieval perspective. After removing non-informative images by a new CNN-based filter, images are ranked by relevance to ensure semantic diversity and finally re-ranked by a novelty criterion to reduce redundancy. To assess the results, a new evaluation metric is proposed which takes into account the non-uniqueness of the solution. Experimental results applied on a database of 7,110 images from 6 different subjects and evaluated by experts gave 95.74% of experts satisfaction and a Mean Opinion Score of 4.57 out of 5.0. Source code is available at https://github.com/imatge-upc/egocentric-2017-lta
△ Less
Submitted 18 August, 2017; v1 submitted 2 November, 2015;
originally announced November 2015.
-
Active skeleton for bacteria modeling
Authors:
Jean-Pascal Jacob,
Mariella Dimiccoli,
Lionel Moisan
Abstract:
The investigation of spatio-temporal dynamics of bacterial cells and their molecular components requires automated image analysis tools to track cell shape properties and molecular component locations inside the cells. In the study of bacteria aging, the molecular components of interest are protein aggregates accumulated near bacteria boundaries. This particular location makes very ambiguous the c…
▽ More
The investigation of spatio-temporal dynamics of bacterial cells and their molecular components requires automated image analysis tools to track cell shape properties and molecular component locations inside the cells. In the study of bacteria aging, the molecular components of interest are protein aggregates accumulated near bacteria boundaries. This particular location makes very ambiguous the correspondence between aggregates and cells, since computing accurately bacteria boundaries in phase-contrast time-lapse imaging is a challenging task. This paper proposes an active skeleton formulation for bacteria modeling which provides several advantages: an easy computation of shape properties (perimeter, length, thickness, orientation), an improved boundary accuracy in noisy images, and a natural bacteria-centered coordinate system that permits the intrinsic location of molecular components inside the cell. Starting from an initial skeleton estimate, the medial axis of the bacterium is obtained by minimizing an energy function which incorporates bacteria shape constraints. Experimental results on biological images and comparative evaluation of the performances validate the proposed approach for modeling cigar-shaped bacteria like Escherichia coli. The Image-J plugin of the proposed method can be found online at http://fluobactracker.inrialpes.fr.
△ Less
Submitted 22 December, 2016; v1 submitted 23 July, 2015;
originally announced July 2015.
-
Particle detection and tracking in fluorescence time-lapse imaging: a contrario approach
Authors:
Mariella Dimiccoli,
Jean-Pascal Jacob,
Lionel Moisan
Abstract:
This paper proposes a probabilistic approach for the detection and the tracking of particles in fluorescent time-lapse imaging. In the presence of a very noised and poor-quality data, particles and trajectories can be characterized by an a contrario model, that estimates the probability of observing the structures of interest in random data. This approach, first introduced in the modeling of human…
▽ More
This paper proposes a probabilistic approach for the detection and the tracking of particles in fluorescent time-lapse imaging. In the presence of a very noised and poor-quality data, particles and trajectories can be characterized by an a contrario model, that estimates the probability of observing the structures of interest in random data. This approach, first introduced in the modeling of human visual perception and then successfully applied in many image processing tasks, leads to algorithms that neither require a previous learning stage, nor a tedious parameter tuning and are very robust to noise. Comparative evaluations against a well-established baseline show that the proposed approach outperforms the state of the art.
△ Less
Submitted 22 December, 2016; v1 submitted 22 July, 2015;
originally announced July 2015.
-
Towards Storytelling from Visual Lifelogging: An Overview
Authors:
Marc Bolaños,
Mariella Dimiccoli,
Petia Radeva
Abstract:
Visual lifelogging consists of acquiring images that capture the daily experiences of the user by wearing a camera over a long period of time. The pictures taken offer considerable potential for knowledge mining concerning how people live their lives, hence, they open up new opportunities for many potential applications in fields including healthcare, security, leisure and the quantified self. How…
▽ More
Visual lifelogging consists of acquiring images that capture the daily experiences of the user by wearing a camera over a long period of time. The pictures taken offer considerable potential for knowledge mining concerning how people live their lives, hence, they open up new opportunities for many potential applications in fields including healthcare, security, leisure and the quantified self. However, automatically building a story from a huge collection of unstructured egocentric data presents major challenges. This paper provides a thorough review of advances made so far in egocentric data analysis, and in view of the current state of the art, indicates new lines of research to move us towards storytelling from visual lifelogging.
△ Less
Submitted 20 July, 2016; v1 submitted 22 July, 2015;
originally announced July 2015.
-
Multi-Face Tracking by Extended Bag-of-Tracklets in Egocentric Videos
Authors:
Maedeh Aghaei,
Mariella Dimiccoli,
Petia Radeva
Abstract:
Wearable cameras offer a hands-free way to record egocentric images of daily experiences, where social events are of special interest. The first step towards detection of social events is to track the appearance of multiple persons involved in it. In this paper, we propose a novel method to find correspondences of multiple faces in low temporal resolution egocentric videos acquired through a weara…
▽ More
Wearable cameras offer a hands-free way to record egocentric images of daily experiences, where social events are of special interest. The first step towards detection of social events is to track the appearance of multiple persons involved in it. In this paper, we propose a novel method to find correspondences of multiple faces in low temporal resolution egocentric videos acquired through a wearable camera. This kind of photo-stream imposes additional challenges to the multi-tracking problem with respect to conventional videos. Due to the free motion of the camera and to its low temporal resolution, abrupt changes in the field of view, in illumination condition and in the target location are highly frequent. To overcome such difficulties, we propose a multi-face tracking method that generates a set of tracklets through finding correspondences along the whole sequence for each detected face and takes advantage of the tracklets redundancy to deal with unreliable ones. Similar tracklets are grouped into the so called extended bag-of-tracklets (eBoT), which is aimed to correspond to a specific person. Finally, a prototype tracklet is extracted for each eBoT, where the occurred occlusions are estimated by relying on a new measure of confidence. We validated our approach over an extensive dataset of egocentric photo-streams and compared it to state of the art methods, demonstrating its effectiveness and robustness.
△ Less
Submitted 13 January, 2016; v1 submitted 16 July, 2015;
originally announced July 2015.
-
Localization of protein aggregation in Escherichia coli is governed by diffusion and nucleoid macromolecular crowding effect
Authors:
Anne-Sophie Coquel,
Jean-Pascal Jacob,
Maël Primet,
Alice Demarez,
Mariella Dimiccoli,
Thomas Julou,
Lionel Moisan,
Ariel B. Lindner,
Hugues Berry
Abstract:
Aggregates of misfolded proteins are a hallmark of many age-related diseases. Recently, they have been linked to aging of Escherichia coli (E. coli) where protein aggregates accumulate at the old pole region of the aging bacterium. Because of the potential of E. coli as a model organism, elucidating aging and protein aggregation in this bacterium may pave the way to significant advances in our glo…
▽ More
Aggregates of misfolded proteins are a hallmark of many age-related diseases. Recently, they have been linked to aging of Escherichia coli (E. coli) where protein aggregates accumulate at the old pole region of the aging bacterium. Because of the potential of E. coli as a model organism, elucidating aging and protein aggregation in this bacterium may pave the way to significant advances in our global understanding of aging. A first obstacle along this path is to decipher the mechanisms by which protein aggregates are targeted to specific intercellular locations. Here, using an integrated approach based on individual-based modeling, time-lapse fluorescence microscopy and automated image analysis, we show that the movement of aging-related protein aggregates in E. coli is purely diffusive (Brownian). Using single-particle tracking of protein aggregates in live E. coli cells, we estimated the average size and diffusion constant of the aggregates. Our results evidence that the aggregates passively diffuse within the cell, with diffusion constants that depend on their size in agreement with the Stokes-Einstein law. However, the aggregate displacements along the cell long axis are confined to a region that roughly corresponds to the nucleoid-free space in the cell pole, thus confirming the importance of increased macromolecular crowding in the nucleoids. We thus used 3d individual-based modeling to show that these three ingredients (diffusion, aggregation and diffusion hindrance in the nucleoids) are sufficient and necessary to reproduce the available experimental data on aggregate localization in the cells. Taken together, our results strongly support the hypothesis that the localization of aging-related protein aggregates in the poles of E. coli results from the coupling of passive diffusion- aggregation with spatially non-homogeneous macromolecular crowding. They further support the importance of "soft" intracellular structuring (based on macromolecular crowding) in diffusion-based protein localization in E. coli.
△ Less
Submitted 8 March, 2013;
originally announced March 2013.