-
Reducing fuzzy relation equations via concept lattices
Authors:
David Lobo,
Víctor López-Marchante,
Jesús Medina
Abstract:
This paper has taken into advantage the relationship between Fuzzy Relation Equations (FRE) and Concept Lattices in order to introduce a procedure to reduce a FRE, without losing information. Specifically, attribute reduction theory in property-oriented and object-oriented concept lattices has been considered in order to present a mechanism for detecting redundant equations. As a first consequence…
▽ More
This paper has taken into advantage the relationship between Fuzzy Relation Equations (FRE) and Concept Lattices in order to introduce a procedure to reduce a FRE, without losing information. Specifically, attribute reduction theory in property-oriented and object-oriented concept lattices has been considered in order to present a mechanism for detecting redundant equations. As a first consequence, the computation of the whole solution set of a solvable FRE is reduced. Moreover, we will also introduce a novel method for computing approximate solutions of unsolvable FRE related to a (real) dataset with uncertainty/imprecision data.
△ Less
Submitted 8 October, 2024;
originally announced October 2024.
-
Extended multi-adjoint logic programming
Authors:
M. Eugenia Cornejo,
David Lobo,
Jesús Medina
Abstract:
Extended multi-adjoint logic programming arises as an extension of multi-adjoint normal logic programming where constraints and a special type of aggregator operator have been included. The use of this general aggregator operator permits to consider, for example, different negation operators in the body of the rules of a logic program. We have introduced the syntax and the semantics of this new pa…
▽ More
Extended multi-adjoint logic programming arises as an extension of multi-adjoint normal logic programming where constraints and a special type of aggregator operator have been included. The use of this general aggregator operator permits to consider, for example, different negation operators in the body of the rules of a logic program. We have introduced the syntax and the semantics of this new paradigm, as well as an interesting mechanism for obtaining a multi-adjoint normal logic program from an extended multi-adjoint logic program. This mechanism will allow us to establish technical properties relating the different stable models of both logic programming frameworks. Moreover, it makes possible that the already developed and future theory associated with stable models of multi-adjoint normal logic programs can be applied to extended multi-adjoint logic programs.
△ Less
Submitted 7 October, 2024;
originally announced October 2024.
-
Bipolar fuzzy relation equations systems based on the product t-norm
Authors:
M. Eugenia Cornejo,
David Lobo,
Jesús Medina
Abstract:
Bipolar fuzzy relation equations arise as a generalization of fuzzy relation equations considering unknown variables together with their logical connective negations. The occurrence of a variable and the occurrence of its negation simultaneously can give very useful information for certain frameworks where the human reasoning plays a key role. Hence, the resolution of bipolar fuzzy relation equati…
▽ More
Bipolar fuzzy relation equations arise as a generalization of fuzzy relation equations considering unknown variables together with their logical connective negations. The occurrence of a variable and the occurrence of its negation simultaneously can give very useful information for certain frameworks where the human reasoning plays a key role. Hence, the resolution of bipolar fuzzy relation equations systems is a research topic of great interest.
This paper focuses on the study of bipolar fuzzy relation equations systems based on the max-product t-norm composition. Specifically, the solvability and the algebraic structure of the set of solutions of these bipolar equations systems will be studied, including the case in which such systems are composed of equations whose independent term be equal to zero. As a consequence, this paper complements the contribution carried out by the authors on the solvability of bipolar max-product fuzzy relation equations.
△ Less
Submitted 24 September, 2024;
originally announced October 2024.
-
Syntax and semantics of multi-adjoint normal logic programming
Authors:
M. Eugenia Cornejo,
David Lobo,
Jesús Medina
Abstract:
Multi-adjoint logic programming is a general framework with interesting features, which involves other positive logic programming frameworks such as monotonic and residuated logic programming, generalized annotated logic programs, fuzzy logic programming and possibilistic logic programming. One of the most interesting extensions of this framework is the possibility of considering a negation operat…
▽ More
Multi-adjoint logic programming is a general framework with interesting features, which involves other positive logic programming frameworks such as monotonic and residuated logic programming, generalized annotated logic programs, fuzzy logic programming and possibilistic logic programming. One of the most interesting extensions of this framework is the possibility of considering a negation operator in the logic programs, which will improve its flexibility and the range of real applications.
This paper introduces multi-adjoint normal logic programming, which is an extension of multi-adjoint logic programming including a negation operator in the underlying lattice. Beside the introduction of the syntax and semantics of this paradigm, we will provide sufficient conditions for the existence of stable models defined on a convex compact set of a euclidean space. Finally, we will consider a particular algebraic structure in which sufficient conditions can be given in order to ensure the unicity of stable models of multi-adjoint normal logic programs.
△ Less
Submitted 24 September, 2024; v1 submitted 23 September, 2024;
originally announced September 2024.
-
SDLNet: Statistical Deep Learning Network for Co-Occurring Object Detection and Identification
Authors:
Binay Kumar Singh,
Niels Da Vitoria Lobo
Abstract:
With the growing advances in deep learning based technologies the detection and identification of co-occurring objects is a challenging task which has many applications in areas such as, security and surveillance. In this paper, we propose a novel framework called SDLNet- Statistical analysis with Deep Learning Network that identifies co-occurring objects in conjunction with base objects in multil…
▽ More
With the growing advances in deep learning based technologies the detection and identification of co-occurring objects is a challenging task which has many applications in areas such as, security and surveillance. In this paper, we propose a novel framework called SDLNet- Statistical analysis with Deep Learning Network that identifies co-occurring objects in conjunction with base objects in multilabel object categories. The pipeline of proposed work is implemented in two stages: in the first stage of SDLNet we deal with multilabel detectors for discovering labels, and in the second stage we perform co-occurrence matrix analysis. In co-occurrence matrix analysis, we learn co-occurrence statistics by setting base classes and frequently occurring classes, following this we build association rules and generate frequent patterns. The crucial part of SDLNet is recognizing base classes and making consideration for co-occurring classes. Finally, the generated co-occurrence matrix based on frequent patterns will show base classes and their corresponding co-occurring classes. SDLNet is evaluated on two publicly available datasets: Pascal VOC and MS-COCO. The experimental results on these benchmark datasets are reported in Sec 4.
△ Less
Submitted 24 July, 2024;
originally announced July 2024.
-
Co-Occurring of Object Detection and Identification towards unlabeled object discovery
Authors:
Binay Kumar Singh,
Niels Da Vitoria Lobo
Abstract:
In this paper, we propose a novel deep learning based approach for identifying co-occurring objects in conjunction with base objects in multilabel object categories. Nowadays, with the advancement in computer vision based techniques we need to know about co-occurring objects with respect to base object for various purposes. The pipeline of the proposed work is composed of two stages: in the first…
▽ More
In this paper, we propose a novel deep learning based approach for identifying co-occurring objects in conjunction with base objects in multilabel object categories. Nowadays, with the advancement in computer vision based techniques we need to know about co-occurring objects with respect to base object for various purposes. The pipeline of the proposed work is composed of two stages: in the first stage of the proposed model we detect all the bounding boxes present in the image and their corresponding labels, then in the second stage we perform co-occurrence matrix analysis. In co-occurrence matrix analysis, we set base classes based on the maximum occurrences of the labels and build association rules and generate frequent patterns. These frequent patterns will show base classes and their corresponding co-occurring classes. We performed our experiments on two publicly available datasets: Pascal VOC and MS-COCO. The experimental results on public benchmark dataset is reported in Sec 4. Further we extend this work by considering all frequently objects as unlabeled and what if they are occluded as well.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Weakly Supervised Grounding for VQA in Vision-Language Transformers
Authors:
Aisha Urooj Khan,
Hilde Kuehne,
Chuang Gan,
Niels Da Vitoria Lobo,
Mubarak Shah
Abstract:
Transformers for visual-language representation learning have been getting a lot of interest and shown tremendous performance on visual question answering (VQA) and grounding. But most systems that show good performance of those tasks still rely on pre-trained object detectors during training, which limits their applicability to the object classes available for those detectors. To mitigate this li…
▽ More
Transformers for visual-language representation learning have been getting a lot of interest and shown tremendous performance on visual question answering (VQA) and grounding. But most systems that show good performance of those tasks still rely on pre-trained object detectors during training, which limits their applicability to the object classes available for those detectors. To mitigate this limitation, the following paper focuses on the problem of weakly supervised grounding in context of visual question answering in transformers. The approach leverages capsules by grouping each visual token in the visual encoder and uses activations from language self-attention layers as a text-guided selection module to mask those capsules before they are forwarded to the next layer. We evaluate our approach on the challenging GQA as well as VQA-HAT dataset for VQA grounding. Our experiments show that: while removing the information of masked objects from standard transformer architectures leads to a significant drop in performance, the integration of capsules significantly improves the grounding ability of such systems and provides new state-of-the-art results compared to other approaches in the field.
△ Less
Submitted 5 July, 2022;
originally announced July 2022.
-
Event Camera Based Real-Time Detection and Tracking of Indoor Ground Robots
Authors:
Himanshu Patel,
Craig Iaboni,
Deepan Lobo,
Ji-won Choi,
Pramod Abichandani
Abstract:
This paper presents a real-time method to detect and track multiple mobile ground robots using event cameras. The method uses density-based spatial clustering of applications with noise (DBSCAN) to detect the robots and a single k-dimensional ($k - d$) tree to accurately keep track of them as they move in an indoor arena. Robust detections and tracks are maintained in the face of event camera nois…
▽ More
This paper presents a real-time method to detect and track multiple mobile ground robots using event cameras. The method uses density-based spatial clustering of applications with noise (DBSCAN) to detect the robots and a single k-dimensional ($k - d$) tree to accurately keep track of them as they move in an indoor arena. Robust detections and tracks are maintained in the face of event camera noise and lack of events (due to robots moving slowly or stopping). An off-the-shelf RGB camera-based tracking system was used to provide ground truth. Experiments including up to 4 robots are performed to study the effect of i) varying DBSCAN parameters, ii) the event accumulation time, iii) the number of robots in the arena, iv) the speed of the robots, and v) variation in ambient light conditions on the detection and tracking performance. The experimental results showed 100% detection and tracking fidelity in the face of event camera noise and robots stopping for tests involving up to 3 robots (and upwards of 93% for 4 robots). When the lighting conditions were varied, a graceful degradation in detection and tracking fidelity was observed.
△ Less
Submitted 2 August, 2021; v1 submitted 23 February, 2021;
originally announced February 2021.
-
MMFT-BERT: Multimodal Fusion Transformer with BERT Encodings for Visual Question Answering
Authors:
Aisha Urooj Khan,
Amir Mazaheri,
Niels da Vitoria Lobo,
Mubarak Shah
Abstract:
We present MMFT-BERT(MultiModal Fusion Transformer with BERT encodings), to solve Visual Question Answering (VQA) ensuring individual and combined processing of multiple input modalities. Our approach benefits from processing multimodal data (video and text) adopting the BERT encodings individually and using a novel transformer-based fusion method to fuse them together. Our method decomposes the d…
▽ More
We present MMFT-BERT(MultiModal Fusion Transformer with BERT encodings), to solve Visual Question Answering (VQA) ensuring individual and combined processing of multiple input modalities. Our approach benefits from processing multimodal data (video and text) adopting the BERT encodings individually and using a novel transformer-based fusion method to fuse them together. Our method decomposes the different sources of modalities, into different BERT instances with similar architectures, but variable weights. This achieves SOTA results on the TVQA dataset. Additionally, we provide TVQA-Visual, an isolated diagnostic subset of TVQA, which strictly requires the knowledge of visual (V) modality based on a human annotator's judgment. This set of questions helps us to study the model's behavior and the challenges TVQA poses to prevent the achievement of super human performance. Extensive experiments show the effectiveness and superiority of our method.
△ Less
Submitted 27 October, 2020;
originally announced October 2020.
-
Deep Photo Cropper and Enhancer
Authors:
Aaron Ott,
Amir Mazaheri,
Niels D. Lobo,
Mubarak Shah
Abstract:
This paper introduces a new type of image enhancement problem. Compared to traditional image enhancement methods, which mostly deal with pixel-wise modifications of a given photo, our proposed task is to crop an image which is embedded within a photo and enhance the quality of the cropped image. We split our proposed approach into two deep networks: deep photo cropper and deep image enhancer. In t…
▽ More
This paper introduces a new type of image enhancement problem. Compared to traditional image enhancement methods, which mostly deal with pixel-wise modifications of a given photo, our proposed task is to crop an image which is embedded within a photo and enhance the quality of the cropped image. We split our proposed approach into two deep networks: deep photo cropper and deep image enhancer. In the photo cropper network, we employ a spatial transformer to extract the embedded image. In the photo enhancer, we employ super-resolution to increase the number of pixels in the embedded image and reduce the effect of stretching and distortion of pixels. We use cosine distance loss between image features and ground truth for the cropper and the mean square loss for the enhancer. Furthermore, we propose a new dataset to train and test the proposed method. Finally, we analyze the proposed method with respect to qualitative and quantitative evaluations.
△ Less
Submitted 2 August, 2020;
originally announced August 2020.
-
Text Synopsis Generation for Egocentric Videos
Authors:
Aidean Sharghi,
Niels da Vitoria Lobo,
Mubarak Shah
Abstract:
Mass utilization of body-worn cameras has led to a huge corpus of available egocentric video. Existing video summarization algorithms can accelerate browsing such videos by selecting (visually) interesting shots from them. Nonetheless, since the system user still has to watch the summary videos, browsing large video databases remain a challenge. Hence, in this work, we propose to generate a textua…
▽ More
Mass utilization of body-worn cameras has led to a huge corpus of available egocentric video. Existing video summarization algorithms can accelerate browsing such videos by selecting (visually) interesting shots from them. Nonetheless, since the system user still has to watch the summary videos, browsing large video databases remain a challenge. Hence, in this work, we propose to generate a textual synopsis, consisting of a few sentences describing the most important events in a long egocentric videos. Users can read the short text to gain insight about the video, and more importantly, efficiently search through the content of a large video database using text queries. Since egocentric videos are long and contain many activities and events, using video-to-text algorithms results in thousands of descriptions, many of which are incorrect. Therefore, we propose a multi-task learning scheme to simultaneously generate descriptions for video segments and summarize the resulting descriptions in an end-to-end fashion. We Input a set of video shots and the network generates a text description for each shot. Next, visual-language content matching unit that is trained with a weakly supervised objective, identifies the correct descriptions. Finally, the last component of our network, called purport network, evaluates the descriptions all together to select the ones containing crucial information. Out of thousands of descriptions generated for the video, a few informative sentences are returned to the user. We validate our framework on the challenging UT Egocentric video dataset, where each video is between 3 to 5 hours long, associated with over 3000 textual descriptions on average. The generated textual summaries, including only 5 percent (or less) of the generated descriptions, are compared to groundtruth summaries in text domain using well-established metrics in natural language processing.
△ Less
Submitted 21 September, 2020; v1 submitted 7 May, 2020;
originally announced May 2020.
-
Understanding Trajectory Behavior: A Motion Pattern Approach
Authors:
Mahdi M. Kalayeh,
Stephen Mussmann,
Alla Petrakova,
Niels da Vitoria Lobo,
Mubarak Shah
Abstract:
Mining the underlying patterns in gigantic and complex data is of great importance to data analysts. In this paper, we propose a motion pattern approach to mine frequent behaviors in trajectory data. Motion patterns, defined by a set of highly similar flow vector groups in a spatial locality, have been shown to be very effective in extracting dominant motion behaviors in video sequences. Inspired…
▽ More
Mining the underlying patterns in gigantic and complex data is of great importance to data analysts. In this paper, we propose a motion pattern approach to mine frequent behaviors in trajectory data. Motion patterns, defined by a set of highly similar flow vector groups in a spatial locality, have been shown to be very effective in extracting dominant motion behaviors in video sequences. Inspired by applications and properties of motion patterns, we have designed a framework that successfully solves the general task of trajectory clustering. Our proposed algorithm consists of four phases: flow vector computation, motion component extraction, motion component's reachability set creation, and motion pattern formation. For the first phase, we break down trajectories into flow vectors that indicate instantaneous movements. In the second phase, via a Kmeans clustering approach, we create motion components by clustering the flow vectors with respect to their location and velocity. Next, we create motion components' reachability set in terms of spatial proximity and motion similarity. Finally, for the fourth phase, we cluster motion components using agglomerative clustering with the weighted Jaccard distance between the motion components' signatures, a set created using path reachability. We have evaluated the effectiveness of our proposed method in an extensive set of experiments on diverse datasets. Further, we have shown how our proposed method handles difficulties in the general task of trajectory clustering that challenge the existing state-of-the-art methods.
△ Less
Submitted 3 January, 2015;
originally announced January 2015.