-
A Polystore Architecture Using Knowledge Graphs to Support Queries on Heterogeneous Data Stores
Authors:
Leonardo Guerreiro Azevedo,
Renan Francisco Santos Souza,
Elton F. de S. Soares,
Raphael M. Thiago,
Julio Cesar Cardoso Tesolin,
Ann C. Oliveira,
Marcio Ferreira Moreno
Abstract:
Modern applications commonly need to manage dataset types composed of heterogeneous data and schemas, making it difficult to access them in an integrated way. A single data store to manage heterogeneous data using a common data model is not effective in such a scenario, which results in the domain data being fragmented in the data stores that best fit their storage and access requirements (e.g., N…
▽ More
Modern applications commonly need to manage dataset types composed of heterogeneous data and schemas, making it difficult to access them in an integrated way. A single data store to manage heterogeneous data using a common data model is not effective in such a scenario, which results in the domain data being fragmented in the data stores that best fit their storage and access requirements (e.g., NoSQL, relational DBMS, or HDFS). Besides, organization workflows independently consume these fragments, and usually, there is no explicit link among the fragments that would be useful to support an integrated view. The research challenge tackled by this work is to provide the means to query heterogeneous data residing on distinct data repositories that are not explicitly connected. We propose a federated database architecture by providing a single abstract global conceptual schema to users, allowing them to write their queries, encapsulating data heterogeneity, location, and linkage by employing: (i) meta-models to represent the global conceptual schema, the remote data local conceptual schemas, and mappings among them; (ii) provenance to create explicit links among the consumed and generated data residing in separate datasets. We evaluated the architecture through its implementation as a polystore service, following a microservice architecture approach, in a scenario that simulates a real case in Oil \& Gas industry. Also, we compared the proposed architecture to a relational multidatabase system based on foreign data wrappers, measuring the user's cognitive load to write a query (or query complexity) and the query processing time. The results demonstrated that the proposed architecture allows query writing two times less complex than the one written for the relational multidatabase system, adding an excess of no more than 30% in query processing time.
△ Less
Submitted 15 March, 2024; v1 submitted 7 August, 2023;
originally announced August 2023.
-
A Knowledge-Oriented Approach to Enhance Integration and Communicability in the Polkadot Ecosystem
Authors:
Marcio Ferreira Moreno,
Rafael Rossi de Mello Brandão
Abstract:
The Polkadot ecosystem is a disruptive and highly complex multi-chain architecture that poses challenges in terms of data analysis and communicability. Currently, there is a lack of standardized and holistic approaches to retrieve and analyze data across parachains and applications, making it difficult for general users and developers to access ecosystem data consistently. This paper proposes a co…
▽ More
The Polkadot ecosystem is a disruptive and highly complex multi-chain architecture that poses challenges in terms of data analysis and communicability. Currently, there is a lack of standardized and holistic approaches to retrieve and analyze data across parachains and applications, making it difficult for general users and developers to access ecosystem data consistently. This paper proposes a conceptual framework that includes a domain ontology called POnto (a Polkadot Ontology) to address these challenges. POnto provides a structured representation of the ecosystem's concepts and relationships, enabling a formal understanding of the platform. The proposed knowledge-oriented approach enhances integration and communicability, enabling a wider range of users to participate in the ecosystem and facilitating the development of AI-based applications. The paper presents a case study methodology to validate the proposed framework, which includes expert feedback and insights from the Polkadot community. The POnto ontology and the roadmap for a query engine based on a Controlled Natural Language using the ontology, provide valuable contributions to the growth and adoption of the Polkadot ecosystem in heterogeneous socio-technical environments.
△ Less
Submitted 1 August, 2023;
originally announced August 2023.
-
Bridging the Gap between Semantics and Multimedia Processing
Authors:
Marcio Ferreira Moreno,
Guilherme Lima,
Rodrigo Costa Mesquita Santos,
Roberto Azevedo,
Markus Endler
Abstract:
In this paper, we give an overview of the semantic gap problem in multimedia and discuss how machine learning and symbolic AI can be combined to narrow this gap. We describe the gap in terms of a classical architecture for multimedia processing and discuss a structured approach to bridge it. This approach combines machine learning (for mapping signals to objects) and symbolic AI (for linking objec…
▽ More
In this paper, we give an overview of the semantic gap problem in multimedia and discuss how machine learning and symbolic AI can be combined to narrow this gap. We describe the gap in terms of a classical architecture for multimedia processing and discuss a structured approach to bridge it. This approach combines machine learning (for mapping signals to objects) and symbolic AI (for linking objects to meanings). Our main goal is to raise awareness and discuss the challenges involved in this structured approach to multimedia understanding, especially in the view of the latest developments in machine learning and symbolic AI.
△ Less
Submitted 2 December, 2019; v1 submitted 25 November, 2019;
originally announced November 2019.
-
An Introduction to Symbolic Artificial Intelligence Applied to Multimedia
Authors:
Guilherme Lima,
Rodrigo Costa,
Marcio Ferreira Moreno
Abstract:
In this chapter, we give an introduction to symbolic artificial intelligence (AI) and discuss its relation and application to multimedia. We begin by defining what symbolic AI is, what distinguishes it from non-symbolic approaches, such as machine learning, and how it can used in the construction of advanced multimedia applications. We then introduce description logic (DL) and use it to discuss sy…
▽ More
In this chapter, we give an introduction to symbolic artificial intelligence (AI) and discuss its relation and application to multimedia. We begin by defining what symbolic AI is, what distinguishes it from non-symbolic approaches, such as machine learning, and how it can used in the construction of advanced multimedia applications. We then introduce description logic (DL) and use it to discuss symbolic representation and reasoning. DL is the logical underpinning of OWL, the most successful family of ontology languages. After discussing DL, we present OWL and related Semantic Web technologies, such as RDF and SPARQL. We conclude the chapter by discussing a hybrid model for multimedia representation, called Hyperknowledge. Throughout the text, we make references to technologies and extensions specifically designed to solve the kinds of problems that arise in multimedia representation.
△ Less
Submitted 28 November, 2019; v1 submitted 21 November, 2019;
originally announced November 2019.
-
Multimedia Search and Temporal Reasoning
Authors:
Marcio Ferreira Moreno,
Rodrigo Costa Mesquita Santos,
Wallas Henrique Sousa dos Santos,
Sandro Rama Fiorini,
Reinaldo Mozart da Gama Silva
Abstract:
Properly modelling dynamic information that changes over time still is an open issue. Most modern knowledge bases are unable to represent relationships that are valid only during a given time interval. In this work, we revisit a previous extension to the hyperknowledge framework to deal with temporal facts and propose a temporal query language and engine. We validate our proposal by discussing a q…
▽ More
Properly modelling dynamic information that changes over time still is an open issue. Most modern knowledge bases are unable to represent relationships that are valid only during a given time interval. In this work, we revisit a previous extension to the hyperknowledge framework to deal with temporal facts and propose a temporal query language and engine. We validate our proposal by discussing a qualitative analysis of the modelling of a real-world use case in the Oil & Gas industry.
△ Less
Submitted 19 November, 2019;
originally announced November 2019.
-
General Fragment Model for Information Artifacts
Authors:
Sandro Rama Fiorini,
Wallas Sousa dos Santos,
Rodrigo Costa Mesquita,
Guilherme Ferreira Lima,
Marcio F. Moreno
Abstract:
The use of semantic descriptions in data intensive domains require a systematic model for linking semantic descriptions with their manifestations in fragments of heterogeneous information and data objects. Such information heterogeneity requires a fragment model that is general enough to support the specification of anchors from conceptual models to multiple types of information artifacts. While d…
▽ More
The use of semantic descriptions in data intensive domains require a systematic model for linking semantic descriptions with their manifestations in fragments of heterogeneous information and data objects. Such information heterogeneity requires a fragment model that is general enough to support the specification of anchors from conceptual models to multiple types of information artifacts. While diverse proposals of anchoring models exist in the literature, they are usually focused in audiovisual information. We propose a generalized fragment model that can be instantiated to different kinds of information artifacts. Our objective is to systematize the way in which fragments and anchors can be described in conceptual models, without committing to a specific vocabulary.
△ Less
Submitted 9 September, 2019;
originally announced September 2019.