-
International AI Safety Report
Authors:
Yoshua Bengio,
Sören Mindermann,
Daniel Privitera,
Tamay Besiroglu,
Rishi Bommasani,
Stephen Casper,
Yejin Choi,
Philip Fox,
Ben Garfinkel,
Danielle Goldfarb,
Hoda Heidari,
Anson Ho,
Sayash Kapoor,
Leila Khalatbari,
Shayne Longpre,
Sam Manning,
Vasilios Mavroudis,
Mantas Mazeika,
Julian Michael,
Jessica Newman,
Kwan Yee Ng,
Chinasa T. Okolo,
Deborah Raji,
Girish Sastry,
Elizabeth Seger
, et al. (71 additional authors not shown)
Abstract:
The first International AI Safety Report comprehensively synthesizes the current evidence on the capabilities, risks, and safety of advanced AI systems. The report was mandated by the nations attending the AI Safety Summit in Bletchley, UK. Thirty nations, the UN, the OECD, and the EU each nominated a representative to the report's Expert Advisory Panel. A total of 100 AI experts contributed, repr…
▽ More
The first International AI Safety Report comprehensively synthesizes the current evidence on the capabilities, risks, and safety of advanced AI systems. The report was mandated by the nations attending the AI Safety Summit in Bletchley, UK. Thirty nations, the UN, the OECD, and the EU each nominated a representative to the report's Expert Advisory Panel. A total of 100 AI experts contributed, representing diverse perspectives and disciplines. Led by the report's Chair, these independent experts collectively had full discretion over the report's content.
△ Less
Submitted 29 January, 2025;
originally announced January 2025.
-
Characterization of $n$-Dimensional Toric and Burst-Error-Correcting Quantum Codes from Lattice Codes
Authors:
Cibele Cristina Trinca,
Reginaldo Palazzo Jr.,
J. Carmelo Interlando,
Ricardo Augusto Watanabe,
Clarice Dias de Albuquerque,
Edson Donizete de Carvalho,
Antonio Aparecido de Andrade
Abstract:
Quantum error correction is essential for the development of any scalable quantum computer. In this work we introduce a generalization of a quantum interleaving method for combating clusters of errors in toric quantum error-correcting codes. We present new $n$-dimensional toric quantum codes, where $n\geq 5$, which are featured by lattice codes and apply the proposed quantum interleaving method to…
▽ More
Quantum error correction is essential for the development of any scalable quantum computer. In this work we introduce a generalization of a quantum interleaving method for combating clusters of errors in toric quantum error-correcting codes. We present new $n$-dimensional toric quantum codes, where $n\geq 5$, which are featured by lattice codes and apply the proposed quantum interleaving method to such new $n$-dimensional toric quantum codes. Through the application of this method to these novel $n$-dimensional toric quantum codes we derive new $n$-dimensional quantum burst-error-correcting codes. Consequently, $n$-dimensional toric quantum codes and burst-error-correcting quantum codes are provided offering both a good code rate and a significant coding gain when it comes to toric quantum codes. Another important consequence from the presented $n$-dimensional toric quantum codes is that if the Golomb and Welch conjecture in \cite{perfcodes} regarding the Lee sphere in $n$ dimensions for the respective close packings holds true, then it follows that these $n$-dimensional toric quantum codes are the only possible ones to be obtained from lattice codes. Moreover, such a methodology can be applied for burst error correction in cases involving localized errors, quantum data storage and quantum channels with memory.
△ Less
Submitted 26 October, 2024;
originally announced October 2024.
-
Livestock Fish Larvae Counting using DETR and YOLO based Deep Networks
Authors:
Daniel Ortega de Carvalho,
Luiz Felipe Teodoro Monteiro,
Fernanda Marques Bazilio,
Gabriel Toshio Hirokawa Higa,
Hemerson Pistori
Abstract:
Counting fish larvae is an important, yet demanding and time consuming, task in aquaculture. In order to address this problem, in this work, we evaluate four neural network architectures, including convolutional neural networks and transformers, in different sizes, in the task of fish larvae counting. For the evaluation, we present a new annotated image dataset with less data collection requiremen…
▽ More
Counting fish larvae is an important, yet demanding and time consuming, task in aquaculture. In order to address this problem, in this work, we evaluate four neural network architectures, including convolutional neural networks and transformers, in different sizes, in the task of fish larvae counting. For the evaluation, we present a new annotated image dataset with less data collection requirements than preceding works, with images of spotted sorubim and dourado larvae. By using image tiling techniques, we achieve a MAPE of 4.46% ($\pm 4.70$) with an extra large real time detection transformer, and 4.71% ($\pm 4.98$) with a medium-sized YOLOv8.
△ Less
Submitted 9 August, 2024;
originally announced August 2024.
-
An agent-based approach to procedural city generation incorporating Land Use and Transport Interaction models
Authors:
Luiz Fernando Silva Eugênio dos Santos,
Claus Aranha,
André Ponce de Leon F de Carvalho
Abstract:
We apply the knowledge of urban settings established with the study of Land Use and Transport Interaction (LUTI) models to develop reward functions for an agent-based system capable of planning realistic artificial cities. The system aims to replicate in the micro scale the main components of real settlements, such as zoning and accessibility in a road network. Moreover, we propose a novel represe…
▽ More
We apply the knowledge of urban settings established with the study of Land Use and Transport Interaction (LUTI) models to develop reward functions for an agent-based system capable of planning realistic artificial cities. The system aims to replicate in the micro scale the main components of real settlements, such as zoning and accessibility in a road network. Moreover, we propose a novel representation for the agent's environment that efficiently combines the road graph with a discrete model for the land. Our system starts from an empty map consisting only of the road network graph, and the agent incrementally expands it by building new sites while distinguishing land uses between residential, commercial, industrial, and recreational.
△ Less
Submitted 21 October, 2022;
originally announced November 2022.
-
No Pattern, No Recognition: a Survey about Reproducibility and Distortion Issues of Text Clustering and Topic Modeling
Authors:
Marília Costa Rosendo Silva,
Felipe Alves Siqueira,
João Pedro Mantovani Tarrega,
João Vitor Pataca Beinotti,
Augusto Sousa Nunes,
Miguel de Mattos Gardini,
Vinícius Adolfo Pereira da Silva,
Nádia Félix Felipe da Silva,
André Carlos Ponce de Leon Ferreira de Carvalho
Abstract:
Extracting knowledge from unlabeled texts using machine learning algorithms can be complex. Document categorization and information retrieval are two applications that may benefit from unsupervised learning (e.g., text clustering and topic modeling), including exploratory data analysis. However, the unsupervised learning paradigm poses reproducibility issues. The initialization can lead to variabi…
▽ More
Extracting knowledge from unlabeled texts using machine learning algorithms can be complex. Document categorization and information retrieval are two applications that may benefit from unsupervised learning (e.g., text clustering and topic modeling), including exploratory data analysis. However, the unsupervised learning paradigm poses reproducibility issues. The initialization can lead to variability depending on the machine learning algorithm. Furthermore, the distortions can be misleading when regarding cluster geometry. Amongst the causes, the presence of outliers and anomalies can be a determining factor. Despite the relevance of initialization and outlier issues for text clustering and topic modeling, the authors did not find an in-depth analysis of them. This survey provides a systematic literature review (2011-2022) of these subareas and proposes a common terminology since similar procedures have different terms. The authors describe research opportunities, trends, and open issues. The appendices summarize the theoretical background of the text vectorization, the factorization, and the clustering algorithms that are directly or indirectly related to the reviewed works.
△ Less
Submitted 2 August, 2022;
originally announced August 2022.
-
Evaluating Meta-Feature Selection for the Algorithm Recommendation Problem
Authors:
Geand Trindade Pereira,
Moises Rocha dos Santos,
Andre Carlos Ponce de Leon Ferreira de Carvalho
Abstract:
With the popularity of Machine Learning (ML) solutions, algorithms and data have been released faster than the capacity of processing them. In this context, the problem of Algorithm Recommendation (AR) is receiving a significant deal of attention recently. This problem has been addressed in the literature as a learning task, often as a Meta-Learning problem where the aim is to recommend the best a…
▽ More
With the popularity of Machine Learning (ML) solutions, algorithms and data have been released faster than the capacity of processing them. In this context, the problem of Algorithm Recommendation (AR) is receiving a significant deal of attention recently. This problem has been addressed in the literature as a learning task, often as a Meta-Learning problem where the aim is to recommend the best alternative for a specific dataset. For such, datasets encoded by meta-features are explored by ML algorithms that try to learn the mapping between meta-representations and the best technique to be used. One of the challenges for the successful use of ML is to define which features are the most valuable for a specific dataset since several meta-features can be used, which increases the meta-feature dimension. This paper presents an empirical analysis of Feature Selection and Feature Extraction in the meta-level for the AR problem. The present study was focused on three criteria: predictive performance, dimensionality reduction, and pipeline runtime. As we verified, applying Dimensionality Reduction (DR) methods did not improve predictive performances in general. However, DR solutions reduced about 80% of the meta-features, obtaining pretty much the same performance as the original setup but with lower runtimes. The only exception was PCA, which presented about the same runtime as the original meta-features. Experimental results also showed that various datasets have many non-informative meta-features and that it is possible to obtain high predictive performance using around 20% of the original meta-features. Therefore, due to their natural trend for high dimensionality, DR methods should be used for Meta-Feature Selection and Meta-Feature Extraction.
△ Less
Submitted 11 June, 2021; v1 submitted 7 June, 2021;
originally announced June 2021.
-
Fuzzy clustering algorithms with distance metric learning and entropy regularization
Authors:
Sara Ines Rizo Rodriguez,
Francisco de Assis Tenorio de Carvalho
Abstract:
The clustering methods have been used in a variety of fields such as image processing, data mining, pattern recognition, and statistical analysis. Generally, the clustering algorithms consider all variables equally relevant or not correlated for the clustering task. Nevertheless, in real situations, some variables can be correlated or may be more or less relevant or even irrelevant for this task.…
▽ More
The clustering methods have been used in a variety of fields such as image processing, data mining, pattern recognition, and statistical analysis. Generally, the clustering algorithms consider all variables equally relevant or not correlated for the clustering task. Nevertheless, in real situations, some variables can be correlated or may be more or less relevant or even irrelevant for this task. This paper proposes partitioning fuzzy clustering algorithms based on Euclidean, City-block and Mahalanobis distances and entropy regularization. These methods are an iterative three steps algorithms which provide a fuzzy partition, a representative for each fuzzy cluster, and the relevance weight of the variables or their correlation by minimizing a suitable objective function. Several experiments on synthetic and real datasets, including its application to noisy image texture segmentation, demonstrate the usefulness of these adaptive clustering methods.
△ Less
Submitted 18 February, 2021;
originally announced February 2021.
-
Using dynamical quantization to perform split attempts in online tree regressors
Authors:
Saulo Martiello Mastelini,
Andre Carlos Ponce de Leon Ferreira de Carvalho
Abstract:
A central aspect of online decision tree solutions is evaluating the incoming data and enabling model growth. For such, trees much deal with different kinds of input features and partition them to learn from the data. Numerical features are no exception, and they pose additional challenges compared to other kinds of features, as there is no trivial strategy to choose the best point to make a split…
▽ More
A central aspect of online decision tree solutions is evaluating the incoming data and enabling model growth. For such, trees much deal with different kinds of input features and partition them to learn from the data. Numerical features are no exception, and they pose additional challenges compared to other kinds of features, as there is no trivial strategy to choose the best point to make a split decision. The problem is even more challenging in regression tasks because both the features and the target are continuous. Typical online solutions evaluate and store all the points monitored between split attempts, which goes against the constraints posed in real-time applications. In this paper, we introduce the Quantization Observer (QO), a simple yet effective hashing-based algorithm to monitor and evaluate split point candidates in numerical features for online tree regressors. QO can be easily integrated into incremental decision trees, such as Hoeffding Trees, and it has a monitoring cost of $O(1)$ per instance and sub-linear cost to evaluate split candidates. Previous solutions had a $O(\log n)$ cost per insertion (in the best case) and a linear cost to evaluate split points. Our extensive experimental setup highlights QO's effectiveness in providing accurate split point suggestions while spending much less memory and processing time than its competitors.
△ Less
Submitted 3 December, 2020; v1 submitted 30 November, 2020;
originally announced December 2020.
-
Rethinking Default Values: a Low Cost and Efficient Strategy to Define Hyperparameters
Authors:
Rafael Gomes Mantovani,
André Luis Debiaso Rossi,
Edesio Alcobaça,
Jadson Castro Gertrudes,
Sylvio Barbon Junior,
André Carlos Ponce de Leon Ferreira de Carvalho
Abstract:
Machine Learning (ML) algorithms have been increasingly applied to problems from several different areas. Despite their growing popularity, their predictive performance is usually affected by the values assigned to their hyperparameters (HPs). As consequence, researchers and practitioners face the challenge of how to set these values. Many users have limited knowledge about ML algorithms and the e…
▽ More
Machine Learning (ML) algorithms have been increasingly applied to problems from several different areas. Despite their growing popularity, their predictive performance is usually affected by the values assigned to their hyperparameters (HPs). As consequence, researchers and practitioners face the challenge of how to set these values. Many users have limited knowledge about ML algorithms and the effect of their HP values and, therefore, do not take advantage of suitable settings. They usually define the HP values by trial and error, which is very subjective, not guaranteed to find good values and dependent on the user experience. Tuning techniques search for HP values able to maximize the predictive performance of induced models for a given dataset, but have the drawback of a high computational cost. Thus, practitioners use default values suggested by the algorithm developer or by tools implementing the algorithm. Although default values usually result in models with acceptable predictive performance, different implementations of the same algorithm can suggest distinct default values. To maintain a balance between tuning and using default values, we propose a strategy to generate new optimized default values. Our approach is grounded on a small set of optimized values able to obtain predictive performance values better than default settings provided by popular tools. After performing a large experiment and a careful analysis of the results, we concluded that our approach delivers better default values. Besides, it leads to competitive solutions when compared to tuned values, making it easier to use and having a lower cost. We also extracted simple rules to guide practitioners in deciding whether to use our new methodology or a HP tuning approach.
△ Less
Submitted 8 July, 2021; v1 submitted 31 July, 2020;
originally announced August 2020.
-
Code Review in the Classroom
Authors:
Victor Rivera,
Hamna Aslam,
Alexandr Naumchev,
Daniel de Carvalho,
Mansur Khazeev,
Manuel Mazzara
Abstract:
This paper presents a case study to examine the affinity of the code review process among young developers in an academic setting. Code review is indispensable considering the positive outcomes it generates. However, it is not an individual activity and requires substantial interaction among stakeholders, deliverance, and acceptance of feedback, timely actions upon feedback as well as the ability…
▽ More
This paper presents a case study to examine the affinity of the code review process among young developers in an academic setting. Code review is indispensable considering the positive outcomes it generates. However, it is not an individual activity and requires substantial interaction among stakeholders, deliverance, and acceptance of feedback, timely actions upon feedback as well as the ability to agree on a solution in the wake of diverse viewpoints. Young developers in a classroom setting provide a clear picture of the potential favourable and problematic areas of the code review process. Their feedback suggests that the process has been well received with some points to better the process. This paper can be used as guidelines to perform code reviews in the classroom.
△ Less
Submitted 19 April, 2020;
originally announced April 2020.
-
Online Local Boosting: improving performance in online decision trees
Authors:
Victor G. Turrisi da Costa,
Saulo Martiello Mastelini,
André C. Ponce de Leon Ferreira de Carvalho,
Sylvio Barbon Jr
Abstract:
As more data are produced each day, and faster, data stream mining is growing in importance, making clear the need for algorithms able to fast process these data. Data stream mining algorithms are meant to be solutions to extract knowledge online, specially tailored from continuous data problem. Many of the current algorithms for data stream mining have high processing and memory costs. Often, the…
▽ More
As more data are produced each day, and faster, data stream mining is growing in importance, making clear the need for algorithms able to fast process these data. Data stream mining algorithms are meant to be solutions to extract knowledge online, specially tailored from continuous data problem. Many of the current algorithms for data stream mining have high processing and memory costs. Often, the higher the predictive performance, the higher these costs. To increase predictive performance without largely increasing memory and time costs, this paper introduces a novel algorithm, named Online Local Boosting (OLBoost), which can be combined into online decision tree algorithms to improve their predictive performance without modifying the structure of the induced decision trees. For such, OLBoost applies a boosting to small separate regions of the instances space. Experimental results presented in this paper show that by using OLBoost the online learning decision tree algorithms can significantly improve their predictive performance. Additionally, it can make smaller trees perform as good or better than larger trees.
△ Less
Submitted 16 July, 2019;
originally announced July 2019.
-
A meta-learning recommender system for hyperparameter tuning: predicting when tuning improves SVM classifiers
Authors:
Rafael Gomes Mantovani,
André Luis Debiaso Rossi,
Edesio Alcobaça,
Joaquin Vanschoren,
André Carlos Ponce de Leon Ferreira de Carvalho
Abstract:
For many machine learning algorithms, predictive performance is critically affected by the hyperparameter values used to train them. However, tuning these hyperparameters can come at a high computational cost, especially on larger datasets, while the tuned settings do not always significantly outperform the default values. This paper proposes a recommender system based on meta-learning to identify…
▽ More
For many machine learning algorithms, predictive performance is critically affected by the hyperparameter values used to train them. However, tuning these hyperparameters can come at a high computational cost, especially on larger datasets, while the tuned settings do not always significantly outperform the default values. This paper proposes a recommender system based on meta-learning to identify exactly when it is better to use default values and when to tune hyperparameters for each new dataset. Besides, an in-depth analysis is performed to understand what they take into account for their decisions, providing useful insights. An extensive analysis of different categories of meta-features, meta-learners, and setups across 156 datasets is performed. Results show that it is possible to accurately predict when tuning will significantly improve the performance of the induced models. The proposed system reduces the time spent on optimization processes, without reducing the predictive performance of the induced models (when compared with the ones obtained using tuned hyperparameters). We also explain the decision-making process of the meta-learners in terms of linear separability-based hypotheses. Although this analysis is focused on the tuning of Support Vector Machines, it can also be applied to other algorithms, as shown in experiments performed with decision trees.
△ Less
Submitted 11 June, 2019; v1 submitted 4 June, 2019;
originally announced June 2019.
-
Towards A Broader Acceptance Of Formal Verification Tools: The Role Of Education
Authors:
Mansur Khazeev,
Manuel Mazzara,
Daniel De Carvalho,
Hamna Aslam
Abstract:
Formal methods yet advantageous, face challenges towards wide acceptance and adoption in software development practices. The major reason being presumed complexity. The issue can be addressed by academia with a thoughtful plan of teaching and practise. The user study detailed in this paper is examining AutoProof tool with the motivation to identify complexities attributed to formal methods. Partic…
▽ More
Formal methods yet advantageous, face challenges towards wide acceptance and adoption in software development practices. The major reason being presumed complexity. The issue can be addressed by academia with a thoughtful plan of teaching and practise. The user study detailed in this paper is examining AutoProof tool with the motivation to identify complexities attributed to formal methods. Participants' (students of Masters program in Computer Science) performance and feedback on the experience with formal methods assisted us in extracting specific problem areas that effect tool usability. The study results infer, along with improvements in verification tool functionalities, teaching program must be modified to include pre-requisite courses to make formal methods easily adapted by students and promoting their usage in software development process.
△ Less
Submitted 4 June, 2019;
originally announced June 2019.
-
Online Multi-target regression trees with stacked leaf models
Authors:
Saulo Martiello Mastelini,
Sylvio Barbon Jr.,
André Carlos Ponce de Leon Ferreira de Carvalho
Abstract:
One of the current challenges in machine learning is how to deal with data coming at increasing rates in data streams. New predictive learning strategies are needed to cope with the high throughput data and concept drift. One of the data stream mining tasks where new learning strategies are needed is multi-target regression, due to its applicability in a high number of real world problems. While r…
▽ More
One of the current challenges in machine learning is how to deal with data coming at increasing rates in data streams. New predictive learning strategies are needed to cope with the high throughput data and concept drift. One of the data stream mining tasks where new learning strategies are needed is multi-target regression, due to its applicability in a high number of real world problems. While reliable and effective learning strategies have been proposed for batch multi-target regression, few have been proposed for multi-target online learning in data streams. Besides, most of the existing solutions do not consider the occurrence of inter-target correlations when making predictions. In this work, we propose a novel online learning strategy for multi-target regression in data streams. The proposed strategy extends existing online decision tree learning algorithm to explore inter-target dependencies while making predictions. For such, the proposed strategy, called Stacked Single-target Hoeffding Tree (SST-HT), uses the inter-target dependencies as an additional information source to enhance predictive accuracy. Throughout an extensive experimental setup, we evaluate our proposal against state-of-the-art decision tree-based algorithms for online multi-target regression. According to the experimental results, SST-HT presents superior predictive accuracy, with a small increase in the processing time and memory requirements.
△ Less
Submitted 10 March, 2020; v1 submitted 29 March, 2019;
originally announced March 2019.
-
Better Trees: An empirical study on hyperparameter tuning of classification decision tree induction algorithms
Authors:
Rafael Gomes Mantovani,
Tomáš Horváth,
André L. D. Rossi,
Ricardo Cerri,
Sylvio Barbon Junior,
Joaquin Vanschoren,
André Carlos Ponce de Leon Ferreira de Carvalho
Abstract:
Machine learning algorithms often contain many hyperparameters (HPs) whose values affect the predictive performance of the induced models in intricate ways. Due to the high number of possibilities for these HP configurations and their complex interactions, it is common to use optimization techniques to find settings that lead to high predictive performance. However, insights into efficiently explo…
▽ More
Machine learning algorithms often contain many hyperparameters (HPs) whose values affect the predictive performance of the induced models in intricate ways. Due to the high number of possibilities for these HP configurations and their complex interactions, it is common to use optimization techniques to find settings that lead to high predictive performance. However, insights into efficiently exploring this vast space of configurations and dealing with the trade-off between predictive and runtime performance remain challenging. Furthermore, there are cases where the default HPs fit the suitable configuration. Additionally, for many reasons, including model validation and attendance to new legislation, there is an increasing interest in interpretable models, such as those created by the Decision Tree (DT) induction algorithms. This paper provides a comprehensive approach for investigating the effects of hyperparameter tuning for the two DT induction algorithms most often used, CART and C4.5. DT induction algorithms present high predictive performance and interpretable classification models, though many HPs need to be adjusted. Experiments were carried out with different tuning strategies to induce models and to evaluate HPs' relevance using 94 classification datasets from OpenML. The experimental results point out that different HP profiles for the tuning of each algorithm provide statistically significant improvements in most of the datasets for CART, but only in one-third for C4.5. Although different algorithms may present different tuning scenarios, the tuning techniques generally required few evaluations to find accurate solutions. Furthermore, the best technique for all the algorithms was the IRACE. Finally, we found out that tuning a specific small subset of HPs is a good alternative for achieving optimal predictive performance.
△ Less
Submitted 21 December, 2023; v1 submitted 5 December, 2018;
originally announced December 2018.
-
Strict Very Fast Decision Tree: a memory conservative algorithm for data stream mining
Authors:
Victor Guilherme Turrisi da Costa,
André Carlos Ponce de Leon Ferreira de Carvalho,
Sylvio Barbon Junior
Abstract:
Dealing with memory and time constraints are current challenges when learning from data streams with a massive amount of data. Many algorithms have been proposed to handle these difficulties, among them, the Very Fast Decision Tree (VFDT) algorithm. Although the VFDT has been widely used in data stream mining, in the last years, several authors have suggested modifications to increase its performa…
▽ More
Dealing with memory and time constraints are current challenges when learning from data streams with a massive amount of data. Many algorithms have been proposed to handle these difficulties, among them, the Very Fast Decision Tree (VFDT) algorithm. Although the VFDT has been widely used in data stream mining, in the last years, several authors have suggested modifications to increase its performance, putting aside memory concerns by proposing memory-costly solutions. Besides, most data stream mining solutions have been centred around ensembles, which combine the memory costs of their weak learners, usually VFDTs. To reduce the memory cost, keeping the predictive performance, this study proposes the Strict VFDT (SVFDT), a novel algorithm based on the VFDT. The SVFDT algorithm minimises unnecessary tree growth, substantially reducing memory usage and keeping competitive predictive performance. Moreover, since it creates much more shallow trees than VFDT, SVFDT can achieve a shorter processing time. Experiments were carried out comparing the SVFDT with the VFDT in 11 benchmark data stream datasets. This comparison assessed the trade-off between accuracy, memory, and processing time. Statistical analysis showed that the proposed algorithm obtained similar predictive performance and significantly reduced processing time and memory use. Thus, SVFDT is a suitable option for data stream mining with memory and time limitations, recommended as a weak learner in ensemble-based solutions.
△ Less
Submitted 17 May, 2018; v1 submitted 16 May, 2018;
originally announced May 2018.
-
Implementing distributed λ-calculus interpreter
Authors:
Alexandr Basov,
Daniel de Carvalho,
Manuel Mazzara
Abstract:
This paper describes how one can implement distributed λ-calculus interpreter from scratch. At first, we describe how to implement a monadic parser, than the Krivine Machine is introduced for the interpretation part and as for distribution, the actor model is used. In this work we are not providing general solution for parallelism, but we consider particular patterns, which always can be paralleli…
▽ More
This paper describes how one can implement distributed λ-calculus interpreter from scratch. At first, we describe how to implement a monadic parser, than the Krivine Machine is introduced for the interpretation part and as for distribution, the actor model is used. In this work we are not providing general solution for parallelism, but we consider particular patterns, which always can be parallelized. As a result, the basic extensible implementation of call-by-name distributed machine is introduced and prototype is presented. We achieved computation speed improvement in some cases, but efficient distributed version is not achieved, problems are discussed in evaluation section. This work provides a foundation for further research, completing the implementation it is possible to add concurrency for non-determinism, improve the interpreter using call-by-need semantic or study optimal auto parallelization to generalize what could be done efficiently in parallel.
△ Less
Submitted 19 February, 2018;
originally announced February 2018.
-
Taylor expansion in linear logic is invertible
Authors:
Daniel de Carvalho
Abstract:
Each Multiplicative Exponential Linear Logic (MELL) proof-net can be expanded into a differential net, which is its Taylor expansion. We prove that two different MELL proof-nets have two different Taylor expansions. As a corollary, we prove a completeness result for MELL: We show that the relational model is injective for MELL proof-nets, i.e. the equality between MELL proof-nets in the relational…
▽ More
Each Multiplicative Exponential Linear Logic (MELL) proof-net can be expanded into a differential net, which is its Taylor expansion. We prove that two different MELL proof-nets have two different Taylor expansions. As a corollary, we prove a completeness result for MELL: We show that the relational model is injective for MELL proof-nets, i.e. the equality between MELL proof-nets in the relational model is exactly axiomatized by cut-elimination.
△ Less
Submitted 6 December, 2018; v1 submitted 14 December, 2017;
originally announced December 2017.
-
Teaching Programming and Design-by-Contract
Authors:
Daniel de Carvalho,
Rasheed Hussain,
Adil Khan,
Mansur Khazeev,
JooYong Lee,
Sergey Masiagin,
Manuel Mazzara,
Ruslan Mustafin,
Alexandr Naumchev,
Victor Rivera
Abstract:
This paper summarizes the experience of teaching an introductory course to programming by using a correctness by construction approach at Innopolis University, Russian Federation. In this paper we claim that division in beginner and advanced groups improves the learning outcomes, present the discussion and the data that support the claim.
This paper summarizes the experience of teaching an introductory course to programming by using a correctness by construction approach at Innopolis University, Russian Federation. In this paper we claim that division in beginner and advanced groups improves the learning outcomes, present the discussion and the data that support the claim.
△ Less
Submitted 4 July, 2018; v1 submitted 22 October, 2017;
originally announced October 2017.
-
Jolie Static Type Checker: a prototype
Authors:
Daniel de Carvalho,
Manuel Mazzara,
Bogdan Mingela,
Larisa Safina,
Alexander Tchitchigin,
Nikolay Troshkov
Abstract:
Static verification of a program source code correctness is an important element of software reliability. Formal verification of software programs involves proving that a program satisfies a formal specification of its behavior. Many languages use both static and dynamic type checking. With such approach, the static type checker verifies everything possible at compile time, and dynamic checks the…
▽ More
Static verification of a program source code correctness is an important element of software reliability. Formal verification of software programs involves proving that a program satisfies a formal specification of its behavior. Many languages use both static and dynamic type checking. With such approach, the static type checker verifies everything possible at compile time, and dynamic checks the remaining. The current state of the Jolie programming language includes a dynamic type system. Consequently, it allows avoidable run-time errors. A static type system for the language has been formally defined on paper but lacks an implementation yet. In this paper, we describe a prototype of Jolie Static Type Checker (JSTC), which employs a technique based on a SMT solver. We describe the theory behind and the implementation, and the process of static analysis.
△ Less
Submitted 18 October, 2017; v1 submitted 23 February, 2017;
originally announced February 2017.
-
The relational model is injective for Multiplicative Exponential Linear Logic
Authors:
Daniel de Carvalho
Abstract:
We prove a completeness result for Multiplicative Exponential Linear Logic (MELL): we show that the relational model is injective for MELL proof-nets, i.e. the equality between MELL proof-nets in the relational model is exactly axiomatized by cut-elimination.
We prove a completeness result for Multiplicative Exponential Linear Logic (MELL): we show that the relational model is injective for MELL proof-nets, i.e. the equality between MELL proof-nets in the relational model is exactly axiomatized by cut-elimination.
△ Less
Submitted 10 May, 2016; v1 submitted 9 February, 2015;
originally announced February 2015.
-
A semantic account of strong normalization in Linear Logic
Authors:
Daniel de Carvalho,
Lorenzo Tortora de Falco
Abstract:
We prove that given two cut free nets of linear logic, by means of their relational interpretations one can: 1) first determine whether or not the net obtained by cutting the two nets is strongly normalizable 2) then (in case it is strongly normalizable) compute the maximal length of the reduction sequences starting from that net.
We prove that given two cut free nets of linear logic, by means of their relational interpretations one can: 1) first determine whether or not the net obtained by cutting the two nets is strongly normalizable 2) then (in case it is strongly normalizable) compute the maximal length of the reduction sequences starting from that net.
△ Less
Submitted 26 August, 2014; v1 submitted 24 April, 2013;
originally announced April 2013.
-
Dynamic Clustering of Histogram Data Based on Adaptive Squared Wasserstein Distances
Authors:
Antonio Irpino,
Rosanna Verde,
Francisco de AT De Carvalho
Abstract:
This paper deals with clustering methods based on adaptive distances for histogram data using a dynamic clustering algorithm. Histogram data describes individuals in terms of empirical distributions. These kind of data can be considered as complex descriptions of phenomena observed on complex objects: images, groups of individuals, spatial or temporal variant data, results of queries, environmenta…
▽ More
This paper deals with clustering methods based on adaptive distances for histogram data using a dynamic clustering algorithm. Histogram data describes individuals in terms of empirical distributions. These kind of data can be considered as complex descriptions of phenomena observed on complex objects: images, groups of individuals, spatial or temporal variant data, results of queries, environmental data, and so on. The Wasserstein distance is used to compare two histograms. The Wasserstein distance between histograms is constituted by two components: the first based on the means, and the second, to internal dispersions (standard deviation, skewness, kurtosis, and so on) of the histograms. To cluster sets of histogram data, we propose to use Dynamic Clustering Algorithm, (based on adaptive squared Wasserstein distances) that is a k-means-like algorithm for clustering a set of individuals into $K$ classes that are apriori fixed.
The main aim of this research is to provide a tool for clustering histograms, emphasizing the different contributions of the histogram variables, and their components, to the definition of the clusters. We demonstrate that this can be achieved using adaptive distances. Two kind of adaptive distances are considered: the first takes into account the variability of each component of each descriptor for the whole set of individuals; the second takes into account the variability of each component of each descriptor in each cluster. We furnish interpretative tools of the obtained partition based on an extension of the classical measures (indexes) to the use of adaptive distances in the clustering criterion function. Applications on synthetic and real-world data corroborate the proposed procedure.
△ Less
Submitted 7 October, 2011;
originally announced October 2011.
-
The relational model is injective for Multiplicative Exponential Linear Logic (without weakenings)
Authors:
Daniel de Carvalho,
Lorenzo Tortora de Falco
Abstract:
We show that for Multiplicative Exponential Linear Logic (without weakenings) the syntactical equivalence relation on proofs induced by cut-elimination coincides with the semantic equivalence relation on proofs induced by the multiset based relational model: one says that the interpretation in the model (or the semantics) is injective. We actually prove a stronger result: two cut-free proofs of th…
▽ More
We show that for Multiplicative Exponential Linear Logic (without weakenings) the syntactical equivalence relation on proofs induced by cut-elimination coincides with the semantic equivalence relation on proofs induced by the multiset based relational model: one says that the interpretation in the model (or the semantics) is injective. We actually prove a stronger result: two cut-free proofs of the full multiplicative and exponential fragment of linear logic whose interpretations coincide in the multiset based relational model are the same "up to the connections between the doors of exponential boxes".
△ Less
Submitted 7 February, 2011; v1 submitted 16 February, 2010;
originally announced February 2010.
-
Execution Time of lambda-Terms via Denotational Semantics and Intersection Types
Authors:
Daniel de Carvalho
Abstract:
The multiset based relational model of linear logic induces a semantics of the type free lambda-calculus, which corresponds to a non-idempotent intersection type system, System R. We prove that, in System R, the size of the type derivations and the size of the types are closely related to the execution time of lambda-terms in a particular environment machine, Krivine's machine.
The multiset based relational model of linear logic induces a semantics of the type free lambda-calculus, which corresponds to a non-idempotent intersection type system, System R. We prove that, in System R, the size of the type derivations and the size of the types are closely related to the execution time of lambda-terms in a particular environment machine, Krivine's machine.
△ Less
Submitted 26 May, 2009;
originally announced May 2009.
-
Fourier Analysis and Holographic Representations of 1D and 2D Signals
Authors:
G. A. Giraldi,
B. F. Moutinho,
D. M. L. de Carvalho,
J. C. de Oliveira
Abstract:
In this paper, we focus on Fourier analysis and holographic transforms for signal representation. For instance, in the case of image processing, the holographic representation has the property that an arbitrary portion of the transformed image enables reconstruction of the whole image with details missing. We focus on holographic representation defined through the Fourier Transforms. Thus, We fi…
▽ More
In this paper, we focus on Fourier analysis and holographic transforms for signal representation. For instance, in the case of image processing, the holographic representation has the property that an arbitrary portion of the transformed image enables reconstruction of the whole image with details missing. We focus on holographic representation defined through the Fourier Transforms. Thus, We firstly review some results in Fourier transform and Fourier series. Next, we review the Discrete Holographic Fourier Transform (DHFT) for image representation. Then, we describe the contributions of our work. We show a simple scheme for progressive transmission based on the DHFT. Next, we propose the Continuous Holographic Fourier Transform (CHFT) and discuss some theoretical aspects of it for 1D signals. Finally, some testes are presented in the experimental results
△ Less
Submitted 3 April, 2006; v1 submitted 29 March, 2006;
originally announced March 2006.