Search | arXiv e-print repository

Interpretable Causal Representation Learning for Biological Data in the Pathway Space

Authors: Jesus de la Fuente, Robert Lehmann, Carlos Ruiz-Arenas, Jan Voges, Irene Marin-Goñi, Xabier Martinez-de-Morentin, David Gomez-Cabrero, Idoia Ochoa, Jesper Tegner, Vincenzo Lagani, Mikel Hernaez

Abstract: Predicting the impact of genomic and drug perturbations in cellular function is crucial for understanding gene functions and drug effects, ultimately leading to improved therapies. To this end, Causal Representation Learning (CRL) constitutes one of the most promising approaches, as it aims to identify the latent factors that causally govern biological systems, thus facilitating the prediction of… ▽ More Predicting the impact of genomic and drug perturbations in cellular function is crucial for understanding gene functions and drug effects, ultimately leading to improved therapies. To this end, Causal Representation Learning (CRL) constitutes one of the most promising approaches, as it aims to identify the latent factors that causally govern biological systems, thus facilitating the prediction of the effect of unseen perturbations. Yet, current CRL methods fail in reconciling their principled latent representations with known biological processes, leading to models that are not interpretable. To address this major issue, we present SENA-discrepancy-VAE, a model based on the recently proposed CRL method discrepancy-VAE, that produces representations where each latent factor can be interpreted as the (linear) combination of the activity of a (learned) set of biological processes. To this extent, we present an encoder, SENA-δ, that efficiently compute and map biological processes' activity levels to the latent causal factors. We show that SENA-discrepancy-VAE achieves predictive performances on unseen combinations of interventions that are comparable with its original, non-interpretable counterpart, while inferring causal latent factors that are biologically meaningful. △ Less

Submitted 14 June, 2025; originally announced June 2025.

Comments: ICLR 2025, 28 pages, 14 figures, 10 tables

arXiv:2505.16477 [pdf]

Advancing the Scientific Method with Large Language Models: From Hypothesis to Discovery

Authors: Yanbo Zhang, Sumeer A. Khan, Adnan Mahmud, Huck Yang, Alexander Lavin, Michael Levin, Jeremy Frey, Jared Dunnmon, James Evans, Alan Bundy, Saso Dzeroski, Jesper Tegner, Hector Zenil

Abstract: With recent Nobel Prizes recognising AI contributions to science, Large Language Models (LLMs) are transforming scientific research by enhancing productivity and reshaping the scientific method. LLMs are now involved in experimental design, data analysis, and workflows, particularly in chemistry and biology. However, challenges such as hallucinations and reliability persist. In this contribution,… ▽ More With recent Nobel Prizes recognising AI contributions to science, Large Language Models (LLMs) are transforming scientific research by enhancing productivity and reshaping the scientific method. LLMs are now involved in experimental design, data analysis, and workflows, particularly in chemistry and biology. However, challenges such as hallucinations and reliability persist. In this contribution, we review how Large Language Models (LLMs) are redefining the scientific method and explore their potential applications across different stages of the scientific cycle, from hypothesis testing to discovery. We conclude that, for LLMs to serve as relevant and effective creative engines and productivity enhancers, their deep integration into all steps of the scientific process should be pursued in collaboration and alignment with human scientific goals, with clear evaluation metrics. The transition to AI-driven science raises ethical questions about creativity, oversight, and responsibility. With careful guidance, LLMs could evolve into creative engines, driving transformative breakthroughs across scientific disciplines responsibly and effectively. However, the scientific community must also decide how much it leaves to LLMs to drive science, even when associations with 'reasoning', mostly currently undeserved, are made in exchange for the potential to explore hypothesis and solution regions that might otherwise remain unexplored by human exploration alone. △ Less

Submitted 22 May, 2025; originally announced May 2025.

Comments: 45 pages

Journal ref: npj Artificial Intelligence, 2025

arXiv:2411.08552 [pdf, other]

Leveraging Pre-Trained Neural Networks to Enhance Machine Learning with Variational Quantum Circuits

Authors: Jun Qi, Chao-Han Yang, Samuel Yen-Chi Chen, Pin-Yu Chen, Hector Zenil, Jesper Tegner

Abstract: Quantum Machine Learning (QML) offers tremendous potential but is currently limited by the availability of qubits. We introduce an innovative approach that utilizes pre-trained neural networks to enhance Variational Quantum Circuits (VQC). This technique effectively separates approximation error from qubit count and removes the need for restrictive conditions, making QML more viable for real-world… ▽ More Quantum Machine Learning (QML) offers tremendous potential but is currently limited by the availability of qubits. We introduce an innovative approach that utilizes pre-trained neural networks to enhance Variational Quantum Circuits (VQC). This technique effectively separates approximation error from qubit count and removes the need for restrictive conditions, making QML more viable for real-world applications. Our method significantly improves parameter optimization for VQC while delivering notable gains in representation and generalization capabilities, as evidenced by rigorous theoretical analysis and extensive empirical testing on quantum dot classification tasks. Moreover, our results extend to applications such as human genome analysis, demonstrating the broad applicability of our approach. By addressing the constraints of current quantum hardware, our work paves the way for a new era of advanced QML applications, unlocking the full potential of quantum computing in fields such as machine learning, materials science, medicine, mimetics, and various interdisciplinary areas. △ Less

Submitted 13 November, 2024; originally announced November 2024.

Comments: In submission

arXiv:2408.15108 [pdf, other]

Assembly Theory Reduced to Shannon Entropy and Rendered Redundant by Naive Statistical Algorithms

Authors: Luan Ozelim, Abicumaran Uthamacumaran, Felipe S. Abrahão, Santiago Hernández-Orozco, Narsis A. Kiani, Jesper Tegnér, Hector Zenil

Abstract: We have proven that any implementation of the concept of `copy number' underlying Assembly Theory (AT) and its assembly index (Ai) is equivalent to Shannon entropy, and not fundamentally or methodologically different from algorithms like ZIP compression. Here we show that the weak empirical correlation between Ai and LZW, which the authors have offered as a defence is based on an incomplete and mi… ▽ More We have proven that any implementation of the concept of `copy number' underlying Assembly Theory (AT) and its assembly index (Ai) is equivalent to Shannon entropy, and not fundamentally or methodologically different from algorithms like ZIP compression. Here we show that the weak empirical correlation between Ai and LZW, which the authors have offered as a defence is based on an incomplete and misleading experiment. When the experiment is completed, the fast asymptotic convergence to $\mathbf{LZ}$ compression and Shannon entropy is undeniable, just as their mathematical proof of equivalence remains undisputed. This contribution completes the theoretical and empirical demonstration that any variation of the copy-number concept underlying AT, which entails counting the number of object repetitions `to arrive at a measure for life,' is equivalent to statistical compression and Shannon entropy. We demonstrate that the authors' `we-are-better-because-we-are-worse' argument against compression does not withstand basic scrutiny, and that their empirical results separating organic from inorganic compounds have not only been previously reported, sans claims to unify physics and biology, but are driven solely by molecular length--which they did not control for. We show that Ai is a particular case of our BDM index, introduced almost a decade earlier, and that arguments attributing special stochastic properties to Ai are misleading, since the properties of Ai are not unique, but rather exactly the same as those that Shannon entropy possesses and for which it was designed in the first place--for the quantification of uncertainty--and which we have proven to be equivalent to Ai. Shannon entropy is already not only equipped with stochasticity but was designed for. This makes AT redundant especially when applied to their own experimental data. △ Less

Submitted 13 March, 2025; v1 submitted 27 August, 2024; originally announced August 2024.

Comments: 12 figures, 55 pages (minor tweaks and typo corrections to new sections on misleading arguments related to Time complexity and stochasticity)

arXiv:2407.14270 [pdf, other]

T- Hop: A framework for studying the importance path information in molecular graphs for chemical property prediction

Authors: Abdulrahman Ibraheem, Narsis Kiani, Jesper Tegner

Abstract: This paper studies the usefulness of incorporating path information in predicting chemical properties from molecular graphs, in the domain of QSAR (Quantitative Structure-Activity Relationship). Towards this, we developed a GNN-style model which can be toggled to operate in one of two modes: a non-degenerate mode which incorporates path information, and a degenerate mode which leaves out path info… ▽ More This paper studies the usefulness of incorporating path information in predicting chemical properties from molecular graphs, in the domain of QSAR (Quantitative Structure-Activity Relationship). Towards this, we developed a GNN-style model which can be toggled to operate in one of two modes: a non-degenerate mode which incorporates path information, and a degenerate mode which leaves out path information. Thus, by comparing the performance of the non-degenerate mode versus the degenerate mode on relevant QSAR datasets, we were able to directly assess the significance of path information on those datasets. Our results corroborate previous works, by suggesting that the usefulness of path information is datasetdependent. Unlike previous studies however, we took the very first steps towards building a model that could predict upfront whether or not path information would be useful for a given dataset at hand. Moreover, we also found that, albeit its simplicity, the degenerate mode of our model yielded rather surprising results, which outperformed more sophisticated SOTA models in certain cases. △ Less

Submitted 29 June, 2024; originally announced July 2024.

arXiv:2403.06629 [pdf, other]

Assembly Theory is an approximation to algorithmic complexity based on LZ compression that does not explain selection or evolution

Authors: Felipe S. Abrahão, Santiago Hernández-Orozco, Narsis A. Kiani, Jesper Tegnér, Hector Zenil

Abstract: We prove the full equivalence between Assembly Theory (AT) and Shannon Entropy via a method based upon the principles of statistical compression renamed `assembly index' that belongs to the LZ family of popular compression algorithms (ZIP, GZIP, JPEG). Such popular algorithms have been shown to empirically reproduce the results of AT, results that have also been reported before in successful appli… ▽ More We prove the full equivalence between Assembly Theory (AT) and Shannon Entropy via a method based upon the principles of statistical compression renamed `assembly index' that belongs to the LZ family of popular compression algorithms (ZIP, GZIP, JPEG). Such popular algorithms have been shown to empirically reproduce the results of AT, results that have also been reported before in successful applications to separating organic from non-organic molecules and in the context of the study of selection and evolution. We show that the assembly index value is equivalent to the size of a minimal context-free grammar. The statistical compressibility of such a method is bounded by Shannon Entropy and other equivalent traditional LZ compression schemes, such as LZ77, LZ78, or LZW. In addition, we demonstrate that AT, and the algorithms supporting its pathway complexity, assembly index, and assembly number, define compression schemes and methods that are subsumed into the theory of algorithmic (Kolmogorov-Solomonoff-Chaitin) complexity. Due to AT's current lack of logical consistency in defining causality for non-stochastic processes and the lack of empirical evidence that it outperforms other complexity measures found in the literature capable of explaining the same phenomena, we conclude that the assembly index and the assembly number do not lead to an explanation or quantification of biases in generative (physical or biological) processes, including those brought about by (abiotic or Darwinian) selection and evolution, that could not have been arrived at using Shannon Entropy or that have not been reported before using classical information theory or algorithmic complexity. △ Less

Submitted 1 April, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

Comments: 15 pages + appendix, 2 figures

arXiv:2310.06434 [pdf, other]

Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition

Authors: Srijith Radhakrishnan, Chao-Han Huck Yang, Sumeer Ahmad Khan, Rohit Kumar, Narsis A. Kiani, David Gomez-Cabrero, Jesper N. Tegner

Abstract: We introduce a new cross-modal fusion technique designed for generative error correction in automatic speech recognition (ASR). Our methodology leverages both acoustic information and external linguistic representations to generate accurate speech transcription contexts. This marks a step towards a fresh paradigm in generative error correction within the realm of n-best hypotheses. Unlike the exis… ▽ More We introduce a new cross-modal fusion technique designed for generative error correction in automatic speech recognition (ASR). Our methodology leverages both acoustic information and external linguistic representations to generate accurate speech transcription contexts. This marks a step towards a fresh paradigm in generative error correction within the realm of n-best hypotheses. Unlike the existing ranking-based rescoring methods, our approach adeptly uses distinct initialization techniques and parameter-efficient algorithms to boost ASR performance derived from pre-trained speech and text models. Through evaluation across diverse ASR datasets, we evaluate the stability and reproducibility of our fusion technique, demonstrating its improved word error rate relative (WERR) performance in comparison to n-best hypotheses by relatively 37.66%. To encourage future research, we have made our code and pre-trained models open source at https://github.com/Srijith-rkr/Whispering-LLaMA. △ Less

Submitted 16 October, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

Comments: Accepted to EMNLP 2023 as main paper. 10 pages. Revised math notations. GitHub: https://github.com/Srijith-rkr/Whispering-LLaMA

arXiv:2307.07522 [pdf, other]

The Future of Fundamental Science Led by Generative Closed-Loop Artificial Intelligence

Authors: Hector Zenil, Jesper Tegnér, Felipe S. Abrahão, Alexander Lavin, Vipin Kumar, Jeremy G. Frey, Adrian Weller, Larisa Soldatova, Alan R. Bundy, Nicholas R. Jennings, Koichi Takahashi, Lawrence Hunter, Saso Dzeroski, Andrew Briggs, Frederick D. Gregory, Carla P. Gomes, Jon Rowe, James Evans, Hiroaki Kitano, Ross King

Abstract: Recent advances in machine learning and AI, including Generative AI and LLMs, are disrupting technological innovation, product development, and society as a whole. AI's contribution to technology can come from multiple approaches that require access to large training data sets and clear performance evaluation criteria, ranging from pattern recognition and classification to generative models. Yet,… ▽ More Recent advances in machine learning and AI, including Generative AI and LLMs, are disrupting technological innovation, product development, and society as a whole. AI's contribution to technology can come from multiple approaches that require access to large training data sets and clear performance evaluation criteria, ranging from pattern recognition and classification to generative models. Yet, AI has contributed less to fundamental science in part because large data sets of high-quality data for scientific practice and model discovery are more difficult to access. Generative AI, in general, and Large Language Models in particular, may represent an opportunity to augment and accelerate the scientific discovery of fundamental deep science with quantitative models. Here we explore and investigate aspects of an AI-driven, automated, closed-loop approach to scientific discovery, including self-driven hypothesis generation and open-ended autonomous exploration of the hypothesis space. Integrating AI-driven automation into the practice of science would mitigate current problems, including the replication of findings, systematic production of data, and ultimately democratisation of the scientific process. Realising these possibilities requires a vision for augmented AI coupled with a diversity of AI approaches able to deal with fundamental aspects of causality analysis and model discovery while enabling unbiased search across the space of putative explanations. These advances hold the promise to unleash AI's potential for searching and discovering the fundamental structure of our world beyond what human scientists have been able to achieve. Such a vision would push the boundaries of new fundamental science rather than automatize current workflows and instead open doors for technological innovation to tackle some of the greatest challenges facing humanity today. △ Less

Submitted 29 August, 2023; v1 submitted 9 July, 2023; originally announced July 2023.

Comments: 35 pages, first draft of the final report from the Alan Turing Institute on AI for Scientific Discovery

arXiv:2305.11244 [pdf, other]

doi 10.21437/Interspeech.2023-1407

A Parameter-Efficient Learning Approach to Arabic Dialect Identification with Pre-Trained General-Purpose Speech Model

Authors: Srijith Radhakrishnan, Chao-Han Huck Yang, Sumeer Ahmad Khan, Narsis A. Kiani, David Gomez-Cabrero, Jesper N. Tegner

Abstract: In this work, we explore Parameter-Efficient-Learning (PEL) techniques to repurpose a General-Purpose-Speech (GSM) model for Arabic dialect identification (ADI). Specifically, we investigate different setups to incorporate trainable features into a multi-layer encoder-decoder GSM formulation under frozen pre-trained settings. Our architecture includes residual adapter and model reprogramming (inpu… ▽ More In this work, we explore Parameter-Efficient-Learning (PEL) techniques to repurpose a General-Purpose-Speech (GSM) model for Arabic dialect identification (ADI). Specifically, we investigate different setups to incorporate trainable features into a multi-layer encoder-decoder GSM formulation under frozen pre-trained settings. Our architecture includes residual adapter and model reprogramming (input-prompting). We design a token-level label mapping to condition the GSM for Arabic Dialect Identification (ADI). This is challenging due to the high variation in vocabulary and pronunciation among the numerous regional dialects. We achieve new state-of-the-art accuracy on the ADI-17 dataset by vanilla fine-tuning. We further reduce the training budgets with the PEL method, which performs within 1.86% accuracy to fine-tuning using only 2.5% of (extra) network trainable parameters. Our study demonstrates how to identify Arabic dialects using a small dataset and limited computation with open source code and pre-trained models. △ Less

Submitted 3 October, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

Comments: Accepted to Interspeech 2023, 5 pages. Code is available at: https://github.com/Srijith-rkr/KAUST-Whisper-Adapter under MIT license

arXiv:2304.02443 [pdf]

IHCV: Discovery of Hidden Time-Dependent Control Variables in Non-Linear Dynamical Systems

Authors: Juan Munoz, Subash Balsamy, Juan P. Bernal-Tamayo, Ali Balubaid, Alberto Maillo Ruiz de Infante, Vincenzo Lagani, David Gomez-Cabrero, Narsis A. Kiani, Jesper Tegner

Abstract: Discovering non-linear dynamical models from data is at the core of science. Recent progress hinges upon sparse regression of observables using extensive libraries of candidate functions. However, it remains challenging to model hidden non-observable control variables governing switching between different dynamical regimes. Here we develop a data-efficient derivative-free method, IHCV, for the Ide… ▽ More Discovering non-linear dynamical models from data is at the core of science. Recent progress hinges upon sparse regression of observables using extensive libraries of candidate functions. However, it remains challenging to model hidden non-observable control variables governing switching between different dynamical regimes. Here we develop a data-efficient derivative-free method, IHCV, for the Identification of Hidden Control Variables. First, the performance and robustness of IHCV against noise are evaluated by benchmarking the IHCV method using well-known bifurcation models (saddle-node, transcritical, pitchfork, Hopf). Next, we demonstrate that IHCV discovers hidden driver variables in the Lorenz, van der Pol, Hodgkin-Huxley, and Fitzhugh-Nagumo models. Finally, IHCV generalizes to the case when only partial observational is given, as demonstrated using the toggle switch model, the genetic repressilator oscillator, and a Waddington landscape model. Our proof-of-principle illustrates that utilizing normal forms could facilitate the data-efficient and scalable discovery of hidden variables controlling transitions between different dynamical regimes and non-linear models. △ Less

Submitted 5 April, 2023; originally announced April 2023.

Comments: 12 pages, 2 figures

MSC Class: 37M20

arXiv:2011.10132 [pdf, other]

VLG-Net: Video-Language Graph Matching Network for Video Grounding

Authors: Mattia Soldan, Mengmeng Xu, Sisi Qu, Jesper Tegner, Bernard Ghanem

Abstract: Grounding language queries in videos aims at identifying the time interval (or moment) semantically relevant to a language query. The solution to this challenging task demands understanding videos' and queries' semantic content and the fine-grained reasoning about their multi-modal interactions. Our key idea is to recast this challenge into an algorithmic graph matching problem. Fueled by recent a… ▽ More Grounding language queries in videos aims at identifying the time interval (or moment) semantically relevant to a language query. The solution to this challenging task demands understanding videos' and queries' semantic content and the fine-grained reasoning about their multi-modal interactions. Our key idea is to recast this challenge into an algorithmic graph matching problem. Fueled by recent advances in Graph Neural Networks, we propose to leverage Graph Convolutional Networks to model video and textual information as well as their semantic alignment. To enable the mutual exchange of information across the modalities, we design a novel Video-Language Graph Matching Network (VLG-Net) to match video and query graphs. Core ingredients include representation graphs built atop video snippets and query tokens separately and used to model intra-modality relationships. A Graph Matching layer is adopted for cross-modal context modeling and multi-modal fusion. Finally, moment candidates are created using masked moment attention pooling by fusing the moment's enriched snippet features. We demonstrate superior performance over state-of-the-art grounding methods on three widely used datasets for temporal localization of moments in videos with language queries: ActivityNet-Captions, TACoS, and DiDeMo. △ Less

Submitted 16 August, 2021; v1 submitted 19 November, 2020; originally announced November 2020.

Comments: 14 pages, 7 figures, In proceeding of the ICCV21 workshop: AI for Creative Video Editing and Understanding 2021

arXiv:2011.00569 [pdf, other]

DeepOpht: Medical Report Generation for Retinal Images via Deep Models and Visual Explanation

Authors: Jia-Hong Huang, Chao-Han Huck Yang, Fangyu Liu, Meng Tian, Yi-Chieh Liu, Ting-Wei Wu, I-Hung Lin, Kang Wang, Hiromasa Morikawa, Hernghua Chang, Jesper Tegner, Marcel Worring

Abstract: In this work, we propose an AI-based method that intends to improve the conventional retinal disease treatment procedure and help ophthalmologists increase diagnosis efficiency and accuracy. The proposed method is composed of a deep neural networks-based (DNN-based) module, including a retinal disease identifier and clinical description generator, and a DNN visual explanation module. To train and… ▽ More In this work, we propose an AI-based method that intends to improve the conventional retinal disease treatment procedure and help ophthalmologists increase diagnosis efficiency and accuracy. The proposed method is composed of a deep neural networks-based (DNN-based) module, including a retinal disease identifier and clinical description generator, and a DNN visual explanation module. To train and validate the effectiveness of our DNN-based module, we propose a large-scale retinal disease image dataset. Also, as ground truth, we provide a retinal image dataset manually labeled by ophthalmologists to qualitatively show, the proposed AI-based method is effective. With our experimental results, we show that the proposed method is quantitatively and qualitatively effective. Our method is capable of creating meaningful retinal image descriptions and visual explanations that are clinically relevant. △ Less

Submitted 1 November, 2020; originally announced November 2020.

Comments: Accepted to IEEE WACV 2021

arXiv:2007.05401 [pdf, other]

Learning Heat Diffusion for Network Alignment

Authors: Sisi Qu, Mengmeng Xu, Bernard Ghanem, Jesper Tegner

Abstract: Networks are abundant in the life sciences. Outstanding challenges include how to characterize similarities between networks, and in extension how to integrate information across networks. Yet, network alignment remains a core algorithmic problem. Here, we present a novel learning algorithm called evolutionary heat diffusion-based network alignment (EDNA) to address this challenge. EDNA uses the d… ▽ More Networks are abundant in the life sciences. Outstanding challenges include how to characterize similarities between networks, and in extension how to integrate information across networks. Yet, network alignment remains a core algorithmic problem. Here, we present a novel learning algorithm called evolutionary heat diffusion-based network alignment (EDNA) to address this challenge. EDNA uses the diffusion signal as a proxy for computing node similarities between networks. Comparing EDNA with state-of-the-art algorithms on a popular protein-protein interaction network dataset, using four different evaluation metrics, we achieve (i) the most accurate alignments, (ii) increased robustness against noise, and (iii) superior scaling capacity. The EDNA algorithm is versatile in that other available network alignments/embeddings can be used as an initial baseline alignment, and then EDNA works as a wrapper around them by running the evolutionary diffusion on top of them. In conclusion, EDNA outperforms state-of-the-art methods for network alignment, thus setting the stage for large-scale comparison and integration of networks. △ Less

Submitted 10 July, 2020; originally announced July 2020.

Comments: 4 Pages, 2 figures

Journal ref: Presented at the ICML 2020 Workshop on Computational Biology (WCB)

arXiv:2002.00539 [pdf, other]

doi 10.1109/CEC48606.2020.9185648

Evolving Neural Networks through a Reverse Encoding Tree

Authors: Haoling Zhang, Chao-Han Huck Yang, Hector Zenil, Narsis A. Kiani, Yue Shen, Jesper N. Tegner

Abstract: NeuroEvolution is one of the most competitive evolutionary learning frameworks for designing novel neural networks for use in specific tasks, such as logic circuit design and digital gaming. However, the application of benchmark methods such as the NeuroEvolution of Augmenting Topologies (NEAT) remains a challenge, in terms of their computational cost and search time inefficiency. This paper advan… ▽ More NeuroEvolution is one of the most competitive evolutionary learning frameworks for designing novel neural networks for use in specific tasks, such as logic circuit design and digital gaming. However, the application of benchmark methods such as the NeuroEvolution of Augmenting Topologies (NEAT) remains a challenge, in terms of their computational cost and search time inefficiency. This paper advances a method which incorporates a type of topological edge coding, named Reverse Encoding Tree (RET), for evolving scalable neural networks efficiently. Using RET, two types of approaches -- NEAT with Binary search encoding (Bi-NEAT) and NEAT with Golden-Section search encoding (GS-NEAT) -- have been designed to solve problems in benchmark continuous learning environments such as logic gates, Cartpole, and Lunar Lander, and tested against classical NEAT and FS-NEAT as baselines. Additionally, we conduct a robustness test to evaluate the resilience of the proposed NEAT algorithms. The results show that the two proposed strategies deliver improved performance, characterized by (1) a higher accumulated reward within a finite number of time steps; (2) using fewer episodes to solve problems in targeted environments, and (3) maintaining adaptive robustness under noisy perturbations, which outperform the baselines in all tested cases. Our analysis also demonstrates that RET expends potential future research directions in dynamic environments. Code is available from https://github.com/HaolingZHANG/ReverseEncodingTree. △ Less

Submitted 31 March, 2020; v1 submitted 2 February, 2020; originally announced February 2020.

Comments: Accepted to IEEE Congress on Evolutionary Computation (IEEE CEC) 2020. Lecture Presentation

Journal ref: 2020 IEEE Congress on Evolutionary Computation (CEC)

arXiv:1911.02172 [pdf]

doi 10.1109/ICASSP40776.2020.9053783

Interpretable Self-Attention Temporal Reasoning for Driving Behavior Understanding

Authors: Yi-Chieh Liu, Yung-An Hsieh, Min-Hung Chen, Chao-Han Huck Yang, Jesper Tegner, Yi-Chang James Tsai

Abstract: Performing driving behaviors based on causal reasoning is essential to ensure driving safety. In this work, we investigated how state-of-the-art 3D Convolutional Neural Networks (CNNs) perform on classifying driving behaviors based on causal reasoning. We proposed a perturbation-based visual explanation method to inspect the models' performance visually. By examining the video attention saliency,… ▽ More Performing driving behaviors based on causal reasoning is essential to ensure driving safety. In this work, we investigated how state-of-the-art 3D Convolutional Neural Networks (CNNs) perform on classifying driving behaviors based on causal reasoning. We proposed a perturbation-based visual explanation method to inspect the models' performance visually. By examining the video attention saliency, we found that existing models could not precisely capture the causes (e.g., traffic light) of the specific action (e.g., stopping). Therefore, the Temporal Reasoning Block (TRB) was proposed and introduced to the models. With the TRB models, we achieved the accuracy of $\mathbf{86.3\%}$, which outperform the state-of-the-art 3D CNNs from previous works. The attention saliency also demonstrated that TRB helped models focus on the causes more precisely. With both numerical and visual evaluations, we concluded that our proposed TRB models were able to provide accurate driving behavior prediction by learning the causal reasoning of the behaviors. △ Less

Submitted 5 November, 2019; originally announced November 2019.

Comments: Submitted to IEEE ICASSP 2020; Pytorch code will be released soon

Journal ref: 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

arXiv:1910.02758 [pdf, other]

Algorithmic Probability-guided Supervised Machine Learning on Non-differentiable Spaces

Authors: Santiago Hernández-Orozco, Hector Zenil, Jürgen Riedel, Adam Uccello, Narsis A. Kiani, Jesper Tegnér

Abstract: We show how complexity theory can be introduced in machine learning to help bring together apparently disparate areas of current research. We show that this new approach requires less training data and is more generalizable as it shows greater resilience to random attacks. We investigate the shape of the discrete algorithmic space when performing regression or classification using a loss function… ▽ More We show how complexity theory can be introduced in machine learning to help bring together apparently disparate areas of current research. We show that this new approach requires less training data and is more generalizable as it shows greater resilience to random attacks. We investigate the shape of the discrete algorithmic space when performing regression or classification using a loss function parametrized by algorithmic complexity, demonstrating that the property of differentiation is not necessary to achieve results similar to those obtained using differentiable programming approaches such as deep learning. In doing so we use examples which enable the two approaches to be compared (small, given the computational power required for estimations of algorithmic complexity). We find and report that (i) machine learning can successfully be performed on a non-smooth surface using algorithmic complexity; (ii) that parameter solutions can be found using an algorithmic-probability classifier, establishing a bridge between a fundamentally discrete theory of computability and a fundamentally continuous mathematical theory of optimization methods; (iii) a formulation of an algorithmically directed search technique in non-smooth manifolds can be defined and conducted; (iv) exploitation techniques and numerical methods for algorithmic search to navigate these discrete non-differentiable spaces can be performed; in application of the (a) identification of generative rules from data observations; (b) solutions to image classification problems more resilient against pixel attacks compared to neural networks; (c) identification of equation parameters from a small data-set in the presence of noise in continuous ODE system problem, (d) classification of Boolean NK networks by (1) network topology, (2) underlying Boolean function, and (3) number of incoming edges. △ Less

Submitted 8 October, 2019; v1 submitted 7 October, 2019; originally announced October 2019.

Comments: 33 pages including appendix

arXiv:1904.10393 [pdf, other]

Estimations of Integrated Information Based on Algorithmic Complexity and Dynamic Querying

Authors: Alberto Hernández-Espinosa, Héctor Zenil, Narsis A. Kiani, Jesper Tegnér

Abstract: The concept of information has emerged as a language in its own right, bridging several disciplines that analyze natural phenomena and man-made systems. Integrated information has been introduced as a metric to quantify the amount of information generated by a system beyond the information generated by its elements. Yet, this intriguing notion comes with the price of being prohibitively expensive… ▽ More The concept of information has emerged as a language in its own right, bridging several disciplines that analyze natural phenomena and man-made systems. Integrated information has been introduced as a metric to quantify the amount of information generated by a system beyond the information generated by its elements. Yet, this intriguing notion comes with the price of being prohibitively expensive to calculate, since the calculations require an exponential number of sub-divisions of a system. Here we introduce a novel framework to connect algorithmic randomness and integrated information and a numerical method for estimating integrated information using a perturbation test rooted in algorithmic information dynamics. This method quantifies the change in program size of a system when subjected to a perturbation. The intuition behind is that if an object is random then random perturbations have little to no effect to what happens when a shorter program but when an object has the ability to move in both directions (towards or away from randomness) it will be shown to be better integrated as a measure of sophistication telling apart randomness and simplicity from structure. We show that an object with a high integrated information value is also more compressible, and is, therefore, more sensitive to perturbations. We find that such a perturbation test quantifying compression sensitivity provides a system with a means to extract explanations--causal accounts--of its own behaviour. Our technique can reduce the number of calculations to arrive at some bounds or estimations, as the algorithmic perturbation test guides an efficient search for estimating integrated information. Our work sets the stage for a systematic exploration of connections between algorithmic complexity and integrated information at the level of both theory and practice. △ Less

Submitted 6 June, 2019; v1 submitted 9 April, 2019; originally announced April 2019.

Comments: 33 pages + Appendix = 44 pages

Journal ref: Entropy, 2019

arXiv:1902.04147 [pdf, other]

Synthesizing New Retinal Symptom Images by Multiple Generative Models

Authors: Yi-Chieh Liu, Hao-Hsiang Yang, Chao-Han Huck Yang, Jia-Hong Huang, Meng Tian, Hiromasa Morikawa, Yi-Chang James Tsai, Jesper Tegner

Abstract: Age-Related Macular Degeneration (AMD) is an asymptomatic retinal disease which may result in loss of vision. There is limited access to high-quality relevant retinal images and poor understanding of the features defining sub-classes of this disease. Motivated by recent advances in machine learning we specifically explore the potential of generative modeling, using Generative Adversarial Networks… ▽ More Age-Related Macular Degeneration (AMD) is an asymptomatic retinal disease which may result in loss of vision. There is limited access to high-quality relevant retinal images and poor understanding of the features defining sub-classes of this disease. Motivated by recent advances in machine learning we specifically explore the potential of generative modeling, using Generative Adversarial Networks (GANs) and style transferring, to facilitate clinical diagnosis and disease understanding by feature extraction. We design an analytic pipeline which first generates synthetic retinal images from clinical images; a subsequent verification step is applied. In the synthesizing step we merge GANs (DCGANs and WGANs architectures) and style transferring for the image generation, whereas the verified step controls the accuracy of the generated images. We find that the generated images contain sufficient pathological details to facilitate ophthalmologists' task of disease classification and in discovery of disease relevant features. In particular, our system predicts the drusen and geographic atrophy sub-classes of AMD. Furthermore, the performance using CFP images for GANs outperforms the classification based on using only the original clinical dataset. Our results are evaluated using existing classifier of retinal diseases and class activated maps, supporting the predictive power of the synthetic images and their utility for feature extraction. Our code examples are available online. △ Less

Submitted 11 February, 2019; originally announced February 2019.

Journal ref: AI for Retinal Image Analysis Workshop ACCV 2018

arXiv:1811.05592 [pdf]

Controllability, Multiplexing, and Transfer Learning in Networks using Evolutionary Learning

Authors: Rise Ooi, Chao-Han Huck Yang, Pin-Yu Chen, Vìctor Eguìluz, Narsis Kiani, Hector Zenil, David Gomez-Cabrero, Jesper Tegnèr

Abstract: Networks are fundamental building blocks for representing data, and computations. Remarkable progress in learning in structurally defined (shallow or deep) networks has recently been achieved. Here we introduce evolutionary exploratory search and learning method of topologically flexible networks under the constraint of producing elementary computational steady-state input-output operations. Our… ▽ More Networks are fundamental building blocks for representing data, and computations. Remarkable progress in learning in structurally defined (shallow or deep) networks has recently been achieved. Here we introduce evolutionary exploratory search and learning method of topologically flexible networks under the constraint of producing elementary computational steady-state input-output operations. Our results include; (1) the identification of networks, over four orders of magnitude, implementing computation of steady-state input-output functions, such as a band-pass filter, a threshold function, and an inverse band-pass function. Next, (2) the learned networks are technically controllable as only a small number of driver nodes are required to move the system to a new state. Furthermore, we find that the fraction of required driver nodes is constant during evolutionary learning, suggesting a stable system design. (3), our framework allows multiplexing of different computations using the same network. For example, using a binary representation of the inputs, the network can readily compute three different input-output functions. Finally, (4) the proposed evolutionary learning demonstrates transfer learning. If the system learns one function A, then learning B requires on average less number of steps as compared to learning B from tabula rasa. We conclude that the constrained evolutionary learning produces large robust controllable circuits, capable of multiplexing and transfer learning. Our study suggests that network-based computations of steady-state functions, representing either cellular modules of cell-to-cell communication networks or internal molecular circuits communicating within a cell, could be a powerful model for biologically inspired computing. This complements conceptualizations such as attractor based models, or reservoir computing. △ Less

Submitted 3 November, 2019; v1 submitted 13 November, 2018; originally announced November 2018.

Comments: A revised version. (word source code to pdf; owing to the algo package conflicts)

arXiv:1808.05754 [pdf, other]

Auto-Classification of Retinal Diseases in the Limit of Sparse Data Using a Two-Streams Machine Learning Model

Authors: C. -H. Huck Yang, Fangyu Liu, Jia-Hong Huang, Meng Tian, Hiromasa Morikawa, I-Hung Lin, Yi-Chieh Liu, Hao-Hsiang Yang, Jesper Tegner

Abstract: Automatic clinical diagnosis of retinal diseases has emerged as a promising approach to facilitate discovery in areas with limited access to specialists. Based on the fact that fundus structure and vascular disorders are the main characteristics of retinal diseases, we propose a novel visual-assisted diagnosis hybrid model mixing the support vector machine (SVM) and deep neural networks (DNNs). Fu… ▽ More Automatic clinical diagnosis of retinal diseases has emerged as a promising approach to facilitate discovery in areas with limited access to specialists. Based on the fact that fundus structure and vascular disorders are the main characteristics of retinal diseases, we propose a novel visual-assisted diagnosis hybrid model mixing the support vector machine (SVM) and deep neural networks (DNNs). Furthermore, we present a new clinical retina dataset, called EyeNet2, for ophthalmology incorporating 52 retina diseases classes. Using EyeNet2, our model achieves 90.43\% diagnosis accuracy, and the model performance is comparable to the professional ophthalmologists. △ Less

Submitted 1 November, 2018; v1 submitted 16 August, 2018; originally announced August 2018.

Comments: A extension work of a workshop paper arXiv admin note: substantial text overlap with arXiv:1806.06423

Journal ref: Asian Conference on Computer Vision (ACCV), Artificial Intelligence for Retinal Image Analysis Workshop, December 2-6, 2018

arXiv:1807.03001 [pdf, other]

Learning Functions in Large Networks requires Modularity and produces Multi-Agent Dynamics

Authors: C. H. Huck Yang, Rise Ooi, Tom Hiscock, Victor Eguiluz, Jesper Tegnér

Abstract: Networks are abundant in biological systems. Small sized over-represented network motifs have been discovered, and it has been suggested that these constitute functional building blocks. We ask whether larger dynamical network motifs exist in biological networks, thus contributing to the higher-order organization of a network. To end this, we introduce a gradient descent machine learning (ML) appr… ▽ More Networks are abundant in biological systems. Small sized over-represented network motifs have been discovered, and it has been suggested that these constitute functional building blocks. We ask whether larger dynamical network motifs exist in biological networks, thus contributing to the higher-order organization of a network. To end this, we introduce a gradient descent machine learning (ML) approach and genetic algorithms to learn larger functional motifs in contrast to an (unfeasible) exhaustive search. We use the French Flag (FF) and Switch functional motif as case studies motivated from biology. While our algorithm successfully learns large functional motifs, we identify a threshold size of approximately 20 nodes beyond which learning breaks down. Therefore we investigate the stability of the motifs. We find that the size of the real negative eigenvalues of the Jacobian decreases with increasing system size, thus conferring instability. Finally, without imposing learning an input-output for all the components of the network, we observe that unconstrained middle components of the network still learn the desired function, a form of homogeneous team learning. We conclude that the size limitation of learnability, most likely due to stability constraints, impose a definite requirement for modularity in networked systems while enabling team learning within unconstrained parts of the module. Thus, the observation that community structures and modularity are abundant in biological networks could be accounted for by a computational compositional network structure. △ Less

Submitted 21 August, 2018; v1 submitted 9 July, 2018; originally announced July 2018.

Comments: Accepted at the Joint ICML and IJCAI Workshop on Computational Biology (ICML-IJCAI WCB) to be held in Stockholm SWEDEN, 2018. Referring to https://sites.google.com/view/wcb2018/accepted-papers?authuser=0 update the team-learning figure

ACM Class: J.3, I.2

arXiv:1806.06423 [pdf, other]

A Novel Hybrid Machine Learning Model for Auto-Classification of Retinal Diseases

Authors: C. -H. Huck Yang, Jia-Hong Huang, Fangyu Liu, Fang-Yi Chiu, Mengya Gao, Weifeng Lyu, I-Hung Lin M. D., Jesper Tegner

Abstract: Automatic clinical diagnosis of retinal diseases has emerged as a promising approach to facilitate discovery in areas with limited access to specialists. We propose a novel visual-assisted diagnosis hybrid model based on the support vector machine (SVM) and deep neural networks (DNNs). The model incorporates complementary strengths of DNNs and SVM. Furthermore, we present a new clinical retina lab… ▽ More Automatic clinical diagnosis of retinal diseases has emerged as a promising approach to facilitate discovery in areas with limited access to specialists. We propose a novel visual-assisted diagnosis hybrid model based on the support vector machine (SVM) and deep neural networks (DNNs). The model incorporates complementary strengths of DNNs and SVM. Furthermore, we present a new clinical retina label collection for ophthalmology incorporating 32 retina diseases classes. Using EyeNet, our model achieves 89.73% diagnosis accuracy and the model performance is comparable to the professional ophthalmologists. △ Less

Submitted 17 June, 2018; originally announced June 2018.

Comments: Accepted at the Joint ICML and IJCAI Workshop on Computational Biology (ICML-IJCAI WCB) to be held in Stockholm SWEDEN, 2018. Referring to https://sites.google.com/view/wcb2018/accepted-papers?authuser=0

Journal ref: ICML-IJCAI Workshop 2018

arXiv:1805.07166 [pdf, other]

doi 10.3390/e21060560

The Thermodynamics of Network Coding, and an Algorithmic Refinement of the Principle of Maximum Entropy

Authors: Hector Zenil, Narsis A. Kiani, Jesper Tegnér

Abstract: The principle of maximum entropy (Maxent) is often used to obtain prior probability distributions as a method to obtain a Gibbs measure under some restriction giving the probability that a system will be in a certain state compared to the rest of the elements in the distribution. Because classical entropy-based Maxent collapses cases confounding all distinct degrees of randomness and pseudo-random… ▽ More The principle of maximum entropy (Maxent) is often used to obtain prior probability distributions as a method to obtain a Gibbs measure under some restriction giving the probability that a system will be in a certain state compared to the rest of the elements in the distribution. Because classical entropy-based Maxent collapses cases confounding all distinct degrees of randomness and pseudo-randomness, here we take into consideration the generative mechanism of the systems considered in the ensemble to separate objects that may comply with the principle under some restriction and whose entropy is maximal but may be generated recursively from those that are actually algorithmically random offering a refinement to classical Maxent. We take advantage of a causal algorithmic calculus to derive a thermodynamic-like result based on how difficult it is to reprogram a computer code. Using the distinction between computable and algorithmic randomness we quantify the cost in information loss associated with reprogramming. To illustrate this we apply the algorithmic refinement to Maxent on graphs and introduce a Maximal Algorithmic Randomness Preferential Attachment (MARPA) Algorithm, a generalisation over previous approaches. We discuss practical implications of evaluation of network randomness. Our analysis provides insight in that the reprogrammability asymmetry appears to originate from a non-monotonic relationship to algorithmic probability. Our analysis motivates further analysis of the origin and consequences of the aforementioned asymmetries, reprogrammability, and computation. △ Less

Submitted 6 June, 2019; v1 submitted 18 May, 2018; originally announced May 2018.

Comments: 30 pages

Journal ref: Entropy, 21(6), 560, 2019

arXiv:1803.02186 [pdf, other]

Symmetry and Algorithmic Complexity of Polyominoes and Polyhedral Graphs

Authors: Hector Zenil, Narsis A. Kiani, Jesper Tegnér

Abstract: We introduce a definition of algorithmic symmetry able to capture essential aspects of geometric symmetry. We review, study and apply a method for approximating the algorithmic complexity (also known as Kolmogorov-Chaitin complexity) of graphs and networks based on the concept of Algorithmic Probability (AP). AP is a concept (and method) capable of recursively enumeration all properties of computa… ▽ More We introduce a definition of algorithmic symmetry able to capture essential aspects of geometric symmetry. We review, study and apply a method for approximating the algorithmic complexity (also known as Kolmogorov-Chaitin complexity) of graphs and networks based on the concept of Algorithmic Probability (AP). AP is a concept (and method) capable of recursively enumeration all properties of computable (causal) nature beyond statistical regularities. We explore the connections of algorithmic complexity---both theoretical and numerical---with geometric properties mainly symmetry and topology from an (algorithmic) information-theoretic perspective. We show that approximations to algorithmic complexity by lossless compression and an Algorithmic Probability-based method can characterize properties of polyominoes, polytopes, regular and quasi-regular polyhedra as well as polyhedral networks, thereby demonstrating its profiling capabilities. △ Less

Submitted 24 February, 2018; originally announced March 2018.

Comments: 18 pages, 4 figures + Appendix (1 figure)

arXiv:1802.09904 [pdf, other]

Algorithmic Causal Deconvolution of Intertwined Programs and Networks by Generative Mechanism

Authors: Hector Zenil, Narsis A. Kiani, Allan A. Zea, Jesper Tegnér

Abstract: Complex data usually results from the interaction of objects produced by different generating mechanisms. Here we introduce a universal, unsupervised and parameter-free model-oriented approach, based upon the seminal concept of algorithmic probability, that decomposes an observation into its most likely algorithmic generative sources. Our approach uses a causal calculus to infer model representati… ▽ More Complex data usually results from the interaction of objects produced by different generating mechanisms. Here we introduce a universal, unsupervised and parameter-free model-oriented approach, based upon the seminal concept of algorithmic probability, that decomposes an observation into its most likely algorithmic generative sources. Our approach uses a causal calculus to infer model representations. We demonstrate its ability to deconvolve interacting mechanisms regardless of whether the resultant objects are strings, space-time evolution diagrams, images or networks. While this is mostly a conceptual contribution and a novel framework, we provide numerical evidence evaluating the ability of our methods to separate data from observations produced by discrete dynamical systems such as cellular automata and complex networks. We think that these separating techniques can contribute to tackling the challenge of causation, thus complementing other statistically oriented approaches. △ Less

Submitted 12 September, 2018; v1 submitted 18 February, 2018; originally announced February 2018.

Comments: 29 pages + 7 Sup Inf. 9 figures in total

arXiv:1802.07181 [pdf, other]

Algorithmic Information Dynamics of Persistent Patterns and Colliding Particles in the Game of Life

Authors: Hector Zenil, Narsis A. Kiani, Jesper Tegnér

Abstract: Without loss of generalisation to other systems, including possibly non-deterministic ones, we demonstrate the application of methods drawn from algorithmic information dynamics to the characterisation and classification of emergent and persistent patterns, motifs and colliding particles in Conway's Game of Life (GoL), a cellular automaton serving as a case study illustrating the way in which such… ▽ More Without loss of generalisation to other systems, including possibly non-deterministic ones, we demonstrate the application of methods drawn from algorithmic information dynamics to the characterisation and classification of emergent and persistent patterns, motifs and colliding particles in Conway's Game of Life (GoL), a cellular automaton serving as a case study illustrating the way in which such ideas can be applied to a typical discrete dynamical system. We explore the issue of local observations of closed systems whose orbits may appear open because of inaccessibility to the global rules governing the overall system. We also investigate aspects of symmetry related to complexity in the distribution of patterns that occur with high frequency in GoL (which we thus call motifs) and analyse the distribution of these motifs with a view to tracking the changes in their algorithmic probability over time. We demonstrate how the tools introduced are an alternative to other computable measures that are unable to capture changes in emergent structures in evolving complex systems that are often too small or too subtle to be properly characterised by methods such as lossless compression and Shannon entropy. △ Less

Submitted 5 April, 2018; v1 submitted 17 February, 2018; originally announced February 2018.

Comments: 18 pages + 1 sup page, 8 figures in total. Online complexity calculator: http://complexitycalculator.com/

arXiv:1802.05856 [pdf, other]

Algorithmic Complexity and Reprogrammability of Chemical Structure Networks

Authors: Hector Zenil, Narsis A. Kiani, Ming-Mei Shang, Jesper Tegnér

Abstract: Here we address the challenge of profiling causal properties and tracking the transformation of chemical compounds from an algorithmic perspective. We explore the potential of applying a computational interventional calculus based on the principles of algorithmic probability to chemical structure networks. We profile the sensitivity of the elements and covalent bonds in a chemical structure networ… ▽ More Here we address the challenge of profiling causal properties and tracking the transformation of chemical compounds from an algorithmic perspective. We explore the potential of applying a computational interventional calculus based on the principles of algorithmic probability to chemical structure networks. We profile the sensitivity of the elements and covalent bonds in a chemical structure network algorithmically, asking whether reprogrammability affords information about thermodynamic and chemical processes involved in the transformation of different compound classes. We arrive at numerical results suggesting a correspondence between some physical, structural and functional properties. Our methods are capable of separating chemical classes that reflect functional and natural differences without considering any information about atomic and molecular properties. We conclude that these methods, with their links to chemoinformatics via algorithmic, probability hold promise for future research. △ Less

Submitted 18 March, 2018; v1 submitted 16 February, 2018; originally announced February 2018.

Comments: 19 pages + Appendix

arXiv:1802.05843 [pdf, other]

Minimal Algorithmic Information Loss Methods for Dimension Reduction, Feature Selection and Network Sparsification

Authors: Hector Zenil, Narsis A. Kiani, Alyssa Adams, Felipe S. Abrahão, Antonio Rueda-Toicen, Allan A. Zea, Luan Ozelim, Jesper Tegnér

Abstract: We present a novel, domain-agnostic, model-independent, unsupervised, and universally applicable Machine Learning approach for dimensionality reduction based on the principles of algorithmic complexity. Specifically, but without loss of generality, we focus on addressing the challenge of reducing certain dimensionality aspects, such as the number of edges in a network, while retaining essential fe… ▽ More We present a novel, domain-agnostic, model-independent, unsupervised, and universally applicable Machine Learning approach for dimensionality reduction based on the principles of algorithmic complexity. Specifically, but without loss of generality, we focus on addressing the challenge of reducing certain dimensionality aspects, such as the number of edges in a network, while retaining essential features of interest. These features include preserving crucial network properties like degree distribution, clustering coefficient, edge betweenness, and degree and eigenvector centralities but can also go beyond edges to nodes and weights for network pruning and trimming. Our approach outperforms classical statistical Machine Learning techniques and state-of-the-art dimensionality reduction algorithms by preserving a greater number of data features that statistical algorithms would miss, particularly nonlinear patterns stemming from deterministic recursive processes that may look statistically random but are not. Moreover, previous approaches heavily rely on a priori feature selection, which requires constant supervision. Our findings demonstrate the effectiveness of the algorithms in overcoming some of these limitations while maintaining a time-efficient computational profile. Our approach not only matches, but also exceeds, the performance of established and state-of-the-art dimensionality reduction algorithms. We extend the applicability of our method to lossy compression tasks involving images and any multi-dimensional data. This highlights the versatility and broad utility of the approach in multiple domains. △ Less

Submitted 3 May, 2025; v1 submitted 16 February, 2018; originally announced February 2018.

Comments: Online implementation at http://complexity-calculator.com/MILS/

arXiv:1801.05058 [pdf]

Predictive Systems Toxicology

Authors: Narsis A. Kiani, Ming-Mei Shang, Hector Zenil, Jesper Tegnér

Abstract: In this review we address to what extent computational techniques can augment our ability to predict toxicity. The first section provides a brief history of empirical observations on toxicity dating back to the dawn of Sumerian civilization. Interestingly, the concept of dose emerged very early on, leading up to the modern emphasis on kinetic properties, which in turn encodes the insight that toxi… ▽ More In this review we address to what extent computational techniques can augment our ability to predict toxicity. The first section provides a brief history of empirical observations on toxicity dating back to the dawn of Sumerian civilization. Interestingly, the concept of dose emerged very early on, leading up to the modern emphasis on kinetic properties, which in turn encodes the insight that toxicity is not solely a property of a compound but instead depends on the interaction with the host organism. The next logical step is the current conception of evaluating drugs from a personalized medicine point-of-view. We review recent work on integrating what could be referred to as classical pharmacokinetic analysis with emerging systems biology approaches incorporating multiple omics data. These systems approaches employ advanced statistical analytical data processing complemented with machine learning techniques and use both pharmacokinetic and omics data. We find that such integrated approaches not only provide improved predictions of toxicity but also enable mechanistic interpretations of the molecular mechanisms underpinning toxicity and drug resistance. We conclude the chapter by discussing some of the main challenges, such as how to balance the inherent tension between the predictive capacity of models, which in practice amounts to constraining the number of features in the models versus allowing for rich mechanistic interpretability, i.e. equipping models with numerous molecular features. This challenge also requires patient-specific predictions on toxicity, which in turn requires proper stratification of patients as regards how they respond, with or without adverse toxic effects. In summary, the transformation of the ancient concept of dose is currently successfully operationalized using rich integrative data encoded in patient-specific models. △ Less

Submitted 15 January, 2018; originally announced January 2018.

Comments: 37 pages, 3 figures. As accepted for the volume in reference

Journal ref: Computational Toxicology - Methods and Protocols, series in Methods in Molecular Biology, Springer Nature, 2017

arXiv:1709.05429 [pdf]

An Algorithmic Information Calculus for Causal Discovery and Reprogramming Systems

Authors: Hector Zenil, Narsis A. Kiani, Francesco Marabita, Yue Deng, Szabolcs Elias, Angelika Schmidt, Gordon Ball, Jesper Tegnér

Abstract: We demonstrate that the algorithmic information content of a system is deeply connected to its potential dynamics, thus affording an avenue for moving systems in the information-theoretic space and controlling them in the phase space. To this end we performed experiments and validated the results on (1) a very large set of small graphs, (2) a number of larger networks with different topologies, an… ▽ More We demonstrate that the algorithmic information content of a system is deeply connected to its potential dynamics, thus affording an avenue for moving systems in the information-theoretic space and controlling them in the phase space. To this end we performed experiments and validated the results on (1) a very large set of small graphs, (2) a number of larger networks with different topologies, and (3) biological networks from a widely studied and validated genetic network (e.coli) as well as on a significant number of differentiating (Th17) and differentiated human cells from high quality databases (Harvard's CellNet) with results conforming to experimentally validated biological data. Based on these results we introduce a conceptual framework, a model-based interventional calculus and a reprogrammability measure with which to steer, manipulate, and reconstruct the dynamics of non- linear dynamical systems from partial and disordered observations. The method consists in finding and applying a series of controlled interventions to a dynamical system to estimate how its algorithmic information content is affected when every one of its elements are perturbed. The approach represents an alternative to numerical simulation and statistical approaches for inferring causal mechanistic/generative models and finding first principles. We demonstrate the framework's capabilities by reconstructing the phase space of some discrete dynamical systems (cellular automata) as case study and reconstructing their generating rules. We thus advance tools for reprogramming artificial and living systems without full knowledge or access to the system's actual kinetic equations or probability distributions yielding a suite of universal and parameter-free algorithms of wide applicability ranging from causation, dimension reduction, feature selection and model generation. △ Less

Submitted 5 April, 2018; v1 submitted 15 September, 2017; originally announced September 2017.

Comments: 50 pages with Supplementary Information and Extended Figures. The Online Algorithmic Complexity Calculator implements the methods in this paper: http://complexitycalculator.com/ Animated video available at: https://youtu.be/ufzq2p5tVLI

arXiv:1706.01241 [pdf, other]

HiDi: An efficient reverse engineering schema for large scale dynamic regulatory network reconstruction using adaptive differentiation

Authors: Yue Deng, Hector Zenil, Jesper Tégner, Narsis A. Kiani

Abstract: The use of differential equations (ODE) is one of the most promising approaches to network inference. The success of ODE-based approaches has, however, been limited, due to the difficulty in estimating parameters and by their lack of scalability. Here we introduce a novel method and pipeline to reverse engineer gene regulatory networks from gene expression of time series and perturbation data base… ▽ More The use of differential equations (ODE) is one of the most promising approaches to network inference. The success of ODE-based approaches has, however, been limited, due to the difficulty in estimating parameters and by their lack of scalability. Here we introduce a novel method and pipeline to reverse engineer gene regulatory networks from gene expression of time series and perturbation data based upon an improvement on the calculation scheme of the derivatives and a pre-filtration step to reduce the number of possible links. The method introduces a linear differential equation model with adaptive numerical differentiation that is scalable to extremely large regulatory networks. We demonstrate the ability of this method to outperform current state-of-the-art methods applied to experimental and synthetic data using test data from the DREAM4 and DREAM5 challenges. Our method displays greater accuracy and scalability. We benchmark the performance of the pipeline with respect to data set size and levels of noise. We show that the computation time is linear over various network sizes. △ Less

Submitted 7 June, 2017; v1 submitted 5 June, 2017; originally announced June 2017.

Comments: As accepted by the journal Bioinformatics (Oxford)

arXiv:1608.05972 [pdf, other]

doi 10.1103/PhysRevE.96.012308

Low Algorithmic Complexity Entropy-deceiving Graphs

Authors: Hector Zenil, Narsis Kiani, Jesper Tegnér

Abstract: In estimating the complexity of objects, in particular of graphs, it is common practice to rely on graph- and information-theoretic measures. Here, using integer sequences with properties such as Borel normality, we explain how these measures are not independent of the way in which an object, such as a graph, can be described or observed. From observations that can reconstruct the same graph and a… ▽ More In estimating the complexity of objects, in particular of graphs, it is common practice to rely on graph- and information-theoretic measures. Here, using integer sequences with properties such as Borel normality, we explain how these measures are not independent of the way in which an object, such as a graph, can be described or observed. From observations that can reconstruct the same graph and are therefore essentially translations of the same description, we will see that when applying a computable measure such as Shannon Entropy, not only is it necessary to pre-select a feature of interest where there is one, and to make an arbitrary selection where there is not, but also that more general properties, such as the causal likelihood of a graph as a measure (opposed to randomness), can be largely misrepresented by computable measures such as Entropy and Entropy rate. We introduce recursive and non-recursive (uncomputable) graphs and graph constructions based on these integer sequences, whose different lossless descriptions have disparate Entropy values, thereby enabling the study and exploration of a measure's range of applications and demonstrating the weaknesses of computable measures of complexity. △ Less

Submitted 10 May, 2017; v1 submitted 21 August, 2016; originally announced August 2016.

Comments: 28 pages

ACM Class: F.1.3

Journal ref: Phys. Rev. E 96, 012308 (2017)

arXiv:1607.05826 [pdf, other]

Characterization and reduction of variability in selection based on effect-size using association measures in cohort study of heterogeneous diseases

Authors: Venkateshan Kannan, Kristina Alexandersson, Jesper Tegner

Abstract: Cohort studies employ pairwise measures of association to quantify dependencies among conditions and exposures. To reliably use these measures to draw conclusions about the underlying association strengths requires that the measures be robust and unbiased. These considerations assume greater significance when applied to disease networks, where associations among heterogeneous pairs of diseases are… ▽ More Cohort studies employ pairwise measures of association to quantify dependencies among conditions and exposures. To reliably use these measures to draw conclusions about the underlying association strengths requires that the measures be robust and unbiased. These considerations assume greater significance when applied to disease networks, where associations among heterogeneous pairs of diseases are ranked. Using disease diagnoses data from a large cohort of 5.5 million individuals, we develop a comprehensive methodology to characterize the bias of standard association measures like relative risk and $φ$ correlation. To overcome these biases, we devise a novel measure based on a stochastic model for disease development. The new measure is demonstrated to have the least overall bias and hence would be most suitable for application to heterogeneous disease cohorts. △ Less

Submitted 27 May, 2017; v1 submitted 20 July, 2016; originally announced July 2016.

arXiv:1512.01088 [pdf, other]

Evaluating Network Inference Methods in Terms of Their Ability to Preserve the Topology and Complexity of Genetic Networks

Authors: Narsis A. Kiani, Hector Zenil, Jakub Olczak, Jesper Tegnér

Abstract: Network inference is a rapidly advancing field, with new methods being proposed on a regular basis. Understanding the advantages and limitations of different network inference methods is key to their effective application in different circumstances. The common structural properties shared by diverse networks naturally pose a challenge when it comes to devising accurate inference methods, but surpr… ▽ More Network inference is a rapidly advancing field, with new methods being proposed on a regular basis. Understanding the advantages and limitations of different network inference methods is key to their effective application in different circumstances. The common structural properties shared by diverse networks naturally pose a challenge when it comes to devising accurate inference methods, but surprisingly, there is a paucity of comparison and evaluation methods. Historically, every new methodology has only been tested against \textit{gold standard} (true values) purpose-designed synthetic and real-world (validated) biological networks. In this paper we aim to assess the impact of taking into consideration aspects of topological and information content in the evaluation of the final accuracy of an inference procedure. Specifically, we will compare the best inference methods, in both graph-theoretic and information-theoretic terms, for preserving topological properties and the original information content of synthetic and biological networks. New methods for performance comparison are introduced by borrowing ideas from gene set enrichment analysis and by applying concepts from algorithmic complexity. Experimental results show that no individual algorithm outperforms all others in all cases, and that the challenging and non-trivial nature of network inference is evident in the struggle of some of the algorithms to turn in a performance that is superior to random guesswork. Therefore special care should be taken to suit the method to the purpose at hand. Finally, we show that evaluations from data generated using different underlying topologies have different signatures that can be used to better choose a network reconstruction method. △ Less

Submitted 14 September, 2016; v1 submitted 3 December, 2015; originally announced December 2015.

Comments: main part: 18 pages. 21 pages with Sup Inf. Forthcoming in the journal of Seminars in Cell and Developmental Biology

arXiv:1509.06338 [pdf]

Approximations of Algorithmic and Structural Complexity Validate Cognitive-behavioural Experimental Results

Authors: Hector Zenil, James A. R. Marshall, Jesper Tegnér

Abstract: Being able to objectively characterise the intrinsic complexity of behavioural patterns resulting from human or animal decisions is fundamental for deconvolving cognition and designing autonomous artificial intelligence systems. Yet complexity is difficult in practice, particularly when strings are short. By numerically approximating algorithmic (Kolmogorov) complexity (K), we establish an objecti… ▽ More Being able to objectively characterise the intrinsic complexity of behavioural patterns resulting from human or animal decisions is fundamental for deconvolving cognition and designing autonomous artificial intelligence systems. Yet complexity is difficult in practice, particularly when strings are short. By numerically approximating algorithmic (Kolmogorov) complexity (K), we establish an objective tool to characterise behavioural complexity. Next, we approximate structural (Bennett's Logical Depth) complexity (LD) to assess the amount of computation required for generating a behavioural string. We apply our toolbox to three landmark studies of animal behaviour of increasing sophistication and degree of environmental influence, including studies of foraging communication by ants, flight patterns of fruit flies, and tactical deception and competition (e.g., predator-prey) strategies. We find that ants harness the environmental condition in their internal decision process, modulating their behavioural complexity accordingly. Our analysis of flight (fruit flies) invalidated the common hypothesis that animals navigating in an environment devoid of stimuli adopt a random strategy. Fruit flies exposed to a featureless environment deviated the most from Levy flight, suggesting an algorithmic bias in their attempt to devise a useful (navigation) strategy. Similarly, a logical depth analysis of rats revealed that the structural complexity of the rat always ends up matching the structural complexity of the competitor, with the rats' behaviour simulating algorithmic randomness. Finally, we discuss how experiments on how humans perceive randomness suggest the existence of an algorithmic bias in our reasoning and decision processes, in line with our analysis of the animal experiments. △ Less

Submitted 20 December, 2022; v1 submitted 21 September, 2015; originally announced September 2015.

Comments: 42 pages, 7 figures and 3 tables

arXiv:1508.06538 [pdf, other]

Causality, Information and Biological Computation: An algorithmic software approach to life, disease and the immune system

Authors: Hector Zenil, Angelika Schmidt, Jesper Tegnér

Abstract: Biology has taken strong steps towards becoming a computer science aiming at reprogramming nature after the realisation that nature herself has reprogrammed organisms by harnessing the power of natural selection and the digital prescriptive nature of replicating DNA. Here we further unpack ideas related to computability, algorithmic information theory and software engineering, in the context of th… ▽ More Biology has taken strong steps towards becoming a computer science aiming at reprogramming nature after the realisation that nature herself has reprogrammed organisms by harnessing the power of natural selection and the digital prescriptive nature of replicating DNA. Here we further unpack ideas related to computability, algorithmic information theory and software engineering, in the context of the extent to which biology can be (re)programmed, and with how we may go about doing so in a more systematic way with all the tools and concepts offered by theoretical computer science in a translation exercise from computing to molecular biology and back. These concepts provide a means to a hierarchical organization thereby blurring previously clear-cut lines between concepts like matter and life, or between tumour types that are otherwise taken as different and may not have however a different cause. This does not diminish the properties of life or make its components and functions less interesting. On the contrary, this approach makes for a more encompassing and integrated view of nature, one that subsumes observer and observed within the same system, and can generate new perspectives and tools with which to view complex diseases like cancer, approaching them afresh from a software-engineering viewpoint that casts evolution in the role of programmer, cells as computing machines, DNA and genes as instructions and computer programs, viruses as hacking devices, the immune system as a software debugging tool, and diseases as an information-theoretic battlefield where all these forces deploy. We show how information theory and algorithmic programming may explain fundamental mechanisms of life and death. △ Less

Submitted 19 January, 2016; v1 submitted 24 August, 2015; originally announced August 2015.

Comments: 30 pages, 8 figures. Invited chapter contribution to Information and Causality: From Matter to Life. Sara I. Walker, Paul C.W. Davies and George Ellis (eds.), Cambridge University Press

arXiv:1504.06249 [pdf, other]

Quantifying Loss of Information in Network-based Dimensionality Reduction Techniques

Authors: Hector Zenil, Narsis A. Kiani, Jesper Tegnér

Abstract: To cope with the complexity of large networks, a number of dimensionality reduction techniques for graphs have been developed. However, the extent to which information is lost or preserved when these techniques are employed has not yet been clear. Here we develop a framework, based on algorithmic information theory, to quantify the extent to which information is preserved when network motif analys… ▽ More To cope with the complexity of large networks, a number of dimensionality reduction techniques for graphs have been developed. However, the extent to which information is lost or preserved when these techniques are employed has not yet been clear. Here we develop a framework, based on algorithmic information theory, to quantify the extent to which information is preserved when network motif analysis, graph spectra and spectral sparsification methods are applied to over twenty different biological and artificial networks. We find that the spectral sparsification is highly sensitive to high number of edge deletion, leading to significant inconsistencies, and that graph spectral methods are the most irregular, capturing algebraic information in a condensed fashion but largely losing most of the information content of the original networks. However, the approach shows that network motif analysis excels at preserving the relative algorithmic information content of a network, hence validating and generalizing the remarkable fact that despite their inherent combinatorial possibilities, local regularities preserve information to such an extent that essential properties are fully recoverable across different networks to determine their family group to which they belong to (eg genetic vs social network). Our algorithmic information methodology thus provides a rigorous framework enabling a fundamental assessment and comparison between different data dimensionality reduction methods thereby facilitating the identification and evaluation of the capabilities of old and new methods. △ Less

Submitted 27 August, 2015; v1 submitted 23 April, 2015; originally announced April 2015.

Comments: 29 pages, 6 figures

arXiv:1501.06080 [pdf, other]

Numerical Investigation of Graph Spectra and Information Interpretability of Eigenvalues

Authors: Hector Zenil, Narsis A. Kiani, Jesper Tegnér

Abstract: We undertake an extensive numerical investigation of the graph spectra of thousands regular graphs, a set of random Erdös-Rényi graphs, the two most popular types of complex networks and an evolving genetic network by using novel conceptual and experimental tools. Our objective in so doing is to contribute to an understanding of the meaning of the Eigenvalues of a graph relative to its topological… ▽ More We undertake an extensive numerical investigation of the graph spectra of thousands regular graphs, a set of random Erdös-Rényi graphs, the two most popular types of complex networks and an evolving genetic network by using novel conceptual and experimental tools. Our objective in so doing is to contribute to an understanding of the meaning of the Eigenvalues of a graph relative to its topological and information-theoretic properties. We introduce a technique for identifying the most informative Eigenvalues of evolving networks by comparing graph spectra behavior to their algorithmic complexity. We suggest that extending techniques can be used to further investigate the behavior of evolving biological networks. In the extended version of this paper we apply these techniques to seven tissue specific regulatory networks as static example and network of a naïve pluripotent immune cell in the process of differentiating towards a Th17 cell as evolving example, finding the most and least informative Eigenvalues at every stage. △ Less

Submitted 24 January, 2015; originally announced January 2015.

Comments: Forthcoming in 3rd International Work-Conference on Bioinformatics and Biomedical Engineering (IWBBIO), Lecture Notes in Bioinformatics, 2015

arXiv:1501.04242 [pdf, ps, other]

The Information-theoretic and Algorithmic Approach to Human, Animal and Artificial Cognition

Authors: Nicolas Gauvrit, Hector Zenil, Jesper Tegnér

Abstract: We survey concepts at the frontier of research connecting artificial, animal and human cognition to computation and information processing---from the Turing test to Searle's Chinese Room argument, from Integrated Information Theory to computational and algorithmic complexity. We start by arguing that passing the Turing test is a trivial computational problem and that its pragmatic difficulty sheds… ▽ More We survey concepts at the frontier of research connecting artificial, animal and human cognition to computation and information processing---from the Turing test to Searle's Chinese Room argument, from Integrated Information Theory to computational and algorithmic complexity. We start by arguing that passing the Turing test is a trivial computational problem and that its pragmatic difficulty sheds light on the computational nature of the human mind more than it does on the challenge of artificial intelligence. We then review our proposed algorithmic information-theoretic measures for quantifying and characterizing cognition in various forms. These are capable of accounting for known biases in human behavior, thus vindicating a computational algorithmic view of cognition as first suggested by Turing, but this time rooted in the concept of algorithmic probability, which in turn is based on computational universality while being independent of computational model, and which has the virtue of being predictive and testable as a model theory of cognitive behavior. △ Less

Submitted 24 December, 2015; v1 submitted 17 January, 2015; originally announced January 2015.

Comments: 22 pages. Forthcoming in Gordana Dodig-Crnkovic and Raffaela Giovagnoli (eds). Representation and Reality: Humans, Animals and Machines, Springer Verlag

arXiv:1406.5104 [pdf, other]

On the Theory and Algorithm for rigorous discretization in applications of Information Theory

Authors: Venkateshan Kannan, Jesper Tegnèr

Abstract: We identify fundamental issues with discretization when estimating information-theoretic quantities in the analysis of data. These difficulties are theoretical in nature and arise with discrete datasets carrying significant implications for the corresponding claims and results. Here we describe the origins of the methodological problems, and provide a clear illustration of their impact with the ex… ▽ More We identify fundamental issues with discretization when estimating information-theoretic quantities in the analysis of data. These difficulties are theoretical in nature and arise with discrete datasets carrying significant implications for the corresponding claims and results. Here we describe the origins of the methodological problems, and provide a clear illustration of their impact with the example of biological network reconstruction. We propose an algorithm (shared information metric) that corrects for the biases and the resulting improved performance of the algorithm demonstrates the need to take due consideration of this issue in different contexts. △ Less

Submitted 23 June, 2014; v1 submitted 19 June, 2014; originally announced June 2014.

arXiv:1401.3604 [pdf, other]

Methods of Information Theory and Algorithmic Complexity for Network Biology

Authors: Hector Zenil, Narsis A. Kiani, Jesper Tegnér

Abstract: We survey and introduce concepts and tools located at the intersection of information theory and network biology. We show that Shannon's information entropy, compressibility and algorithmic complexity quantify different local and global aspects of synthetic and biological data. We show examples such as the emergence of giant components in Erdos-Renyi random graphs, and the recovery of topological… ▽ More We survey and introduce concepts and tools located at the intersection of information theory and network biology. We show that Shannon's information entropy, compressibility and algorithmic complexity quantify different local and global aspects of synthetic and biological data. We show examples such as the emergence of giant components in Erdos-Renyi random graphs, and the recovery of topological properties from numerical kinetic properties simulating gene expression data. We provide exact theoretical calculations, numerical approximations and error estimations of entropy, algorithmic probability and Kolmogorov complexity for different types of graphs, characterizing their variant and invariant properties. We introduce formal definitions of complexity for both labeled and unlabeled graphs and prove that the Kolmogorov complexity of a labeled graph is a good approximation of its unlabeled Kolmogorov complexity and thus a robust definition of graph complexity. △ Less

Submitted 11 December, 2015; v1 submitted 15 January, 2014; originally announced January 2014.

Comments: 28 pages. Forthcoming in the journal Seminars in Cell and Developmental Biology

arXiv:1206.6847 [pdf]

Identifying the Relevant Nodes Without Learning the Model

Authors: Jose M. Pena, Roland Nilsson, Johan Björkegren, Jesper Tegnér

Abstract: We propose a method to identify all the nodes that are relevant to compute all the conditional probability distributions for a given set of nodes. Our method is simple, effcient, consistent, and does not require learning a Bayesian network first. Therefore, our method can be applied to high-dimensional databases, e.g. gene expression databases. We propose a method to identify all the nodes that are relevant to compute all the conditional probability distributions for a given set of nodes. Our method is simple, effcient, consistent, and does not require learning a Bayesian network first. Therefore, our method can be applied to high-dimensional databases, e.g. gene expression databases. △ Less

Submitted 27 June, 2012; originally announced June 2012.

Comments: Appears in Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (UAI2006)

Report number: UAI-P-2006-PG-367-374

Showing 1–42 of 42 results for author: Tegnér, J