Search | arXiv e-print repository

The Discovery Engine: A Framework for AI-Driven Synthesis and Navigation of Scientific Knowledge Landscapes

Authors: Vladimir Baulin, Austin Cook, Daniel Friedman, Janna Lumiruusu, Andrew Pashea, Shagor Rahman, Benedikt Waldeck

Abstract: The prevailing model for disseminating scientific knowledge relies on individual publications dispersed across numerous journals and archives. This legacy system is ill suited to the recent exponential proliferation of publications, contributing to insurmountable information overload, issues surrounding reproducibility and retractions. We introduce the Discovery Engine, a framework to address thes… ▽ More The prevailing model for disseminating scientific knowledge relies on individual publications dispersed across numerous journals and archives. This legacy system is ill suited to the recent exponential proliferation of publications, contributing to insurmountable information overload, issues surrounding reproducibility and retractions. We introduce the Discovery Engine, a framework to address these challenges by transforming an array of disconnected literature into a unified, computationally tractable representation of a scientific domain. Central to our approach is the LLM-driven distillation of publications into structured "knowledge artifacts," instances of a universal conceptual schema, complete with verifiable links to source evidence. These artifacts are then encoded into a high-dimensional Conceptual Tensor. This tensor serves as the primary, compressed representation of the synthesized field, where its labeled modes index scientific components (concepts, methods, parameters, relations) and its entries quantify their interdependencies. The Discovery Engine allows dynamic "unrolling" of this tensor into human-interpretable views, such as explicit knowledge graphs (the CNM graph) or semantic vector spaces, for targeted exploration. Crucially, AI agents operate directly on the graph using abstract mathematical and learned operations to navigate the knowledge landscape, identify non-obvious connections, pinpoint gaps, and assist researchers in generating novel knowledge artifacts (hypotheses, designs). By converting literature into a structured tensor and enabling agent-based interaction with this compact representation, the Discovery Engine offers a new paradigm for AI-augmented scientific inquiry and accelerated discovery. △ Less

Submitted 23 May, 2025; originally announced May 2025.

arXiv:2405.10243 [pdf, other]

DocuMint: Docstring Generation for Python using Small Language Models

Authors: Bibek Poudel, Adam Cook, Sekou Traore, Shelah Ameli

Abstract: Effective communication, specifically through documentation, is the beating heart of collaboration among contributors in software development. Recent advancements in language models (LMs) have enabled the introduction of a new type of actor in that ecosystem: LM-powered assistants capable of code generation, optimization, and maintenance. Our study investigates the efficacy of small language model… ▽ More Effective communication, specifically through documentation, is the beating heart of collaboration among contributors in software development. Recent advancements in language models (LMs) have enabled the introduction of a new type of actor in that ecosystem: LM-powered assistants capable of code generation, optimization, and maintenance. Our study investigates the efficacy of small language models (SLMs) for generating high-quality docstrings by assessing accuracy, conciseness, and clarity, benchmarking performance quantitatively through mathematical formulas and qualitatively through human evaluation using Likert scale. Further, we introduce DocuMint, as a large-scale supervised fine-tuning dataset with 100,000 samples. In quantitative experiments, Llama 3 8B achieved the best performance across all metrics, with conciseness and clarity scores of 0.605 and 64.88, respectively. However, under human evaluation, CodeGemma 7B achieved the highest overall score with an average of 8.3 out of 10 across all metrics. Fine-tuning the CodeGemma 2B model using the DocuMint dataset led to significant improvements in performance across all metrics, with gains of up to 22.5% in conciseness. The fine-tuned model and the dataset can be found in HuggingFace and the code can be found in the repository. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: 12 pages, 4 figures

arXiv:2311.00812 [pdf, other]

InfoGuard: A Design and Usability Study of User-Controlled Application-Independent Encryption for Privacy-Conscious Users

Authors: Tarun Yadav, Austin Cook, Justin Hales, Kent Seamons

Abstract: Billions of secure messaging users have adopted end-to-end encryption (E2EE). Nevertheless, challenges remain. Most communication applications do not provide E2EE, and application silos prevent interoperability. Our qualitative analysis of privacy-conscious users' discussions of E2EE on Reddit reveals concerns about trusting client applications with plaintext, lack of clear indicators about how en… ▽ More Billions of secure messaging users have adopted end-to-end encryption (E2EE). Nevertheless, challenges remain. Most communication applications do not provide E2EE, and application silos prevent interoperability. Our qualitative analysis of privacy-conscious users' discussions of E2EE on Reddit reveals concerns about trusting client applications with plaintext, lack of clear indicators about how encryption works, high cost to switch apps, and concerns that most apps are not open source. We propose InfoGuard, a system enabling E2EE for user-to-user communication in any application. InfoGuard allows users to trigger encryption on any textbox, even if the application does not support E2EE. InfoGuard encrypts text before it reaches the application, eliminating the client app's access to plaintext. InfoGuard also incorporates visible encryption to make it easier for users to understand that their data is being encrypted and give them greater confidence in the system's security. The design enables fine-grained encryption, allowing specific sensitive data items to be encrypted while the rest remains visible to the server. Participants in our user study found InfoGuard usable and trustworthy, expressing a willingness to adopt it. △ Less

Submitted 1 November, 2023; originally announced November 2023.

arXiv:2308.03404 [pdf, other]

Applied metamodelling for ATM performance simulations

Authors: Christoffer Riis, Francisco N. Antunes, Tatjana Bolić, Gérald Gurtner, Andrew Cook, Carlos Lima Azevedo, Francisco Câmara Pereira

Abstract: The use of Air traffic management (ATM) simulators for planing and operations can be challenging due to their modelling complexity. This paper presents XALM (eXplainable Active Learning Metamodel), a three-step framework integrating active learning and SHAP (SHapley Additive exPlanations) values into simulation metamodels for supporting ATM decision-making. XALM efficiently uncovers hidden relatio… ▽ More The use of Air traffic management (ATM) simulators for planing and operations can be challenging due to their modelling complexity. This paper presents XALM (eXplainable Active Learning Metamodel), a three-step framework integrating active learning and SHAP (SHapley Additive exPlanations) values into simulation metamodels for supporting ATM decision-making. XALM efficiently uncovers hidden relationships among input and output variables in ATM simulators, those usually of interest in policy analysis. Our experiments show XALM's predictive performance comparable to the XGBoost metamodel with fewer simulations. Additionally, XALM exhibits superior explanatory capabilities compared to non-active learning metamodels. Using the `Mercury' (flight and passenger) ATM simulator, XALM is applied to a real-world scenario in Paris Charles de Gaulle airport, extending an arrival manager's range and scope by analysing six variables. This case study illustrates XALM's effectiveness in enhancing simulation interpretability and understanding variable interactions. By addressing computational challenges and improving explainability, XALM complements traditional simulation-based analyses. Lastly, we discuss two practical approaches for reducing the computational burden of the metamodelling further: we introduce a stopping criterion for active learning based on the inherent uncertainty of the metamodel, and we show how the simulations used for the metamodel can be reused across key performance indicators, thus decreasing the overall number of simulations needed. △ Less

Submitted 7 August, 2023; originally announced August 2023.

arXiv:2303.18060 [pdf]

NOSTROMO: Lessons learned, conclusions and way forward

Authors: Mayte Cano, Andrés Perillo, Juan Antonio López, Faustino Tello, Javier Poveda, Francisco Câmara, Francisco Antunes, Christoffer Riis, Ian Crook, Abderrazak Tibichte, Sandrine Molton, David Mocholí, Ricardo Herranz, Gérald Gurtner, Tatjana Bolić, Andrew Cook, Jovana Kuljanin, Xavier Prats

Abstract: This White Paper sets out to explain the value that metamodelling can bring to air traffic management (ATM) research. It will define metamodelling and explore what it can, and cannot, do. The reader is assumed to have basic knowledge of SESAR: the Single European Sky ATM Research project. An important element of SESAR, as the technological pillar of the Single European Sky initiative, is to bring… ▽ More This White Paper sets out to explain the value that metamodelling can bring to air traffic management (ATM) research. It will define metamodelling and explore what it can, and cannot, do. The reader is assumed to have basic knowledge of SESAR: the Single European Sky ATM Research project. An important element of SESAR, as the technological pillar of the Single European Sky initiative, is to bring about improvements, as measured through specific key performance indicators (KPIs), and as implemented by a series of so-called SESAR 'Solutions'. These 'Solutions' are new or improved operational procedures or technologies, designed to meet operational and performance improvements described in the European ATM Master Plan. △ Less

Submitted 29 March, 2023; originally announced March 2023.

Comments: White Paper of the NOSTROMO, an exploratory research project funded by the SESAR Joint Undertaking (SJU) under the European Union's Horizon 2020 research and innovation programme

arXiv:2303.12814 [pdf, other]

Nowhere coexpanding functions

Authors: Andrew Cook, Andy Hammerlindl, Warwick Tucker

Abstract: We define a family of $C^1$ functions which we call "nowhere coexpanding functions" that is closed under composition and includes all $C^3$ functions with non-positive Schwarzian derivative. We establish results on the number and nature of the fixed points of these functions, including a generalisation of a classic result of Singer. We define a family of $C^1$ functions which we call "nowhere coexpanding functions" that is closed under composition and includes all $C^3$ functions with non-positive Schwarzian derivative. We establish results on the number and nature of the fixed points of these functions, including a generalisation of a classic result of Singer. △ Less

Submitted 13 September, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

Comments: 9 pages, 3 figures

MSC Class: 37E99 (Primary) 37N99 (Secondary)

arXiv:2107.07975 [pdf, other]

Joint Semi-supervised 3D Super-Resolution and Segmentation with Mixed Adversarial Gaussian Domain Adaptation

Authors: Nicolo Savioli, Antonio de Marvao, Wenjia Bai, Shuo Wang, Stuart A. Cook, Calvin W. L. Chin, Daniel Rueckert, Declan P. O'Regan

Abstract: Optimising the analysis of cardiac structure and function requires accurate 3D representations of shape and motion. However, techniques such as cardiac magnetic resonance imaging are conventionally limited to acquiring contiguous cross-sectional slices with low through-plane resolution and potential inter-slice spatial misalignment. Super-resolution in medical imaging aims to increase the resoluti… ▽ More Optimising the analysis of cardiac structure and function requires accurate 3D representations of shape and motion. However, techniques such as cardiac magnetic resonance imaging are conventionally limited to acquiring contiguous cross-sectional slices with low through-plane resolution and potential inter-slice spatial misalignment. Super-resolution in medical imaging aims to increase the resolution of images but is conventionally trained on features from low resolution datasets and does not super-resolve corresponding segmentations. Here we propose a semi-supervised multi-task generative adversarial network (Gemini-GAN) that performs joint super-resolution of the images and their labels using a ground truth of high resolution 3D cines and segmentations, while an unsupervised variational adversarial mixture autoencoder (V-AMA) is used for continuous domain adaptation. Our proposed approach is extensively evaluated on two transnational multi-ethnic populations of 1,331 and 205 adults respectively, delivering an improvement on state of the art methods in terms of Dice index, peak signal to noise ratio, and structural similarity index measure. This framework also exceeds the performance of state of the art generative domain adaptation models on external validation (Dice index 0.81 vs 0.74 for the left ventricle). This demonstrates how joint super-resolution and segmentation, trained on 3D ground-truth data with cross-domain generalization, enables robust precision phenotyping in diverse populations. △ Less

Submitted 16 July, 2021; originally announced July 2021.

arXiv:2004.10411 [pdf, other]

A NIS Directive compliant Cybersecurity Maturity Assessment Framework

Authors: George Drivas, Argyro Chatzopoulou, Leandros Maglaras, Costas Lambrinoudakis, Allan Cook, Helge Janicke

Abstract: The NIS Directive introduces obligations for the security of the network and information systems of operators of essential services and of digital service providers and require from the national competent authorities to assess their compliance to these obligations. This paper describes a novel cybersecurity maturity assessment framework (CMAF) that is tailored to the NIS Directive requirements and… ▽ More The NIS Directive introduces obligations for the security of the network and information systems of operators of essential services and of digital service providers and require from the national competent authorities to assess their compliance to these obligations. This paper describes a novel cybersecurity maturity assessment framework (CMAF) that is tailored to the NIS Directive requirements and can be used either as a self assessment tool from critical national infrastructures either as an audit tool from the National Competent Authorities for cybersecurity. △ Less

Submitted 22 April, 2020; originally announced April 2020.

Comments: 6 pages, 2 figures

arXiv:1907.12537

Towards Automatic Screening of Typical and Atypical Behaviors in Children With Autism

Authors: Andrew Cook, Bappaditya Mandal, Donna Berry, Matthew Johnson

Abstract: This paper has been withdrawn by the authors due to insufficient or definition error(s) in the ethics approval protocol. Autism spectrum disorders (ASD) impact the cognitive, social, communicative and behavioral abilities of an individual. The development of new clinical decision support systems is of importance in reducing the delay between presentation of symptoms and an accurate diagnosis. In… ▽ More This paper has been withdrawn by the authors due to insufficient or definition error(s) in the ethics approval protocol. Autism spectrum disorders (ASD) impact the cognitive, social, communicative and behavioral abilities of an individual. The development of new clinical decision support systems is of importance in reducing the delay between presentation of symptoms and an accurate diagnosis. In this work, we contribute a new database consisting of video clips of typical (normal) and atypical (such as hand flapping, spinning or rocking) behaviors, displayed in natural settings, which have been collected from the YouTube video website. We propose a preliminary non-intrusive approach based on skeleton keypoint identification using pretrained deep neural networks on human body video clips to extract features and perform body movement analysis that differentiates typical and atypical behaviors of children. Experimental results on the newly contributed database show that our platform performs best with decision tree as the classifier when compared to other popular methodologies and offers a baseline against which alternate approaches may developed and tested. △ Less

Submitted 17 September, 2019; v1 submitted 29 July, 2019; originally announced July 2019.

Comments: This paper has been withdrawn by the authors due to insufficient or definition error(s) in the ethics approval protocol

arXiv:1907.00058 [pdf, other]

Explainable Anatomical Shape Analysis through Deep Hierarchical Generative Models

Authors: Carlo Biffi, Juan J. Cerrolaza, Giacomo Tarroni, Wenjia Bai, Antonio de Marvao, Ozan Oktay, Christian Ledig, Loic Le Folgoc, Konstantinos Kamnitsas, Georgia Doumou, Jinming Duan, Sanjay K. Prasad, Stuart A. Cook, Declan P. O'Regan, Daniel Rueckert

Abstract: Quantification of anatomical shape changes currently relies on scalar global indexes which are largely insensitive to regional or asymmetric modifications. Accurate assessment of pathology-driven anatomical remodeling is a crucial step for the diagnosis and treatment of many conditions. Deep learning approaches have recently achieved wide success in the analysis of medical images, but they lack in… ▽ More Quantification of anatomical shape changes currently relies on scalar global indexes which are largely insensitive to regional or asymmetric modifications. Accurate assessment of pathology-driven anatomical remodeling is a crucial step for the diagnosis and treatment of many conditions. Deep learning approaches have recently achieved wide success in the analysis of medical images, but they lack interpretability in the feature extraction and decision processes. In this work, we propose a new interpretable deep learning model for shape analysis. In particular, we exploit deep generative networks to model a population of anatomical segmentations through a hierarchy of conditional latent variables. At the highest level of this hierarchy, a two-dimensional latent space is simultaneously optimised to discriminate distinct clinical conditions, enabling the direct visualisation of the classification space. Moreover, the anatomical variability encoded by this discriminative latent space can be visualised in the segmentation space thanks to the generative properties of the model, making the classification task transparent. This approach yielded high accuracy in the categorisation of healthy and remodelled left ventricles when tested on unseen segmentations from our own multi-centre dataset as well as in an external validation set, and on hippocampi from healthy controls and patients with Alzheimer's disease when tested on ADNI data. More importantly, it enabled the visualisation in three-dimensions of both global and regional anatomical features which better discriminate between the conditions under exam. The proposed approach scales effectively to large populations, facilitating high-throughput analysis of normal anatomy and pathology in large-scale studies of volumetric imaging. △ Less

Submitted 4 January, 2020; v1 submitted 28 June, 2019; originally announced July 2019.

Comments: Accepted for publication in IEEE Transactions on Medical Imaging (TMI)

arXiv:1906.10495 [pdf, ps, other]

Approximating Unitary Preparations of Orthogonal Black Box States

Authors: Joshua Alan Cook

Abstract: In this paper, I take a step toward answering the following question: for m different small circuits that compute m orthogonal n qubit states, is there a small circuit that will map m computational basis states to these m states without any input leaving any auxiliary bits changed. While this may seem simple, the constraint that auxiliary bits always be returned to 0 on any input (even ones beside… ▽ More In this paper, I take a step toward answering the following question: for m different small circuits that compute m orthogonal n qubit states, is there a small circuit that will map m computational basis states to these m states without any input leaving any auxiliary bits changed. While this may seem simple, the constraint that auxiliary bits always be returned to 0 on any input (even ones besides the m we care about) led me to use sophisticated techniques. I give an approximation of such a unitary in the m = 2 case that has size polynomial in the approximation error, and the number of qubits n. △ Less

Submitted 22 June, 2019; originally announced June 2019.

Comments: A Class project Paper for CS395T Quantum Complexity Theory at UT Austin in Spring 2019 under Scott Aaronson

arXiv:1904.01551 [pdf]

A Review of Critical Infrastructure Protection Approaches: Improving Security through Responsiveness to the Dynamic Modelling Landscape

Authors: Uchenna D Ani, Jeremy D McK. Watson, Jason R. C. Nurse, Al Cook, Carsten Maple

Abstract: As new technologies such as the Internet of Things (IoT) are integrated into Critical National Infrastructures (CNI), new cybersecurity threats emerge that require specific security solutions. Approaches used for analysis include the modelling and simulation of critical infrastructure systems using attributes, functionalities, operations, and behaviours to support various security analysis viewpoi… ▽ More As new technologies such as the Internet of Things (IoT) are integrated into Critical National Infrastructures (CNI), new cybersecurity threats emerge that require specific security solutions. Approaches used for analysis include the modelling and simulation of critical infrastructure systems using attributes, functionalities, operations, and behaviours to support various security analysis viewpoints, recognising and appropriately managing associated security risks. With several critical infrastructure protection approaches available, the question of how to effectively model the complex behaviour of interconnected CNI elements and to configure their protection as a system-of-systems remains a challenge. Using a systematic review approach, existing critical infrastructure protection approaches (tools and techniques) are examined to determine their suitability given trends like IoT, and effective security modelling and analysis issues. It is found that empirical-based, agent-based, system dynamics-based, and network-based modelling are more commonly applied than economic-based and equation-based techniques, and empirical-based modelling is the most widely used. The energy and transportation critical infrastructure sectors reflect the most responsive sectors, and no one Critical Infrastructure Protection (CIP) approach - tool, technique, methodology or framework -- provides a fit-for-all capacity for all-round attribute modelling and simulation of security risks. Typically, deciding factors for CIP choices to adopt are often dominated by trade-offs between complexity of use and popularity of approach, as well as between specificity and generality of application in sectors. △ Less

Submitted 2 April, 2019; originally announced April 2019.

Comments: PETRAS/IET Conference Living in the Internet of Things: Cybersecurity of the IoT 2019

arXiv:1902.11000 [pdf, other]

3D High-Resolution Cardiac Segmentation Reconstruction from 2D Views using Conditional Variational Autoencoders

Authors: Carlo Biffi, Juan J. Cerrolaza, Giacomo Tarroni, Antonio de Marvao, Stuart A. Cook, Declan P. O'Regan, Daniel Rueckert

Abstract: Accurate segmentation of heart structures imaged by cardiac MR is key for the quantitative analysis of pathology. High-resolution 3D MR sequences enable whole-heart structural imaging but are time-consuming, expensive to acquire and they often require long breath holds that are not suitable for patients. Consequently, multiplanar breath-hold 2D cine sequences are standard practice but are disadvan… ▽ More Accurate segmentation of heart structures imaged by cardiac MR is key for the quantitative analysis of pathology. High-resolution 3D MR sequences enable whole-heart structural imaging but are time-consuming, expensive to acquire and they often require long breath holds that are not suitable for patients. Consequently, multiplanar breath-hold 2D cine sequences are standard practice but are disadvantaged by lack of whole-heart coverage and low through-plane resolution. To address this, we propose a conditional variational autoencoder architecture able to learn a generative model of 3D high-resolution left ventricular (LV) segmentations which is conditioned on three 2D LV segmentations of one short-axis and two long-axis images. By only employing these three 2D segmentations, our model can efficiently reconstruct the 3D high-resolution LV segmentation of a subject. When evaluated on 400 unseen healthy volunteers, our model yielded an average Dice score of $87.92 \pm 0.15$ and outperformed competing architectures. △ Less

Submitted 28 February, 2019; originally announced February 2019.

Comments: Accepted in IEEE International Symposium on Biomedical Imaging (ISBI 2019)

arXiv:1811.04313 [pdf, ps, other]

Uniform, Integral and Feasible Proofs for the Determinant Identities

Authors: Iddo Tzameret, Stephen A. Cook

Abstract: Aiming to provide weak as possible axiomatic assumptions in which one can develop basic linear algebra, we give a uniform and integral version of the short propositional proofs for the determinant identities demonstrated over $GF(2)$ in Hrubes-Tzameret [SICOMP'15]. Specifically, we show that the multiplicativity of the determinant function and the Cayley-Hamilton theorem over the integers are prov… ▽ More Aiming to provide weak as possible axiomatic assumptions in which one can develop basic linear algebra, we give a uniform and integral version of the short propositional proofs for the determinant identities demonstrated over $GF(2)$ in Hrubes-Tzameret [SICOMP'15]. Specifically, we show that the multiplicativity of the determinant function and the Cayley-Hamilton theorem over the integers are provable in the bounded arithmetic theory $\mathbf{VNC}^2$; the latter is a first-order theory corresponding to the complexity class $\mathbf{NC}^2$ consisting of problems solvable by uniform families of polynomial-size circuits and $O(\log ^2 n)$-depth. This also establishes the existence of uniform polynomial-size $\mathbf{NC}^2$-Frege proofs of the basic determinant identities over the integers (previous propositional proofs hold only over the two element field). △ Less

Submitted 10 November, 2018; originally announced November 2018.

Comments: 76 pages

MSC Class: 03F20; 68Q17; 68Q15; 03F30 ACM Class: F.2.2; F.4.1

arXiv:1810.03382 [pdf, other]

doi 10.1038/s42256-019-0019-2

Deep learning cardiac motion analysis for human survival prediction

Authors: Ghalib A. Bello, Timothy J. W. Dawes, Jinming Duan, Carlo Biffi, Antonio de Marvao, Luke S. G. E. Howard, J. Simon R. Gibbs, Martin R. Wilkins, Stuart A. Cook, Daniel Rueckert, Declan P. O'Regan

Abstract: Motion analysis is used in computer vision to understand the behaviour of moving objects in sequences of images. Optimising the interpretation of dynamic biological systems requires accurate and precise motion tracking as well as efficient representations of high-dimensional motion trajectories so that these can be used for prediction tasks. Here we use image sequences of the heart, acquired using… ▽ More Motion analysis is used in computer vision to understand the behaviour of moving objects in sequences of images. Optimising the interpretation of dynamic biological systems requires accurate and precise motion tracking as well as efficient representations of high-dimensional motion trajectories so that these can be used for prediction tasks. Here we use image sequences of the heart, acquired using cardiac magnetic resonance imaging, to create time-resolved three-dimensional segmentations using a fully convolutional network trained on anatomical shape priors. This dense motion model formed the input to a supervised denoising autoencoder (4Dsurvival), which is a hybrid network consisting of an autoencoder that learns a task-specific latent code representation trained on observed outcome data, yielding a latent representation optimised for survival prediction. To handle right-censored survival outcomes, our network used a Cox partial likelihood loss function. In a study of 302 patients the predictive accuracy (quantified by Harrell's C-index) was significantly higher (p < .0001) for our model C=0.73 (95$\%$ CI: 0.68 - 0.78) than the human benchmark of C=0.59 (95$\%$ CI: 0.53 - 0.65). This work demonstrates how a complex computer vision task using high-dimensional medical image data can efficiently predict human survival. △ Less

Submitted 8 October, 2018; originally announced October 2018.

Journal ref: Nature Machine Intelligence, 1, 95-104 (2019)

arXiv:1711.00525 [pdf, other]

Internet of Cloud: Security and Privacy issues

Authors: Allan Cook, Michael Robinson, Mohamed Amine Ferrag, Leandros A. Maglaras, Ying He, Kevin Jones, Helge Janicke

Abstract: The synergy between the cloud and the IoT has emerged largely due to the cloud having attributes which directly benefit the IoT and enable its continued growth. IoT adopting Cloud services has brought new security challenges. In this book chapter, we pursue two main goals: 1) to analyse the different components of Cloud computing and the IoT and 2) to present security and privacy problems that the… ▽ More The synergy between the cloud and the IoT has emerged largely due to the cloud having attributes which directly benefit the IoT and enable its continued growth. IoT adopting Cloud services has brought new security challenges. In this book chapter, we pursue two main goals: 1) to analyse the different components of Cloud computing and the IoT and 2) to present security and privacy problems that these systems face. We thoroughly investigate current security and privacy preservation solutions that exist in this area, with an eye on the Industrial Internet of Things, discuss open issues and propose future directions △ Less

Submitted 1 November, 2017; originally announced November 2017.

Comments: 27 pages, 4 figures

arXiv:1701.05141 [pdf, other]

The Medial Axis of a Multi-Layered Environment and its Application as a Navigation Mesh

Authors: Wouter van Toll, Atlas F. Cook IV, Marc J. van Kreveld, Roland Geraerts

Abstract: Path planning for walking characters in complicated virtual environments is a fundamental task in simulations and games. A navigation mesh is a data structure that allows efficient path planning. The Explicit Corridor Map (ECM) is a navigation mesh based on the medial axis. It enables path planning for disk-shaped characters of any radius. In this paper, we formally extend the medial axis (and t… ▽ More Path planning for walking characters in complicated virtual environments is a fundamental task in simulations and games. A navigation mesh is a data structure that allows efficient path planning. The Explicit Corridor Map (ECM) is a navigation mesh based on the medial axis. It enables path planning for disk-shaped characters of any radius. In this paper, we formally extend the medial axis (and therefore the ECM) to 3D environments in which characters are constrained to walkable surfaces. Typical examples of such environments are multi-storey buildings, train stations, and sports stadiums. We give improved definitions of a walkable environment (WE: a description of walkable surfaces in 3D) and a multi-layered environment (MLE: a subdivision of a WE into connected layers). We define the medial axis of such environments based on projected distances on the ground plane. For an MLE with $n$ boundary vertices and $k$ connections, we show that the medial axis has size $O(n)$, and we present an improved algorithm that constructs the medial axis in $O(n \log n \log k)$ time. The medial axis can be annotated with nearest-obstacle information to obtain the ECM navigation mesh. Our implementations show that the ECM can be computed efficiently for large 2D and multi-layered environments, and that it can be used to compute paths within milliseconds. This enables simulations of large virtual crowds of heterogeneous characters in real-time. △ Less

Submitted 26 July, 2017; v1 submitted 18 January, 2017; originally announced January 2017.

Comments: 34 pages

arXiv:1208.2721 [pdf, ps, other]

The Complexity of the Comparator Circuit Value Problem

Authors: Stephen A. Cook, Yuval Filmus, Dai Tri Man Le

Abstract: In 1990 Subramanian defined the complexity class CC as the set of problems log-space reducible to the comparator circuit value problem (CCV). He and Mayr showed that NL \subseteq CC \subseteq P, and proved that in addition to CCV several other problems are complete for CC, including the stable marriage problem, and finding the lexicographically first maximal matching in a bipartite graph. We are i… ▽ More In 1990 Subramanian defined the complexity class CC as the set of problems log-space reducible to the comparator circuit value problem (CCV). He and Mayr showed that NL \subseteq CC \subseteq P, and proved that in addition to CCV several other problems are complete for CC, including the stable marriage problem, and finding the lexicographically first maximal matching in a bipartite graph. We are interested in CC because we conjecture that it is incomparable with the parallel class NC which also satisfies NL \subseteq NC \subseteq P, and note that this conjecture implies that none of the CC-complete problems has an efficient polylog time parallel algorithm. We provide evidence for our conjecture by giving oracle settings in which relativized CC and relativized NC are incomparable. We give several alternative definitions of CC, including (among others) the class of problems computed by uniform polynomial-size families of comparator circuits supplied with copies of the input and its negation, the class of problems AC^0-reducible to CCV, and the class of problems computed by uniform AC^0 circuits with CCV gates. We also give a machine model for CC, which corresponds to its characterization as log-space uniform polynomial-size families of comparator circuits. These various characterizations show that CC is a robust class. The main technical tool we employ is universal comparator circuits. Other results include a simpler proof of NL \subseteq CC, and an explanation of the relation between the Gale-Shapley algorithm and Subramanian's algorithm for stable marriage. This paper continues the previous work of Cook, Lê and Ye which focused on Cook-Nguyen style uniform proof complexity, answering several open questions raised in that paper. △ Less

Submitted 25 July, 2013; v1 submitted 13 August, 2012; originally announced August 2012.

Comments: continues the previous work of Cook, Le and Ye [arXiv:1106.4142]

arXiv:1106.4142 [pdf, ps, other]

Complexity Classes and Theories for the Comparator Circuit Value Problem

Authors: Stephen A. Cook, Dai Tri Man Le, Yuli Ye

Abstract: Subramanian defined the complexity class CC as the set of problems log-space reducible to the comparator circuit value problem. He proved that several other problems are complete for CC, including the stable marriage problem, and finding the lexicographical first maximal matching in a bipartite graph. We suggest alternative definitions of CC based on different reducibilities and introduce a two-so… ▽ More Subramanian defined the complexity class CC as the set of problems log-space reducible to the comparator circuit value problem. He proved that several other problems are complete for CC, including the stable marriage problem, and finding the lexicographical first maximal matching in a bipartite graph. We suggest alternative definitions of CC based on different reducibilities and introduce a two-sorted theory VCC* based on one of them. We sharpen and simplify Subramanian's completeness proofs for the above two problems and formalize them in VCC*. △ Less

Submitted 4 October, 2011; v1 submitted 21 June, 2011; originally announced June 2011.

Comments: This version was substantially rewritten, where several proofs were rewritten and many typos and mistakes were fixed. A shorter version of this paper appeared in CSL 2011

arXiv:1103.5215 [pdf, ps, other]

doi 10.2168/LMCS-8(3:5)2012

Formalizing Randomized Matching Algorithms

Authors: Dai Tri Man Le, Stephen A. Cook

Abstract: Using Jeřábek 's framework for probabilistic reasoning, we formalize the correctness of two fundamental RNC^2 algorithms for bipartite perfect matching within the theory VPV for polytime reasoning. The first algorithm is for testing if a bipartite graph has a perfect matching, and is based on the Schwartz-Zippel Lemma for polynomial identity testing applied to the Edmonds polynomial of the graph.… ▽ More Using Jeřábek 's framework for probabilistic reasoning, we formalize the correctness of two fundamental RNC^2 algorithms for bipartite perfect matching within the theory VPV for polytime reasoning. The first algorithm is for testing if a bipartite graph has a perfect matching, and is based on the Schwartz-Zippel Lemma for polynomial identity testing applied to the Edmonds polynomial of the graph. The second algorithm, due to Mulmuley, Vazirani and Vazirani, is for finding a perfect matching, where the key ingredient of this algorithm is the Isolating Lemma. △ Less

Submitted 9 August, 2012; v1 submitted 27 March, 2011; originally announced March 2011.

ACM Class: F.2.2, F.4.1

Journal ref: Logical Methods in Computer Science, Volume 8, Issue 3 (August 10, 2012) lmcs:973

arXiv:1103.2865 [pdf, ps, other]

Computing the Fréchet Distance Between Folded Polygons

Authors: Atlas F. Cook IV, Anne Driemel, Sariel Har-Peled, Jessica Sherette, Carola Wenk

Abstract: Computing the Fréchet distance for surfaces is a surprisingly hard problem and the only known algorithm is limited to computing it between flat surfaces. We adapt this algorithm to create one for computing the Fréchet distance for a class of non-flat surfaces which we call folded polygons. Unfortunately, the original algorithm cannot be extended directly. We present three different methods to adap… ▽ More Computing the Fréchet distance for surfaces is a surprisingly hard problem and the only known algorithm is limited to computing it between flat surfaces. We adapt this algorithm to create one for computing the Fréchet distance for a class of non-flat surfaces which we call folded polygons. Unfortunately, the original algorithm cannot be extended directly. We present three different methods to adapt it. The first of which is a fixed-parameter tractable algorithm. The second is a polynomial-time approximation algorithm. Finally, we present a restricted class of folded polygons for which we can compute the Fréchet distance in polynomial time. △ Less

Submitted 15 March, 2011; originally announced March 2011.

Comments: 19 pages, 10 figures. Submitted to WADS 2011

arXiv:1101.1449 [pdf, ps, other]

doi 10.2168/LMCS-8(1:25)2012

Formal Theories for Linear Algebra

Authors: Stephen A Cook, Lila A Fontes

Abstract: We introduce two-sorted theories in the style of [CN10] for the complexity classes \oplusL and DET, whose complete problems include determinants over Z2 and Z, respectively. We then describe interpretations of Soltys' linear algebra theory LAp over arbitrary integral domains, into each of our new theories. The result shows equivalences of standard theorems of linear algebra over Z2 and Z can be p… ▽ More We introduce two-sorted theories in the style of [CN10] for the complexity classes \oplusL and DET, whose complete problems include determinants over Z2 and Z, respectively. We then describe interpretations of Soltys' linear algebra theory LAp over arbitrary integral domains, into each of our new theories. The result shows equivalences of standard theorems of linear algebra over Z2 and Z can be proved in the corresponding theory, but leaves open the interesting question of whether the theorems themselves can be proved. △ Less

Submitted 16 March, 2012; v1 submitted 7 January, 2011; originally announced January 2011.

Comments: This is a revised journal version of the paper "Formal Theories for Linear Algebra" (Computer Science Logic) for the journal Logical Methods in Computer Science

ACM Class: F.4.0

Journal ref: Logical Methods in Computer Science, Volume 8, Issue 1 (March 16, 2012) lmcs:716

arXiv:0802.2846 [pdf, ps, other]

Geodesic Fréchet Distance Inside a Simple Polygon

Authors: Atlas F. Cook IV, Carola Wenk

Abstract: We unveil an alluring alternative to parametric search that applies to both the non-geodesic and geodesic Fréchet optimization problems. This randomized approach is based on a variant of red-blue intersections and is appealing due to its elegance and practical efficiency when compared to parametric search. We present the first algorithm for the geodesic Fréchet distance between two polygonal cur… ▽ More We unveil an alluring alternative to parametric search that applies to both the non-geodesic and geodesic Fréchet optimization problems. This randomized approach is based on a variant of red-blue intersections and is appealing due to its elegance and practical efficiency when compared to parametric search. We present the first algorithm for the geodesic Fréchet distance between two polygonal curves $A$ and $B$ inside a simple bounding polygon $P$. The geodesic Fréchet decision problem is solved almost as fast as its non-geodesic sibling and requires $O(N^{2\log k)$ time and $O(k+N)$ space after $O(k)$ preprocessing, where $N$ is the larger of the complexities of $A$ and $B$ and $k$ is the complexity of $P$. The geodesic Fréchet optimization problem is solved by a randomized approach in $O(k+N^{2\log kN\log N)$ expected time and $O(k+N^{2)$ space. This runtime is only a logarithmic factor larger than the standard non-geodesic Fréchet algorithm (Alt and Godau 1995). Results are also presented for the geodesic Fréchet distance in a polygonal domain with obstacles and the geodesic Hausdorff distance for sets of points or sets of line segments inside a simple polygon $P$. △ Less

Submitted 20 February, 2008; originally announced February 2008.

Journal ref: Dans Proceedings of the 25th Annual Symposium on the Theoretical Aspects of Computer Science - STACS 2008, Bordeaux : France (2008)

Showing 1–23 of 23 results for author: Cook, A