Search | arXiv e-print repository

TRIP: Terrain Traversability Mapping With Risk-Aware Prediction for Enhanced Online Quadrupedal Robot Navigation

Authors: Minho Oh, Byeongho Yu, I Made Aswin Nahrendra, Seoyeon Jang, Hyeonwoo Lee, Dongkyu Lee, Seungjae Lee, Yeeun Kim, Marsim Kevin Christiansen, Hyungtae Lim, Hyun Myung

Abstract: Accurate traversability estimation using an online dense terrain map is crucial for safe navigation in challenging environments like construction and disaster areas. However, traversability estimation for legged robots on rough terrains faces substantial challenges owing to limited terrain information caused by restricted field-of-view, and data occlusion and sparsity. To robustly map traversable… ▽ More Accurate traversability estimation using an online dense terrain map is crucial for safe navigation in challenging environments like construction and disaster areas. However, traversability estimation for legged robots on rough terrains faces substantial challenges owing to limited terrain information caused by restricted field-of-view, and data occlusion and sparsity. To robustly map traversable regions, we introduce terrain traversability mapping with risk-aware prediction (TRIP). TRIP reconstructs the terrain maps while predicting multi-modal traversability risks, enhancing online autonomous navigation with the following contributions. Firstly, estimating steppability in a spherical projection space allows for addressing data sparsity while accomodating scalable terrain properties. Moreover, the proposed traversability-aware Bayesian generalized kernel (T-BGK)-based inference method enhances terrain completion accuracy and efficiency. Lastly, leveraging the steppability-based Mahalanobis distance contributes to robustness against outliers and dynamic elements, ultimately yielding a static terrain traversability map. As verified in both public and our in-house datasets, our TRIP shows significant performance increases in terms of terrain reconstruction and navigation map. A demo video that demonstrates its feasibility as an integral component within an onboard online autonomous navigation system for quadruped robots is available at https://youtu.be/d7HlqAP4l0c. △ Less

Submitted 26 November, 2024; originally announced November 2024.

arXiv:2308.15096 [pdf, ps, other]

How Faithful are Self-Explainable GNNs?

Authors: Marc Christiansen, Lea Villadsen, Zhiqiang Zhong, Stefano Teso, Davide Mottin

Abstract: Self-explainable deep neural networks are a recent class of models that can output ante-hoc local explanations that are faithful to the model's reasoning, and as such represent a step forward toward filling the gap between expressiveness and interpretability. Self-explainable graph neural networks (GNNs) aim at achieving the same in the context of graph data. This begs the question: do these model… ▽ More Self-explainable deep neural networks are a recent class of models that can output ante-hoc local explanations that are faithful to the model's reasoning, and as such represent a step forward toward filling the gap between expressiveness and interpretability. Self-explainable graph neural networks (GNNs) aim at achieving the same in the context of graph data. This begs the question: do these models fulfill their implicit guarantees in terms of faithfulness? In this extended abstract, we analyze the faithfulness of several self-explainable GNNs using different measures of faithfulness, identify several limitations -- both in the models themselves and in the evaluation metrics -- and outline possible ways forward. △ Less

Submitted 29 August, 2023; originally announced August 2023.

arXiv:2208.09273 [pdf, other]

doi 10.1063/5.0121748

Atomistic structure search using local surrogate mode

Authors: Nikolaj Rønne, Mads-Peter V. Christiansen, Andreas Møller Slavensky, Zeyuan Tang, Florian Brix, Mikkel Elkjær Pedersen, Malthe Kjær Bisbo, Bjørk Hammer

Abstract: We describe a local surrogate model for use in conjunction with global structure search methods. The model follows the Gaussian approximation potential (GAP) formalism and is based on a the smooth overlap of atomic positions descriptor with sparsification in terms of a reduced number of local environments using mini-batch $k$-means. The model is implemented in the Atomistic Global Optimization X f… ▽ More We describe a local surrogate model for use in conjunction with global structure search methods. The model follows the Gaussian approximation potential (GAP) formalism and is based on a the smooth overlap of atomic positions descriptor with sparsification in terms of a reduced number of local environments using mini-batch $k$-means. The model is implemented in the Atomistic Global Optimization X framework and used as a partial replacement of the local relaxations in basin hopping structure search. The approach is shown to be robust for a wide range of atomistic system including molecules, nano-particles, surface supported clusters and surface thin films. The benefits in a structure search context of a local surrogate model are demonstrated. This includes the ability to transfer learning from smaller systems as well as the possibility to perform concurrent multi-stoichiometry searches. △ Less

Submitted 19 August, 2022; originally announced August 2022.

Comments: 12 pages, 11 figures

Journal ref: J. Chem. Phys. 157, 174115 (2022)

arXiv:2107.05007 [pdf, other]

Generating stable molecules using imitation and reinforcement learning

Authors: Søren Ager Meldgaard, Jonas Köhler, Henrik Lund Mortensen, Mads-Peter V. Christiansen, Frank Noé, Bjørk Hammer

Abstract: Chemical space is routinely explored by machine learning methods to discover interesting molecules, before time-consuming experimental synthesizing is attempted. However, these methods often rely on a graph representation, ignoring 3D information necessary for determining the stability of the molecules. We propose a reinforcement learning approach for generating molecules in cartesian coordinates… ▽ More Chemical space is routinely explored by machine learning methods to discover interesting molecules, before time-consuming experimental synthesizing is attempted. However, these methods often rely on a graph representation, ignoring 3D information necessary for determining the stability of the molecules. We propose a reinforcement learning approach for generating molecules in cartesian coordinates allowing for quantum chemical prediction of the stability. To improve sample-efficiency we learn basic chemical rules from imitation learning on the GDB-11 database to create an initial model applicable for all stoichiometries. We then deploy multiple copies of the model conditioned on a specific stoichiometry in a reinforcement learning setting. The models correctly identify low energy molecules in the database and produce novel isomers not found in the training set. Finally, we apply the model to larger molecules to show how reinforcement learning further refines the imitation learning model in domains far from the training data. △ Less

Submitted 11 July, 2021; originally announced July 2021.

arXiv:2007.07523 [pdf, other]

doi 10.1103/PhysRevB.102.075427

Atomistic Structure Learning Algorithm with surrogate energy model relaxation

Authors: Henrik Lund Mortensen, Søren Ager Meldgaard, Malthe Kjær Bisbo, Mads-Peter V. Christiansen, Bjørk Hammer

Abstract: The recently proposed Atomistic Structure Learning Algorithm (ASLA) builds on neural network enabled image recognition and reinforcement learning. It enables fully autonomous structure determination when used in combination with a first-principles total energy calculator, e.g. a density functional theory (DFT) program. To save on the computational requirements, ASLA utilizes the DFT program in a s… ▽ More The recently proposed Atomistic Structure Learning Algorithm (ASLA) builds on neural network enabled image recognition and reinforcement learning. It enables fully autonomous structure determination when used in combination with a first-principles total energy calculator, e.g. a density functional theory (DFT) program. To save on the computational requirements, ASLA utilizes the DFT program in a single-point mode, i.e. without allowing for relaxation of the structural candidates according to the force information at the DFT level. In this work, we augment ASLA to establish a surrogate energy model concurrently with its structure search. This enables approximative but computationally cheap relaxation of the structural candidates before the single-point energy evaluation with the computationally expensive DFT program. We demonstrate a significantly increased performance of ASLA for building benzene while utilizing a surrogate energy landscape. Further we apply this model-enhanced ASLA in a thorough investigation of the c(4x8) phase of the Ag(111) surface oxide. ASLA successfully identifies a surface reconstruction which has previously only been guessed on the basis of scanning tunnelling microscopy images. △ Less

Submitted 15 July, 2020; originally announced July 2020.

Journal ref: Phys. Rev. B 102, 075427 (2020)

arXiv:2005.03521 [pdf, other]

The Danish Gigaword Project

Authors: Leon Strømberg-Derczynski, Manuel R. Ciosici, Rebekah Baglini, Morten H. Christiansen, Jacob Aarup Dalsgaard, Riccardo Fusaroli, Peter Juel Henrichsen, Rasmus Hvingelby, Andreas Kirkedal, Alex Speed Kjeldsen, Claus Ladefoged, Finn Årup Nielsen, Malte Lau Petersen, Jonathan Hvithamar Rystrøm, Daniel Varab

Abstract: Danish language technology has been hindered by a lack of broad-coverage corpora at the scale modern NLP prefers. This paper describes the Danish Gigaword Corpus, the result of a focused effort to provide a diverse and freely-available one billion word corpus of Danish text. The Danish Gigaword corpus covers a wide array of time periods, domains, speakers' socio-economic status, and Danish dialect… ▽ More Danish language technology has been hindered by a lack of broad-coverage corpora at the scale modern NLP prefers. This paper describes the Danish Gigaword Corpus, the result of a focused effort to provide a diverse and freely-available one billion word corpus of Danish text. The Danish Gigaword corpus covers a wide array of time periods, domains, speakers' socio-economic status, and Danish dialects. △ Less

Submitted 12 May, 2021; v1 submitted 7 May, 2020; originally announced May 2020.

Comments: Identical to the NoDaLiDa 2021 version

arXiv:1908.06629 [pdf, other]

doi 10.53482/2022_52_397

Memory limitations are hidden in grammar

Authors: Carlos Gómez-Rodríguez, Morten H. Christiansen, Ramon Ferrer-i-Cancho

Abstract: The ability to produce and understand an unlimited number of different sentences is a hallmark of human language. Linguists have sought to define the essence of this generative capacity using formal grammars that describe the syntactic dependencies between constituents, independent of the computational limitations of the human brain. Here, we evaluate this independence assumption by sampling sente… ▽ More The ability to produce and understand an unlimited number of different sentences is a hallmark of human language. Linguists have sought to define the essence of this generative capacity using formal grammars that describe the syntactic dependencies between constituents, independent of the computational limitations of the human brain. Here, we evaluate this independence assumption by sampling sentences uniformly from the space of possible syntactic structures. We find that the average dependency distance between syntactically related words, a proxy for memory limitations, is less than expected by chance in a collection of state-of-the-art classes of dependency grammars. Our findings indicate that memory limitations have permeated grammatical descriptions, suggesting that it may be impossible to build a parsimonious theory of human linguistic productivity independent of non-linguistic cognitive constraints. △ Less

Submitted 5 April, 2022; v1 submitted 19 August, 2019; originally announced August 2019.

Comments: Improved with reviewer feedback once again. In press in Glottometrics

Journal ref: Glottometrics (2022) 52, 39-64

arXiv:1906.05111 [pdf, other]

doi 10.13140/RG.2.2.10989.03042/1

Co-modelling of Agricultural Robotic Systems

Authors: Martin Peter Christiansen

Abstract: Automated and robotic ground-vehicle solutions are gradually becoming part of the agricultural industry, where they are used for performing tasks such as feeding, herding, planting, harvesting, and weed spraying. Agricultural machinery operates in both indoor and outdoor farm environments, resulting in changing operational conditions. Variation in the load transported by ground-vehicles is a commo… ▽ More Automated and robotic ground-vehicle solutions are gradually becoming part of the agricultural industry, where they are used for performing tasks such as feeding, herding, planting, harvesting, and weed spraying. Agricultural machinery operates in both indoor and outdoor farm environments, resulting in changing operational conditions. Variation in the load transported by ground-vehicles is a common occurrence in the agricultural domain, in tasks such as animal feeding and field spraying. The development of automated and robotic ground-vehicle solutions for conditions and scenarios in the agricultural domain is a complex task, which requires input from multiple engineering disciplines. This PhD thesis proposes modelling and simulation for the research and development of automated and robotic ground-vehicle solutions for purposes such as component development, virtual prototype testing, and scenario evaluation. The collaboration of multiple engineering disciplines is achieved by combining multiple modelling and simulation tools from different engineering disciplines. These combined models are known as co-models and their execution is referred to as co-simulation. The results of this thesis are a model-based development methodology for automated and robotic ground-vehicles utilised for a number of research and development cases. The co-models of the automated and robotic ground vehicles were created using the model-based development methodology, and they contribute to the future development support in this research domain. The thesis presents four contributions toward the exploration of a chosen design space for an automated or robotic ground vehicle. Solutions obtained using co-modelling and co-simulation are deployed to their ground-vehicle realisations, which ensures that all stages of development are covered. △ Less

Submitted 17 June, 2019; v1 submitted 12 June, 2019; originally announced June 2019.

Comments: PhD thesis,June 17th 2015, 203 pages, Aarhus University

arXiv:1809.01652 [pdf]

Current potentials and challenges using Sentinel-1 for broadacre field remote sensing

Authors: Martin Peter Christiansen, Morten Stigaard Laursen, Birgitte Feld Mikkelsen, Nima Teimouri, Rasmus Nyholm Jørgensen, Claus Aage Grøn Sørensen

Abstract: ESA operates the Sentinel-1 satellites, which provides Synthetic Aperture Radar (SAR) data of Earth. Recorded Sentinel-1 data have shown a potential for remotely observing and monitoring local conditions on broad acre fields. Remote sensing using Sentinel-1 have the potential to provide daily updates on the current conditions in the individual fields and at the same time give an overview of the ag… ▽ More ESA operates the Sentinel-1 satellites, which provides Synthetic Aperture Radar (SAR) data of Earth. Recorded Sentinel-1 data have shown a potential for remotely observing and monitoring local conditions on broad acre fields. Remote sensing using Sentinel-1 have the potential to provide daily updates on the current conditions in the individual fields and at the same time give an overview of the agricultural areas in the region. Research depends on the ability of independent validation of the presented results. In the case of the Sentinel-1 satellites, every researcher has access to the same base dataset, and therefore independent validation is possible. Well documented research performed with Sentinel-1 allow other research the ability to redo the experiments and either validate or falsify presented findings. Based on current state-of-art research we have chosen to provide a service for researchers in the agricultural domain. The service allows researchers the ability to monitor local conditions by using the Sentinel-1 information combined with a priori knowledge from broad acre fields. Correlating processed Sentinel-1 to the actual conditions is still a task the individual researchers must perform to benefit from the service. In this paper, we presented our methodology in translating sentinel-1 data to a level that is more accessible to researchers in the agricultural field. The goal here was to make the data more easily available, so the primary focus can be on correlating and comparing to measurements collected in the broadacre fields. We illustrate the value of the service with three examples of the possible application areas. The presented application examples are all based on Denmark, where we have processed all sentinel-1 scan from since 2016. △ Less

Submitted 4 September, 2018; originally announced September 2018.

Comments: 9 pages, 5 figures, conference (AGENG2018)

Journal ref: EurAgEng 2018

arXiv:1805.01426 [pdf]

Ground vehicle mapping of fields using LiDAR to enable prediction of crop biomass

Authors: Martin Peter Christiansen, Morten Stigaard Laursen, Rasmus Nyholm Jørgensen, Søren Skovsen, René Gislum

Abstract: Mapping field environments into point clouds using a 3D LIDAR has the ability to become a new approach for online estimation of crop biomass in the field. The estimation of crop biomass in agriculture is expected to be closely correlated to canopy heights. The work presented in this paper contributes to the mapping and textual analysis of agricultural fields. Crop and environmental state informati… ▽ More Mapping field environments into point clouds using a 3D LIDAR has the ability to become a new approach for online estimation of crop biomass in the field. The estimation of crop biomass in agriculture is expected to be closely correlated to canopy heights. The work presented in this paper contributes to the mapping and textual analysis of agricultural fields. Crop and environmental state information can be used to tailor treatments to the specific site. This paper presents the current results with our ground vehicle LiDAR mapping systems for broad acre crop fields. The proposed vehicle system and method facilitates LiDAR recordings in an experimental winter wheat field. LiDAR data are combined with data from Global Navigation Satellite System (GNSS) and Inertial Measurement Unit (IMU) sensors to conduct environment mapping for point clouds. The sensory data from the vehicle are recorded, mapped, and analyzed using the functionalities of the Robot Operating System (ROS) and the Point Cloud Library (PCL). In this experiment winter wheat (Triticum aestivum L.) in field plots, was mapped using 3D point clouds with a point density on the centimeter level. The purpose of the experiment was to create 3D LiDAR point-clouds of the field plots enabling canopy volume and textural analysis to discriminate different crop treatments. Estimated crop volumes ranging from 3500-6200 (m3) per hectare are correlated to the manually collected samples of cut biomass extracted from the experimental field. △ Less

Submitted 3 May, 2018; originally announced May 2018.

Comments: 9 pages, 6 figures, conference (ICPA 2018)

Journal ref: 14th International Conference on Precision Agriculture 2018

arXiv:1802.06299 [pdf]

Robotic design choice overview using co-simulation

Authors: Martin Peter Christiansen, Peter Gorm Larsen, Rasmus Nyholm Jørgensen

Abstract: Rapid robotic system development sets a demand for multi-disciplinary methods and tools to explore and compare design alternatives. In this paper, we present collaborative modeling that combines discrete-event models of controller software with continuous-time models of physical robot components. The presented co-modeling method utilized VDM for discrete-event and 20-sim for continuous-time modeli… ▽ More Rapid robotic system development sets a demand for multi-disciplinary methods and tools to explore and compare design alternatives. In this paper, we present collaborative modeling that combines discrete-event models of controller software with continuous-time models of physical robot components. The presented co-modeling method utilized VDM for discrete-event and 20-sim for continuous-time modeling. The collaborative modeling method is illustrated with a concrete example of collaborative model development of a mobile robot animal feeding system. Simulations are used to evaluate the robot model output response in relation to operational demands. The result of the simulations provides the developers with an overview of the impacts of each solution instance in the chosen design space. Based on the solution overview the developers can select candidates that are deemed viable to be deployed and tested on an actual physical robot. △ Less

Submitted 17 February, 2018; originally announced February 2018.

Comments: 5 pages, 4 figures, conference

Report number: Agromek and NJF joint seminar (2014), 41-45

arXiv:1802.06296 [pdf]

Collaborative model based design of automated and robotic agricultural vehicles in the Crescendo Tool

Authors: Martin Peter Christiansen, Morten Stiggaard Laursen, Rasmus Nyholm Jørgensen, Ibrahim A. Hameed

Abstract: This paper describes a collaborative modelling approach to automated and robotic agricultural vehicle design. The Cresendo technology allows engineers from different disciplines to collaborate and produce system models. The combined models are called co-models and their execution co-simulation. To support future development efforts a template library of different vehicle and controllers types are… ▽ More This paper describes a collaborative modelling approach to automated and robotic agricultural vehicle design. The Cresendo technology allows engineers from different disciplines to collaborate and produce system models. The combined models are called co-models and their execution co-simulation. To support future development efforts a template library of different vehicle and controllers types are provided. This paper describes a methodology to developing co-models from initial problem definition to deployment of the actual system. We illustrate the development methodology with an example development case from the agricultural domain. The case relates to an encountered speed controller problem on a differential driven vehicle, where we iterate through different candidate solutions and end up with an adaptive controller solution based on a combination of classical control and learning feedforward. The second case is an example of combining human control interface and co-simulation of agricultural robotic operation to illustrate collaborative development △ Less

Submitted 17 February, 2018; originally announced February 2018.

Journal ref: Agromek and NJF joint seminar (2014), 36-41

arXiv:1801.09021 [pdf, other]

doi 10.1109/TIT.2018.2879477

A Characterization of Guesswork on Swiftly Tilting Curves

Authors: Ahmad Beirami, Robert Calderbank, Mark Christiansen, Ken Duffy, Muriel Médard

Abstract: Given a collection of strings, each with an associated probability of occurrence, the guesswork of each of them is their position in a list ordered from most likely to least likely, breaking ties arbitrarily. Guesswork is central to several applications in information theory: Average guesswork provides a lower bound on the expected computational cost of a sequential decoder to decode successfully… ▽ More Given a collection of strings, each with an associated probability of occurrence, the guesswork of each of them is their position in a list ordered from most likely to least likely, breaking ties arbitrarily. Guesswork is central to several applications in information theory: Average guesswork provides a lower bound on the expected computational cost of a sequential decoder to decode successfully the transmitted message; the complementary cumulative distribution function of guesswork gives the error probability in list decoding; the logarithm of guesswork is the number of bits needed in optimal lossless one-to-one source coding; and guesswork is the number of trials required of an adversary to breach a password protected system in a brute-force attack. In this paper, we consider memoryless string-sources that generate strings consisting of i.i.d. characters drawn from a finite alphabet, and characterize their corresponding guesswork. Our main tool is the tilt operation. We show that the tilt operation on a memoryless string-source parametrizes an exponential family of memoryless string-sources, which we refer to as the tilted family. We provide an operational meaning to the tilted families by proving that two memoryless string-sources result in the same guesswork on all strings of all lengths if and only if their respective categorical distributions belong to the same tilted family. Establishing some general properties of the tilt operation, we generalize the notions of weakly typical set and asymptotic equipartition property to tilted weakly typical sets of different orders. We use this new definition to characterize the large deviations for all atypical strings and characterize the volume of weakly typical sets of different orders. We subsequently build on this characterization to prove large deviation bounds on guesswork and provide an accurate approximation of its PMF. △ Less

Submitted 31 October, 2018; v1 submitted 26 January, 2018; originally announced January 2018.

Comments: Accepted for publication in IEEE Trans. Inf. Theory

Journal ref: IEEE Transactions on Information Theory, 65 (5), 2850-2871, 2019

arXiv:1704.00820 [pdf, ps, other]

Principal Inertia Components and Applications

Authors: Flavio P. Calmon, Ali Makhdoumi, Muriel Médard, Mayank Varia, Mark Christiansen, Ken R. Duffy

Abstract: We explore properties and applications of the Principal Inertia Components (PICs) between two discrete random variables $X$ and $Y$. The PICs lie in the intersection of information and estimation theory, and provide a fine-grained decomposition of the dependence between $X$ and $Y$. Moreover, the PICs describe which functions of $X$ can or cannot be reliably inferred (in terms of MMSE) given an ob… ▽ More We explore properties and applications of the Principal Inertia Components (PICs) between two discrete random variables $X$ and $Y$. The PICs lie in the intersection of information and estimation theory, and provide a fine-grained decomposition of the dependence between $X$ and $Y$. Moreover, the PICs describe which functions of $X$ can or cannot be reliably inferred (in terms of MMSE) given an observation of $Y$. We demonstrate that the PICs play an important role in information theory, and they can be used to characterize information-theoretic limits of certain estimation problems. In privacy settings, we prove that the PICs are related to fundamental limits of perfect privacy. △ Less

Submitted 3 April, 2017; originally announced April 2017.

Comments: Overlaps with arXiv:1405.1472 and arXiv:1310.1512

arXiv:1503.08513 [pdf, ps, other]

Hiding Symbols and Functions: New Metrics and Constructions for Information-Theoretic Security

Authors: Flavio du Pin Calmon, Muriel Médard, Mayank Varia, Ken R. Duffy, Mark M. Christiansen, Linda M. Zeger

Abstract: We present information-theoretic definitions and results for analyzing symmetric-key encryption schemes beyond the perfect secrecy regime, i.e. when perfect secrecy is not attained. We adopt two lines of analysis, one based on lossless source coding, and another akin to rate-distortion theory. We start by presenting a new information-theoretic metric for security, called symbol secrecy, and derive… ▽ More We present information-theoretic definitions and results for analyzing symmetric-key encryption schemes beyond the perfect secrecy regime, i.e. when perfect secrecy is not attained. We adopt two lines of analysis, one based on lossless source coding, and another akin to rate-distortion theory. We start by presenting a new information-theoretic metric for security, called symbol secrecy, and derive associated fundamental bounds. We then introduce list-source codes (LSCs), which are a general framework for mapping a key length (entropy) to a list size that an eavesdropper has to resolve in order to recover a secret message. We provide explicit constructions of LSCs, and demonstrate that, when the source is uniformly distributed, the highest level of symbol secrecy for a fixed key length can be achieved through a construction based on minimum-distance separable (MDS) codes. Using an analysis related to rate-distortion theory, we then show how symbol secrecy can be used to determine the probability that an eavesdropper correctly reconstructs functions of the original plaintext. We illustrate how these bounds can be applied to characterize security properties of symmetric-key encryption schemes, and, in particular, extend security claims based on symbol secrecy to a functional setting. △ Less

Submitted 29 March, 2015; originally announced March 2015.

Comments: Submitted to IEEE Transactions on Information Theory

arXiv:1405.5024 [pdf, other]

doi 10.1109/TIT.2015.2482972

Multi-user guesswork and brute force security

Authors: Mark M. Christiansen, Ken R. Duffy, Flavio du Pin Calmon, Muriel Medard

Abstract: The Guesswork problem was originally motivated by a desire to quantify computational security for single user systems. Leveraging recent results from its analysis, we extend the remit and utility of the framework to the quantification of the computational security for multi-user systems. In particular, assume that $V$ users independently select strings stochastically from a finite, but potentially… ▽ More The Guesswork problem was originally motivated by a desire to quantify computational security for single user systems. Leveraging recent results from its analysis, we extend the remit and utility of the framework to the quantification of the computational security for multi-user systems. In particular, assume that $V$ users independently select strings stochastically from a finite, but potentially large, list. An inquisitor who does not know which strings have been selected wishes to identify $U$ of them. The inquisitor knows the selection probabilities of each user and is equipped with a method that enables the testing of each (user, string) pair, one at a time, for whether that string had been selected by that user. Here we establish that, unless $U=V$, there is no general strategy that minimizes the distribution of the number of guesses, but in the asymptote as the strings become long we prove the following: by construction, there is an asymptotically optimal class of strategies; the number of guesses required in an asymptotically optimal strategy satisfies a large deviation principle with a rate function, which is not necessarily convex, that can be determined from the rate functions of optimally guessing individual users' strings; if all user's selection statistics are identical, the exponential growth rate of the average guesswork as the string-length increases is determined by the specific Rényi entropy of the string-source with parameter $(V-U+1)/(V-U+2)$, generalizing the known $V=U=1$ case; and that the Shannon entropy of the source is a lower bound on the average guesswork growth rate for all $U$ and $V$, thus providing a bound on computational security for multi-user systems. Examples are presented to illustrate these results and their ramifications for systems design. △ Less

Submitted 3 August, 2017; v1 submitted 20 May, 2014; originally announced May 2014.

Journal ref: EEE Transactions on Information Theory, 61 (12), 6876-6886 (2015)

arXiv:1311.1053 [pdf, other]

Guessing a password over a wireless channel (on the effect of noise non-uniformity)

Authors: Mark M. Christiansen, Ken R. Duffy, Flavio du Pin Calmon, Muriel Medard

Abstract: A string is sent over a noisy channel that erases some of its characters. Knowing the statistical properties of the string's source and which characters were erased, a listener that is equipped with an ability to test the veracity of a string, one string at a time, wishes to fill in the missing pieces. Here we characterize the influence of the stochastic properties of both the string's source and… ▽ More A string is sent over a noisy channel that erases some of its characters. Knowing the statistical properties of the string's source and which characters were erased, a listener that is equipped with an ability to test the veracity of a string, one string at a time, wishes to fill in the missing pieces. Here we characterize the influence of the stochastic properties of both the string's source and the noise on the channel on the distribution of the number of attempts required to identify the string, its guesswork. In particular, we establish that the average noise on the channel is not a determining factor for the average guesswork and illustrate simple settings where one recipient with, on average, a better channel than another recipient, has higher average guesswork. These results stand in contrast to those for the capacity of wiretap channels and suggest the use of techniques such as friendly jamming with pseudo-random sequences to exploit this guesswork behavior. △ Less

Submitted 26 November, 2013; v1 submitted 5 November, 2013; originally announced November 2013.

Comments: Asilomar Conference on Signals, Systems & Computers, 2013

arXiv:1310.1512 [pdf, ps, other]

Bounds on inference

Authors: Flavio du Pin Calmon, Mayank Varia, Muriel Médard, Mark M. Christiansen, Ken R. Duffy, Stefano Tessaro

Abstract: Lower bounds for the average probability of error of estimating a hidden variable X given an observation of a correlated random variable Y, and Fano's inequality in particular, play a central role in information theory. In this paper, we present a lower bound for the average estimation error based on the marginal distribution of X and the principal inertias of the joint distribution matrix of X an… ▽ More Lower bounds for the average probability of error of estimating a hidden variable X given an observation of a correlated random variable Y, and Fano's inequality in particular, play a central role in information theory. In this paper, we present a lower bound for the average estimation error based on the marginal distribution of X and the principal inertias of the joint distribution matrix of X and Y. Furthermore, we discuss an information measure based on the sum of the largest principal inertias, called k-correlation, which generalizes maximal correlation. We show that k-correlation satisfies the Data Processing Inequality and is convex in the conditional distribution of Y given X. Finally, we investigate how to answer a fundamental question in inference and privacy: given an observation Y, can we estimate a function f(X) of the hidden random variable X with an average error below a certain threshold? We provide a general method for answering this question using an approach based on rate-distortion theory. △ Less

Submitted 5 October, 2013; originally announced October 2013.

Comments: Allerton 2013 with extended proof, 10 pages

arXiv:1304.6736 [pdf]

doi 10.1016/j.tics.2013.04.010

Networks in Cognitive Science

Authors: Andrea Baronchelli, Ramon Ferrer-i-Cancho, Romualdo Pastor-Satorras, Nick Chater, Morten H. Christiansen

Abstract: Networks of interconnected nodes have long played a key role in Cognitive Science, from artificial neural net- works to spreading activation models of semantic mem- ory. Recently, however, a new Network Science has been developed, providing insights into the emergence of global, system-scale properties in contexts as diverse as the Internet, metabolic reactions, and collaborations among scientists… ▽ More Networks of interconnected nodes have long played a key role in Cognitive Science, from artificial neural net- works to spreading activation models of semantic mem- ory. Recently, however, a new Network Science has been developed, providing insights into the emergence of global, system-scale properties in contexts as diverse as the Internet, metabolic reactions, and collaborations among scientists. Today, the inclusion of network theory into Cognitive Sciences, and the expansion of complex- systems science, promises to significantly change the way in which the organization and dynamics of cognitive and behavioral processes are understood. In this paper, we review recent contributions of network theory at different levels and domains within the Cognitive Sciences. △ Less

Submitted 5 July, 2013; v1 submitted 24 April, 2013; originally announced April 2013.

Journal ref: Trends in Cognitive Sciences 17, 348-360 (2013)

arXiv:1302.2937 [pdf]

doi 10.1371/journal.pone.0048029

The Biological Origin of Linguistic Diversity

Authors: Andrea Baronchelli, Nick Chater, Romualdo Pastor-Satorras, Morten H. Christiansen

Abstract: In contrast with animal communication systems, diversity is characteristic of almost every aspect of human language. Languages variously employ tones, clicks, or manual signs to signal differences in meaning; some languages lack the noun-verb distinction (e.g., Straits Salish), whereas others have a proliferation of fine-grained syntactic categories (e.g., Tzeltal); and some languages do without m… ▽ More In contrast with animal communication systems, diversity is characteristic of almost every aspect of human language. Languages variously employ tones, clicks, or manual signs to signal differences in meaning; some languages lack the noun-verb distinction (e.g., Straits Salish), whereas others have a proliferation of fine-grained syntactic categories (e.g., Tzeltal); and some languages do without morphology (e.g., Mandarin), while others pack a whole sentence into a single word (e.g., Cayuga). A challenge for evolutionary biology is to reconcile the diversity of languages with the high degree of biological uniformity of their speakers. Here, we model processes of language change and geographical dispersion and find a consistent pressure for flexible learning, irrespective of the language being spoken. This pressure arises because flexible learners can best cope with the observed high rates of linguistic change associated with divergent cultural evolution following human migration. Thus, rather than genetic adaptations for specific aspects of language, such as recursion, the coevolution of genes and fast-changing linguistic structure provides the biological basis for linguistic diversity. Only biological adaptations for flexible learning combined with cultural evolution can explain how each child has the potential to learn any human language. △ Less

Submitted 12 February, 2013; originally announced February 2013.

Journal ref: PLoS ONE 7(10): e48029 (2012)

arXiv:1301.6356 [pdf, other]

Brute force searching, the typical set and Guesswork

Authors: Mark M. Christiansen, Ken R. Duffy, Flavio du Pin Calmon, Muriel Medard

Abstract: Consider the situation where a word is chosen probabilistically from a finite list. If an attacker knows the list and can inquire about each word in turn, then selecting the word via the uniform distribution maximizes the attacker's difficulty, its Guesswork, in identifying the chosen word. It is tempting to use this property in cryptanalysis of computationally secure ciphers by assuming coded wor… ▽ More Consider the situation where a word is chosen probabilistically from a finite list. If an attacker knows the list and can inquire about each word in turn, then selecting the word via the uniform distribution maximizes the attacker's difficulty, its Guesswork, in identifying the chosen word. It is tempting to use this property in cryptanalysis of computationally secure ciphers by assuming coded words are drawn from a source's typical set and so, for all intents and purposes, uniformly distributed within it. By applying recent results on Guesswork, for i.i.d. sources it is this equipartition ansatz that we investigate here. In particular, we demonstrate that the expected Guesswork for a source conditioned to create words in the typical set grows, with word length, at a lower exponential rate than that of the uniform approximation, suggesting use of the approximation is ill-advised. △ Less

Submitted 13 May, 2013; v1 submitted 27 January, 2013; originally announced January 2013.

Comments: ISIT 2013, with extended proof

arXiv:1210.2126 [pdf, ps, other]

Lists that are smaller than their parts: A coding approach to tunable secrecy

Authors: Flavio du Pin Calmon, Muriel Médard, Linda M. Zeger, João Barros, Mark M. Christiansen, Ken. R. Duffy

Abstract: We present a new information-theoretic definition and associated results, based on list decoding in a source coding setting. We begin by presenting list-source codes, which naturally map a key length (entropy) to list size. We then show that such codes can be analyzed in the context of a novel information-theoretic metric, ε-symbol secrecy, that encompasses both the one-time pad and traditional ra… ▽ More We present a new information-theoretic definition and associated results, based on list decoding in a source coding setting. We begin by presenting list-source codes, which naturally map a key length (entropy) to list size. We then show that such codes can be analyzed in the context of a novel information-theoretic metric, ε-symbol secrecy, that encompasses both the one-time pad and traditional rate-based asymptotic metrics, but, like most cryptographic constructs, can be applied in non-asymptotic settings. We derive fundamental bounds for ε-symbol secrecy and demonstrate how these bounds can be achieved with MDS codes when the source is uniformly distributed. We discuss applications and implementation issues of our codes. △ Less

Submitted 7 October, 2012; originally announced October 2012.

Comments: Allerton 2012, 8 pages

arXiv:1205.4135 [pdf, ps, other]

doi 10.1109/TIT.2012.2219036

Guesswork, large deviations and Shannon entropy

Authors: Mark M. Christiansen, Ken R. Duffy

Abstract: How hard is it guess a password? Massey showed that that the Shannon entropy of the distribution from which the password is selected is a lower bound on the expected number of guesses, but one which is not tight in general. In a series of subsequent papers under ever less restrictive stochastic assumptions, an asymptotic relationship as password length grows between scaled moments of the guesswork… ▽ More How hard is it guess a password? Massey showed that that the Shannon entropy of the distribution from which the password is selected is a lower bound on the expected number of guesses, but one which is not tight in general. In a series of subsequent papers under ever less restrictive stochastic assumptions, an asymptotic relationship as password length grows between scaled moments of the guesswork and specific Rényi entropy was identified. Here we show that, when appropriately scaled, as the password length grows the logarithm of the guesswork satisfies a Large Deviation Principle (LDP), providing direct estimates of the guesswork distribution when passwords are long. The rate function governing the LDP possess a specific, restrictive form that encapsulates underlying structure in the nature of guesswork. Returning to Massey's original observation, a corollary to the LDP shows that expectation of the logarithm of the guesswork is the specific Shannon entropy of the password selection process. △ Less

Submitted 21 June, 2012; v1 submitted 18 May, 2012; originally announced May 2012.

MSC Class: 94A17

Journal ref: IEEE Transactions on Information Theory, 59 (2), 796-802 2013

Showing 1–23 of 23 results for author: Christiansen, M