-
Discovery of Endianness and Instruction Size Characteristics in Binary Programs from Unknown Instruction Set Architectures
Authors:
Joachim Andreassen,
Donn Morrison
Abstract:
We study the problem of streamlining reverse engineering (RE) of binary programs from unknown instruction set architectures (ISA). We focus on two fundamental ISA characteristics to beginning the RE process: identification of endianness and whether the instruction width is a fixed or variable. For ISAs with a fixed instruction width, we also present methods for estimating the width. In addition to…
▽ More
We study the problem of streamlining reverse engineering (RE) of binary programs from unknown instruction set architectures (ISA). We focus on two fundamental ISA characteristics to beginning the RE process: identification of endianness and whether the instruction width is a fixed or variable. For ISAs with a fixed instruction width, we also present methods for estimating the width. In addition to advancing research in software RE, our work can also be seen as a first step in hardware reverse engineering, because endianness and instruction format describe intrinsic characteristics of the underlying ISA. We detail our efforts at feature engineering and perform experiments using a variety of machine learning models on two datasets of architectures using Leave-One-Group-Out-Cross-Validation to simulate conditions where the tested ISA is unknown during model training. We use bigram-based features for endianness detection and the autocorrelation function, commonly used in signal processing applications, for differentiation between fixed- and variable-width instruction sizes. A collection of classifiers from the machine learning library scikit-learn are used in the experiments to research these features. Initial results are promising, with accuracy of endianness detection at 99.4%, fixed- versus variable-width instruction size at 86.0%, and detection of fixed instruction sizes at 88.0%.
△ Less
Submitted 28 October, 2024;
originally announced October 2024.
-
Cryptic Bytes: WebAssembly Obfuscation for Evading Cryptojacking Detection
Authors:
Håkon Harnes,
Donn Morrison
Abstract:
WebAssembly has gained significant traction as a high-performance, secure, and portable compilation target for the Web and beyond. However, its growing adoption has also introduced new security challenges. One such threat is cryptojacking, where websites mine cryptocurrencies on visitors' devices without their knowledge or consent, often through the use of WebAssembly. While detection methods have…
▽ More
WebAssembly has gained significant traction as a high-performance, secure, and portable compilation target for the Web and beyond. However, its growing adoption has also introduced new security challenges. One such threat is cryptojacking, where websites mine cryptocurrencies on visitors' devices without their knowledge or consent, often through the use of WebAssembly. While detection methods have been proposed, research on circumventing them remains limited. In this paper, we present the most comprehensive evaluation of code obfuscation techniques for WebAssembly to date, assessing their effectiveness, detectability, and overhead across multiple abstraction levels. We obfuscate a diverse set of applications, including utilities, games, and crypto miners, using state-of-the-art obfuscation tools like Tigress and wasm-mutate, as well as our novel tool, emcc-obf. Our findings suggest that obfuscation can effectively produce dissimilar WebAssembly binaries, with Tigress proving most effective, followed by emcc-obf and wasm-mutate. The impact on the resulting native code is also significant, although the V8 engine's TurboFan optimizer can reduce native code size by 30\% on average. Notably, we find that obfuscation can successfully evade state-of-the-art cryptojacking detectors. Although obfuscation can introduce substantial performance overheads, we demonstrate how obfuscation can be used for evading detection with minimal overhead in real-world scenarios by strategically applying transformations. These insights are valuable for researchers, providing a foundation for developing more robust detection methods. Additionally, we make our dataset of over 20,000 obfuscated WebAssembly binaries and the emcc-obf tool publicly available to stimulate further research.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
Call graph discovery in binary programs from unknown instruction set architectures
Authors:
Håvard Pettersen,
Donn Morrison
Abstract:
This study addresses the challenge of reverse engineering binaries from unknown instruction set architectures, a complex task with potential implications for software maintenance and cyber-security. We focus on the tasks of detecting candidate call and return opcodes for automatic extraction of call graphs in order to simplify the reverse engineering process. Empirical testing on a small dataset o…
▽ More
This study addresses the challenge of reverse engineering binaries from unknown instruction set architectures, a complex task with potential implications for software maintenance and cyber-security. We focus on the tasks of detecting candidate call and return opcodes for automatic extraction of call graphs in order to simplify the reverse engineering process. Empirical testing on a small dataset of binary files from different architectures demonstrates that the approach can accurately detect specific opcodes under conditions of noisy data. The method lays the groundwork for a valuable tool for reverse engineering where the reverse engineer has minimal a priori knowledge of the underlying instruction set architecture.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
SoK: Analysis techniques for WebAssembly
Authors:
Håkon Harnes,
Donn Morrison
Abstract:
WebAssembly is a low-level bytecode language that allows high-level languages like C, C++, and Rust to be executed in the browser at near-native performance. In recent years, WebAssembly has gained widespread adoption is now natively supported by all modern browsers. However, vulnerabilities in memory-unsafe languages, like C and C++, can translate into vulnerabilities in WebAssembly binaries. Unf…
▽ More
WebAssembly is a low-level bytecode language that allows high-level languages like C, C++, and Rust to be executed in the browser at near-native performance. In recent years, WebAssembly has gained widespread adoption is now natively supported by all modern browsers. However, vulnerabilities in memory-unsafe languages, like C and C++, can translate into vulnerabilities in WebAssembly binaries. Unfortunately, most WebAssembly binaries are compiled from such memory-unsafe languages, and these vulnerabilities have been shown to be practical in real-world scenarios. WebAssembly smart contracts have also been found to be vulnerable, causing significant financial loss. Additionally, WebAssembly has been used for malicious purposes like cryptojacking. To address these issues, several analysis techniques for WebAssembly binaries have been proposed. In this paper, we conduct a comprehensive literature review of these techniques and categorize them based on their analysis strategy and objectives. Furthermore, we compare and evaluate the techniques using quantitative data, highlighting their strengths and weaknesses. In addition, one of the main contributions of this paper is the identification of future research directions based on the thorough literature review conducted.
△ Less
Submitted 22 March, 2024; v1 submitted 11 January, 2024;
originally announced January 2024.
-
AI-assisted Optimization of the ECCE Tracking System at the Electron Ion Collider
Authors:
C. Fanelli,
Z. Papandreou,
K. Suresh,
J. K. Adkins,
Y. Akiba,
A. Albataineh,
M. Amaryan,
I. C. Arsene,
C. Ayerbe Gayoso,
J. Bae,
X. Bai,
M. D. Baker,
M. Bashkanov,
R. Bellwied,
F. Benmokhtar,
V. Berdnikov,
J. C. Bernauer,
F. Bock,
W. Boeglin,
M. Borysova,
E. Brash,
P. Brindza,
W. J. Briscoe,
M. Brooks,
S. Bueltmann
, et al. (258 additional authors not shown)
Abstract:
The Electron-Ion Collider (EIC) is a cutting-edge accelerator facility that will study the nature of the "glue" that binds the building blocks of the visible matter in the universe. The proposed experiment will be realized at Brookhaven National Laboratory in approximately 10 years from now, with detector design and R&D currently ongoing. Notably, EIC is one of the first large-scale facilities to…
▽ More
The Electron-Ion Collider (EIC) is a cutting-edge accelerator facility that will study the nature of the "glue" that binds the building blocks of the visible matter in the universe. The proposed experiment will be realized at Brookhaven National Laboratory in approximately 10 years from now, with detector design and R&D currently ongoing. Notably, EIC is one of the first large-scale facilities to leverage Artificial Intelligence (AI) already starting from the design and R&D phases. The EIC Comprehensive Chromodynamics Experiment (ECCE) is a consortium that proposed a detector design based on a 1.5T solenoid. The EIC detector proposal review concluded that the ECCE design will serve as the reference design for an EIC detector. Herein we describe a comprehensive optimization of the ECCE tracker using AI. The work required a complex parametrization of the simulated detector system. Our approach dealt with an optimization problem in a multidimensional design space driven by multiple objectives that encode the detector performance, while satisfying several mechanical constraints. We describe our strategy and show results obtained for the ECCE tracking system. The AI-assisted design is agnostic to the simulation framework and can be extended to other sub-detectors or to a system of sub-detectors to further optimize the performance of the EIC detector.
△ Less
Submitted 19 May, 2022; v1 submitted 18 May, 2022;
originally announced May 2022.
-
Semantics for Robotic Mapping, Perception and Interaction: A Survey
Authors:
Sourav Garg,
Niko Sünderhauf,
Feras Dayoub,
Douglas Morrison,
Akansel Cosgun,
Gustavo Carneiro,
Qi Wu,
Tat-Jun Chin,
Ian Reid,
Stephen Gould,
Peter Corke,
Michael Milford
Abstract:
For robots to navigate and interact more richly with the world around them, they will likely require a deeper understanding of the world in which they operate. In robotics and related research fields, the study of understanding is often referred to as semantics, which dictates what does the world "mean" to a robot, and is strongly tied to the question of how to represent that meaning. With humans…
▽ More
For robots to navigate and interact more richly with the world around them, they will likely require a deeper understanding of the world in which they operate. In robotics and related research fields, the study of understanding is often referred to as semantics, which dictates what does the world "mean" to a robot, and is strongly tied to the question of how to represent that meaning. With humans and robots increasingly operating in the same world, the prospects of human-robot interaction also bring semantics and ontology of natural language into the picture. Driven by need, as well as by enablers like increasing availability of training data and computational resources, semantics is a rapidly growing research area in robotics. The field has received significant attention in the research literature to date, but most reviews and surveys have focused on particular aspects of the topic: the technical research issues regarding its use in specific robotic topics like mapping or segmentation, or its relevance to one particular application domain like autonomous driving. A new treatment is therefore required, and is also timely because so much relevant research has occurred since many of the key surveys were published. This survey therefore provides an overarching snapshot of where semantics in robotics stands today. We establish a taxonomy for semantics research in or relevant to robotics, split into four broad categories of activity, in which semantics are extracted, used, or both. Within these broad categories we survey dozens of major topics including fundamentals from the computer vision field and key robotics research areas utilizing semantics, including mapping, navigation and interaction with the world. The survey also covers key practical considerations, including enablers like increased data availability and improved computational hardware, and major application areas where...
△ Less
Submitted 2 January, 2021;
originally announced January 2021.
-
EGAD! an Evolved Grasping Analysis Dataset for diversity and reproducibility in robotic manipulation
Authors:
Douglas Morrison,
Peter Corke,
Jürgen Leitner
Abstract:
We present the Evolved Grasping Analysis Dataset (EGAD), comprising over 2000 generated objects aimed at training and evaluating robotic visual grasp detection algorithms. The objects in EGAD are geometrically diverse, filling a space ranging from simple to complex shapes and from easy to difficult to grasp, compared to other datasets for robotic grasping, which may be limited in size or contain o…
▽ More
We present the Evolved Grasping Analysis Dataset (EGAD), comprising over 2000 generated objects aimed at training and evaluating robotic visual grasp detection algorithms. The objects in EGAD are geometrically diverse, filling a space ranging from simple to complex shapes and from easy to difficult to grasp, compared to other datasets for robotic grasping, which may be limited in size or contain only a small number of object classes. Additionally, we specify a set of 49 diverse 3D-printable evaluation objects to encourage reproducible testing of robotic grasping systems across a range of complexity and difficulty. The dataset, code and videos can be found at https://dougsm.github.io/egad/
△ Less
Submitted 23 April, 2020; v1 submitted 2 March, 2020;
originally announced March 2020.
-
The Minimum Hybrid Contract (MHC): Combining legal and blockchain smart contracts
Authors:
Jørgen Svennevik Notland,
Jakob Svennevik Notland,
Donn Morrison
Abstract:
Corruption is a major global financial problem with billions of dollars rendered lost or unaccountable annually. Corruption through contract fraud is often conducted by withholding and/or altering financial information. When such scandals are investigated by authorities, financial and legal documents are usually altered to conceal the paper trail.
Smart contracts have emerged in recent years and…
▽ More
Corruption is a major global financial problem with billions of dollars rendered lost or unaccountable annually. Corruption through contract fraud is often conducted by withholding and/or altering financial information. When such scandals are investigated by authorities, financial and legal documents are usually altered to conceal the paper trail.
Smart contracts have emerged in recent years and appear promising for applications such as legal contracts where transparency is critical and of public interest. Transparency and auditability are inherent because smart contracts execute operations on the blockchain, a distributed public ledger.
In this paper, we propose the Minimum Hybrid Contract (MHC), with the aim of introducing 1) auditability, 2) transparency, and 3) immutability to the contract's financial transactions. The MHC comprises an online smart contract and an offline traditional legal contract. where the two are immutably linked.
Secure peer-to-peer financial transactions, transparency, and cost accounting are automated by the smart contract, and legal issues or disputes are carried out by civil courts. The reliance on established legal processes facilitates an appropriate adoption of smart contracts in traditional contracts.
△ Less
Submitted 17 February, 2020;
originally announced February 2020.
-
Multi-View Picking: Next-best-view Reaching for Improved Grasping in Clutter
Authors:
Douglas Morrison,
Peter Corke,
Jürgen Leitner
Abstract:
Camera viewpoint selection is an important aspect of visual grasp detection, especially in clutter where many occlusions are present. Where other approaches use a static camera position or fixed data collection routines, our Multi-View Picking (MVP) controller uses an active perception approach to choose informative viewpoints based directly on a distribution of grasp pose estimates in real time,…
▽ More
Camera viewpoint selection is an important aspect of visual grasp detection, especially in clutter where many occlusions are present. Where other approaches use a static camera position or fixed data collection routines, our Multi-View Picking (MVP) controller uses an active perception approach to choose informative viewpoints based directly on a distribution of grasp pose estimates in real time, reducing uncertainty in the grasp poses caused by clutter and occlusions. In trials of grasping 20 objects from clutter, our MVP controller achieves 80% grasp success, outperforming a single-viewpoint grasp detector by 12%. We also show that our approach is both more accurate and more efficient than approaches which consider multiple fixed viewpoints.
△ Less
Submitted 10 May, 2019; v1 submitted 23 September, 2018;
originally announced September 2018.
-
Closing the Loop for Robotic Grasping: A Real-time, Generative Grasp Synthesis Approach
Authors:
Douglas Morrison,
Peter Corke,
Jürgen Leitner
Abstract:
This paper presents a real-time, object-independent grasp synthesis method which can be used for closed-loop grasping. Our proposed Generative Grasping Convolutional Neural Network (GG-CNN) predicts the quality and pose of grasps at every pixel. This one-to-one mapping from a depth image overcomes limitations of current deep-learning grasping techniques by avoiding discrete sampling of grasp candi…
▽ More
This paper presents a real-time, object-independent grasp synthesis method which can be used for closed-loop grasping. Our proposed Generative Grasping Convolutional Neural Network (GG-CNN) predicts the quality and pose of grasps at every pixel. This one-to-one mapping from a depth image overcomes limitations of current deep-learning grasping techniques by avoiding discrete sampling of grasp candidates and long computation times. Additionally, our GG-CNN is orders of magnitude smaller while detecting stable grasps with equivalent performance to current state-of-the-art techniques. The light-weight and single-pass generative nature of our GG-CNN allows for closed-loop control at up to 50Hz, enabling accurate grasping in non-static environments where objects move and in the presence of robot control inaccuracies. In our real-world tests, we achieve an 83% grasp success rate on a set of previously unseen objects with adversarial geometry and 88% on a set of household objects that are moved during the grasp attempt. We also achieve 81% accuracy when grasping in dynamic clutter.
△ Less
Submitted 15 May, 2018; v1 submitted 14 April, 2018;
originally announced April 2018.
-
Design of a Multi-Modal End-Effector and Grasping System: How Integrated Design helped win the Amazon Robotics Challenge
Authors:
S. Wade-McCue,
N. Kelly-Boxall,
M. McTaggart,
D. Morrison,
A. W. Tow,
J. Erskine,
R. Grinover,
A. Gurman,
T. Hunn,
D. Lee,
A. Milan,
T. Pham,
G. Rallos,
A. Razjigaev,
T. Rowntree,
R. Smith,
K. Vijay,
Z. Zhuang,
C. Lehnert,
I. Reid,
P. Corke,
J. Leitner
Abstract:
We present the grasping system and design approach behind Cartman, the winning entrant in the 2017 Amazon Robotics Challenge. We investigate the design processes leading up to the final iteration of the system and describe the emergent solution by comparing it with key robotics design aspects. Following our experience, we propose a new design aspect, precision vs. redundancy, that should be consid…
▽ More
We present the grasping system and design approach behind Cartman, the winning entrant in the 2017 Amazon Robotics Challenge. We investigate the design processes leading up to the final iteration of the system and describe the emergent solution by comparing it with key robotics design aspects. Following our experience, we propose a new design aspect, precision vs. redundancy, that should be considered alongside the previously proposed design aspects of modularity vs. integration, generality vs. assumptions, computation vs. embodiment and planning vs. feedback. We present the grasping system behind Cartman, the winning robot in the 2017 Amazon Robotics Challenge. The system makes strong use of redundancy in design by implementing complimentary tools, a suction gripper and a parallel gripper. This multi-modal end-effector is combined with three grasp synthesis algorithms to accommodate the range of objects provided by Amazon during the challenge. We provide a detailed system description and an evaluation of its performance before discussing the broader nature of the system with respect to the key aspects of robotic design as initially proposed by the winners of the first Amazon Picking Challenge. To address the principal nature of our grasping system and the reason for its success, we propose an additional robotic design aspect `precision vs. redundancy'. The full design of our robotic system, including the end-effector, is open sourced and available at http://juxi.net/projects/AmazonRoboticsChallenge/
△ Less
Submitted 19 June, 2018; v1 submitted 3 October, 2017;
originally announced October 2017.
-
Mechanical Design of a Cartesian Manipulator for Warehouse Pick and Place
Authors:
M. McTaggart,
D. Morrison,
A. W. Tow,
R. Smith,
Norton Kelly-Boxall,
Anton Milan,
T. Pham Zheyu Zhuang,
J. Leitner,
I. Reid,
P. Corke,
C. Lehnert
Abstract:
Robotic manipulation and grasping in cluttered and unstructured environments is a current challenge for robotics. Enabling robots to operate in these challenging environments have direct applications from automating warehouses to harvesting fruit in agriculture. One of the main challenges associated with these difficult robotic manipulation tasks is the motion planning and control problem for mult…
▽ More
Robotic manipulation and grasping in cluttered and unstructured environments is a current challenge for robotics. Enabling robots to operate in these challenging environments have direct applications from automating warehouses to harvesting fruit in agriculture. One of the main challenges associated with these difficult robotic manipulation tasks is the motion planning and control problem for multi-DoF (Degree of Freedom) manipulators. This paper presents the design and performance evaluation of a low-cost Cartesian manipulator, Cartman who took first place in the Amazon Robotics Challenge 2017. It can perform pick and place tasks of household items in a cluttered environment. The robot is capable of linear speeds of 1 m/s and angular speeds of 1.5 rad/s, capable of sub-millimetre static accuracy and safe payload capacity of 2kg. Cartman can be produced for under 10 000 AUD. The complete design is open sourced and can be found at http://juxi.net/projects/AmazonRoboticsChallenge.
△ Less
Submitted 18 June, 2018; v1 submitted 2 October, 2017;
originally announced October 2017.
-
Semantic Segmentation from Limited Training Data
Authors:
A. Milan,
T. Pham,
K. Vijay,
D. Morrison,
A. W. Tow,
L. Liu,
J. Erskine,
R. Grinover,
A. Gurman,
T. Hunn,
N. Kelly-Boxall,
D. Lee,
M. McTaggart,
G. Rallos,
A. Razjigaev,
T. Rowntree,
T. Shen,
R. Smith,
S. Wade-McCue,
Z. Zhuang,
C. Lehnert,
G. Lin,
I. Reid,
P. Corke,
J. Leitner
Abstract:
We present our approach for robotic perception in cluttered scenes that led to winning the recent Amazon Robotics Challenge (ARC) 2017. Next to small objects with shiny and transparent surfaces, the biggest challenge of the 2017 competition was the introduction of unseen categories. In contrast to traditional approaches which require large collections of annotated data and many hours of training,…
▽ More
We present our approach for robotic perception in cluttered scenes that led to winning the recent Amazon Robotics Challenge (ARC) 2017. Next to small objects with shiny and transparent surfaces, the biggest challenge of the 2017 competition was the introduction of unseen categories. In contrast to traditional approaches which require large collections of annotated data and many hours of training, the task here was to obtain a robust perception pipeline with only few minutes of data acquisition and training time. To that end, we present two strategies that we explored. One is a deep metric learning approach that works in three separate steps: semantic-agnostic boundary detection, patch classification and pixel-wise voting. The other is a fully-supervised semantic segmentation approach with efficient dataset collection. We conduct an extensive analysis of the two methods on our ARC 2017 dataset. Interestingly, only few examples of each class are sufficient to fine-tune even very deep convolutional neural networks for this specific task.
△ Less
Submitted 22 September, 2017;
originally announced September 2017.
-
Cartman: The low-cost Cartesian Manipulator that won the Amazon Robotics Challenge
Authors:
D. Morrison,
A. W. Tow,
M. McTaggart,
R. Smith,
N. Kelly-Boxall,
S. Wade-McCue,
J. Erskine,
R. Grinover,
A. Gurman,
T. Hunn,
D. Lee,
A. Milan,
T. Pham,
G. Rallos,
A. Razjigaev,
T. Rowntree,
K. Vijay,
Z. Zhuang,
C. Lehnert,
I. Reid,
P. Corke,
J. Leitner
Abstract:
The Amazon Robotics Challenge enlisted sixteen teams to each design a pick-and-place robot for autonomous warehousing, addressing development in robotic vision and manipulation. This paper presents the design of our custom-built, cost-effective, Cartesian robot system Cartman, which won first place in the competition finals by stowing 14 (out of 16) and picking all 9 items in 27 minutes, scoring a…
▽ More
The Amazon Robotics Challenge enlisted sixteen teams to each design a pick-and-place robot for autonomous warehousing, addressing development in robotic vision and manipulation. This paper presents the design of our custom-built, cost-effective, Cartesian robot system Cartman, which won first place in the competition finals by stowing 14 (out of 16) and picking all 9 items in 27 minutes, scoring a total of 272 points. We highlight our experience-centred design methodology and key aspects of our system that contributed to our competitiveness. We believe these aspects are crucial to building robust and effective robotic systems.
△ Less
Submitted 25 February, 2018; v1 submitted 19 September, 2017;
originally announced September 2017.
-
Who is Who in Phylogenetic Networks: Articles, Authors and Programs
Authors:
Tushar Agarwal,
Philippe Gambette,
David Morrison
Abstract:
The phylogenetic network emerged in the 1990s as a new model to represent the evolution of species in the case where coexisting species transfer genetic information through hybridization, recombination, lateral gene transfer, etc. As is true for many rapidly evolving fields, there is considerable fragmentation and diversity in methodologies, standards and vocabulary in phylogenetic network researc…
▽ More
The phylogenetic network emerged in the 1990s as a new model to represent the evolution of species in the case where coexisting species transfer genetic information through hybridization, recombination, lateral gene transfer, etc. As is true for many rapidly evolving fields, there is considerable fragmentation and diversity in methodologies, standards and vocabulary in phylogenetic network research, thus creating the need for an integrated database of articles, authors, techniques, keywords and software. We describe such a database, "Who is Who in Phylogenetic Networks", available at http://phylnet.univ-mlv.fr. "Who is Who in Phylogenetic Networks" comprises more than 600 publications and 500 authors interlinked with a rich set of more than 200 keywords related to phylogenetic networks. The database is integrated with web-based tools to visualize authorship and collaboration networks and analyze these networks using common graph and social network metrics such as centrality (betweenness, eigenvector, degree and closeness) and clustering. We provide downloads of raw information about entries in the database, and a facility to suggest modifications and contribute new information to the database. We also present in this article common use cases of the database and identify trends in the research on phylogenetic networks using the information in the database and textual analysis.
△ Less
Submitted 5 October, 2016;
originally announced October 2016.
-
Syntax and Semantics of Abstract Binding Trees
Authors:
Jonathan Sterling,
Darin Morrison
Abstract:
The contribution of this paper is the development of the syntax and semantics of multi-sorted nominal abstract binding trees (abts), an extension of second order universal algebra to support symbol-indexed families of operators. Nominal abts are essential for correctly treating the syntax of languages with generative phenomena, including exceptions and mutable state. Additionally we have developed…
▽ More
The contribution of this paper is the development of the syntax and semantics of multi-sorted nominal abstract binding trees (abts), an extension of second order universal algebra to support symbol-indexed families of operators. Nominal abts are essential for correctly treating the syntax of languages with generative phenomena, including exceptions and mutable state. Additionally we have developed the categorical semantics for abstract binding trees formally in Constructive Type Theory using the Agda proof assistant. Multi-sorted nominal abts also form the syntactic basis for the upcoming version of the JonPRL proof assistant, an implementation of an extensional constructive type theory in the Nuprl tradition.
△ Less
Submitted 23 January, 2016;
originally announced January 2016.
-
Phantom cascades: The effect of hidden nodes on information diffusion
Authors:
Václav Belák,
Afra Mashhadi,
Alessandra Sala,
Donn Morrison
Abstract:
Research on information diffusion generally assumes complete knowledge of the underlying network. However, in the presence of factors such as increasing privacy awareness, restrictions on application programming interfaces (APIs) and sampling strategies, this assumption rarely holds in the real world which in turn leads to an underestimation of the size of information cascades. In this work we stu…
▽ More
Research on information diffusion generally assumes complete knowledge of the underlying network. However, in the presence of factors such as increasing privacy awareness, restrictions on application programming interfaces (APIs) and sampling strategies, this assumption rarely holds in the real world which in turn leads to an underestimation of the size of information cascades. In this work we study the effect of hidden network structure on information diffusion processes. We characterise information cascades through activation paths traversing visible and hidden parts of the network. We quantify diffusion estimation error while varying the amount of hidden structure in five empirical and synthetic network datasets and demonstrate the effect of topological properties on this error. Finally, we suggest practical recommendations for practitioners and propose a model to predict the cascade size with minimal information regarding the underlying network.
△ Less
Submitted 21 May, 2015; v1 submitted 5 February, 2015;
originally announced February 2015.
-
Toward automatic censorship detection in microblogs
Authors:
Donn Morrison
Abstract:
Social media is an area where users often experience censorship through a variety of means such as the restriction of search terms or active and retroactive deletion of messages. In this paper we examine the feasibility of automatically detecting censorship of microblogs. We use a network growing model to simulate discussion over a microblog follow network and compare two censorship strategies to…
▽ More
Social media is an area where users often experience censorship through a variety of means such as the restriction of search terms or active and retroactive deletion of messages. In this paper we examine the feasibility of automatically detecting censorship of microblogs. We use a network growing model to simulate discussion over a microblog follow network and compare two censorship strategies to simulate varying levels of message deletion. Using topological features extracted from the resulting graphs, a classifier is trained to detect whether or not a given communication graph has been censored. The results show that censorship detection is feasible under empirically measured levels of message deletion. The proposed framework can enable automated censorship measurement and tracking, which, when combined with aggregated citizen reports of censorship, can allow users to make informed decisions about online communication habits.
△ Less
Submitted 27 February, 2014; v1 submitted 21 February, 2014;
originally announced February 2014.
-
Solving the Pricing Problem in a Branch-and-Price Algorithm for Graph Coloring using Zero-Suppressed Binary Decision Diagrams
Authors:
David R. Morrison,
Edward C. Sewell,
Sheldon H. Jacobson
Abstract:
Branch-and-price algorithms combine a branch-and-bound search with an exponentially-sized LP formulation that must be solved via column generation. Unfortunately, the standard branching rules used in branch-and-bound for integer programming interfere with the structure of the column generation routine; therefore, most such algorithms employ alternate branching rules to circumvent this difficulty.…
▽ More
Branch-and-price algorithms combine a branch-and-bound search with an exponentially-sized LP formulation that must be solved via column generation. Unfortunately, the standard branching rules used in branch-and-bound for integer programming interfere with the structure of the column generation routine; therefore, most such algorithms employ alternate branching rules to circumvent this difficulty. This paper shows how a zero-suppressed binary decision diagram (ZDD) can be used to solve the pricing problem in a branch-and-price algorithm for the graph coloring problem, even in the presence of constraints imposed by branching decisions. This approach facilitates a much more direct solution method, and can improve convergence of the column generation subroutine.
△ Less
Submitted 8 July, 2015; v1 submitted 22 January, 2014;
originally announced January 2014.
-
Characteristics of Optimal Solutions to the Sensor Location Problem
Authors:
David R. Morrison,
Susan E. Martonosi
Abstract:
In [Bianco, L., Giuseppe C., and P. Reverberi. 2001. "A network based model for traffic sensor location with implications on O/D matrix estimates". Transportation Science 35(1):50-60.], the authors present the Sensor Location Problem: that of locating the minimum number of traffic sensors at intersections of a road network such that the traffic flow on the entire network can be determined. They of…
▽ More
In [Bianco, L., Giuseppe C., and P. Reverberi. 2001. "A network based model for traffic sensor location with implications on O/D matrix estimates". Transportation Science 35(1):50-60.], the authors present the Sensor Location Problem: that of locating the minimum number of traffic sensors at intersections of a road network such that the traffic flow on the entire network can be determined. They offer a necessary and sufficient condition on the set of monitored nodes in order for the flow everywhere to be determined. In this paper, we present a counterexample that demonstrates that the condition is not actually sufficient (though it is still necessary). We present a stronger necessary condition for flow calculability, and show that it is a sufficient condition in a large class of graphs in which a particular subgraph is a tree. Many typical road networks are included in this category, and we show how our condition can be used to inform traffic sensor placement.
△ Less
Submitted 1 June, 2011;
originally announced June 2011.
-
Data Management for Physics Analysis in Phenix (BNL, RHIC)
Authors:
Barbara Jacak,
Roy Lacey,
Dave Morrison,
Irina Sourikova,
Andrey Shevel,
Qiu Zhiping
Abstract:
Every year the PHENIX collaboration deals with increasing volume of data (now about 1/4 PB/year). Apparently the more data the more questions how to process all the data in most efficient way. In recent past many developments in HEP computing were dedicated to the production environment. Now we need more tools to help to obtain physics results from the analysis of distributed simulated and exper…
▽ More
Every year the PHENIX collaboration deals with increasing volume of data (now about 1/4 PB/year). Apparently the more data the more questions how to process all the data in most efficient way. In recent past many developments in HEP computing were dedicated to the production environment. Now we need more tools to help to obtain physics results from the analysis of distributed simulated and experimental data. Developments in Grid architectures gave many examples how distributed computing facilities can be organized to meet physics analysis needs. We feel that our main task in this area is to try to use already developed systems or system components in PHENIX environment.
We are concentrating here on the followed problems: file/replica catalog which keep names of our files, data moving over WAN, job submission in multicluster environment.
PHENIX is a running experiment and this fact narrowed our ability to test new software on the collaboration computer facilities. We are experimenting with system prototypes at State University of New York at Stony Brook (SUNYSB) where we run midrange computing cluster for physics analysis. The talk is dedicated to discuss some experience with Grid software and achieved results.
△ Less
Submitted 13 June, 2003;
originally announced June 2003.
-
Relational databases for data management in PHENIX
Authors:
I. Sourikova,
D. Morrison
Abstract:
PHENIX is one of the two large experiments at the Relativistic Heavy Ion Collider (RHIC) at Brookhaven National Laboratory (BNL) and archives roughly 100TB of experimental data per year. In addition, large volumes of simulated data are produced at multiple off-site computing centers. For any file catalog to play a central role in data management it has to face problems associated with the need f…
▽ More
PHENIX is one of the two large experiments at the Relativistic Heavy Ion Collider (RHIC) at Brookhaven National Laboratory (BNL) and archives roughly 100TB of experimental data per year. In addition, large volumes of simulated data are produced at multiple off-site computing centers. For any file catalog to play a central role in data management it has to face problems associated with the need for distributed access and updates. To be used effectively by the hundreds of PHENIX collaborators in 12 countries the catalog must satisfy the following requirements: 1) contain up-to-date data, 2) provide fast and reliable access to the data, 3) have write permissions for the sites that store portions of data. We present an analysis of several available Relational Database Management Systems (RDBMS) to support a catalog meeting the above requirements and discuss the PHENIX experience with building and using the distributed file catalog.
△ Less
Submitted 4 June, 2003;
originally announced June 2003.