-
The Topological Structures of the Orders of Hypergraphs
Authors:
Robert E. Green,
Cliff A. Joslyn,
Audun Myers,
Michael G. Rawson,
Michael Robinson
Abstract:
We provide first a categorical exploration of, and then completion of the mapping of the relationships among, three fundamental perspectives on binary relations: as the incidence matrices of hypergraphs, as the formal contexts of concept lattices, and as specifying topological cosheaves of simplicial (Dowker) complexes on simplicial (Dowker) complexes. We provide an integrative, functorial framewo…
▽ More
We provide first a categorical exploration of, and then completion of the mapping of the relationships among, three fundamental perspectives on binary relations: as the incidence matrices of hypergraphs, as the formal contexts of concept lattices, and as specifying topological cosheaves of simplicial (Dowker) complexes on simplicial (Dowker) complexes. We provide an integrative, functorial framework combining previously known with three new results: 1) given a binary relation, there are order isomorphisms among the bounded edge order of the intersection complexes of its dual hypergraphs and its concept lattice; 2) the concept lattice of a context is an isomorphism invariant of the Dowker cosheaf (of abstract simplicial complexes) of that context; and 3) a novel Dowker cosheaf (of chain complexes) of a relation is an isomorphism invariant of the concept lattice of the context that generalizes Dowker's original homological result. We illustrate these concepts throughout with a running example, and demonstrate relationships to past results.
△ Less
Submitted 18 April, 2025; v1 submitted 16 April, 2025;
originally announced April 2025.
-
Chromosome-scale shotgun assembly using an in vitro method for long-range linkage
Authors:
Nicholas H. Putnam,
Brendan O'Connell,
Jonathan C. Stites,
Brandon J. Rice,
Andrew Fields,
Paul D. Hartley,
Charles W. Sugnet,
David Haussler,
Daniel S. Rokhsar,
Richard E. Green
Abstract:
Long-range and highly accurate de novo assembly from short-read data is one of the most pressing challenges in genomics. Recently, it has been shown that read pairs generated by proximity ligation of DNA in chromatin of living tissue can address this problem. These data dramatically increase the scaffold contiguity of assemblies and provide haplotype phasing information. Here, we describe a simple…
▽ More
Long-range and highly accurate de novo assembly from short-read data is one of the most pressing challenges in genomics. Recently, it has been shown that read pairs generated by proximity ligation of DNA in chromatin of living tissue can address this problem. These data dramatically increase the scaffold contiguity of assemblies and provide haplotype phasing information. Here, we describe a simpler approach ("Chicago") based on in vitro reconstituted chromatin. We generated two Chicago datasets with human DNA and used a new software pipeline ("HiRise") to construct a highly accurate de novo assembly and scaffolding of a human genome with scaffold N50 of 30 Mb. We also demonstrated the utility of Chicago for improving existing assemblies by re-assembling and scaffolding the genome of the American alligator. With a single library and one lane of Illumina HiSeq sequencing, we increased the scaffold N50 of the American alligator from 508 kb to 10 Mb. Our method uses established molecular biology procedures and can be used to analyze any genome, as it requires only about 5 micrograms of DNA as the starting material.
△ Less
Submitted 18 February, 2015;
originally announced February 2015.
-
The Chemical and Ionization Conditions in Weak Mg II Absorbers
Authors:
Anand Narayanan,
Jane C. Charlton,
Toru Misawa,
Rebecca E. Green,
Tae-Sun Kim
Abstract:
We present an analysis of the chemical and ionization conditions in a sample of 100 weak Mg II absorbers identified in the VLT/UVES archive of quasar spectra. Using a host of low ionization lines associated with each absorber in this sample, and on the basis of ionization models, we infer that the metallicity in a significant fraction of weak Mg II clouds is constrained to values of solar or hig…
▽ More
We present an analysis of the chemical and ionization conditions in a sample of 100 weak Mg II absorbers identified in the VLT/UVES archive of quasar spectra. Using a host of low ionization lines associated with each absorber in this sample, and on the basis of ionization models, we infer that the metallicity in a significant fraction of weak Mg II clouds is constrained to values of solar or higher, if they are sub-Lyman limit systems. Based on the observed constraints, we present a physical picture in which weak Mg II absorbers are predominantly tracing two different astrophysical processes/structures. A significant population of weak Mg II clouds, those in which N(Fe II) is much less than N(Mg II), identified at both low (z ~ 1) and high (z ~ 2) redshift, are potentially tracing gas in the extended halos of galaxies, analogous to the Galactic high velocity clouds. These absorbers might correspond to alpha-enhanced interstellar gas expelled from star-forming galaxies, in correlated supernova events. On the other hand, N(FeII) approximately equal to N(Mg II) clouds, which are prevalent only at lower redshifts (z < 1.5), must be tracing Type Ia enriched gas in small, high metallicity pockets in dwarf galaxies, tidal debris, or other intergalactic structures.
△ Less
Submitted 19 August, 2008;
originally announced August 2008.
-
Pairwise alignment incorporating dipeptide covariation
Authors:
Gavin E. Crooks,
Richard E. Green,
Steven E. Brenner
Abstract:
Motivation: Standard algorithms for pairwise protein sequence alignment make the simplifying assumption that amino acid substitutions at neighboring sites are uncorrelated. This assumption allows implementation of fast algorithms for pairwise sequence alignment, but it ignores information that could conceivably increase the power of remote homolog detection. We examine the validity of this assum…
▽ More
Motivation: Standard algorithms for pairwise protein sequence alignment make the simplifying assumption that amino acid substitutions at neighboring sites are uncorrelated. This assumption allows implementation of fast algorithms for pairwise sequence alignment, but it ignores information that could conceivably increase the power of remote homolog detection. We examine the validity of this assumption by constructing extended substitution matrixes that encapsulate the observed correlations between neighboring sites, by developing an efficient and rigorous algorithm for pairwise protein sequence alignment that incorporates these local substitution correlations, and by assessing the ability of this algorithm to detect remote homologies. Results: Our analysis indicates that local correlations between substitutions are not strong on the average. Furthermore, incorporating local substitution correlations into pairwise alignment did not lead to a statistically significant improvement in remote homology detection. Therefore, the standard assumption that individual residues within protein sequences evolve independently of neighboring positions appears to be an efficient and appropriate approximation.
△ Less
Submitted 28 July, 2005; v1 submitted 19 February, 2005;
originally announced February 2005.