-
LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models
Authors:
Neel Guha,
Julian Nyarko,
Daniel E. Ho,
Christopher RĂ©,
Adam Chilton,
Aditya Narayana,
Alex Chohlas-Wood,
Austin Peters,
Brandon Waldon,
Daniel N. Rockmore,
Diego Zambrano,
Dmitry Talisman,
Enam Hoque,
Faiz Surani,
Frank Fagan,
Galit Sarfaty,
Gregory M. Dickinson,
Haggai Porat,
Jason Hegland,
Jessica Wu,
Joe Nudell,
Joel Niklaus,
John Nay,
Jonathan H. Choi,
Kevin Tobia
, et al. (15 additional authors not shown)
Abstract:
The advent of large language models (LLMs) and their adoption by the legal community has given rise to the question: what types of legal reasoning can LLMs perform? To enable greater study of this question, we present LegalBench: a collaboratively constructed legal reasoning benchmark consisting of 162 tasks covering six different types of legal reasoning. LegalBench was built through an interdisc…
▽ More
The advent of large language models (LLMs) and their adoption by the legal community has given rise to the question: what types of legal reasoning can LLMs perform? To enable greater study of this question, we present LegalBench: a collaboratively constructed legal reasoning benchmark consisting of 162 tasks covering six different types of legal reasoning. LegalBench was built through an interdisciplinary process, in which we collected tasks designed and hand-crafted by legal professionals. Because these subject matter experts took a leading role in construction, tasks either measure legal reasoning capabilities that are practically useful, or measure reasoning skills that lawyers find interesting. To enable cross-disciplinary conversations about LLMs in the law, we additionally show how popular legal frameworks for describing legal reasoning -- which distinguish between its many forms -- correspond to LegalBench tasks, thus giving lawyers and LLM developers a common vocabulary. This paper describes LegalBench, presents an empirical evaluation of 20 open-source and commercial LLMs, and illustrates the types of research explorations LegalBench enables.
△ Less
Submitted 20 August, 2023;
originally announced August 2023.
-
Complex Systems of Secrecy: The Offshore Networks of Oligarchs
Authors:
Ho-Chun Herbert Chang,
Brooke Harrington,
Feng Fu,
Daniel Rockmore
Abstract:
Following the invasion of Ukraine, the US, UK, and EU governments--among others--sanctioned oligarchs close to Putin. This approach has come under scrutiny, as evidence has emerged of the oligarchs' successful evasion of these punishments. To address this problem, we analyze the role of an overlooked but highly influential group: the secretive professional intermediaries who create and administer…
▽ More
Following the invasion of Ukraine, the US, UK, and EU governments--among others--sanctioned oligarchs close to Putin. This approach has come under scrutiny, as evidence has emerged of the oligarchs' successful evasion of these punishments. To address this problem, we analyze the role of an overlooked but highly influential group: the secretive professional intermediaries who create and administer the oligarchs' offshore financial empires. Drawing on the Offshore Leaks Database provided by the International Consortium of Investigative Journalists (ICIJ), we examine the ties linking offshore expert advisors (lawyers, accountants, and other wealth management professionals) to ultra-high-net-worth individuals from four countries: Russia, China, the United States, and Hong Kong. We find that resulting nation-level "oligarch networks" share a scale-free structure characterized by heterogeneity of heavy-tailed degree distributions of wealth managers; however, network topologies diverge across clients from democratic versus autocratic regimes. While generally robust, scale-free networks are fragile when targeted by attacks on highly-connected nodes. Our "knock-out" experiments pinpoint this vulnerability to the small group of wealth managers themselves, suggesting that sanctioning these professional intermediaries may be more effective and efficient in disrupting dark finance flows than sanctions on their wealthy clients. This vulnerability is especially pronounced amongst Russian oligarchs, who concentrate their offshore business in a handful of boutique wealth management firms. The distinctive patterns we identify suggest a new approach to sanctions, focused on expert intermediaries to disrupt the finances and alliances of their wealthy clients. More generally, our research contributes to the larger body of work on complexity science and the structures of secrecy.
△ Less
Submitted 6 March, 2023;
originally announced March 2023.
-
On the Spectrum of Finite, Rooted Homogeneous Trees
Authors:
Daryl R. DeFord,
Daniel N. Rockmore
Abstract:
In this paper we study the adjacency spectrum of families of finite rooted trees with regular branching properties. In particular, we show that in the case of constant branching, the eigenvalues are realized as the roots of a family of generalized Fibonacci polynomials and produce a limiting distribution for the eigenvalues as the tree depth goes to infinity. We indicate how these results can be e…
▽ More
In this paper we study the adjacency spectrum of families of finite rooted trees with regular branching properties. In particular, we show that in the case of constant branching, the eigenvalues are realized as the roots of a family of generalized Fibonacci polynomials and produce a limiting distribution for the eigenvalues as the tree depth goes to infinity. We indicate how these results can be extended to periodic branching patterns and also provide a generalization to higher order simplicial complexes.
△ Less
Submitted 29 March, 2020; v1 submitted 17 March, 2019;
originally announced March 2019.
-
The Cultural Evolution of National Constitutions
Authors:
Daniel N. Rockmore,
Chen Fang,
Nicholas J. Foti,
Tom Ginsburg,
David C. Krakauer
Abstract:
We explore how ideas from infectious disease and genetics can be used to uncover patterns of cultural inheritance and innovation in a corpus of 591 national constitutions spanning 1789 - 2008. Legal "Ideas" are encoded as "topics" - words statistically linked in documents - derived from topic modeling the corpus of constitutions. Using these topics we derive a diffusion network for borrowing from…
▽ More
We explore how ideas from infectious disease and genetics can be used to uncover patterns of cultural inheritance and innovation in a corpus of 591 national constitutions spanning 1789 - 2008. Legal "Ideas" are encoded as "topics" - words statistically linked in documents - derived from topic modeling the corpus of constitutions. Using these topics we derive a diffusion network for borrowing from ancestral constitutions back to the US Constitution of 1789 and reveal that constitutions are complex cultural recombinants. We find systematic variation in patterns of borrowing from ancestral texts and "biological"-like behavior in patterns of inheritance with the distribution of "offspring" arising through a bounded preferential-attachment process. This process leads to a small number of highly innovative (influential) constitutions some of which have yet to have been identified as so in the current literature. Our findings thus shed new light on the critical nodes of the constitution-making network. The constitutional network structure reflects periods of intense constitution creation, and systematic patterns of variation in constitutional life-span and temporal influence.
△ Less
Submitted 18 November, 2017;
originally announced November 2017.
-
Evaluating prose style transfer with the Bible
Authors:
Keith Carlson,
Allen Riddell,
Daniel Rockmore
Abstract:
In the prose style transfer task a system, provided with text input and a target prose style, produces output which preserves the meaning of the input text but alters the style. These systems require parallel data for evaluation of results and usually make use of parallel data for training. Currently, there are few publicly available corpora for this task. In this work, we identify a high-quality…
▽ More
In the prose style transfer task a system, provided with text input and a target prose style, produces output which preserves the meaning of the input text but alters the style. These systems require parallel data for evaluation of results and usually make use of parallel data for training. Currently, there are few publicly available corpora for this task. In this work, we identify a high-quality source of aligned, stylistically distinct text in different versions of the Bible. We provide a standardized split, into training, development and testing data, of the public domain versions in our corpus. This corpus is highly parallel since many Bible versions are included. Sentences are aligned due to the presence of chapter and verse numbers within all versions of the text. In addition to the corpus, we present the results, as measured by the BLEU and PINC metrics, of several models trained on our data which can serve as baselines for future research. While we present these data as a style transfer corpus, we believe that it is of unmatched quality and may be useful for other natural language tasks as well.
△ Less
Submitted 14 December, 2018; v1 submitted 13 November, 2017;
originally announced November 2017.
-
Analysis of the U.S. Patient Referral Network
Authors:
Chuankai An,
A. James O'Malley,
Daniel N. Rockmore,
Corey D. Stock
Abstract:
In this paper we analyze the US Patient Referral Network (also called the Shared Patient Network) and various subnetworks for the years 2009--2015. In these networks two physicians are linked if a patient encounters both of them within a specified time-interval, according to the data made available by the Centers for Medicare and Medicaid Services. We find power law distributions on most state-lev…
▽ More
In this paper we analyze the US Patient Referral Network (also called the Shared Patient Network) and various subnetworks for the years 2009--2015. In these networks two physicians are linked if a patient encounters both of them within a specified time-interval, according to the data made available by the Centers for Medicare and Medicaid Services. We find power law distributions on most state-level data as well as a core-periphery structure. On a national and state level, we discover a so-called small-world structure as well as a "gravity law" of the type found in some large-scale economic networks. Some physicians play the role of hubs for interstate referral. Strong correlations between certain network statistics with healthcare system statistics at both the state and national levels are discovered. The patterns in the referral network evinced using several statistical analyses involving key metrics derived from the network illustrate the potential for using network analysis to provide new insights into the healthcare system and opportunities or mechanisms for catalyzing improvements.
△ Less
Submitted 8 November, 2017;
originally announced November 2017.
-
A Random Dot Product Model for Weighted Networks
Authors:
Daryl R. DeFord,
Daniel N. Rockmore
Abstract:
This paper presents a generalization of the random dot product model for networks whose edge weights are drawn from a parametrized probability distribution. We focus on the case of integer weight edges and show that many previously studied models can be recovered as special cases of this generalization. Our model also determines a dimension--reducing embedding process that gives geometric interpre…
▽ More
This paper presents a generalization of the random dot product model for networks whose edge weights are drawn from a parametrized probability distribution. We focus on the case of integer weight edges and show that many previously studied models can be recovered as special cases of this generalization. Our model also determines a dimension--reducing embedding process that gives geometric interpretations of community structure and centrality. The dimension of the embedding has consequences for the derived community structure and we exhibit a stress function for determining appropriate dimensions. We use this approach to analyze a coauthorship network and voting data from the U.S. Senate.
△ Less
Submitted 8 November, 2016;
originally announced November 2016.
-
Measuring Verifiability in Online Information
Authors:
Reed H. Harder,
Alfredo J. Velasco,
Michael S. Evans,
Daniel N. Rockmore
Abstract:
The verifiability of online information is important, but difficult to assess systematically. We examine verifiability in the case of Wikipedia, one of the world's largest and most consulted online information sources. We extend prior work about quality of Wikipedia articles, knowledge production, and sources to consider the quality of Wikipedia references. We propose a multidimensional measure of…
▽ More
The verifiability of online information is important, but difficult to assess systematically. We examine verifiability in the case of Wikipedia, one of the world's largest and most consulted online information sources. We extend prior work about quality of Wikipedia articles, knowledge production, and sources to consider the quality of Wikipedia references. We propose a multidimensional measure of verifiability that takes into account technical accuracy and practical accessibility of sources. We calculate article verifiability scores for a sample of 5,000 articles and 295,800 citations, and compare differently weighted models to illustrate effects of emphasizing particular elements of verifiability over others. We find that, while the quality of references in the overall sample is reasonably high, verifiability varies significantly by article, particularly when emphasizing the use of standard digital identifiers and taking into account the practical availability of referenced sources. We discuss the implications of these findings for measuring verifiability in online information more generally.
△ Less
Submitted 16 November, 2015; v1 submitted 18 September, 2015;
originally announced September 2015.
-
The Intrafirm Complexity of Systemically Important Financial Institutions
Authors:
Robin L. Lumsdaine,
Daniel N. Rockmore,
Nicholas Foti,
Gregory Leibon,
J. Doyne Farmer
Abstract:
In November, 2011, the Financial Stability Board, in collaboration with the International Monetary Fund, published a list of 29 "systemically important financial institutions" (SIFIs). This designation reflects a concern that the failure of any one of them could have dramatic negative consequences for the global economy and is based on "their size, complexity, and systemic interconnectedness". Whi…
▽ More
In November, 2011, the Financial Stability Board, in collaboration with the International Monetary Fund, published a list of 29 "systemically important financial institutions" (SIFIs). This designation reflects a concern that the failure of any one of them could have dramatic negative consequences for the global economy and is based on "their size, complexity, and systemic interconnectedness". While the characteristics of "size" and "systemic interconnectedness" have been the subject of a good deal of quantitative analysis, less attention has been paid to measures of a firm's "complexity." In this paper we take on the challenges of measuring the complexity of a financial institution and to that end explore the use of the structure of an individual firm's control hierarchy as a proxy for institutional complexity. The control hierarchy is a network representation of the institution and its subsidiaries. We show that this mathematical representation (and various associated metrics) provides a consistent way to compare the complexity of firms with often very disparate business models and as such may provide the foundation for determining a SIFI designation. By quantifying the level of complexity of a firm, our approach also may prove useful should firms need to reduce their level of complexity either in response to business or regulatory needs. Using a data set containing the control hierarchies of many of the designated SIFIs, we find that in the past two years, these firms have decreased their level of complexity, perhaps in response to regulatory requirements.
△ Less
Submitted 9 May, 2015;
originally announced May 2015.
-
Multi-Task Metric Learning on Network Data
Authors:
Chen Fang,
Daniel N. Rockmore
Abstract:
Multi-task learning (MTL) improves prediction performance in different contexts by learning models jointly on multiple different, but related tasks. Network data, which are a priori data with a rich relational structure, provide an important context for applying MTL. In particular, the explicit relational structure implies that network data is not i.i.d. data. Network data also often comes with si…
▽ More
Multi-task learning (MTL) improves prediction performance in different contexts by learning models jointly on multiple different, but related tasks. Network data, which are a priori data with a rich relational structure, provide an important context for applying MTL. In particular, the explicit relational structure implies that network data is not i.i.d. data. Network data also often comes with significant metadata (i.e., attributes) associated with each entity (node). Moreover, due to the diversity and variation in network data (e.g., multi-relational links or multi-category entities), various tasks can be performed and often a rich correlation exists between them. Learning algorithms should exploit all of these additional sources of information for better performance. In this work we take a metric-learning point of view for the MTL problem in the network context. Our approach builds on structure preserving metric learning (SPML). In particular SPML learns a Mahalanobis distance metric for node attributes using network structure as supervision, so that the learned distance function encodes the structure and can be used to predict link patterns from attributes. SPML is described for single-task learning on single network. Herein, we propose a multi-task version of SPML, abbreviated as MT-SPML, which is able to learn across multiple related tasks on multiple networks via shared intermediate parametrization. MT-SPML learns a specific metric for each task and a common metric for all tasks. The task correlation is carried through the common metric and the individual metrics encode task specific information. When combined together, they are structure-preserving with respect to individual tasks. MT-SPML works on general networks, thus is suitable for a wide variety of problems. In experiments, we challenge MT-SPML on two real-word problems, where MT-SPML achieves significant improvement.
△ Less
Submitted 10 November, 2014;
originally announced November 2014.
-
A unifying representation for a class of dependent random measures
Authors:
Nicholas J. Foti,
Joseph D. Futoma,
Daniel N. Rockmore,
Sinead Williamson
Abstract:
We present a general construction for dependent random measures based on thinning Poisson processes on an augmented space. The framework is not restricted to dependent versions of a specific nonparametric model, but can be applied to all models that can be represented using completely random measures. Several existing dependent random measures can be seen as specific cases of this framework. Inter…
▽ More
We present a general construction for dependent random measures based on thinning Poisson processes on an augmented space. The framework is not restricted to dependent versions of a specific nonparametric model, but can be applied to all models that can be represented using completely random measures. Several existing dependent random measures can be seen as specific cases of this framework. Interesting properties of the resulting measures are derived and the efficacy of the framework is demonstrated by constructing a covariate-dependent latent feature model and topic model that obtain superior predictive performance.
△ Less
Submitted 20 November, 2012;
originally announced November 2012.
-
Partition Decomposition for Roll Call Data
Authors:
Greg Leibon,
Scott Pauls,
Daniel N. Rockmore,
Robert Savell
Abstract:
In this paper we bring to bear some new tools from statistical learning on the analysis of roll call data. We present a new data-driven model for roll call voting that is geometric in nature. We construct the model by adapting the "Partition Decoupling Method," an unsupervised learning technique originally developed for the analysis of families of time series, to produce a multiscale geometric des…
▽ More
In this paper we bring to bear some new tools from statistical learning on the analysis of roll call data. We present a new data-driven model for roll call voting that is geometric in nature. We construct the model by adapting the "Partition Decoupling Method," an unsupervised learning technique originally developed for the analysis of families of time series, to produce a multiscale geometric description of a weighted network associated to a set of roll call votes. Central to this approach is the quantitative notion of a "motivation," a cluster-based and learned basis element that serves as a building block in the representation of roll call data. Motivations enable the formulation of a quantitative description of ideology and their data-dependent nature makes possible a quantitative analysis of the evolution of ideological factors. This approach is generally applicable to roll call data and we apply it in particular to the historical roll call voting of the U.S. House and Senate. This methodology provides a mechanism for estimating the dimension of the underlying action space. We determine that the dominant factors form a low- (one- or two-) dimensional representation with secondary factors adding higher-dimensional features. In this way our work supports and extends the findings of both Poole-Rosenthal and Heckman-Snyder concerning the dimensionality of the action space. We give a detailed analysis of several individual Senates and use the AdaBoost technique from statistical learning to determine those votes with the most powerful discriminatory value. When used as a predictive model, this geometric view significantly outperforms spatial models such as the Poole-Rosenthal DW-NOMINATE model and the Heckman-Snyder 6-factor model, both in raw accuracy as well as Aggregate Proportional Reduced Error (APRE).
△ Less
Submitted 13 August, 2011;
originally announced August 2011.
-
Characteristic Characteristics
Authors:
Sean Brocklebank,
Scott Pauls,
Daniel Rockmore,
Timothy C. Bates
Abstract:
While five-factor models of personality are widespread, there is still not universal agreement on this as a structural framework. Part of the reason for the lingering debate is its dependence on factor analysis. In particular, derivation or refutation of the model via other statistical means is a worthwhile project. In this paper we use the methodology of spectral clustering to articulate the stru…
▽ More
While five-factor models of personality are widespread, there is still not universal agreement on this as a structural framework. Part of the reason for the lingering debate is its dependence on factor analysis. In particular, derivation or refutation of the model via other statistical means is a worthwhile project. In this paper we use the methodology of spectral clustering to articulate the structure in the dataset of responses of 20,993 subjects on a 300-item item version of the IPIP NEO personality questionnaire, and we compare our results to those obtained from a factor analytic solution. We found support for five- and six-cluster solutions. The five-cluster solution was similar to a conventional five-factor solution, but the six-cluster and six-factor solutions differed significantly, and only the six-cluster solution was readily interpretable: it gave a model similar to the HEXACO model. We suggest that spectral clustering provides a robust alternative view of personality data.
△ Less
Submitted 6 July, 2011;
originally announced July 2011.
-
Robustness and Contagion in the International Financial Network
Authors:
Tilman Dette,
Scott Pauls,
Daniel N. Rockmore
Abstract:
The recent financial crisis of 2008 and the 2011 indebtedness of Greece highlight the importance of understanding the structure of the global financial network. In this paper we set out to analyze and characterize this network, as captured by the IMF Coordinated Portfolio Investment Survey (CPIS), in two ways. First, through an adaptation of the "error and attack" methodology [1], we show that the…
▽ More
The recent financial crisis of 2008 and the 2011 indebtedness of Greece highlight the importance of understanding the structure of the global financial network. In this paper we set out to analyze and characterize this network, as captured by the IMF Coordinated Portfolio Investment Survey (CPIS), in two ways. First, through an adaptation of the "error and attack" methodology [1], we show that the network is of the "robust-yet-fragile" type, a topology found in a wide variety of evolved networks. We compare these results against four common null-models, generated only from first-order statistics of the empirical data. In addition, we suggest a fifth, log-normal model, which generates networks that seem to match the empirical one more closely. Still, this model does not account for several higher order network statistics, which reenforces the added value of the higher-order analysis. Second, using loss-given-default dynamics [2], we model financial interdependence and potential cascading of financial distress through the network. Preliminary simulations indicate that default by a single relatively small country like Greece can be absorbed by the network, but that default in combination with defaults of other PIGS countries (Portugal, Ireland, and Spain) could lead to a massive extinction cascade in the global economy.
△ Less
Submitted 7 July, 2011; v1 submitted 21 April, 2011;
originally announced April 2011.