Skip to main content

Showing 1–7 of 7 results for author: Pickering, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2412.09587  [pdf, ps, other

    cs.CL

    OpenNER 1.0: Standardized Open-Access Named Entity Recognition Datasets in 50+ Languages

    Authors: Chester Palen-Michel, Maxwell Pickering, Maya Kruse, Jonne Sälevä, Constantine Lignos

    Abstract: We present OpenNER 1.0, a standardized collection of openly-available named entity recognition (NER) datasets. OpenNER contains 36 NER corpora that span 52 languages, human-annotated in varying named entity ontologies. We correct annotation format issues, standardize the original datasets into a uniform representation with consistent entity type names across corpora, and provide the collection in… ▽ More

    Submitted 26 June, 2025; v1 submitted 12 December, 2024; originally announced December 2024.

    Comments: Under review

  2. Automating Governing Knowledge Commons and Contextual Integrity (GKC-CI) Privacy Policy Annotations with Large Language Models

    Authors: Jake Chanenson, Madison Pickering, Noah Apthorpe

    Abstract: Identifying contextual integrity (CI) and governing knowledge commons (GKC) parameters in privacy policy texts can facilitate normative privacy analysis. However, GKC-CI annotation has heretofore required manual or crowdsourced effort. This paper demonstrates that high-accuracy GKC-CI parameter annotation of privacy policies can be performed automatically using large language models. We fine-tune… ▽ More

    Submitted 9 December, 2024; v1 submitted 3 November, 2023; originally announced November 2023.

    Comments: 29 pages, 18 figures, 11 tables; camera-ready version

  3. arXiv:2303.08014  [pdf

    cs.CL

    Do large language models resemble humans in language use?

    Authors: Zhenguang G. Cai, Xufeng Duan, David A. Haslett, Shuqi Wang, Martin J. Pickering

    Abstract: Large language models (LLMs) such as ChatGPT and Vicuna have shown remarkable capacities in comprehending and producing language. However, their internal workings remain a black box, and it is unclear whether LLMs and chatbots can develop humanlike characteristics in language use. Cognitive scientists have devised many experiments that probe, and have made great progress in explaining, how people… ▽ More

    Submitted 25 March, 2024; v1 submitted 10 March, 2023; originally announced March 2023.

  4. arXiv:2112.03653  [pdf, ps, other

    cs.PL

    A Specification for Typed Template Haskell

    Authors: Matthew Pickering, Andres Löh, Nicolas Wu

    Abstract: Multi-stage programming is a proven technique that provides predictable performance characteristics by controlling code generation. We propose a core semantics for Typed Template Haskell, an extension of Haskell that supports multi staged programming that interacts well with polymorphism and qualified types. Our semantics relates a declarative source language with qualified types to a core languag… ▽ More

    Submitted 7 December, 2021; originally announced December 2021.

  5. arXiv:1805.06798  [pdf, other

    cs.PL

    Generic Deriving of Generic Traversals

    Authors: Csongor Kiss, Matthew Pickering, Nicolas Wu

    Abstract: Functional programmers have an established tradition of using traversals as a design pattern to work with recursive data structures. The technique is so prolific that a whole host of libraries have been designed to help in the task of automatically providing traversals by analysing the generic structure of data types. More recently, lenses have entered the functional scene and have proved themselv… ▽ More

    Submitted 17 May, 2018; originally announced May 2018.

    Comments: 28 pages, ICFP

  6. Profunctor Optics: Modular Data Accessors

    Authors: Matthew Pickering, Jeremy Gibbons, Nicolas Wu

    Abstract: CONTEXT: Data accessors allow one to read and write components of a data structure, such as the fields of a record, the variants of a union, or the elements of a container. These data accessors are collectively known as optics; they are fundamental to programs that manipulate complex data. INQUIRY: Individual data accessors for simple data structures are easy to write, for example as pairs of "g… ▽ More

    Submitted 31 March, 2017; originally announced March 2017.

    Journal ref: The Art, Science, and Engineering of Programming, 2017, Vol. 1, Issue 2, Article 7

  7. Modeling and performance evaluation of stealthy false data injection attacks on smart grid in the presence of corrupted measurements

    Authors: Adnan Anwar, Abdun Naser Mahmood, Mark Pickering

    Abstract: The false data injection (FDI) attack cannot be detected by the traditional anomaly detection techniques used in the energy system state estimators. In this paper, we demonstrate how FDI attacks can be constructed blindly, i.e., without system knowledge, including topological connectivity and line reactance information. Our analysis reveals that existing FDI attacks become detectable (consequently… ▽ More

    Submitted 19 May, 2016; originally announced May 2016.

    Comments: Keywords: Smart grid, False data injection, Blind attack, Principal component analysis (PCA), Journal of Computer and System Sciences, Elsevier, 2016