Skip to main content

Showing 1–6 of 6 results for author: Mattmann, C

Searching in archive cs. Search in all archives.
.
  1. Many-to-English Machine Translation Tools, Data, and Pretrained Models

    Authors: Thamme Gowda, Zhao Zhang, Chris A Mattmann, Jonathan May

    Abstract: While there are more than 7000 languages in the world, most translation research efforts have targeted a few high-resource languages. Commercial translation systems support only one hundred languages or fewer, and do not make these models available for transfer to low resource languages. In this work, we present useful tools for machine translation research: MTData, NLCodec, and RTG. We demonstrat… ▽ More

    Submitted 1 July, 2021; v1 submitted 1 April, 2021; originally announced April 2021.

    Comments: To-appear: ACL 2021 System Demonstrations

  2. Technology Readiness Levels for Machine Learning Systems

    Authors: Alexander Lavin, Ciarán M. Gilligan-Lee, Alessya Visnjic, Siddha Ganju, Dava Newman, Atılım Güneş Baydin, Sujoy Ganguly, Danny Lange, Amit Sharma, Stephan Zheng, Eric P. Xing, Adam Gibson, James Parr, Chris Mattmann, Yarin Gal

    Abstract: The development and deployment of machine learning (ML) systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end. The lack of diligence can lead to technical debt, scope creep and misaligned objectives, model misuse and failures, and expensive consequences. Engineering systems, on the other hand, follow well-defined processes and testing standards t… ▽ More

    Submitted 29 November, 2021; v1 submitted 11 January, 2021; originally announced January 2021.

  3. arXiv:1808.03753  [pdf, other

    cs.LG stat.ML

    MARVIN: An Open Machine Learning Corpus and Environment for Automated Machine Learning Primitive Annotation and Execution

    Authors: Chris A. Mattmann, Sujen Shah, Brian Wilson

    Abstract: In this demo paper, we introduce the DARPA D3M program for automatic machine learning (ML) and JPL's MARVIN tool that provides an environment to locate, annotate, and execute machine learning primitives for use in ML pipelines. MARVIN is a web-based application and associated back-end interface written in Python that enables composition of ML pipelines from hundreds of primitives from the world of… ▽ More

    Submitted 11 August, 2018; originally announced August 2018.

  4. arXiv:1710.04312  [pdf, other

    cs.IR cs.AI cs.CL

    Measurement Context Extraction from Text: Discovering Opportunities and Gaps in Earth Science

    Authors: Kyle Hundman, Chris A. Mattmann

    Abstract: We propose Marve, a system for extracting measurement values, units, and related words from natural language text. Marve uses conditional random fields (CRF) to identify measurement values and units, followed by a rule-based system to find related entities, descriptors and modifiers within a sentence. Sentence tokens are represented by an undirected graphical model, and rules are based on part-of-… ▽ More

    Submitted 11 October, 2017; originally announced October 2017.

    Journal ref: 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Data-Driven Discovery Workshop, Halifax, Canada, August 2017

  5. arXiv:1610.06669  [pdf, other

    cs.CV

    Scalable Pooled Time Series of Big Video Data from the Deep Web

    Authors: Chris Mattmann, Madhav Sharan

    Abstract: We contribute a scalable implementation of Ryoo et al's Pooled Time Series algorithm from CVPR 2015. The updated algorithm has been evaluated on a large and diverse dataset of approximately 6800 videos collected from a crawl of the deep web related to human trafficking on DARPA's MEMEX effort. We describe the properties of Pooled Time Series and the motivation for using it to relate videos collect… ▽ More

    Submitted 21 October, 2016; originally announced October 2016.

    Comments: 7 pages, 5 figures

  6. Ensemble Maximum Entropy Classification and Linear Regression for Author Age Prediction

    Authors: Joey Hong, Chris Mattmann, Paul Ramirez

    Abstract: The evolution of the internet has created an abundance of unstructured data on the web, a significant part of which is textual. The task of author profiling seeks to find the demographics of people solely from their linguistic and content-based features in text. The ability to describe traits of authors clearly has applications in fields such as security and forensics, as well as marketing. Instea… ▽ More

    Submitted 4 October, 2016; originally announced October 2016.

    Comments: 6 pages, 4 figures

    Journal ref: 2017 IEEE International Conference on Information Reuse and Integration (IRI)