Skip to main content

Showing 1–8 of 8 results for author: Strzyz, M

.
  1. arXiv:2108.07556  [pdf, other

    cs.CL

    Not All Linearizations Are Equally Data-Hungry in Sequence Labeling Parsing

    Authors: Alberto Muñoz-Ortiz, Michalina Strzyz, David Vilares

    Abstract: Different linearizations have been proposed to cast dependency parsing as sequence labeling and solve the task as: (i) a head selection problem, (ii) finding a representation of the token arcs as bracket strings, or (iii) associating partial transition sequences of a transition-based parser to words. Yet, there is little understanding about how these linearizations behave in low-resource setups. H… ▽ More

    Submitted 17 August, 2021; originally announced August 2021.

    Comments: Accepted at RANLP 2021 (https://ranlp.org/ranlp2021)

  2. arXiv:2011.00596  [pdf, ps, other

    cs.CL

    Bracketing Encodings for 2-Planar Dependency Parsing

    Authors: Michalina Strzyz, David Vilares, Carlos Gómez-Rodríguez

    Abstract: We present a bracketing-based encoding that can be used to represent any 2-planar dependency tree over a sentence of length n as a sequence of n labels, hence providing almost total coverage of crossing arcs in sequence labeling parsing. First, we show that existing bracketing encodings for parsing as labeling can only handle a very mild extension of projective trees. Second, we overcome this limi… ▽ More

    Submitted 22 March, 2021; v1 submitted 1 November, 2020; originally announced November 2020.

    Comments: COLING2020 (long papers), 13 pages (incl. appendix) with corrected parsing speeds for Danish and Gothic

    MSC Class: 68T50 ACM Class: I.2.7

  3. arXiv:2011.00584  [pdf, ps, other

    cs.CL cs.FL

    A Unifying Theory of Transition-based and Sequence Labeling Parsing

    Authors: Carlos Gómez-Rodríguez, Michalina Strzyz, David Vilares

    Abstract: We define a mapping from transition-based parsing algorithms that read sentences from left to right to sequence labeling encodings of syntactic trees. This not only establishes a theoretical relation between transition-based parsing and sequence-labeling parsing, but also provides a method to obtain new encodings for fast and simple sequence labeling parsing from the many existing transition-based… ▽ More

    Submitted 1 November, 2020; originally announced November 2020.

    Comments: Camera-ready version (final peer-reviewed manuscript) to appear at proceedings of COLING 2020. 18 pages (incl. appendices)

    MSC Class: 68T50; 68Q45 ACM Class: F.4.3; I.2.7

  4. arXiv:2002.01685  [pdf, other

    cs.CL cs.LG

    Parsing as Pretraining

    Authors: David Vilares, Michalina Strzyz, Anders Søgaard, Carlos Gómez-Rodríguez

    Abstract: Recent analyses suggest that encoders pretrained for language modeling capture certain morpho-syntactic structure. However, probing frameworks for word vectors still do not report results on standard setups such as constituent and dependency parsing. This paper addresses this problem and does full parsing (on English) relying only on pretraining architectures -- and no decoding. We first cast cons… ▽ More

    Submitted 5 February, 2020; originally announced February 2020.

    Comments: AAAI 2020 - The Thirty-Fourth AAAI Conference on Artificial Intelligence

  5. arXiv:1909.01053  [pdf, other

    cs.CL cs.LG

    Towards Making a Dependency Parser See

    Authors: Michalina Strzyz, David Vilares, Carlos Gómez-Rodríguez

    Abstract: We explore whether it is possible to leverage eye-tracking data in an RNN dependency parser (for English) when such information is only available during training, i.e., no aggregated or token-level gaze features are used at inference time. To do so, we train a multitask learning model that parses sentences as sequence labeling and leverages gaze features as auxiliary tasks. Our method also learns… ▽ More

    Submitted 3 September, 2019; originally announced September 2019.

    Comments: Camera-ready version to appear at EMNLP 2019 (final peer-reviewed manuscript). 8 pages (incl. appendix)

    MSC Class: 68T50 ACM Class: I.2.7

  6. Sequence Labeling Parsing by Learning Across Representations

    Authors: Michalina Strzyz, David Vilares, Carlos Gómez-Rodríguez

    Abstract: We use parsing as sequence labeling as a common framework to learn across constituency and dependency syntactic abstractions. To do so, we cast the problem as multitask learning (MTL). First, we show that adding a parsing paradigm as an auxiliary loss consistently improves the performance on the other paradigm. Secondly, we explore an MTL sequence labeling model that parses both representations, a… ▽ More

    Submitted 7 January, 2020; v1 submitted 2 July, 2019; originally announced July 2019.

    Comments: Proc. of the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019). Revised version after fixing evaluation bug

  7. arXiv:1904.03417  [pdf, ps, other

    cs.CL

    Speeding Up Natural Language Parsing by Reusing Partial Results

    Authors: Michalina Strzyz, Carlos Gómez-Rodríguez

    Abstract: This paper proposes a novel technique that applies case-based reasoning in order to generate templates for reusable parse tree fragments, based on PoS tags of bigrams and trigrams that demonstrate low variability in their syntactic analyses from prior data. The aim of this approach is to improve the speed of dependency parsers by avoiding redundant calculations. This can be resolved by applying th… ▽ More

    Submitted 6 April, 2019; originally announced April 2019.

    Comments: Accepted manuscript for CICLing 2019. 10 pages

    MSC Class: 68T50 ACM Class: I.2.7

  8. arXiv:1902.10505  [pdf, other

    cs.CL cs.LG

    Viable Dependency Parsing as Sequence Labeling

    Authors: Michalina Strzyz, David Vilares, Carlos Gómez-Rodríguez

    Abstract: We recast dependency parsing as a sequence labeling problem, exploring several encodings of dependency trees as labels. While dependency parsing by means of sequence labeling had been attempted in existing work, results suggested that the technique was impractical. We show instead that with a conventional BiLSTM-based model it is possible to obtain fast and accurate parsers. These parsers are conc… ▽ More

    Submitted 29 March, 2019; v1 submitted 27 February, 2019; originally announced February 2019.

    Comments: Camera-ready version to appear at NAACL 2019 (final peer-reviewed manuscript). 8 pages (incl. appendix)

    MSC Class: 68T50 ACM Class: I.2.7