Search | arXiv e-print repository

CLaC at SemEval-2025 Task 6: A Multi-Architecture Approach for Corporate Environmental Promise Verification

Authors: Nawar Turk, Eeham Khan, Leila Kosseim

Abstract: This paper presents our approach to the SemEval-2025 Task~6 (PromiseEval), which focuses on verifying promises in corporate ESG (Environmental, Social, and Governance) reports. We explore three model architectures to address the four subtasks of promise identification, supporting evidence assessment, clarity evaluation, and verification timing. Our first model utilizes ESG-BERT with task-specific… ▽ More This paper presents our approach to the SemEval-2025 Task~6 (PromiseEval), which focuses on verifying promises in corporate ESG (Environmental, Social, and Governance) reports. We explore three model architectures to address the four subtasks of promise identification, supporting evidence assessment, clarity evaluation, and verification timing. Our first model utilizes ESG-BERT with task-specific classifier heads, while our second model enhances this architecture with linguistic features tailored for each subtask. Our third approach implements a combined subtask model with attention-based sequence pooling, transformer representations augmented with document metadata, and multi-objective learning. Experiments on the English portion of the ML-Promise dataset demonstrate progressive improvement across our models, with our combined subtask approach achieving a leaderboard score of 0.5268, outperforming the provided baseline of 0.5227. Our work highlights the effectiveness of linguistic feature extraction, attention pooling, and multi-objective learning in promise verification tasks, despite challenges posed by class imbalance and limited training data. △ Less

Submitted 29 May, 2025; originally announced May 2025.

Comments: Accepted to SemEval-2025 Task 6 (ACL 2025)

arXiv:2504.05521 [pdf, other]

Deep Reinforcement Learning Algorithms for Option Hedging

Authors: Andrei Neagu, Frédéric Godin, Leila Kosseim

Abstract: Dynamic hedging is a financial strategy that consists in periodically transacting one or multiple financial assets to offset the risk associated with a correlated liability. Deep Reinforcement Learning (DRL) algorithms have been used to find optimal solutions to dynamic hedging problems by framing them as sequential decision-making problems. However, most previous work assesses the performance of… ▽ More Dynamic hedging is a financial strategy that consists in periodically transacting one or multiple financial assets to offset the risk associated with a correlated liability. Deep Reinforcement Learning (DRL) algorithms have been used to find optimal solutions to dynamic hedging problems by framing them as sequential decision-making problems. However, most previous work assesses the performance of only one or two DRL algorithms, making an objective comparison across algorithms difficult. In this paper, we compare the performance of eight DRL algorithms in the context of dynamic hedging; Monte Carlo Policy Gradient (MCPG), Proximal Policy Optimization (PPO), along with four variants of Deep Q-Learning (DQL) and two variants of Deep Deterministic Policy Gradient (DDPG). Two of these variants represent a novel application to the task of dynamic hedging. In our experiments, we use the Black-Scholes delta hedge as a baseline and simulate the dataset using a GJR-GARCH(1,1) model. Results show that MCPG, followed by PPO, obtain the best performance in terms of the root semi-quadratic penalty. Moreover, MCPG is the only algorithm to outperform the Black-Scholes delta hedge baseline with the allotted computational budget, possibly due to the sparsity of rewards in our environment. △ Less

Submitted 16 April, 2025; v1 submitted 7 April, 2025; originally announced April 2025.

arXiv:2408.08971 [pdf, ps, other]

A Multi-Task and Multi-Label Classification Model for Implicit Discourse Relation Recognition

Authors: Nelson Filipe Costa, Leila Kosseim

Abstract: We propose a novel multi-label classification approach to implicit discourse relation recognition (IDRR). Our approach features a multi-task model that jointly learns multi-label representations of implicit discourse relations across all three sense levels in the PDTB 3.0 framework. The model can also be adapted to the traditional single-label IDRR setting by selecting the sense with the highest p… ▽ More We propose a novel multi-label classification approach to implicit discourse relation recognition (IDRR). Our approach features a multi-task model that jointly learns multi-label representations of implicit discourse relations across all three sense levels in the PDTB 3.0 framework. The model can also be adapted to the traditional single-label IDRR setting by selecting the sense with the highest probability in the multi-label representation. We conduct extensive experiments to identify optimal model configurations and loss functions in both settings. Our approach establishes the first benchmark for multi-label IDRR and achieves SOTA results on single-label IDRR using DiscoGeM. Finally, we evaluate our model on the PDTB 3.0 corpus in the single-label setting, presenting the first analysis of transfer learning between the DiscoGeM and PDTB 3.0 corpora for IDRR. △ Less

Submitted 8 July, 2025; v1 submitted 16 August, 2024; originally announced August 2024.

Comments: Accepted at SIGDIAL 2025

arXiv:2407.01784 [pdf, other]

Analyzing Persuasive Strategies in Meme Texts: A Fusion of Language Models with Paraphrase Enrichment

Authors: Kota Shamanth Ramanath Nayak, Leila Kosseim

Abstract: This paper describes our approach to hierarchical multi-label detection of persuasion techniques in meme texts. Our model, developed as a part of the recent SemEval task, is based on fine-tuning individual language models (BERT, XLM-RoBERTa, and mBERT) and leveraging a mean-based ensemble model in addition to dataset augmentation through paraphrase generation from ChatGPT. The scope of the study e… ▽ More This paper describes our approach to hierarchical multi-label detection of persuasion techniques in meme texts. Our model, developed as a part of the recent SemEval task, is based on fine-tuning individual language models (BERT, XLM-RoBERTa, and mBERT) and leveraging a mean-based ensemble model in addition to dataset augmentation through paraphrase generation from ChatGPT. The scope of the study encompasses enhancing model performance through innovative training techniques and data augmentation strategies. The problem addressed is the effective identification and classification of multiple persuasive techniques in meme texts, a task complicated by the diversity and complexity of such content. The objective of the paper is to improve detection accuracy by refining model training methods and examining the impact of balanced versus unbalanced training datasets. Novelty in the results and discussion lies in the finding that training with paraphrases enhances model performance, yet a balanced training set proves more advantageous than a larger unbalanced one. Additionally, the analysis reveals the potential pitfalls of indiscriminate incorporation of paraphrases from diverse distributions, which can introduce substantial noise. Results with the SemEval 2024 data confirm these insights, demonstrating improved model efficacy with the proposed methods. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: 15 pages, 8 figures, 1 table, Proceedings of 5th International Conference on Natural Language Processing and Applications (NLPA 2024)

Journal ref: Computer Science & Information Technology (CS & IT), ISSN : 2231 - 5403, Volume 14, Number 11, June 2024

arXiv:2402.13326 [pdf, other]

Deep Hedging with Market Impact

Authors: Andrei Neagu, Frédéric Godin, Clarence Simard, Leila Kosseim

Abstract: Dynamic hedging is the practice of periodically transacting financial instruments to offset the risk caused by an investment or a liability. Dynamic hedging optimization can be framed as a sequential decision problem; thus, Reinforcement Learning (RL) models were recently proposed to tackle this task. However, existing RL works for hedging do not consider market impact caused by the finite liquidi… ▽ More Dynamic hedging is the practice of periodically transacting financial instruments to offset the risk caused by an investment or a liability. Dynamic hedging optimization can be framed as a sequential decision problem; thus, Reinforcement Learning (RL) models were recently proposed to tackle this task. However, existing RL works for hedging do not consider market impact caused by the finite liquidity of traded instruments. Integrating such feature can be crucial to achieve optimal performance when hedging options on stocks with limited liquidity. In this paper, we propose a novel general market impact dynamic hedging model based on Deep Reinforcement Learning (DRL) that considers several realistic features such as convex market impacts, and impact persistence through time. The optimal policy obtained from the DRL model is analysed using several option hedging simulations and compared to commonly used procedures such as delta hedging. Results show our DRL model behaves better in contexts of low liquidity by, among others: 1) learning the extent to which portfolio rebalancing actions should be dampened or delayed to avoid high costs, 2) factoring in the impact of features not considered by conventional approaches, such as previous hedging errors through the portfolio value, and the underlying asset's drift (i.e. the magnitude of its expected return). △ Less

Submitted 22 February, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

Comments: 13 pages, 5 figures

arXiv:1904.11018 [pdf, other]

Toponym Identification in Epidemiology Articles - A Deep Learning Approach

Authors: MohammadReza Davari, Leila Kosseim, Tien D. Bui

Abstract: When analyzing the spread of viruses, epidemiologists often need to identify the location of infected hosts. This information can be found in public databases, such as GenBank, however, information provided in these databases are usually limited to the country or state level. More fine-grained localization information requires phylogeographers to manually read relevant scientific articles. In this… ▽ More When analyzing the spread of viruses, epidemiologists often need to identify the location of infected hosts. This information can be found in public databases, such as GenBank, however, information provided in these databases are usually limited to the country or state level. More fine-grained localization information requires phylogeographers to manually read relevant scientific articles. In this work we propose an approach to automate the process of place name identification from medical (epidemiology) articles. The focus of this paper is to propose a deep learning based model for toponym detection and experiment with the use of external linguistic features and domain specific information. The model was evaluated using a collection of 105 epidemiology articles from PubMed Central provided by the recent SemEval task 12. Our best detection model achieves an F1 score of $80.13\%$, a significant improvement compared to the state of the art of $69.84\%$. These results underline the importance of domain specific embedding as well as specific linguistic features in toponym detection in medical journals. △ Less

Submitted 27 April, 2019; v1 submitted 24 April, 2019; originally announced April 2019.

Comments: 12 pages. pre-print from Proceedings of CICLing 2019: 20th International Conference on Computational Linguistics and Intelligent Text Processing

arXiv:1709.02843 [pdf, ps, other]

CLaC at SemEval-2016 Task 11: Exploring linguistic and psycho-linguistic Features for Complex Word Identification

Authors: Elnaz Davoodi, Leila Kosseim

Abstract: This paper describes the system deployed by the CLaC-EDLK team to the "SemEval 2016, Complex Word Identification task". The goal of the task is to identify if a given word in a given context is "simple" or "complex". Our system relies on linguistic features and cognitive complexity. We used several supervised models, however the Random Forest model outperformed the others. Overall our best configu… ▽ More This paper describes the system deployed by the CLaC-EDLK team to the "SemEval 2016, Complex Word Identification task". The goal of the task is to identify if a given word in a given context is "simple" or "complex". Our system relies on linguistic features and cognitive complexity. We used several supervised models, however the Random Forest model outperformed the others. Overall our best configuration achieved a G-score of 68.8% in the task, ranking our system 21 out of 45. △ Less

Submitted 8 September, 2017; originally announced September 2017.

Comments: In Proceedings of the International Workshop on Semantic Evaluation (SemEval-2016), a workshop of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-2016) pp 982-985. June 16-17, San Diego, California

arXiv:1708.05857 [pdf, other]

The CLaC Discourse Parser at CoNLL-2015

Authors: Majid Laali, Elnaz Davoodi, Leila Kosseim

Abstract: This paper describes our submission (kosseim15) to the CoNLL-2015 shared task on shallow discourse parsing. We used the UIMA framework to develop our parser and used ClearTK to add machine learning functionality to the UIMA framework. Overall, our parser achieves a result of 17.3 F1 on the identification of discourse relations on the blind CoNLL-2015 test set, ranking in sixth place. This paper describes our submission (kosseim15) to the CoNLL-2015 shared task on shallow discourse parsing. We used the UIMA framework to develop our parser and used ClearTK to add machine learning functionality to the UIMA framework. Overall, our parser achieves a result of 17.3 F1 on the identification of discourse relations on the blind CoNLL-2015 test set, ranking in sixth place. △ Less

Submitted 19 August, 2017; originally announced August 2017.

Comments: Proceedings of the Nineteenth Conference on Computational Natural Language Learning Shared Task (CoNLL 2015). Beijing, China

arXiv:1708.05803 [pdf, ps, other]

Measuring the Effect of Discourse Relations on Blog Summarization

Authors: Shamima Mithun, Leila Kosseim

Abstract: The work presented in this paper attempts to evaluate and quantify the use of discourse relations in the context of blog summarization and compare their use to more traditional and factual texts. Specifically, we measured the usefulness of 6 discourse relations - namely comparison, contingency, illustration, attribution, topic-opinion, and attributive for the task of text summarization from blogs.… ▽ More The work presented in this paper attempts to evaluate and quantify the use of discourse relations in the context of blog summarization and compare their use to more traditional and factual texts. Specifically, we measured the usefulness of 6 discourse relations - namely comparison, contingency, illustration, attribution, topic-opinion, and attributive for the task of text summarization from blogs. We have evaluated the effect of each relation using the TAC 2008 opinion summarization dataset and compared them with the results with the DUC 2007 dataset. The results show that in both textual genres, contingency, comparison, and illustration relations provide a significant improvement on summarization content; while attribution, topic-opinion, and attributive relations do not provide a consistent and significant improvement. These results indicate that, at least for summarization, discourse relations are just as useful for informal and affective texts as for more traditional news articles. △ Less

Submitted 19 August, 2017; originally announced August 2017.

Comments: In Proceedings of the 6th International Joint Conference on Natural Language Processing (IJCNLP 2013), pages 1401-1409, October 2013, Nagoya, Japan

arXiv:1708.05801 [pdf, other]

ClaC: Semantic Relatedness of Words and Phrases

Authors: Reda Siblini, Leila Kosseim

Abstract: The measurement of phrasal semantic relatedness is an important metric for many natural language processing applications. In this paper, we present three approaches for measuring phrasal semantics, one based on a semantic network model, another on a distributional similarity model, and a hybrid between the two. Our hybrid approach achieved an F-measure of 77.4% on the task of evaluating the semant… ▽ More The measurement of phrasal semantic relatedness is an important metric for many natural language processing applications. In this paper, we present three approaches for measuring phrasal semantics, one based on a semantic network model, another on a distributional similarity model, and a hybrid between the two. Our hybrid approach achieved an F-measure of 77.4% on the task of evaluating the semantic similarity of words and compositional phrases. △ Less

Submitted 18 August, 2017; originally announced August 2017.

Comments: In Proceedings of the Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013),June, Atlanta, Georgia, USA, pp. 108-113

arXiv:1708.05800 [pdf, ps, other]

On the Contribution of Discourse Structure on Text Complexity Assessment

Authors: Elnaz Davoodi, Leila Kosseim

Abstract: This paper investigates the influence of discourse features on text complexity assessment. To do so, we created two data sets based on the Penn Discourse Treebank and the Simple English Wikipedia corpora and compared the influence of coherence, cohesion, surface, lexical and syntactic features to assess text complexity. Results show that with both data sets coherence features are more correlated… ▽ More This paper investigates the influence of discourse features on text complexity assessment. To do so, we created two data sets based on the Penn Discourse Treebank and the Simple English Wikipedia corpora and compared the influence of coherence, cohesion, surface, lexical and syntactic features to assess text complexity. Results show that with both data sets coherence features are more correlated to text complexity than the other types of features. In addition, feature selection revealed that with both data sets the top most discriminating feature is a coherence feature. △ Less

Submitted 18 August, 2017; originally announced August 2017.

Comments: In Proceedings of the 17th Annual SigDial Meeting on Discourse and Dialogue (SigDial 2016). pp 166-174. September 13-15. Los Angeles, USA

arXiv:1708.05798 [pdf, ps, other]

The CLaC Discourse Parser at CoNLL-2016

Authors: Majid Laali, Andre Cianflone, Leila Kosseim

Abstract: This paper describes our submission "CLaC" to the CoNLL-2016 shared task on shallow discourse parsing. We used two complementary approaches for the task. A standard machine learning approach for the parsing of explicit relations, and a deep learning approach for non-explicit relations. Overall, our parser achieves an F1-score of 0.2106 on the identification of discourse relations (0.3110 for expli… ▽ More This paper describes our submission "CLaC" to the CoNLL-2016 shared task on shallow discourse parsing. We used two complementary approaches for the task. A standard machine learning approach for the parsing of explicit relations, and a deep learning approach for non-explicit relations. Overall, our parser achieves an F1-score of 0.2106 on the identification of discourse relations (0.3110 for explicit relations and 0.1219 for non-explicit relations) on the blind CoNLL-2016 test set. △ Less

Submitted 18 August, 2017; originally announced August 2017.

Comments: In Proceedings of the Twentieth Conference on Computational Natural Language Learning: Shared Task. pp 92-99. July 7-12, 2016. Berlin, Germany

arXiv:1708.05797 [pdf, other]

CLaC @ QATS: Quality Assessment for Text Simplification

Authors: Elnaz Davoodi, Leila Kosseim

Abstract: This paper describes our approach to the 2016 QATS quality assessment shared task. We trained three independent Random Forest classifiers in order to assess the quality of the simplified texts in terms of grammaticality, meaning preservation and simplicity. We used the language model of Google-Ngram as feature to predict the grammaticality. Meaning preservation is predicted using two complementary… ▽ More This paper describes our approach to the 2016 QATS quality assessment shared task. We trained three independent Random Forest classifiers in order to assess the quality of the simplified texts in terms of grammaticality, meaning preservation and simplicity. We used the language model of Google-Ngram as feature to predict the grammaticality. Meaning preservation is predicted using two complementary approaches based on word embedding and WordNet synonyms. A wider range of features including TF-IDF, sentence length and frequency of cue phrases are used to evaluate the simplicity aspect. Overall, the accuracy of the system ranges from 33.33% for the overall aspect to 58.73% for grammaticality. △ Less

Submitted 18 August, 2017; originally announced August 2017.

Comments: In Proceedings of the Workshop Shared task on Quality Assessment for Text Simplification (QATS-2016), a workshop of the 10th Language Resources and Evaluation Conference (LREC-2016), pp. 53-56, May 23-28, Portoroz, Slovenia

arXiv:1708.03541 [pdf, ps, other]

Automatic Identification of AltLexes using Monolingual Parallel Corpora

Authors: Elnaz Davoodi, Leila Kosseim

Abstract: The automatic identification of discourse relations is still a challenging task in natural language processing. Discourse connectives, such as "since" or "but", are the most informative cues to identify explicit relations; however discourse parsers typically use a closed inventory of such connectives. As a result, discourse relations signaled by markers outside these inventories (i.e. AltLexes) ar… ▽ More The automatic identification of discourse relations is still a challenging task in natural language processing. Discourse connectives, such as "since" or "but", are the most informative cues to identify explicit relations; however discourse parsers typically use a closed inventory of such connectives. As a result, discourse relations signaled by markers outside these inventories (i.e. AltLexes) are not detected as effectively. In this paper, we propose a novel method to leverage parallel corpora in text simplification and lexical resources to automatically identify alternative lexicalizations that signal discourse relation. When applied to the Simple Wikipedia and Newsela corpora along with WordNet and the PPDB, the method allowed the automatic discovery of 91 AltLexes. △ Less

Submitted 11 August, 2017; originally announced August 2017.

Comments: 6 pages, Proceedings of Recent Advances in Natural Language Processing (RANLP 2017)

arXiv:1708.03425 [pdf, other]

Argument Labeling of Explicit Discourse Relations using LSTM Neural Networks

Authors: Sohail Hooda, Leila Kosseim

Abstract: Argument labeling of explicit discourse relations is a challenging task. The state of the art systems achieve slightly above 55% F-measure but require hand-crafted features. In this paper, we propose a Long Short Term Memory (LSTM) based model for argument labeling. We experimented with multiple configurations of our model. Using the PDTB dataset, our best model achieved an F1 measure of 23.05% wi… ▽ More Argument labeling of explicit discourse relations is a challenging task. The state of the art systems achieve slightly above 55% F-measure but require hand-crafted features. In this paper, we propose a Long Short Term Memory (LSTM) based model for argument labeling. We experimented with multiple configurations of our model. Using the PDTB dataset, our best model achieved an F1 measure of 23.05% without any feature engineering. This is significantly higher than the 20.52% achieved by the state of the art RNN approach, but significantly lower than the feature based state of the art systems. On the other hand, because our approach learns only from the raw dataset, it is more widely applicable to multiple textual genres and languages. △ Less

Submitted 7 September, 2017; v1 submitted 10 August, 2017; originally announced August 2017.

Comments: Proceedings of Recent Advances in Natural Language Processing (RANLP 2017), pp. 309-315, 4-6 September, Varna, Bulgaria

arXiv:1708.03421 [pdf, other]

N-gram and Neural Language Models for Discriminating Similar Languages

Authors: Andre Cianflone, Leila Kosseim

Abstract: This paper describes our submission (named clac) to the 2016 Discriminating Similar Languages (DSL) shared task. We participated in the closed Sub-task 1 (Set A) with two separate machine learning techniques. The first approach is a character based Convolution Neural Network with a bidirectional long short term memory (BiLSTM) layer (CLSTM), which achieved an accuracy of 78.45% with minimal tuning… ▽ More This paper describes our submission (named clac) to the 2016 Discriminating Similar Languages (DSL) shared task. We participated in the closed Sub-task 1 (Set A) with two separate machine learning techniques. The first approach is a character based Convolution Neural Network with a bidirectional long short term memory (BiLSTM) layer (CLSTM), which achieved an accuracy of 78.45% with minimal tuning. The second approach is a character-based n-gram model. This last approach achieved an accuracy of 88.45% which is close to the accuracy of 89.38% achieved by the best submission, and allowed us to rank #7 overall. △ Less

Submitted 10 August, 2017; originally announced August 2017.

Comments: 8 pages

Journal ref: Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3). A workshop of the 26th International Conference on Computational Linguistics (COLING 2016, Osaka, Japan), pp 243-250 (2016)

arXiv:1707.06357 [pdf, ps, other]

Improving Discourse Relation Projection to Build Discourse Annotated Corpora

Authors: Majid Laali, Leila Kosseim

Abstract: The naive approach to annotation projection is not effective to project discourse annotations from one language to another because implicit discourse relations are often changed to explicit ones and vice-versa in the translation. In this paper, we propose a novel approach based on the intersection between statistical word-alignment models to identify unsupported discourse annotations. This approac… ▽ More The naive approach to annotation projection is not effective to project discourse annotations from one language to another because implicit discourse relations are often changed to explicit ones and vice-versa in the translation. In this paper, we propose a novel approach based on the intersection between statistical word-alignment models to identify unsupported discourse annotations. This approach identified 65% of the unsupported annotations in the English-French parallel sentences from Europarl. By filtering out these unsupported annotations, we induced the first PDTB-style discourse annotated corpus for French from Europarl. We then used this corpus to train a classifier to identify the discourse-usage of French discourse connectives and show a 15% improvement of F1-score compared to the classifier trained on the non-filtered annotations. △ Less

Submitted 19 July, 2017; originally announced July 2017.

arXiv:1706.09856 [pdf, ps, other]

Automatic Mapping of French Discourse Connectives to PDTB Discourse Relations

Authors: Majid Laali, Leila Kosseim

Abstract: In this paper, we present an approach to exploit phrase tables generated by statistical machine translation in order to map French discourse connectives to discourse relations. Using this approach, we created ConcoLeDisCo, a lexicon of French discourse connectives and their PDTB relations. When evaluated against LEXCONN, ConcoLeDisCo achieves a recall of 0.81 and an Average Precision of 0.68 for t… ▽ More In this paper, we present an approach to exploit phrase tables generated by statistical machine translation in order to map French discourse connectives to discourse relations. Using this approach, we created ConcoLeDisCo, a lexicon of French discourse connectives and their PDTB relations. When evaluated against LEXCONN, ConcoLeDisCo achieves a recall of 0.81 and an Average Precision of 0.68 for the Concession and Condition relations. △ Less

Submitted 29 June, 2017; originally announced June 2017.

arXiv:1704.05162 [pdf, other]

Automatic Disambiguation of French Discourse Connectives

Authors: Majid Laali, Leila Kosseim

Abstract: Discourse connectives (e.g. however, because) are terms that can explicitly convey a discourse relation within a text. While discourse connectives have been shown to be an effective clue to automatically identify discourse relations, they are not always used to convey such relations, thus they should first be disambiguated between discourse-usage non-discourse-usage. In this paper, we investigate… ▽ More Discourse connectives (e.g. however, because) are terms that can explicitly convey a discourse relation within a text. While discourse connectives have been shown to be an effective clue to automatically identify discourse relations, they are not always used to convey such relations, thus they should first be disambiguated between discourse-usage non-discourse-usage. In this paper, we investigate the applicability of features proposed for the disambiguation of English discourse connectives for French. Our results with the French Discourse Treebank (FDTB) show that syntactic and lexical features developed for English texts are as effective for French and allow the disambiguation of French discourse connectives with an accuracy of 94.2%. △ Less

Submitted 17 April, 2017; originally announced April 2017.

Journal ref: International Journal of Computational Linguistics and Applications, vol. 7, no. 1, 2016, pp. 11-30

Showing 1–19 of 19 results for author: Kosseim, L