Skip to main content

Showing 1–13 of 13 results for author: Sekine, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.02372  [pdf, ps, other

    cs.CL

    AnswerCarefully: A Dataset for Improving the Safety of Japanese LLM Output

    Authors: Hisami Suzuki, Satoru Katsumata, Takashi Kodama, Tetsuro Takahashi, Kouta Nakayama, Satoshi Sekine

    Abstract: In this paper we present AnswerCarefully, a dataset for promoting the safety and appropriateness of Japanese LLM outputs. The dataset consists of 1,800 pairs of questions and reference answers, where the questions require special attention in answering. It covers a wide range of risk categories established in prior English-language datasets, but the data samples are original in that they are manua… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

  2. arXiv:2409.08609  [pdf, other

    cs.LG

    Optimizing Item-based Marketing Promotion Efficiency in C2C Marketplace with Dynamic Sequential Coupon Allocation Framework

    Authors: Jie Yang, Padunna Valappil Krishnaraj Sekhar, Sho Sekine, Yilin Li

    Abstract: In e-commerce platforms, coupons play a crucial role in boosting transactions. In the customer-to-customer (C2C) marketplace, ensuring the satisfaction of both buyers and sellers is essential. While buyer-focused marketing strategies often receive more attention, addressing the needs of sellers is equally important. Additionally, the existing strategies tend to optimize each promotion independentl… ▽ More

    Submitted 13 September, 2024; originally announced September 2024.

    Journal ref: ACM SIGKDD 3rd Workshop on End-to-End Customer Journey Optimization, 2024

  3. arXiv:2407.14895  [pdf, other

    cs.IR

    Strategic Coupon Allocation for Increasing Providers' Sales Experiences in Two-sided Marketplaces

    Authors: Koya Ohashi, Sho Sekine, Deddy Jobson, Jie Yang, Naoki Nishimura, Noriyoshi Sukegawa, Yuichi Takano

    Abstract: In a two-sided marketplace, network effects are crucial for competitiveness, and platforms need to retain users through advanced customer relationship management as much as possible. Maintaining numerous providers' stable and active presence on the platform is highly important to enhance the marketplace's scale and diversity. The strongest motivation for providers to continue using the platform is… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

    Comments: 8 pages, 10 figures, KDD 2024 Workshop on Two-sided Marketplace Optimization: Search, Pricing, Matching & Growth

  4. arXiv:2407.03963  [pdf, other

    cs.CL cs.AI

    LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs

    Authors: LLM-jp, :, Akiko Aizawa, Eiji Aramaki, Bowen Chen, Fei Cheng, Hiroyuki Deguchi, Rintaro Enomoto, Kazuki Fujii, Kensuke Fukumoto, Takuya Fukushima, Namgi Han, Yuto Harada, Chikara Hashimoto, Tatsuya Hiraoka, Shohei Hisada, Sosuke Hosokawa, Lu Jie, Keisuke Kamata, Teruhito Kanazawa, Hiroki Kanezashi, Hiroshi Kataoka, Satoru Katsumata, Daisuke Kawahara, Seiya Kawano , et al. (58 additional authors not shown)

    Abstract: This paper introduces LLM-jp, a cross-organizational project for the research and development of Japanese large language models (LLMs). LLM-jp aims to develop open-source and strong Japanese LLMs, and as of this writing, more than 1,500 participants from academia and industry are working together for this purpose. This paper presents the background of the establishment of LLM-jp, summaries of its… ▽ More

    Submitted 30 December, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

  5. arXiv:2402.14531  [pdf

    cs.CL

    Should We Respect LLMs? A Cross-Lingual Study on the Influence of Prompt Politeness on LLM Performance

    Authors: Ziqi Yin, Hao Wang, Kaito Horio, Daisuke Kawahara, Satoshi Sekine

    Abstract: We investigate the impact of politeness levels in prompts on the performance of large language models (LLMs). Polite language in human communications often garners more compliance and effectiveness, while rudeness can cause aversion, impacting response quality. We consider that LLMs mirror human communication traits, suggesting they align with human cultural norms. We assess the impact of politene… ▽ More

    Submitted 14 October, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: SICon 2024

  6. arXiv:2305.05928  [pdf, ps, other

    cs.CL

    WikiSQE: A Large-Scale Dataset for Sentence Quality Estimation in Wikipedia

    Authors: Kenichiro Ando, Satoshi Sekine, Mamoru Komachi

    Abstract: Wikipedia can be edited by anyone and thus contains various quality sentences. Therefore, Wikipedia includes some poor-quality edits, which are often marked up by other editors. While editors' reviews enhance the credibility of Wikipedia, it is hard to check all edited text. Assisting in this process is very important, but a large and comprehensive dataset for studying it does not currently exist.… ▽ More

    Submitted 29 December, 2023; v1 submitted 10 May, 2023; originally announced May 2023.

    Comments: AAAI 2024 Main Track Accepted

  7. arXiv:2001.07558  [pdf, other

    cs.LG stat.ML

    Classifying Wikipedia in a fine-grained hierarchy: what graphs can contribute

    Authors: Tiphaine Viard, Thomas McLachlan, Hamidreza Ghader, Satoshi Sekine

    Abstract: Wikipedia is a huge opportunity for machine learning, being the largest semi-structured base of knowledge available. Because of this, many works examine its contents, and focus on structuring it in order to make it usable in learning tasks, for example by classifying it into an ontology. Beyond its textual contents, Wikipedia also displays a typical graph structure, where pages are linked together… ▽ More

    Submitted 22 January, 2020; v1 submitted 21 January, 2020; originally announced January 2020.

    Comments: 7 pages

  8. arXiv:1909.06502  [pdf, ps, other

    cs.CL

    Multi-class Multilingual Classification of Wikipedia Articles Using Extended Named Entity Tag Set

    Authors: Hassan S. Shavarani, Satoshi Sekine

    Abstract: Wikipedia is a great source of general world knowledge which can guide NLP models better understand their motivation to make predictions. Structuring Wikipedia is the initial step towards this goal which can facilitate fine-grain classification of articles. In this work, we introduce the Shinra 5-Language Categorization Dataset (SHINRA-5LDS), a large multi-lingual and multi-labeled set of annotate… ▽ More

    Submitted 5 March, 2020; v1 submitted 13 September, 2019; originally announced September 2019.

  9. arXiv:1909.04453  [pdf, other

    cs.CL

    Select and Attend: Towards Controllable Content Selection in Text Generation

    Authors: Xiaoyu Shen, Jun Suzuki, Kentaro Inui, Hui Su, Dietrich Klakow, Satoshi Sekine

    Abstract: Many text generation tasks naturally contain two steps: content selection and surface realization. Current neural encoder-decoder models conflate both steps into a black-box architecture. As a result, the content to be described in the text cannot be explicitly controlled. This paper tackles this problem by decoupling content selection from the decoder. The decoupled content selection is human int… ▽ More

    Submitted 10 September, 2019; originally announced September 2019.

    Comments: EMNLP 2019

  10. arXiv:1906.06448  [pdf, other

    cs.CL

    Can neural networks understand monotonicity reasoning?

    Authors: Hitomi Yanaka, Koji Mineshima, Daisuke Bekki, Kentaro Inui, Satoshi Sekine, Lasha Abzianidze, Johan Bos

    Abstract: Monotonicity reasoning is one of the important reasoning skills for any intelligent natural language inference (NLI) model in that it requires the ability to capture the interaction between lexical and syntactic structures. Since no test set has been developed for monotonicity reasoning with wide coverage, it is still unclear whether neural models can perform monotonicity reasoning in a proper way… ▽ More

    Submitted 27 June, 2019; v1 submitted 14 June, 2019; originally announced June 2019.

    Comments: accepted by ACL2019 BlackboxNLP (long paper)

  11. arXiv:1904.12166  [pdf, ps, other

    cs.CL

    HELP: A Dataset for Identifying Shortcomings of Neural Models in Monotonicity Reasoning

    Authors: Hitomi Yanaka, Koji Mineshima, Daisuke Bekki, Kentaro Inui, Satoshi Sekine, Lasha Abzianidze, Johan Bos

    Abstract: Large crowdsourced datasets are widely used for training and evaluating neural models on natural language inference (NLI). Despite these efforts, neural models have a hard time capturing logical inferences, including those licensed by phrase replacements, so-called monotonicity reasoning. Since no large dataset has been developed for monotonicity reasoning, it is still unclear whether the main obs… ▽ More

    Submitted 27 April, 2019; originally announced April 2019.

    Comments: 6 pages, 1 figure, accepted as *SEM 2019

  12. arXiv:1902.10118  [pdf, other

    cs.CL

    Multi-Task Learning with Contextualized Word Representations for Extented Named Entity Recognition

    Authors: Thai-Hoang Pham, Khai Mai, Nguyen Minh Trung, Nguyen Tuan Duc, Danushka Bolegala, Ryohei Sasano, Satoshi Sekine

    Abstract: Fine-Grained Named Entity Recognition (FG-NER) is critical for many NLP applications. While classical named entity recognition (NER) has attracted a substantial amount of research, FG-NER is still an open research domain. The current state-of-the-art (SOTA) model for FG-NER relies heavily on manual efforts for building a dictionary and designing hand-crafted features. The end-to-end framework whic… ▽ More

    Submitted 26 February, 2019; originally announced February 2019.

    Comments: 7 pages, 2 figures, 4 tables

  13. arXiv:1808.09384  [pdf, ps, other

    cs.CL cs.AI

    What Makes Reading Comprehension Questions Easier?

    Authors: Saku Sugawara, Kentaro Inui, Satoshi Sekine, Akiko Aizawa

    Abstract: A challenge in creating a dataset for machine reading comprehension (MRC) is to collect questions that require a sophisticated understanding of language to answer beyond using superficial cues. In this work, we investigate what makes questions easier across recent 12 MRC datasets with three question styles (answer extraction, description, and multiple choice). We propose to employ simple heuristic… ▽ More

    Submitted 28 August, 2018; originally announced August 2018.

    Comments: 12 pages, EMNLP2018