Showing 1–2 of 2 results for author: Lijiang, C
-
A Geometric Method to Obtain the Generation Probability of a Sentence
Authors:
Chen Lijiang
Abstract:
"How to generate a sentence" is the most critical and difficult problem in all the natural language processing technologies. In this paper, we present a new approach to explain the generation process of a sentence from the perspective of mathematics. Our method is based on the premise that in our brain a sentence is a part of a word network which is formed by many word nodes. Experiments show that…
▽ More
"How to generate a sentence" is the most critical and difficult problem in all the natural language processing technologies. In this paper, we present a new approach to explain the generation process of a sentence from the perspective of mathematics. Our method is based on the premise that in our brain a sentence is a part of a word network which is formed by many word nodes. Experiments show that the probability of the entire sentence can be obtained by the probabilities of single words and the probabilities of the co-occurrence of word pairs, which indicate that human use the synthesis method to generate a sentence.
△ Less
Submitted 4 June, 2014;
originally announced June 2014.
-
Coordinate System Selection for Minimum Error Rate Training in Statistical Machine Translation
Authors:
Chen Lijiang
Abstract:
Minimum error rate training (MERT) is a widely used training procedure for statistical machine translation. A general problem of this approach is that the search space is easy to converge to a local optimum and the acquired weight set is not in accord with the real distribution of feature functions. This paper introduces coordinate system selection (RSS) into the search algorithm for MERT. Contrar…
▽ More
Minimum error rate training (MERT) is a widely used training procedure for statistical machine translation. A general problem of this approach is that the search space is easy to converge to a local optimum and the acquired weight set is not in accord with the real distribution of feature functions. This paper introduces coordinate system selection (RSS) into the search algorithm for MERT. Contrary to previous approaches in which every dimension only corresponds to one independent feature function, we create several coordinate systems by moving one of the dimensions to a new direction. The basic idea is quite simple but critical that the training procedure of MERT should be based on a coordinate system formed by search directions but not directly on feature functions. Experiments show that by selecting coordinate systems with tuning set results, better results can be obtained without any other language knowledge.
△ Less
Submitted 10 May, 2014;
originally announced May 2014.