Word differences in news media of lower and higher peace countries revealed by natural language processing and machine learning
Authors:
Larry S. Liebovitch,
William Powers,
Lin Shi,
Allegra Chen-Carrel,
Philippe Loustaunau,
Peter T. Coleman
Abstract:
Language is both a cause and a consequence of the social processes that lead to conflict or peace. Hate speech can mobilize violence and destruction. What are the characteristics of peace speech that reflect and support the social processes that maintain peace? This study used existing peace indices, machine learning, and on-line, news media sources to identify the words most associated with lower…
▽ More
Language is both a cause and a consequence of the social processes that lead to conflict or peace. Hate speech can mobilize violence and destruction. What are the characteristics of peace speech that reflect and support the social processes that maintain peace? This study used existing peace indices, machine learning, and on-line, news media sources to identify the words most associated with lower-peace versus higher-peace countries. As each peace index measures different social properties, there is little consensus on the numerical values of these indices. There is however greater consensus with these indices for the countries that are at the extremes of lower-peace and higher-peace. Therefore, a data driven approach was used to find the words most important in distinguishing lower-peace and higher-peace countries. Rather than assuming a theoretical framework that predicts which words are more likely in lower-peace and higher-peace countries, and then searching for those words in news media, in this study, natural language processing and machine learning were used to identify the words that most accurately classified a country as lower-peace or higher-peace. Once the machine learning model was trained on the word frequencies from the extreme lower-peace and higher-peace countries, that model was also used to compute a quantitative peace index for these and other intermediate-peace countries. The model successfully yielded a quantitative peace index for intermediate-peace countries that was in between that of the lower-peace and higher-peace, even though they were not in the training set. This study demonstrates how natural language processing and machine learning can help to generate new quantitative measures of social systems, which in this study, were linguistic differences resulting in a quantitative index of peace for countries at different levels of peacefulness.
△ Less
Submitted 21 May, 2023;
originally announced May 2023.
RheFrameDetect: A Text Classification System for Automatic Detection of Rhetorical Frames in AI from Open Sources
Authors:
Saurav Ghosh,
Philippe Loustaunau
Abstract:
Rhetorical Frames in AI can be thought of as expressions that describe AI development as a competition between two or more actors, such as governments or companies. Examples of such Frames include robotic arms race, AI rivalry, technological supremacy, cyberwarfare dominance and 5G race. Detection of Rhetorical Frames from open sources can help us track the attitudes of governments or companies to…
▽ More
Rhetorical Frames in AI can be thought of as expressions that describe AI development as a competition between two or more actors, such as governments or companies. Examples of such Frames include robotic arms race, AI rivalry, technological supremacy, cyberwarfare dominance and 5G race. Detection of Rhetorical Frames from open sources can help us track the attitudes of governments or companies towards AI, specifically whether attitudes are becoming more cooperative or competitive over time. Given the rapidly increasing volumes of open sources (online news media, twitter, blogs), it is difficult for subject matter experts to identify Rhetorical Frames in (near) real-time. Moreover, these sources are in general unstructured (noisy) and therefore, detecting Frames from these sources will require state-of-the-art text classification techniques. In this paper, we develop RheFrameDetect, a text classification system for (near) real-time capture of Rhetorical Frames from open sources. Given an input document, RheFrameDetect employs text classification techniques at multiple levels (document level and paragraph level) to identify all occurrences of Frames used in the discussion of AI. We performed extensive evaluation of the text classification techniques used in RheFrameDetect against human annotated Frames from multiple news sources. To further demonstrate the effectiveness of RheFrameDetect, we show multiple case studies depicting the Frames identified by RheFrameDetect compared against human annotated Frames.
△ Less
Submitted 30 December, 2021;
originally announced December 2021.