-
Large Language Models as 'Hidden Persuaders': Fake Product Reviews are Indistinguishable to Humans and Machines
Authors:
Weiyao Meng,
John Harvey,
James Goulding,
Chris James Carter,
Evgeniya Lukinova,
Andrew Smith,
Paul Frobisher,
Mina Forrest,
Georgiana Nica-Avram
Abstract:
Reading and evaluating product reviews is central to how most people decide what to buy and consume online. However, the recent emergence of Large Language Models and Generative Artificial Intelligence now means writing fraudulent or fake reviews is potentially easier than ever. Through three studies we demonstrate that (1) humans are no longer able to distinguish between real and fake product rev…
▽ More
Reading and evaluating product reviews is central to how most people decide what to buy and consume online. However, the recent emergence of Large Language Models and Generative Artificial Intelligence now means writing fraudulent or fake reviews is potentially easier than ever. Through three studies we demonstrate that (1) humans are no longer able to distinguish between real and fake product reviews generated by machines, averaging only 50.8% accuracy overall - essentially the same that would be expected by chance alone; (2) that LLMs are likewise unable to distinguish between fake and real reviews and perform equivalently bad or even worse than humans; and (3) that humans and LLMs pursue different strategies for evaluating authenticity which lead to equivalently bad accuracy, but different precision, recall and F1 scores - indicating they perform worse at different aspects of judgment. The results reveal that review systems everywhere are now susceptible to mechanised fraud if they do not depend on trustworthy purchase verification to guarantee the authenticity of reviewers. Furthermore, the results provide insight into the consumer psychology of how humans judge authenticity, demonstrating there is an inherent 'scepticism bias' towards positive reviews and a special vulnerability to misjudge the authenticity of fake negative reviews. Additionally, results provide a first insight into the 'machine psychology' of judging fake reviews, revealing that the strategies LLMs take to evaluate authenticity radically differ from humans, in ways that are equally wrong in terms of accuracy, but different in their misjudgments.
△ Less
Submitted 16 June, 2025;
originally announced June 2025.
-
Urban mapping in Dar es Salaam using AJIVE
Authors:
Rachel J. Carrington,
Ian L. Dryden,
Madeleine Ellis,
James O. Goulding,
Simon P. Preston,
David J. Sirl
Abstract:
Mapping deprivation in urban areas is important, for example for identifying areas of greatest need and planning interventions. Traditional ways of obtaining deprivation estimates are based on either census or household survey data, which in many areas is unavailable or difficult to collect. However, there has been a huge rise in the amount of new, non-traditional forms of data, such as satellite…
▽ More
Mapping deprivation in urban areas is important, for example for identifying areas of greatest need and planning interventions. Traditional ways of obtaining deprivation estimates are based on either census or household survey data, which in many areas is unavailable or difficult to collect. However, there has been a huge rise in the amount of new, non-traditional forms of data, such as satellite imagery and cell-phone call-record data, which may contain information useful for identifying deprivation. We use Angle-Based Joint and Individual Variation Explained (AJIVE) to jointly model satellite imagery data, cell-phone data, and survey data for the city of Dar es Salaam, Tanzania. We first identify interpretable low-dimensional structure from the imagery and cell-phone data, and find that we can use these to identify deprivation. We then consider what is gained from further incorporating the more traditional and costly survey data. We also introduce a scalar measure of deprivation as a response variable to be predicted, and consider various approaches to multiview regression, including using AJIVE scores as predictors.
△ Less
Submitted 12 June, 2025; v1 submitted 13 March, 2024;
originally announced March 2024.
-
Exploring the Unexplored: Understanding the Impact of Layer Adjustments on Image Classification
Authors:
Haixia Liu,
Tim Brailsford,
James Goulding,
Gavin Smith,
Larry Bull
Abstract:
This paper investigates how adjustments to deep learning architectures impact model performance in image classification. Small-scale experiments generate initial insights although the trends observed are not consistent with the entire dataset. Filtering operations in the image processing pipeline are crucial, with image filtering before pre-processing yielding better results. The choice and order…
▽ More
This paper investigates how adjustments to deep learning architectures impact model performance in image classification. Small-scale experiments generate initial insights although the trends observed are not consistent with the entire dataset. Filtering operations in the image processing pipeline are crucial, with image filtering before pre-processing yielding better results. The choice and order of layers as well as filter placement significantly impact model performance. This study provides valuable insights into optimizing deep learning models, with potential avenues for future research including collaborative platforms.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
CCC/Code 8.7: Applying AI in the Fight Against Modern Slavery
Authors:
Nadya Bliss,
Mark Briers,
Alice Eckstein,
James Goulding,
Daniel P. Lopresti,
Anjali Mazumder,
Gavin Smith
Abstract:
On any given day, tens of millions of people find themselves trapped in instances of modern slavery. The terms "human trafficking," "trafficking in persons," and "modern slavery" are sometimes used interchangeably to refer to both sex trafficking and forced labor. Human trafficking occurs when a trafficker compels someone to provide labor or services through the use of force, fraud, and/or coercio…
▽ More
On any given day, tens of millions of people find themselves trapped in instances of modern slavery. The terms "human trafficking," "trafficking in persons," and "modern slavery" are sometimes used interchangeably to refer to both sex trafficking and forced labor. Human trafficking occurs when a trafficker compels someone to provide labor or services through the use of force, fraud, and/or coercion. The wide range of stakeholders in human trafficking presents major challenges. Direct stakeholders are law enforcement, NGOs and INGOs, businesses, local/planning government authorities, and survivors. Viewed from a very high level, all stakeholders share in a rich network of interactions that produce and consume enormous amounts of information. The problems of making efficient use of such information for the purposes of fighting trafficking while at the same time adhering to community standards of privacy and ethics are formidable. At the same time they help us, technologies that increase surveillance of populations can also undermine basic human rights.
In early March 2020, the Computing Community Consortium (CCC), in collaboration with the Code 8.7 Initiative, brought together over fifty members of the computing research community along with anti-slavery practitioners and survivors to lay out a research roadmap. The primary goal was to explore ways in which long-range research in artificial intelligence (AI) could be applied to the fight against human trafficking. Building on the kickoff Code 8.7 conference held at the headquarters of the United Nations in February 2019, the focus for this workshop was to link the ambitious goals outlined in the A 20-Year Community Roadmap for Artificial Intelligence Research in the US (AI Roadmap) to challenges vital in achieving the UN's Sustainable Development Goal Target 8.7, the elimination of modern slavery.
△ Less
Submitted 24 June, 2021;
originally announced June 2021.
-
The Bayesian Spatial Bradley--Terry Model: Urban Deprivation Modeling in Tanzania
Authors:
R. G. Seymour,
D. Sirl,
S. Preston,
I. L. Dryden,
M. J. A. Ellis,
B. Perrat,
J. Goulding
Abstract:
Identifying the most deprived regions of any country or city is key if policy makers are to design successful interventions. However, locating areas with the greatest need is often surprisingly challenging in developing countries. Due to the logistical challenges of traditional household surveying, official statistics can be slow to be updated; estimates that exist can be coarse, a consequence of…
▽ More
Identifying the most deprived regions of any country or city is key if policy makers are to design successful interventions. However, locating areas with the greatest need is often surprisingly challenging in developing countries. Due to the logistical challenges of traditional household surveying, official statistics can be slow to be updated; estimates that exist can be coarse, a consequence of prohibitive costs and poor infrastructures; and mass urbanisation can render manually surveyed figures rapidly out-of-date. Comparative judgement models, such as the Bradley--Terry model, offer a promising solution. Leveraging local knowledge, elicited via comparisons of different areas' affluence, such models can both simplify logistics and circumvent biases inherent to house-hold surveys. Yet widespread adoption remains limited, due to the large amount of data existing approaches still require. We address this via development of a novel Bayesian Spatial Bradley--Terry model, which substantially decreases the amount of data comparisons required for effective inference. This model integrates a network representation of the city or country, along with assumptions of spatial smoothness that allow deprivation in one area to be informed by neighbouring areas. We demonstrate the practical effectiveness of this method, through a novel comparative judgement data set collected in Dar es Salaam, Tanzania.
△ Less
Submitted 28 October, 2021; v1 submitted 27 October, 2020;
originally announced October 2020.
-
On Bayesian inferential tasks with infinite-state jump processes: efficient data augmentation
Authors:
Iker Perez,
Lax Chan,
Mercedes Torres Torres,
James Goulding,
Theodore Kypraios
Abstract:
Advances in sampling schemes for Markov jump processes have recently enabled multiple inferential tasks. However, in statistical and machine learning applications, we often require that these continuous-time models find support on structured and infinite state spaces. In these cases, exact sampling may only be achieved by often inefficient particle filtering procedures, and rapidly augmenting obse…
▽ More
Advances in sampling schemes for Markov jump processes have recently enabled multiple inferential tasks. However, in statistical and machine learning applications, we often require that these continuous-time models find support on structured and infinite state spaces. In these cases, exact sampling may only be achieved by often inefficient particle filtering procedures, and rapidly augmenting observed datasets remains a significant challenge. Here, we build on the principles of uniformization and present a tractable framework to address this problem, which greatly improves the efficiency of existing state-of-the-art methods commonly used in small finite-state systems, and further scales their use to infinite-state scenarios. We capitalize on the marginal role of variable subsets in a model hierarchy during the process jumps, and describe an algorithm that relies on measurable mappings between pairs of states and carefully designed sets of synthetic jump observations. The proposed method enables the efficient integration of slice sampling techniques and it can overcome the existing computational bottleneck. We offer evidence by means of experiments addressing inference and clustering tasks on both simulated and real data sets.
△ Less
Submitted 6 June, 2018;
originally announced June 2018.
-
AMP: a new time-frequency feature extraction method for intermittent time-series data
Authors:
Duncan Barrack,
James Goulding,
Keith Hopcraft,
Simon Preston,
Gavin Smith
Abstract:
The characterisation of time-series data via their most salient features is extremely important in a range of machine learning task, not least of all with regards to classification and clustering. While there exist many feature extraction techniques suitable for non-intermittent time-series data, these approaches are not always appropriate for intermittent time-series data, where intermittency is…
▽ More
The characterisation of time-series data via their most salient features is extremely important in a range of machine learning task, not least of all with regards to classification and clustering. While there exist many feature extraction techniques suitable for non-intermittent time-series data, these approaches are not always appropriate for intermittent time-series data, where intermittency is characterized by constant values for large periods of time punctuated by sharp and transient increases or decreases in value.
Motivated by this, we present aggregation, mode decomposition and projection (AMP) a feature extraction technique particularly suited to intermittent time-series data which contain time-frequency patterns. For our method all individual time-series within a set are combined to form a non-intermittent aggregate. This is decomposed into a set of components which represent the intrinsic time-frequency signals within the data set. Individual time-series can then be fit to these components to obtain a set of numerical features that represent their intrinsic time-frequency patterns. To demonstrate the effectiveness of AMP, we evaluate against the real word task of clustering intermittent time-series data. Using synthetically generated data we show that a clustering approach which uses the features derived from AMP significantly outperforms traditional clustering methods. Our technique is further exemplified on a real world data set where AMP can be used to discover groupings of individuals which correspond to real world sub-populations.
△ Less
Submitted 27 July, 2015; v1 submitted 20 July, 2015;
originally announced July 2015.
-
Analytical Parameterization of Self-Consistent Polycrystal Mechanics: Fast Calculation of Upper Mantle Anisotropy
Authors:
Neil J. Goulding,
Neil M. Ribe,
Olivier Castelnau,
Andrew M. Walker,
James Wookey
Abstract:
Progressive deformation of upper mantle rocks via dislocation creep causes their constituent crystals to take on a non-random orientation distribution (crystallographic preferred orientation or CPO) whose observable signatures include shear-wave splitting and azimuthal dependence of surface wave speeds. Comparison of these signatures with mantle flow models thus allows mantle dynamics to be unrave…
▽ More
Progressive deformation of upper mantle rocks via dislocation creep causes their constituent crystals to take on a non-random orientation distribution (crystallographic preferred orientation or CPO) whose observable signatures include shear-wave splitting and azimuthal dependence of surface wave speeds. Comparison of these signatures with mantle flow models thus allows mantle dynamics to be unraveled on global and regional scales. However, existing self-consistent models of CPO evolution are computationally expensive when used in 3-D and/or time-dependent convection models. Here we propose a new method, called ANPAR, which is based on an analytical parameterisation of the crystallographic spin predicted by the second-order (SO) self-consistent theory. Our parameterisation runs approximately 2-3 x 10^4 times faster than the SO model and fits its predictions for CPO and crystallographic spin with a variance reduction > 99%. We illustrate the ANPAR model predictions for three uniform deformations (uniaxial compression, pure shear, simple shear) and for a corner-flow model of a spreading ridge.
△ Less
Submitted 10 December, 2014;
originally announced December 2014.