-
Bottleneck-based Encoder-decoder ARchitecture (BEAR) for Learning Unbiased Consumer-to-Consumer Image Representations
Authors:
Pablo Rivas,
Gisela Bichler,
Tomas Cerny,
Laurie Giddens,
Stacie Petter
Abstract:
Unbiased representation learning is still an object of study under specific applications and contexts. Novel architectures are usually crafted to resolve particular problems using mixtures of fundamental pieces. This paper presents different image feature extraction mechanisms that work together with residual connections to encode perceptual image information in an autoencoder configuration. We us…
▽ More
Unbiased representation learning is still an object of study under specific applications and contexts. Novel architectures are usually crafted to resolve particular problems using mixtures of fundamental pieces. This paper presents different image feature extraction mechanisms that work together with residual connections to encode perceptual image information in an autoencoder configuration. We use image data that aims to support a larger research agenda dealing with issues regarding criminal activity in consumer-to-consumer online platforms. Preliminary results suggest that the proposed architecture can learn rich spaces using ours and other image datasets resolving important challenges that are identified.
△ Less
Submitted 9 September, 2024;
originally announced September 2024.
-
On the Challenges of Creating Datasets for Analyzing Commercial Sex Advertisements to Assess Human Trafficking Risk and Organized Activity
Authors:
Pablo Rivas,
Tomas Cerny,
Alejandro Rodriguez Perez,
Javier Turek,
Laurie Giddens,
Gisela Bichler,
Stacie Petter
Abstract:
Our study addresses the challenges of building datasets to understand the risks associated with organized activities and human trafficking through commercial sex advertisements. These challenges include data scarcity, rapid obsolescence, and privacy concerns. Traditional approaches, which are not automated and are difficult to reproduce, fall short in addressing these issues. We have developed a r…
▽ More
Our study addresses the challenges of building datasets to understand the risks associated with organized activities and human trafficking through commercial sex advertisements. These challenges include data scarcity, rapid obsolescence, and privacy concerns. Traditional approaches, which are not automated and are difficult to reproduce, fall short in addressing these issues. We have developed a reproducible and automated methodology to analyze five million advertisements. In the process, we identified further challenges in dataset creation within this sensitive domain. This paper presents a streamlined methodology to assist researchers in constructing effective datasets for combating organized crime, allowing them to focus on advancing detection technologies.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Reconstructing Sparse Multiplex Networks with Application to Covert Networks
Authors:
Jin-Zhu Yu,
Mincheng Wu,
Gisela Bichler,
Felipe Aros-Vera,
Jianxi Gao
Abstract:
Network structure provides critical information for understanding the dynamic behavior of networks. However, the complete structure of real-world networks is often unavailable, thus it is crucially important to develop approaches to infer a more complete structure of networks. In this paper, we integrate the configuration model for generating random networks into an Expectation-Maximization-Aggreg…
▽ More
Network structure provides critical information for understanding the dynamic behavior of networks. However, the complete structure of real-world networks is often unavailable, thus it is crucially important to develop approaches to infer a more complete structure of networks. In this paper, we integrate the configuration model for generating random networks into an Expectation-Maximization-Aggregation (EMA) framework to reconstruct the complete structure of multiplex networks. We validate the proposed EMA framework against the random model on several real-world multiplex networks, including both covert and overt ones. It is found that the EMA framework generally achieves the best predictive accuracy compared to the EM framework and the random model. As the number of layers increases, the performance improvement of EMA over EM decreases. The inferred multiplex networks can be leveraged to inform the decision-making on monitoring covert networks as well as allocating limited resources for collecting additional information to improve reconstruction accuracy. For law enforcement agencies, the inferred complete network structure can be used to develop more effective strategies for covert network interdiction.
△ Less
Submitted 10 January, 2023; v1 submitted 29 July, 2022;
originally announced August 2022.