-
RECA-PD: A Robust Explainable Cross-Attention Method for Speech-based Parkinson's Disease Classification
Authors:
Terry Yi Zhong,
Cristian Tejedor-Garcia,
Martha Larson,
Bastiaan R. Bloem
Abstract:
Parkinson's Disease (PD) affects over 10 million people globally, with speech impairments often preceding motor symptoms by years, making speech a valuable modality for early, non-invasive detection. While recent deep-learning models achieve high accuracy, they typically lack the explainability required for clinical use. To address this, we propose RECA-PD, a novel, robust, and explainable cross-a…
▽ More
Parkinson's Disease (PD) affects over 10 million people globally, with speech impairments often preceding motor symptoms by years, making speech a valuable modality for early, non-invasive detection. While recent deep-learning models achieve high accuracy, they typically lack the explainability required for clinical use. To address this, we propose RECA-PD, a novel, robust, and explainable cross-attention architecture that combines interpretable speech features with self-supervised representations. RECA-PD matches state-of-the-art performance in Speech-based PD detection while providing explanations that are more consistent and more clinically meaningful. Additionally, we demonstrate that performance degradation in certain speech tasks (e.g., monologue) can be mitigated by segmenting long recordings. Our findings indicate that performance and explainability are not necessarily mutually exclusive. Future work will enhance the usability of explanations for non-experts and explore severity estimation to increase the real-world clinical relevance.
△ Less
Submitted 4 July, 2025;
originally announced July 2025.
-
Evaluating the Usefulness of Non-Diagnostic Speech Data for Developing Parkinson's Disease Classifiers
Authors:
Terry Yi Zhong,
Esther Janse,
Cristian Tejedor-Garcia,
Louis ten Bosch,
Martha Larson
Abstract:
Speech-based Parkinson's disease (PD) detection has gained attention for its automated, cost-effective, and non-intrusive nature. As research studies usually rely on data from diagnostic-oriented speech tasks, this work explores the feasibility of diagnosing PD on the basis of speech data not originally intended for diagnostic purposes, using the Turn-Taking (TT) dataset. Our findings indicate tha…
▽ More
Speech-based Parkinson's disease (PD) detection has gained attention for its automated, cost-effective, and non-intrusive nature. As research studies usually rely on data from diagnostic-oriented speech tasks, this work explores the feasibility of diagnosing PD on the basis of speech data not originally intended for diagnostic purposes, using the Turn-Taking (TT) dataset. Our findings indicate that TT can be as useful as diagnostic-oriented PD datasets like PC-GITA. We also investigate which specific dataset characteristics impact PD classification performance. The results show that concatenating audio recordings and balancing participants' gender and status distributions can be beneficial. Cross-dataset evaluation reveals that models trained on PC-GITA generalize poorly to TT, whereas models trained on TT perform better on PC-GITA. Furthermore, we provide insights into the high variability across folds, which is mainly due to large differences in individual speaker performance.
△ Less
Submitted 24 May, 2025;
originally announced May 2025.
-
Rotational Multi-material 3D Printing of Soft Robotic Matter with Asymmetrical Embedded Pneumatics
Authors:
Jackson K. Wilt,
Natalie M. Larson,
Jennifer A. Lewis
Abstract:
The rapid design and fabrication of soft robotic matter is of growing interest for shape morphing, actuation, and wearable devices. Here, we report a facile fabrication method for creating soft robotic materials with embedded pneumatics that exhibit programmable shape morphing behavior. Using rotational multi-material 3D printing, asymmetrical core-shell filaments composed of elastomeric shells an…
▽ More
The rapid design and fabrication of soft robotic matter is of growing interest for shape morphing, actuation, and wearable devices. Here, we report a facile fabrication method for creating soft robotic materials with embedded pneumatics that exhibit programmable shape morphing behavior. Using rotational multi-material 3D printing, asymmetrical core-shell filaments composed of elastomeric shells and fugitive cores are patterned in 1D and 2D motifs. By precisely controlling the nozzle design, rotation rate, and print path, one can control the local orientation, shape, and cross-sectional area of the patterned fugitive core along each printed filament. Once the elastomeric matrix is cured, the fugitive cores are removed, leaving behind embedded conduits that facilitate pneumatic actuation. Using a connected Fermat spirals pathing approach, one can automatically generate desired print paths required for more complex soft robots, such as hand-inspired grippers. Our integrated design and printing approach enables one to rapidly build soft robotic matter that exhibits myriad shape morphing transitions on demand.
△ Less
Submitted 23 May, 2025;
originally announced May 2025.
-
Rapid patient-specific neural networks for intraoperative X-ray to volume registration
Authors:
Vivek Gopalakrishnan,
Neel Dey,
David-Dimitris Chlorogiannis,
Andrew Abumoussa,
Anna M. Larson,
Darren B. Orbach,
Sarah Frisken,
Polina Golland
Abstract:
The integration of artificial intelligence in image-guided interventions holds transformative potential, promising to extract 3D geometric and quantitative information from conventional 2D imaging modalities during complex procedures. Achieving this requires the rapid and precise alignment of 2D intraoperative images (e.g., X-ray) with 3D preoperative volumes (e.g., CT, MRI). However, current 2D/3…
▽ More
The integration of artificial intelligence in image-guided interventions holds transformative potential, promising to extract 3D geometric and quantitative information from conventional 2D imaging modalities during complex procedures. Achieving this requires the rapid and precise alignment of 2D intraoperative images (e.g., X-ray) with 3D preoperative volumes (e.g., CT, MRI). However, current 2D/3D registration methods fail across the broad spectrum of procedures dependent on X-ray guidance: traditional optimization techniques require custom parameter tuning for each subject, whereas neural networks trained on small datasets do not generalize to new patients or require labor-intensive manual annotations, increasing clinical burden and precluding application to new anatomical targets. To address these challenges, we present xvr, a fully automated framework for training patient-specific neural networks for 2D/3D registration. xvr uses physics-based simulation to generate abundant high-quality training data from a patient's own preoperative volumetric imaging, thereby overcoming the inherently limited ability of supervised models to generalize to new patients and procedures. Furthermore, xvr requires only 5 minutes of training per patient, making it suitable for emergency interventions as well as planned procedures. We perform the largest evaluation of a 2D/3D registration algorithm on real X-ray data to date and find that xvr robustly generalizes across a diverse dataset comprising multiple anatomical structures, imaging modalities, and hospitals. Across surgical tasks, xvr achieves submillimeter-accurate registration at intraoperative speeds, improving upon existing methods by an order of magnitude. xvr is released as open-source software freely available at https://github.com/eigenvivek/xvr.
△ Less
Submitted 20 March, 2025;
originally announced March 2025.
-
On Fact and Frequency: LLM Responses to Misinformation Expressed with Uncertainty
Authors:
Yana van de Sande,
Gunes Açar,
Thabo van Woudenberg,
Martha Larson
Abstract:
We study LLM judgments of misinformation expressed with uncertainty. Our experiments study the response of three widely used LLMs (GPT-4o, LlaMA3, DeepSeek-v2) to misinformation propositions that have been verified false and then are transformed into uncertain statements according to an uncertainty typology. Our results show that after transformation, LLMs change their factchecking classification…
▽ More
We study LLM judgments of misinformation expressed with uncertainty. Our experiments study the response of three widely used LLMs (GPT-4o, LlaMA3, DeepSeek-v2) to misinformation propositions that have been verified false and then are transformed into uncertain statements according to an uncertainty typology. Our results show that after transformation, LLMs change their factchecking classification from false to not-false in 25% of the cases. Analysis reveals that the change cannot be explained by predictors to which humans are expected to be sensitive, i.e., modality, linguistic cues, or argumentation strategy. The exception is doxastic transformations, which use linguistic cue phrases such as "It is believed ...".To gain further insight, we prompt the LLM to make another judgment about the transformed misinformation statements that is not related to truth value. Specifically, we study LLM estimates of the frequency with which people make the uncertain statement. We find a small but significant correlation between judgment of fact and estimation of frequency.
△ Less
Submitted 6 March, 2025;
originally announced March 2025.
-
An optimal Petrov-Galerkin framework for operator networks
Authors:
Philip Charles,
Deep Ray,
Yue Yu,
Joost Prins,
Hugo Melchers,
Michael R. A. Abdelmalik,
Jeffrey Cochran,
Assad A. Oberai,
Thomas J. R. Hughes,
Mats G. Larson
Abstract:
The optimal Petrov-Galerkin formulation to solve partial differential equations (PDEs) recovers the best approximation in a specified finite-dimensional (trial) space with respect to a suitable norm. However, the recovery of this optimal solution is contingent on being able to construct the optimal weighting functions associated with the trial basis. While explicit constructions are available for…
▽ More
The optimal Petrov-Galerkin formulation to solve partial differential equations (PDEs) recovers the best approximation in a specified finite-dimensional (trial) space with respect to a suitable norm. However, the recovery of this optimal solution is contingent on being able to construct the optimal weighting functions associated with the trial basis. While explicit constructions are available for simple one- and two-dimensional problems, such constructions for a general multidimensional problem remain elusive. In the present work, we revisit the optimal Petrov-Galerkin formulation through the lens of deep learning. We propose an operator network framework called Petrov-Galerkin Variationally Mimetic Operator Network (PG-VarMiON), which emulates the optimal Petrov-Galerkin weak form of the underlying PDE. The PG-VarMiON is trained in a supervised manner using a labeled dataset comprising the PDE data and the corresponding PDE solution, with the training loss depending on the choice of the optimal norm. The special architecture of the PG-VarMiON allows it to implicitly learn the optimal weighting functions, thus endowing the proposed operator network with the ability to generalize well beyond the training set. We derive approximation error estimates for PG-VarMiON, highlighting the contributions of various error sources, particularly the error in learning the true weighting functions. Several numerical results are presented for the advection-diffusion equation to demonstrate the efficacy of the proposed method. By embedding the Petrov-Galerkin structure into the network architecture, PG-VarMiON exhibits greater robustness and improved generalization compared to other popular deep operator frameworks, particularly when the training data is limited.
△ Less
Submitted 5 March, 2025;
originally announced March 2025.
-
Typographic Attacks in a Multi-Image Setting
Authors:
Xiaomeng Wang,
Zhengyu Zhao,
Martha Larson
Abstract:
Large Vision-Language Models (LVLMs) are susceptible to typographic attacks, which are misclassifications caused by an attack text that is added to an image. In this paper, we introduce a multi-image setting for studying typographic attacks, broadening the current emphasis of the literature on attacking individual images. Specifically, our focus is on attacking image sets without repeating the att…
▽ More
Large Vision-Language Models (LVLMs) are susceptible to typographic attacks, which are misclassifications caused by an attack text that is added to an image. In this paper, we introduce a multi-image setting for studying typographic attacks, broadening the current emphasis of the literature on attacking individual images. Specifically, our focus is on attacking image sets without repeating the attack query. Such non-repeating attacks are stealthier, as they are more likely to evade a gatekeeper than attacks that repeat the same attack text. We introduce two attack strategies for the multi-image setting, leveraging the difficulty of the target image, the strength of the attack text, and text-image similarity. Our text-image similarity approach improves attack success rates by 21% over random, non-specific methods on the CLIP model using ImageNet while maintaining stealth in a multi-image scenario. An additional experiment demonstrates transferability, i.e., text-image similarity calculated using CLIP transfers when attacking InstructBLIP.
△ Less
Submitted 12 February, 2025;
originally announced February 2025.
-
Frequency Is What You Need: Word-frequency Masking Benefits Vision-Language Model Pre-training
Authors:
Mingliang Liang,
Martha Larson
Abstract:
Vision Language Models (VLMs) can be trained more efficiently if training sets can be reduced in size. Recent work has shown the benefits of masking text during VLM training using a variety of approaches: truncation, random masking, block masking and syntax masking. In this paper, we show that the best masking strategy changes over training epochs and that, given sufficient training epochs. We ana…
▽ More
Vision Language Models (VLMs) can be trained more efficiently if training sets can be reduced in size. Recent work has shown the benefits of masking text during VLM training using a variety of approaches: truncation, random masking, block masking and syntax masking. In this paper, we show that the best masking strategy changes over training epochs and that, given sufficient training epochs. We analyze existing text masking approaches including syntax masking, which is currently the state of the art, and identify the word frequency distribution as important in determining their success. Experiments on a large range of data sets demonstrate that syntax masking is outperformed by other approaches, given sufficient epochs, and that our proposed frequency-based approach, called Contrastive Language-Image Pre-training with Word Frequency Masking (CLIPF) has numerous advantages. The benefits are particularly evident as the number of input tokens decreases.
△ Less
Submitted 14 April, 2025; v1 submitted 20 December, 2024;
originally announced December 2024.
-
Nonlinear Operator Learning Using Energy Minimization and MLPs
Authors:
Mats G. Larson,
Carl Lundholm,
Anna Persson
Abstract:
We develop and evaluate a method for learning solution operators to nonlinear problems governed by partial differential equations. The approach is based on a finite element discretization and aims at representing the solution operator by an MLP that takes latent variables as input. The latent variables will typically correspond to parameters in a parametrization of input data such as boundary cond…
▽ More
We develop and evaluate a method for learning solution operators to nonlinear problems governed by partial differential equations. The approach is based on a finite element discretization and aims at representing the solution operator by an MLP that takes latent variables as input. The latent variables will typically correspond to parameters in a parametrization of input data such as boundary conditions, coefficients, and right-hand sides. The loss function is most often an energy functional and we formulate efficient parallelizable training algorithms based on assembling the energy locally on each element. For large problems, the learning process can be made more efficient by using only a small fraction of randomly chosen elements in the mesh in each iteration. The approach is evaluated on several relevant test cases, where learning the solution operator turns out to be beneficial compared to classical numerical methods.
△ Less
Submitted 5 December, 2024;
originally announced December 2024.
-
Stabilizing and Solving Unique Continuation Problems by Parameterizing Data and Learning Finite Element Solution Operators
Authors:
Erik Burman,
Mats G. Larson,
Karl Larsson,
Carl Lundholm
Abstract:
We consider an inverse problem involving the reconstruction of the solution to a nonlinear partial differential equation (PDE) with unknown boundary conditions. Instead of direct boundary data, we are provided with a large dataset of boundary observations for typical solutions (collective data) and a bulk measurement of a specific realization. To leverage this collective data, we first compress th…
▽ More
We consider an inverse problem involving the reconstruction of the solution to a nonlinear partial differential equation (PDE) with unknown boundary conditions. Instead of direct boundary data, we are provided with a large dataset of boundary observations for typical solutions (collective data) and a bulk measurement of a specific realization. To leverage this collective data, we first compress the boundary data using proper orthogonal decomposition (POD) in a linear expansion. Next, we identify a possible nonlinear low-dimensional structure in the expansion coefficients using an autoencoder, which provides a parametrization of the dataset in a lower-dimensional latent space. We then train an operator network to map the expansion coefficients representing the boundary data to the finite element (FE) solution of the PDE. Finally, we connect the autoencoder's decoder to the operator network which enables us to solve the inverse problem by optimizing a data-fitting term over the latent space. We analyze the underlying stabilized finite element method (FEM) in the linear setting and establish an optimal error estimate in the $H^1$-norm. The nonlinear problem is then studied numerically, demonstrating the effectiveness of our approach.
△ Less
Submitted 5 May, 2025; v1 submitted 5 December, 2024;
originally announced December 2024.
-
Enhancing Vision-Language Model Pre-training with Image-text Pair Pruning Based on Word Frequency
Authors:
Mingliang Liang,
Martha Larson
Abstract:
We propose Word-Frequency-based Image-Text Pair Pruning (WFPP), a novel data pruning method that improves the efficiency of VLMs. Unlike MetaCLIP, our method does not need metadata for pruning, but selects text-image pairs to prune based on the content of the text. Specifically, WFPP prunes text-image pairs containing high-frequency words across the entire training dataset. The effect of WFPP is t…
▽ More
We propose Word-Frequency-based Image-Text Pair Pruning (WFPP), a novel data pruning method that improves the efficiency of VLMs. Unlike MetaCLIP, our method does not need metadata for pruning, but selects text-image pairs to prune based on the content of the text. Specifically, WFPP prunes text-image pairs containing high-frequency words across the entire training dataset. The effect of WFPP is to reduce the dominance of frequent words. The result a better balanced word-frequency distribution in the dataset, which is known to improve the training of word embedding models. After pre-training on the pruned subset, we fine-tuned the model on the entire dataset for one additional epoch to achieve better performance. Our experiments demonstrate that applying WFPP when training a CLIP model improves performance on a wide range of downstream tasks. WFPP also provides the advantage of speeding up pre-training by using fewer samples. Additionally, we analyze the training data before and after pruning to visualize how WFPP changes the balance of word frequencies. We hope our work encourages researchers to consider the distribution of words in the training data when pre-training VLMs, not limited to CLIP.
△ Less
Submitted 10 December, 2024; v1 submitted 9 October, 2024;
originally announced October 2024.
-
Scenario of Use Scheme: Threat Model Specification for Speaker Privacy Protection in the Medical Domain
Authors:
Mehtab Ur Rahman,
Martha Larson,
Louis ten Bosch,
Cristian Tejedor-García
Abstract:
Speech recordings are being more frequently used to detect and monitor disease, leading to privacy concerns. Beyond cryptography, protection of speech can be addressed by approaches, such as perturbation, disentanglement, and re-synthesis, that eliminate sensitive information of the speaker, leaving the information necessary for medical analysis purposes. In order for such privacy protective appro…
▽ More
Speech recordings are being more frequently used to detect and monitor disease, leading to privacy concerns. Beyond cryptography, protection of speech can be addressed by approaches, such as perturbation, disentanglement, and re-synthesis, that eliminate sensitive information of the speaker, leaving the information necessary for medical analysis purposes. In order for such privacy protective approaches to be developed, clear and systematic specifications of assumptions concerning medical settings and the needs of medical professionals are necessary. In this paper, we propose a Scenario of Use Scheme that incorporates an Attacker Model, which characterizes the adversary against whom the speaker's privacy must be defended, and a Protector Model, which specifies the defense. We discuss the connection of the scheme with previous work on speech privacy. Finally, we present a concrete example of a specified Scenario of Use and a set of experiments about protecting speaker data against gender inference attacks while maintaining utility for Parkinson's detection.
△ Less
Submitted 26 September, 2024; v1 submitted 24 September, 2024;
originally announced September 2024.
-
Implicit Hypersurface Approximation Capacity in Deep ReLU Networks
Authors:
Jonatan Vallin,
Karl Larsson,
Mats G. Larson
Abstract:
We develop a geometric approximation theory for deep feed-forward neural networks with ReLU activations. Given a $d$-dimensional hypersurface in $\mathbb{R}^{d+1}$ represented as the graph of a $C^2$-function $φ$, we show that a deep fully-connected ReLU network of width $d+1$ can implicitly construct an approximation as its zero contour with a precision bound depending on the number of layers. Th…
▽ More
We develop a geometric approximation theory for deep feed-forward neural networks with ReLU activations. Given a $d$-dimensional hypersurface in $\mathbb{R}^{d+1}$ represented as the graph of a $C^2$-function $φ$, we show that a deep fully-connected ReLU network of width $d+1$ can implicitly construct an approximation as its zero contour with a precision bound depending on the number of layers. This result is directly applicable to the binary classification setting where the sign of the network is trained as a classifier, with the network's zero contour as a decision boundary. Our proof is constructive and relies on the geometrical structure of ReLU layers provided in [doi:10.48550/arXiv.2310.03482]. Inspired by this geometrical description, we define a new equivalent network architecture that is easier to interpret geometrically, where the action of each hidden layer is a projection onto a polyhedral cone derived from the layer's parameters. By repeatedly adding such layers, with parameters chosen such that we project small parts of the graph of $φ$ from the outside in, we, in a controlled way, construct a network that implicitly approximates the graph over a ball of radius $R$. The accuracy of this construction is controlled by a discretization parameter $δ$ and we show that the tolerance in the resulting error bound scales as $(d-1)R^{3/2}δ^{1/2}$ and the required number of layers is of order $d\big(\frac{32R}δ\big)^{\frac{d+1}{2}}$.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Rigidity matroids and linear algebraic matroids with applications to matrix completion and tensor codes
Authors:
Joshua Brakensiek,
Manik Dhar,
Jiyang Gao,
Sivakanth Gopi,
Matt Larson
Abstract:
We establish a connection between problems studied in rigidity theory and matroids arising from linear algebraic constructions like tensor products and symmetric products. A special case of this correspondence identifies the problem of giving a description of the correctable erasure patterns in a maximally recoverable tensor code with the problem of describing bipartite rigid graphs or low-rank co…
▽ More
We establish a connection between problems studied in rigidity theory and matroids arising from linear algebraic constructions like tensor products and symmetric products. A special case of this correspondence identifies the problem of giving a description of the correctable erasure patterns in a maximally recoverable tensor code with the problem of describing bipartite rigid graphs or low-rank completable matrix patterns. Additionally, we relate dependencies among symmetric products of generic vectors to graph rigidity and symmetric matrix completion. With an eye toward applications to computer science, we study the dependency of these matroids on the characteristic by giving new combinatorial descriptions in several cases, including the first description of the correctable patterns in an (m, n, a=2, b=2) maximally recoverable tensor code.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
Centered Masking for Language-Image Pre-Training
Authors:
Mingliang Liang,
Martha Larson
Abstract:
We introduce Gaussian masking for Language-Image Pre-Training (GLIP) a novel, straightforward, and effective technique for masking image patches during pre-training of a vision-language model. GLIP builds on Fast Language-Image Pre-Training (FLIP), which randomly masks image patches while training a CLIP model. GLIP replaces random masking with centered masking, that uses a Gaussian distribution a…
▽ More
We introduce Gaussian masking for Language-Image Pre-Training (GLIP) a novel, straightforward, and effective technique for masking image patches during pre-training of a vision-language model. GLIP builds on Fast Language-Image Pre-Training (FLIP), which randomly masks image patches while training a CLIP model. GLIP replaces random masking with centered masking, that uses a Gaussian distribution and is inspired by the importance of image patches at the center of the image. GLIP retains the same computational savings as FLIP, while improving performance across a range of downstream datasets and tasks, as demonstrated by our experimental results. We show the benefits of GLIP to be easy to obtain, requiring no delicate tuning of the Gaussian, and also applicable to data sets containing images without an obvious center focus.
△ Less
Submitted 27 March, 2024; v1 submitted 23 March, 2024;
originally announced March 2024.
-
Tabdoor: Backdoor Vulnerabilities in Transformer-based Neural Networks for Tabular Data
Authors:
Bart Pleiter,
Behrad Tajalli,
Stefanos Koffas,
Gorka Abad,
Jing Xu,
Martha Larson,
Stjepan Picek
Abstract:
Deep Neural Networks (DNNs) have shown great promise in various domains. Alongside these developments, vulnerabilities associated with DNN training, such as backdoor attacks, are a significant concern. These attacks involve the subtle insertion of triggers during model training, allowing for manipulated predictions. More recently, DNNs for tabular data have gained increasing attention due to the r…
▽ More
Deep Neural Networks (DNNs) have shown great promise in various domains. Alongside these developments, vulnerabilities associated with DNN training, such as backdoor attacks, are a significant concern. These attacks involve the subtle insertion of triggers during model training, allowing for manipulated predictions. More recently, DNNs for tabular data have gained increasing attention due to the rise of transformer models. Our research presents a comprehensive analysis of backdoor attacks on tabular data using DNNs, mainly focusing on transformers. We also propose a novel approach for trigger construction: an in-bounds attack, which provides excellent attack performance while maintaining stealthiness. Through systematic experimentation across benchmark datasets, we uncover that transformer-based DNNs for tabular data are highly susceptible to backdoor attacks, even with minimal feature value alterations. We also verify that our attack can be generalized to other models, like XGBoost and DeepFM. Our results demonstrate up to 100% attack success rate with negligible clean accuracy drop. Furthermore, we evaluate several defenses against these attacks, identifying Spectral Signatures as the most effective. Nevertheless, our findings highlight the need to develop tabular data-specific countermeasures to defend against backdoor attacks.
△ Less
Submitted 25 April, 2024; v1 submitted 13 November, 2023;
originally announced November 2023.
-
When Machine Learning Models Leak: An Exploration of Synthetic Training Data
Authors:
Manel Slokom,
Peter-Paul de Wolf,
Martha Larson
Abstract:
We investigate an attack on a machine learning model that predicts whether a person or household will relocate in the next two years, i.e., a propensity-to-move classifier. The attack assumes that the attacker can query the model to obtain predictions and that the marginal distribution of the data on which the model was trained is publicly available. The attack also assumes that the attacker has o…
▽ More
We investigate an attack on a machine learning model that predicts whether a person or household will relocate in the next two years, i.e., a propensity-to-move classifier. The attack assumes that the attacker can query the model to obtain predictions and that the marginal distribution of the data on which the model was trained is publicly available. The attack also assumes that the attacker has obtained the values of non-sensitive attributes for a certain number of target individuals. The objective of the attack is to infer the values of sensitive attributes for these target individuals. We explore how replacing the original data with synthetic data when training the model impacts how successfully the attacker can infer sensitive attributes.
△ Less
Submitted 19 May, 2024; v1 submitted 12 October, 2023;
originally announced October 2023.
-
The Geometric Structure of Fully-Connected ReLU Layers
Authors:
Jonatan Vallin,
Karl Larsson,
Mats G. Larson
Abstract:
We formalize and interpret the geometric structure of $d$-dimensional fully connected ReLU layers in neural networks. The parameters of a ReLU layer induce a natural partition of the input domain, such that the ReLU layer can be significantly simplified in each sector of the partition. This leads to a geometric interpretation of a ReLU layer as a projection onto a polyhedral cone followed by an af…
▽ More
We formalize and interpret the geometric structure of $d$-dimensional fully connected ReLU layers in neural networks. The parameters of a ReLU layer induce a natural partition of the input domain, such that the ReLU layer can be significantly simplified in each sector of the partition. This leads to a geometric interpretation of a ReLU layer as a projection onto a polyhedral cone followed by an affine transformation, in line with the description in [doi:10.48550/arXiv.1905.08922] for convolutional networks with ReLU activations. Further, this structure facilitates simplified expressions for preimages of the intersection between partition sectors and hyperplanes, which is useful when describing decision boundaries in a classification setting. We investigate this in detail for a feed-forward network with one hidden ReLU-layer, where we provide results on the geometric complexity of the decision boundary generated by such networks, as well as proving that modulo an affine transformation, such a network can only generate $d$ different decision boundaries. Finally, the effect of adding more layers to the network is discussed.
△ Less
Submitted 8 November, 2023; v1 submitted 5 October, 2023;
originally announced October 2023.
-
Observation of high-energy neutrinos from the Galactic plane
Authors:
R. Abbasi,
M. Ackermann,
J. Adams,
J. A. Aguilar,
M. Ahlers,
M. Ahrens,
J. M. Alameddine,
A. A. Alves Jr.,
N. M. Amin,
K. Andeen,
T. Anderson,
G. Anton,
C. Argüelles,
Y. Ashida,
S. Athanasiadou,
S. Axani,
X. Bai,
A. Balagopal V.,
S. W. Barwick,
V. Basu,
S. Baur,
R. Bay,
J. J. Beatty,
K. -H. Becker,
J. Becker Tjus
, et al. (364 additional authors not shown)
Abstract:
The origin of high-energy cosmic rays, atomic nuclei that continuously impact Earth's atmosphere, has been a mystery for over a century. Due to deflection in interstellar magnetic fields, cosmic rays from the Milky Way arrive at Earth from random directions. However, near their sources and during propagation, cosmic rays interact with matter and produce high-energy neutrinos. We search for neutrin…
▽ More
The origin of high-energy cosmic rays, atomic nuclei that continuously impact Earth's atmosphere, has been a mystery for over a century. Due to deflection in interstellar magnetic fields, cosmic rays from the Milky Way arrive at Earth from random directions. However, near their sources and during propagation, cosmic rays interact with matter and produce high-energy neutrinos. We search for neutrino emission using machine learning techniques applied to ten years of data from the IceCube Neutrino Observatory. We identify neutrino emission from the Galactic plane at the 4.5$σ$ level of significance, by comparing diffuse emission models to a background-only hypothesis. The signal is consistent with modeled diffuse emission from the Galactic plane, but could also arise from a population of unresolved point sources.
△ Less
Submitted 10 July, 2023;
originally announced July 2023.
-
Beyond Neural-on-Neural Approaches to Speaker Gender Protection
Authors:
Loes van Bemmel,
Zhuoran Liu,
Nik Vaessen,
Martha Larson
Abstract:
Recent research has proposed approaches that modify speech to defend against gender inference attacks. The goal of these protection algorithms is to control the availability of information about a speaker's gender, a privacy-sensitive attribute. Currently, the common practice for developing and testing gender protection algorithms is "neural-on-neural", i.e., perturbations are generated and tested…
▽ More
Recent research has proposed approaches that modify speech to defend against gender inference attacks. The goal of these protection algorithms is to control the availability of information about a speaker's gender, a privacy-sensitive attribute. Currently, the common practice for developing and testing gender protection algorithms is "neural-on-neural", i.e., perturbations are generated and tested with a neural network. In this paper, we propose to go beyond this practice to strengthen the study of gender protection. First, we demonstrate the importance of testing gender inference attacks that are based on speech features historically developed by speech scientists, alongside the conventionally used neural classifiers. Next, we argue that researchers should use speech features to gain insight into how protective modifications change the speech signal. Finally, we point out that gender-protection algorithms should be compared with novel "vocal adversaries", human-executed voice adaptations, in order to improve interpretability and enable before-the-mic protection.
△ Less
Submitted 30 June, 2023;
originally announced June 2023.
-
The XPRESS Challenge: Xray Projectomic Reconstruction -- Extracting Segmentation with Skeletons
Authors:
Tri Nguyen,
Mukul Narwani,
Mark Larson,
Yicong Li,
Shuhan Xie,
Hanspeter Pfister,
Donglai Wei,
Nir Shavit,
Lu Mi,
Alexandra Pacureanu,
Wei-Chung Lee,
Aaron T. Kuan
Abstract:
The wiring and connectivity of neurons form a structural basis for the function of the nervous system. Advances in volume electron microscopy (EM) and image segmentation have enabled mapping of circuit diagrams (connectomics) within local regions of the mouse brain. However, applying volume EM over the whole brain is not currently feasible due to technological challenges. As a result, comprehensiv…
▽ More
The wiring and connectivity of neurons form a structural basis for the function of the nervous system. Advances in volume electron microscopy (EM) and image segmentation have enabled mapping of circuit diagrams (connectomics) within local regions of the mouse brain. However, applying volume EM over the whole brain is not currently feasible due to technological challenges. As a result, comprehensive maps of long-range connections between brain regions are lacking. Recently, we demonstrated that X-ray holographic nanotomography (XNH) can provide high-resolution images of brain tissue at a much larger scale than EM. In particular, XNH is wellsuited to resolve large, myelinated axon tracts (white matter) that make up the bulk of long-range connections (projections) and are critical for inter-region communication. Thus, XNH provides an imaging solution for brain-wide projectomics. However, because XNH data is typically collected at lower resolutions and larger fields-of-view than EM, accurate segmentation of XNH images remains an important challenge that we present here. In this task, we provide volumetric XNH images of cortical white matter axons from the mouse brain along with ground truth annotations for axon trajectories. Manual voxel-wise annotation of ground truth is a time-consuming bottleneck for training segmentation networks. On the other hand, skeleton-based ground truth is much faster to annotate, and sufficient to determine connectivity. Therefore, we encourage participants to develop methods to leverage skeleton-based training. To this end, we provide two types of ground-truth annotations: a small volume of voxel-wise annotations and a larger volume with skeleton-based annotations. Entries will be evaluated on how accurately the submitted segmentations agree with the ground-truth skeleton annotations.
△ Less
Submitted 24 February, 2023; v1 submitted 7 February, 2023;
originally announced February 2023.
-
Image Shortcut Squeezing: Countering Perturbative Availability Poisons with Compression
Authors:
Zhuoran Liu,
Zhengyu Zhao,
Martha Larson
Abstract:
Perturbative availability poisons (PAPs) add small changes to images to prevent their use for model training. Current research adopts the belief that practical and effective approaches to countering PAPs do not exist. In this paper, we argue that it is time to abandon this belief. We present extensive experiments showing that 12 state-of-the-art PAP methods are vulnerable to Image Shortcut Squeezi…
▽ More
Perturbative availability poisons (PAPs) add small changes to images to prevent their use for model training. Current research adopts the belief that practical and effective approaches to countering PAPs do not exist. In this paper, we argue that it is time to abandon this belief. We present extensive experiments showing that 12 state-of-the-art PAP methods are vulnerable to Image Shortcut Squeezing (ISS), which is based on simple compression. For example, on average, ISS restores the CIFAR-10 model accuracy to $81.73\%$, surpassing the previous best preprocessing-based countermeasures by $37.97\%$ absolute. ISS also (slightly) outperforms adversarial training and has higher generalizability to unseen perturbation norms and also higher efficiency. Our investigation reveals that the property of PAP perturbations depends on the type of surrogate model used for poison generation, and it explains why a specific ISS compression yields the best performance for a specific type of PAP perturbation. We further test stronger, adaptive poisoning, and show it falls short of being an ideal defense against ISS. Overall, our results demonstrate the importance of considering various (simple) countermeasures to ensure the meaningfulness of analysis carried out during the development of PAP methods.
△ Less
Submitted 23 June, 2023; v1 submitted 31 January, 2023;
originally announced January 2023.
-
Expanding Accurate Person Recognition to New Altitudes and Ranges: The BRIAR Dataset
Authors:
David Cornett III,
Joel Brogan,
Nell Barber,
Deniz Aykac,
Seth Baird,
Nick Burchfield,
Carl Dukes,
Andrew Duncan,
Regina Ferrell,
Jim Goddard,
Gavin Jager,
Matt Larson,
Bart Murphy,
Christi Johnson,
Ian Shelley,
Nisha Srinivas,
Brandon Stockwell,
Leanne Thompson,
Matt Yohe,
Robert Zhang,
Scott Dolvin,
Hector J. Santos-Villalobos,
David S. Bolme
Abstract:
Face recognition technology has advanced significantly in recent years due largely to the availability of large and increasingly complex training datasets for use in deep learning models. These datasets, however, typically comprise images scraped from news sites or social media platforms and, therefore, have limited utility in more advanced security, forensics, and military applications. These app…
▽ More
Face recognition technology has advanced significantly in recent years due largely to the availability of large and increasingly complex training datasets for use in deep learning models. These datasets, however, typically comprise images scraped from news sites or social media platforms and, therefore, have limited utility in more advanced security, forensics, and military applications. These applications require lower resolution, longer ranges, and elevated viewpoints. To meet these critical needs, we collected and curated the first and second subsets of a large multi-modal biometric dataset designed for use in the research and development (R&D) of biometric recognition technologies under extremely challenging conditions. Thus far, the dataset includes more than 350,000 still images and over 1,300 hours of video footage of approximately 1,000 subjects. To collect this data, we used Nikon DSLR cameras, a variety of commercial surveillance cameras, specialized long-rage R&D cameras, and Group 1 and Group 2 UAV platforms. The goal is to support the development of algorithms capable of accurately recognizing people at ranges up to 1,000 m and from high angles of elevation. These advances will include improvements to the state of the art in face recognition and will support new research in the area of whole-body recognition using methods based on gait and anthropometry. This paper describes methods used to collect and curate the dataset, and the dataset's characteristics at the current stage.
△ Less
Submitted 3 November, 2022;
originally announced November 2022.
-
Generative Poisoning Using Random Discriminators
Authors:
Dirren van Vlijmen,
Alex Kolmus,
Zhuoran Liu,
Zhengyu Zhao,
Martha Larson
Abstract:
We introduce ShortcutGen, a new data poisoning attack that generates sample-dependent, error-minimizing perturbations by learning a generator. The key novelty of ShortcutGen is the use of a randomly-initialized discriminator, which provides spurious shortcuts needed for generating poisons. Different from recent, iterative methods, our ShortcutGen can generate perturbations with only one forward pa…
▽ More
We introduce ShortcutGen, a new data poisoning attack that generates sample-dependent, error-minimizing perturbations by learning a generator. The key novelty of ShortcutGen is the use of a randomly-initialized discriminator, which provides spurious shortcuts needed for generating poisons. Different from recent, iterative methods, our ShortcutGen can generate perturbations with only one forward pass in a label-free manner, and compared to the only existing generative method, DeepConfuse, our ShortcutGen is faster and simpler to train while remaining competitive. We also demonstrate that integrating a simple augmentation strategy can further boost the robustness of ShortcutGen against early stopping, and combining augmentation and non-augmentation leads to new state-of-the-art results in terms of final validation accuracy, especially in the challenging, transfer scenario. Lastly, we speculate, through uncovering its working mechanism, that learning a more general representation space could allow ShortcutGen to work for unseen data.
△ Less
Submitted 2 November, 2022;
originally announced November 2022.
-
Graph Neural Networks for Low-Energy Event Classification & Reconstruction in IceCube
Authors:
R. Abbasi,
M. Ackermann,
J. Adams,
N. Aggarwal,
J. A. Aguilar,
M. Ahlers,
M. Ahrens,
J. M. Alameddine,
A. A. Alves Jr.,
N. M. Amin,
K. Andeen,
T. Anderson,
G. Anton,
C. Argüelles,
Y. Ashida,
S. Athanasiadou,
S. Axani,
X. Bai,
A. Balagopal V.,
M. Baricevic,
S. W. Barwick,
V. Basu,
R. Bay,
J. J. Beatty,
K. -H. Becker
, et al. (359 additional authors not shown)
Abstract:
IceCube, a cubic-kilometer array of optical sensors built to detect atmospheric and astrophysical neutrinos between 1 GeV and 1 PeV, is deployed 1.45 km to 2.45 km below the surface of the ice sheet at the South Pole. The classification and reconstruction of events from the in-ice detectors play a central role in the analysis of data from IceCube. Reconstructing and classifying events is a challen…
▽ More
IceCube, a cubic-kilometer array of optical sensors built to detect atmospheric and astrophysical neutrinos between 1 GeV and 1 PeV, is deployed 1.45 km to 2.45 km below the surface of the ice sheet at the South Pole. The classification and reconstruction of events from the in-ice detectors play a central role in the analysis of data from IceCube. Reconstructing and classifying events is a challenge due to the irregular detector geometry, inhomogeneous scattering and absorption of light in the ice and, below 100 GeV, the relatively low number of signal photons produced per event. To address this challenge, it is possible to represent IceCube events as point cloud graphs and use a Graph Neural Network (GNN) as the classification and reconstruction method. The GNN is capable of distinguishing neutrino events from cosmic-ray backgrounds, classifying different neutrino event types, and reconstructing the deposited energy, direction and interaction vertex. Based on simulation, we provide a comparison in the 1-100 GeV energy range to the current state-of-the-art maximum likelihood techniques used in current IceCube analyses, including the effects of known systematic uncertainties. For neutrino event classification, the GNN increases the signal efficiency by 18% at a fixed false positive rate (FPR), compared to current IceCube methods. Alternatively, the GNN offers a reduction of the FPR by over a factor 8 (to below half a percent) at a fixed signal efficiency. For the reconstruction of energy, direction, and interaction vertex, the resolution improves by an average of 13%-20% compared to current maximum likelihood techniques in the energy range of 1-30 GeV. The GNN, when run on a GPU, is capable of processing IceCube events at a rate nearly double of the median IceCube trigger rate of 2.7 kHz, which opens the possibility of using low energy neutrinos in online searches for transient events.
△ Less
Submitted 11 October, 2022; v1 submitted 7 September, 2022;
originally announced September 2022.
-
Minimizing Mindless Mentions: Recommendation with Minimal Necessary User Reviews
Authors:
Danny Stax,
Manel Slokom,
Martha Larson
Abstract:
Recently, researchers have turned their attention to recommender systems that use only minimal necessary data. This trend is informed by the idea that recommender systems should use no more user interactions than are needed in order to provide users with useful recommendations. In this position paper, we make the case for applying the idea of minimal necessary data to recommender systems that use…
▽ More
Recently, researchers have turned their attention to recommender systems that use only minimal necessary data. This trend is informed by the idea that recommender systems should use no more user interactions than are needed in order to provide users with useful recommendations. In this position paper, we make the case for applying the idea of minimal necessary data to recommender systems that use user reviews. We argue that the content of individual user reviews should be subject to minimization. Specifically, reviews used as training data to generate recommendations or reviews used to help users decide on purchases or consumption should be automatically edited to contain only the information that is needed.
△ Less
Submitted 9 September, 2022; v1 submitted 5 August, 2022;
originally announced August 2022.
-
Gender In Gender Out: A Closer Look at User Attributes in Context-Aware Recommendation
Authors:
Manel Slokom,
Özlem Özgöbek,
Martha Larson
Abstract:
This paper studies user attributes in light of current concerns in the recommender system community: diversity, coverage, calibration, and data minimization. In experiments with a conventional context-aware recommender system that leverages side information, we show that user attributes do not always improve recommendation. Then, we demonstrate that user attributes can negatively impact diversity…
▽ More
This paper studies user attributes in light of current concerns in the recommender system community: diversity, coverage, calibration, and data minimization. In experiments with a conventional context-aware recommender system that leverages side information, we show that user attributes do not always improve recommendation. Then, we demonstrate that user attributes can negatively impact diversity and coverage. Finally, we investigate the amount of information about users that ``survives'' from the training data into the recommendation lists produced by the recommender. This information is a weak signal that could in the future be exploited for calibration or studied further as a privacy leak.
△ Less
Submitted 28 July, 2022;
originally announced July 2022.
-
The Importance of Image Interpretation: Patterns of Semantic Misclassification in Real-World Adversarial Images
Authors:
Zhengyu Zhao,
Nga Dang,
Martha Larson
Abstract:
Adversarial images are created with the intention of causing an image classifier to produce a misclassification. In this paper, we propose that adversarial images should be evaluated based on semantic mismatch, rather than label mismatch, as used in current work. In other words, we propose that an image of a "mug" would be considered adversarial if classified as "turnip", but not as "cup", as curr…
▽ More
Adversarial images are created with the intention of causing an image classifier to produce a misclassification. In this paper, we propose that adversarial images should be evaluated based on semantic mismatch, rather than label mismatch, as used in current work. In other words, we propose that an image of a "mug" would be considered adversarial if classified as "turnip", but not as "cup", as current systems would assume. Our novel idea of taking semantic misclassification into account in the evaluation of adversarial images offers two benefits. First, it is a more realistic conceptualization of what makes an image adversarial, which is important in order to fully understand the implications of adversarial images for security and privacy. Second, it makes it possible to evaluate the transferability of adversarial images to a real-world classifier, without requiring the classifier's label set to have been available during the creation of the images. The paper carries out an evaluation of a transfer attack on a real-world image classifier that is made possible by our semantic misclassification approach. The attack reveals patterns in the semantics of adversarial misclassifications that could not be investigated using conventional label mismatch.
△ Less
Submitted 13 December, 2022; v1 submitted 3 June, 2022;
originally announced June 2022.
-
Regex in a Time of Deep Learning: The Role of an Old Technology in Age Discrimination Detection in Job Advertisements
Authors:
Anna Pillar,
Kyrill Poelmans,
Martha Larson
Abstract:
Deep learning holds great promise for detecting discriminatory language in the public sphere. However, for the detection of illegal age discrimination in job advertisements, regex approaches are still strong performers. In this paper, we investigate job advertisements in the Netherlands. We present a qualitative analysis of the benefits of the 'old' approach based on regexes and investigate how ne…
▽ More
Deep learning holds great promise for detecting discriminatory language in the public sphere. However, for the detection of illegal age discrimination in job advertisements, regex approaches are still strong performers. In this paper, we investigate job advertisements in the Netherlands. We present a qualitative analysis of the benefits of the 'old' approach based on regexes and investigate how neural embeddings could address its limitations.
△ Less
Submitted 18 May, 2022;
originally announced May 2022.
-
Federated Learning Enables Big Data for Rare Cancer Boundary Detection
Authors:
Sarthak Pati,
Ujjwal Baid,
Brandon Edwards,
Micah Sheller,
Shih-Han Wang,
G Anthony Reina,
Patrick Foley,
Alexey Gruzdev,
Deepthi Karkada,
Christos Davatzikos,
Chiharu Sako,
Satyam Ghodasara,
Michel Bilello,
Suyash Mohan,
Philipp Vollmuth,
Gianluca Brugnara,
Chandrakanth J Preetha,
Felix Sahm,
Klaus Maier-Hein,
Maximilian Zenk,
Martin Bendszus,
Wolfgang Wick,
Evan Calabrese,
Jeffrey Rudie,
Javier Villanueva-Meyer
, et al. (254 additional authors not shown)
Abstract:
Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train acc…
▽ More
Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train accurate and generalizable ML models, by only sharing numerical model updates. Here we present findings from the largest FL study to-date, involving data from 71 healthcare institutions across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, utilizing the largest dataset of such patients ever used in the literature (25,256 MRI scans from 6,314 patients). We demonstrate a 33% improvement over a publicly trained model to delineate the surgically targetable tumor, and 23% improvement over the tumor's entire extent. We anticipate our study to: 1) enable more studies in healthcare informed by large and diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further quantitative analyses for glioblastoma via performance optimization of our consensus model for eventual public release, and 3) demonstrate the effectiveness of FL at such scale and task complexity as a paradigm shift for multi-site collaborations, alleviating the need for data sharing.
△ Less
Submitted 25 April, 2022; v1 submitted 22 April, 2022;
originally announced April 2022.
-
Augmented Lagrangian approach to deriving discontinuous Galerkin methods for nonlinear elasticity problems
Authors:
Peter Hansbo,
Mats G. Larson
Abstract:
We use the augmented Lagrangian formalism to derive discontinuous Galerkin formulations for problems in nonlinear elasticity. In elasticity stress is typically a symmetric function of strain, leading to symmetric tangent stiffness matrices in Newtons method when conforming finite elements are used for discretization. By use of the augmented Lagrangian framework, we can also obtain symmetric tangen…
▽ More
We use the augmented Lagrangian formalism to derive discontinuous Galerkin formulations for problems in nonlinear elasticity. In elasticity stress is typically a symmetric function of strain, leading to symmetric tangent stiffness matrices in Newtons method when conforming finite elements are used for discretization. By use of the augmented Lagrangian framework, we can also obtain symmetric tangent stiffness matrices in discontinuous Galerkin methods. We suggest two different approaches and give examples from plasticity and from large deformation hyperelasticity.
△ Less
Submitted 17 February, 2022;
originally announced February 2022.
-
Going Grayscale: The Road to Understanding and Improving Unlearnable Examples
Authors:
Zhuoran Liu,
Zhengyu Zhao,
Alex Kolmus,
Tijn Berns,
Twan van Laarhoven,
Tom Heskes,
Martha Larson
Abstract:
Recent work has shown that imperceptible perturbations can be applied to craft unlearnable examples (ULEs), i.e. images whose content cannot be used to improve a classifier during training. In this paper, we reveal the road that researchers should follow for understanding ULEs and improving ULEs as they were originally formulated (ULEOs). The paper makes four contributions. First, we show that ULE…
▽ More
Recent work has shown that imperceptible perturbations can be applied to craft unlearnable examples (ULEs), i.e. images whose content cannot be used to improve a classifier during training. In this paper, we reveal the road that researchers should follow for understanding ULEs and improving ULEs as they were originally formulated (ULEOs). The paper makes four contributions. First, we show that ULEOs exploit color and, consequently, their effects can be mitigated by simple grayscale pre-filtering, without resorting to adversarial training. Second, we propose an extension to ULEOs, which is called ULEO-GrayAugs, that forces the generated ULEs away from channel-wise color perturbations by making use of grayscale knowledge and data augmentations during optimization. Third, we show that ULEOs generated using Multi-Layer Perceptrons (MLPs) are effective in the case of complex Convolutional Neural Network (CNN) classifiers, suggesting that CNNs suffer specific vulnerability to ULEs. Fourth, we demonstrate that when a classifier is trained on ULEOs, adversarial training will prevent a drop in accuracy measured both on clean images and on adversarial images. Taken together, our contributions represent a substantial advance in the state of art of unlearnable examples, but also reveal important characteristics of their behavior that must be better understood in order to achieve further improvements.
△ Less
Submitted 25 November, 2021;
originally announced November 2021.
-
Doing Data Right: How Lessons Learned Working with Conventional Data should Inform the Future of Synthetic Data for Recommender Systems
Authors:
Manel Slokom,
Martha Larson
Abstract:
We present a case that the newly emerging field of synthetic data in the area of recommender systems should prioritize `doing data right'. We consider this catchphrase to have two aspects: First, we should not repeat the mistakes of the past, and, second, we should explore the full scope of opportunities presented by synthetic data as we move into the future. We argue that explicit attention to da…
▽ More
We present a case that the newly emerging field of synthetic data in the area of recommender systems should prioritize `doing data right'. We consider this catchphrase to have two aspects: First, we should not repeat the mistakes of the past, and, second, we should explore the full scope of opportunities presented by synthetic data as we move into the future. We argue that explicit attention to dataset design and description will help to avoid past mistakes with dataset bias and evaluation. In order to fully exploit the opportunities of synthetic data, we point out that researchers can investigate new areas such as using data synthesize to support reproducibility by making data open, as well as FAIR, and to push forward our understanding of data minimization.
△ Less
Submitted 7 October, 2021;
originally announced October 2021.
-
A Convolutional Neural Network based Cascade Reconstruction for the IceCube Neutrino Observatory
Authors:
R. Abbasi,
M. Ackermann,
J. Adams,
J. A. Aguilar,
M. Ahlers,
M. Ahrens,
C. Alispach,
A. A. Alves Jr.,
N. M. Amin,
R. An,
K. Andeen,
T. Anderson,
I. Ansseau,
G. Anton,
C. Argüelles,
S. Axani,
X. Bai,
A. Balagopal V.,
A. Barbano,
S. W. Barwick,
B. Bastian,
V. Basu,
V. Baum,
S. Baur,
R. Bay
, et al. (343 additional authors not shown)
Abstract:
Continued improvements on existing reconstruction methods are vital to the success of high-energy physics experiments, such as the IceCube Neutrino Observatory. In IceCube, further challenges arise as the detector is situated at the geographic South Pole where computational resources are limited. However, to perform real-time analyses and to issue alerts to telescopes around the world, powerful an…
▽ More
Continued improvements on existing reconstruction methods are vital to the success of high-energy physics experiments, such as the IceCube Neutrino Observatory. In IceCube, further challenges arise as the detector is situated at the geographic South Pole where computational resources are limited. However, to perform real-time analyses and to issue alerts to telescopes around the world, powerful and fast reconstruction methods are desired. Deep neural networks can be extremely powerful, and their usage is computationally inexpensive once the networks are trained. These characteristics make a deep learning-based approach an excellent candidate for the application in IceCube. A reconstruction method based on convolutional architectures and hexagonally shaped kernels is presented. The presented method is robust towards systematic uncertainties in the simulation and has been tested on experimental data. In comparison to standard reconstruction methods in IceCube, it can improve upon the reconstruction accuracy, while reducing the time necessary to run the reconstruction by two to three orders of magnitude.
△ Less
Submitted 26 July, 2021; v1 submitted 27 January, 2021;
originally announced January 2021.
-
On Success and Simplicity: A Second Look at Transferable Targeted Attacks
Authors:
Zhengyu Zhao,
Zhuoran Liu,
Martha Larson
Abstract:
Achieving transferability of targeted attacks is reputed to be remarkably difficult. Currently, state-of-the-art approaches are resource-intensive because they necessitate training model(s) for each target class with additional data. In our investigation, we find, however, that simple transferable attacks which require neither additional data nor model training can achieve surprisingly high target…
▽ More
Achieving transferability of targeted attacks is reputed to be remarkably difficult. Currently, state-of-the-art approaches are resource-intensive because they necessitate training model(s) for each target class with additional data. In our investigation, we find, however, that simple transferable attacks which require neither additional data nor model training can achieve surprisingly high targeted transferability. This insight has been overlooked until now, mainly due to the widespread practice of unreasonably restricting attack optimization to a limited number of iterations. In particular, we, for the first time, identify that a simple logit loss can yield competitive results with the state of the arts. Our analysis spans a variety of transfer settings, especially including three new, realistic settings: an ensemble transfer setting with little model similarity, a worse-case setting with low-ranked target classes, and also a real-world attack against the Google Cloud Vision API. Results in these new settings demonstrate that the commonly adopted, easy settings cannot fully reveal the actual properties of different attacks and may cause misleading comparisons. We also show the usefulness of the simple logit loss for generating targeted universal adversarial perturbations in a data-free and training-free manner. Overall, the aim of our analysis is to inspire a more meaningful evaluation on targeted transferability. Code is available at https://github.com/ZhengyuZhao/Targeted-Tansfer
△ Less
Submitted 26 October, 2021; v1 submitted 21 December, 2020;
originally announced December 2020.
-
Screen Gleaning: A Screen Reading TEMPEST Attack on Mobile Devices Exploiting an Electromagnetic Side Channel
Authors:
Zhuoran Liu,
Niels Samwel,
Léo Weissbart,
Zhengyu Zhao,
Dirk Lauret,
Lejla Batina,
Martha Larson
Abstract:
We introduce screen gleaning, a TEMPEST attack in which the screen of a mobile device is read without a visual line of sight, revealing sensitive information displayed on the phone screen. The screen gleaning attack uses an antenna and a software-defined radio (SDR) to pick up the electromagnetic signal that the device sends to the screen to display, e.g., a message with a security code. This spec…
▽ More
We introduce screen gleaning, a TEMPEST attack in which the screen of a mobile device is read without a visual line of sight, revealing sensitive information displayed on the phone screen. The screen gleaning attack uses an antenna and a software-defined radio (SDR) to pick up the electromagnetic signal that the device sends to the screen to display, e.g., a message with a security code. This special equipment makes it possible to recreate the signal as a gray-scale image, which we refer to as an emage. Here, we show that it can be used to read a security code. The screen gleaning attack is challenging because it is often impossible for a human viewer to interpret the emage directly. We show that this challenge can be addressed with machine learning, specifically, a deep learning classifier. Screen gleaning will become increasingly serious as SDRs and deep learning continue to rapidly advance. In this paper, we demonstrate the security code attack and we propose a testbed that provides a standard setup in which screen gleaning could be tested with different attacker models. Finally, we analyze the dimensions of screen gleaning attacker models and discuss possible countermeasures with the potential to address them.
△ Less
Submitted 19 November, 2020;
originally announced November 2020.
-
Adversarial Image Color Transformations in Explicit Color Filter Space
Authors:
Zhengyu Zhao,
Zhuoran Liu,
Martha Larson
Abstract:
Deep Neural Networks have been shown to be vulnerable to adversarial images. Conventional attacks strive for indistinguishable adversarial images with strictly restricted perturbations. Recently, researchers have moved to explore distinguishable yet non-suspicious adversarial images and demonstrated that color transformation attacks are effective. In this work, we propose Adversarial Color Filter…
▽ More
Deep Neural Networks have been shown to be vulnerable to adversarial images. Conventional attacks strive for indistinguishable adversarial images with strictly restricted perturbations. Recently, researchers have moved to explore distinguishable yet non-suspicious adversarial images and demonstrated that color transformation attacks are effective. In this work, we propose Adversarial Color Filter (AdvCF), a novel color transformation attack that is optimized with gradient information in the parameter space of a simple color filter. In particular, our color filter space is explicitly specified so that we are able to provide a systematic analysis of model robustness against adversarial color transformations, from both the attack and defense perspectives. In contrast, existing color transformation attacks do not offer the opportunity for systematic analysis due to the lack of such an explicit space. We further demonstrate the effectiveness of our AdvCF in fooling image classifiers and also compare it with other color transformation attacks regarding their robustness to defenses and image acceptability through an extensive user study. We also highlight the human-interpretability of AdvCF and show its superiority over the state-of-the-art human-interpretable color transformation attack on both image acceptability and efficiency. Additional results provide interesting new insights into model robustness against AdvCF in another three visual tasks.
△ Less
Submitted 16 June, 2023; v1 submitted 12 November, 2020;
originally announced November 2020.
-
Partially Synthetic Data for Recommender Systems: Prediction Performance and Preference Hiding
Authors:
Manel Slokom,
Martha Larson,
Alan Hanjalic
Abstract:
This paper demonstrates the potential of statistical disclosure control for protecting the data used to train recommender systems. Specifically, we use a synthetic data generation approach to hide specific information in the user-item matrix. We apply a transformation to the original data that changes some values, but leaves others the same. The result is a partially synthetic data set that can be…
▽ More
This paper demonstrates the potential of statistical disclosure control for protecting the data used to train recommender systems. Specifically, we use a synthetic data generation approach to hide specific information in the user-item matrix. We apply a transformation to the original data that changes some values, but leaves others the same. The result is a partially synthetic data set that can be used for recommendation but contains less specific information about individual user preferences. Synthetic data has the potential to be useful for companies, who are interested in releasing data to allow outside parties to develop new recommender algorithms, i.e., in the case of a recommender system challenge, and also reducing the risks associated with data misappropriation. Our experiments run a set of recommender system algorithms on our partially synthetic data sets as well as on the original data. The results show that the relative performance of the algorithms on the partially synthetic data reflects the relative performance on the original data. Further analysis demonstrates that properties of the original data are preserved under synthesis, but that for certain examples of attributes accessible in the original data are hidden in the synthesized data.
△ Less
Submitted 9 August, 2020;
originally announced August 2020.
-
Adversarial Item Promotion: Vulnerabilities at the Core of Top-N Recommenders that Use Images to Address Cold Start
Authors:
Zhuoran Liu,
Martha Larson
Abstract:
E-commerce platforms provide their customers with ranked lists of recommended items matching the customers' preferences. Merchants on e-commerce platforms would like their items to appear as high as possible in the top-N of these ranked lists. In this paper, we demonstrate how unscrupulous merchants can create item images that artificially promote their products, improving their rankings. Recommen…
▽ More
E-commerce platforms provide their customers with ranked lists of recommended items matching the customers' preferences. Merchants on e-commerce platforms would like their items to appear as high as possible in the top-N of these ranked lists. In this paper, we demonstrate how unscrupulous merchants can create item images that artificially promote their products, improving their rankings. Recommender systems that use images to address the cold start problem are vulnerable to this security risk. We describe a new type of attack, Adversarial Item Promotion (AIP), that strikes directly at the core of Top-N recommenders: the ranking mechanism itself. Existing work on adversarial images in recommender systems investigates the implications of conventional attacks, which target deep learning classifiers. In contrast, our AIP attacks are embedding attacks that seek to push features representations in a way that fools the ranker (not a classifier) and directly lead to item promotion. We introduce three AIP attacks insider attack, expert attack, and semantic attack, which are defined with respect to three successively more realistic attack models. Our experiments evaluate the danger of these attacks when mounted against three representative visually-aware recommender algorithms in a framework that uses images to address cold start. We also evaluate potential defenses, including adversarial training and find that common, currently-existing, techniques do not eliminate the danger of AIP attacks. In sum, we show that using images to address cold start opens recommender systems to potential threats with clear practical implications.
△ Less
Submitted 20 October, 2020; v1 submitted 2 June, 2020;
originally announced June 2020.
-
Adversarial Color Enhancement: Generating Unrestricted Adversarial Images by Optimizing a Color Filter
Authors:
Zhengyu Zhao,
Zhuoran Liu,
Martha Larson
Abstract:
We introduce an approach that enhances images using a color filter in order to create adversarial effects, which fool neural networks into misclassification. Our approach, Adversarial Color Enhancement (ACE), generates unrestricted adversarial images by optimizing the color filter via gradient descent. The novelty of ACE is its incorporation of established practice for image enhancement in a trans…
▽ More
We introduce an approach that enhances images using a color filter in order to create adversarial effects, which fool neural networks into misclassification. Our approach, Adversarial Color Enhancement (ACE), generates unrestricted adversarial images by optimizing the color filter via gradient descent. The novelty of ACE is its incorporation of established practice for image enhancement in a transparent manner. Experimental results validate the white-box adversarial strength and black-box transferability of ACE. A range of examples demonstrates the perceptual quality of images that ACE produces. ACE makes an important contribution to recent work that moves beyond $L_p$ imperceptibility and focuses on unrestricted adversarial modifications that yield large perceptible perturbations, but remain non-suspicious, to the human eye. The future potential of filter-based adversaries is also explored in two directions: guiding ACE with common enhancement practices (e.g., Instagram filters) towards specific attractive image styles and adapting ACE to image semantics. Code is available at https://github.com/ZhengyuZhao/ACE.
△ Less
Submitted 9 August, 2020; v1 submitted 3 February, 2020;
originally announced February 2020.
-
Towards Large yet Imperceptible Adversarial Image Perturbations with Perceptual Color Distance
Authors:
Zhengyu Zhao,
Zhuoran Liu,
Martha Larson
Abstract:
The success of image perturbations that are designed to fool image classifier is assessed in terms of both adversarial effect and visual imperceptibility. The conventional assumption on imperceptibility is that perturbations should strive for tight $L_p$-norm bounds in RGB space. In this work, we drop this assumption by pursuing an approach that exploits human color perception, and more specifical…
▽ More
The success of image perturbations that are designed to fool image classifier is assessed in terms of both adversarial effect and visual imperceptibility. The conventional assumption on imperceptibility is that perturbations should strive for tight $L_p$-norm bounds in RGB space. In this work, we drop this assumption by pursuing an approach that exploits human color perception, and more specifically, minimizing perturbation size with respect to perceptual color distance. Our first approach, Perceptual Color distance C&W (PerC-C&W), extends the widely-used C&W approach and produces larger RGB perturbations. PerC-C&W is able to maintain adversarial strength, while contributing to imperceptibility. Our second approach, Perceptual Color distance Alternating Loss (PerC-AL), achieves the same outcome, but does so more efficiently by alternating between the classification loss and perceptual color difference when updating perturbations. Experimental evaluation shows PerC approaches outperform conventional $L_p$ approaches in terms of robustness and transferability, and also demonstrates that the PerC distance can provide added value on top of existing structure-based methods to creating image perturbations.
△ Less
Submitted 31 March, 2020; v1 submitted 6 November, 2019;
originally announced November 2019.
-
Remembering Winter Was Coming: Character-Oriented Video Summaries of TV Series
Authors:
Xavier Bost,
Serigne Gueye,
Vincent Labatut,
Martha Larson,
Georges Linarès,
Damien Malinas,
Raphaël Roth
Abstract:
Today's popular TV series tend to develop continuous, complex plots spanning several seasons, but are often viewed in controlled and discontinuous conditions. Consequently, most viewers need to be re-immersed in the story before watching a new season. Although discussions with friends and family can help, we observe that most viewers make extensive use of summaries to re-engage with the plot. Auto…
▽ More
Today's popular TV series tend to develop continuous, complex plots spanning several seasons, but are often viewed in controlled and discontinuous conditions. Consequently, most viewers need to be re-immersed in the story before watching a new season. Although discussions with friends and family can help, we observe that most viewers make extensive use of summaries to re-engage with the plot. Automatic generation of video summaries of TV series' complex stories requires, first, modeling the dynamics of the plot and, second, extracting relevant sequences. In this paper, we tackle plot modeling by considering the social network of interactions between the characters involved in the narrative: substantial, durable changes in a major character's social environment suggest a new development relevant for the summary. Once identified, these major stages in each character's storyline can be used as a basis for completing the summary with related sequences. Our algorithm combines such social network analysis with filmmaking grammar to automatically generate character-oriented video summaries of TV series from partially annotated data. We carry out evaluation with a user study in a real-world scenario: a large sample of viewers were asked to rank video summaries centered on five characters of the popular TV series Game of Thrones, a few weeks before the new, sixth season was released. Our results reveal the ability of character-oriented summaries to re-engage viewers in television series and confirm the contributions of modeling the plot content and exploiting stylistic patterns to identify salient sequences.
△ Less
Submitted 18 March, 2020; v1 submitted 5 September, 2019;
originally announced September 2019.
-
Who's Afraid of Adversarial Queries? The Impact of Image Modifications on Content-based Image Retrieval
Authors:
Zhuoran Liu,
Zhengyu Zhao,
Martha Larson
Abstract:
An adversarial query is an image that has been modified to disrupt content-based image retrieval (CBIR) while appearing nearly untouched to the human eye. This paper presents an analysis of adversarial queries for CBIR based on neural, local, and global features. We introduce an innovative neural image perturbation approach, called Perturbations for Image Retrieval Error (PIRE), that is capable of…
▽ More
An adversarial query is an image that has been modified to disrupt content-based image retrieval (CBIR) while appearing nearly untouched to the human eye. This paper presents an analysis of adversarial queries for CBIR based on neural, local, and global features. We introduce an innovative neural image perturbation approach, called Perturbations for Image Retrieval Error (PIRE), that is capable of blocking neural-feature-based CBIR. PIRE differs significantly from existing approaches that create images adversarial with respect to CNN classifiers because it is unsupervised, i.e., it needs no labelled data from the data set to which it is applied. Our experimental analysis demonstrates the surprising effectiveness of PIRE in blocking CBIR, and also covers aspects of PIRE that must be taken into account in practical settings, including saving images, image quality and leaking adversarial queries into the background collection. Our experiments also compare PIRE (a neural approach) with existing keypoint removal and injection approaches (which modify local features). Finally, we discuss the challenges that face multimedia researchers in the future study of adversarial queries.
△ Less
Submitted 2 May, 2019; v1 submitted 29 January, 2019;
originally announced January 2019.
-
Factorization Machines for Data with Implicit Feedback
Authors:
Babak Loni,
Martha Larson,
Alan Hanjalic
Abstract:
In this work, we propose FM-Pair, an adaptation of Factorization Machines with a pairwise loss function, making them effective for datasets with implicit feedback. The optimization model in FM-Pair is based on the BPR (Bayesian Personalized Ranking) criterion, which is a well-established pairwise optimization model. FM-Pair retains the advantages of FMs on generality, expressiveness and performanc…
▽ More
In this work, we propose FM-Pair, an adaptation of Factorization Machines with a pairwise loss function, making them effective for datasets with implicit feedback. The optimization model in FM-Pair is based on the BPR (Bayesian Personalized Ranking) criterion, which is a well-established pairwise optimization model. FM-Pair retains the advantages of FMs on generality, expressiveness and performance and yet it can be used for datasets with implicit feedback. We also propose how to apply FM-Pair effectively on two collaborative filtering problems, namely, context-aware recommendation and cross-domain collaborative filtering. By performing experiments on different datasets with explicit or implicit feedback we empirically show that in most of the tested datasets, FM-Pair beats state-of-the-art learning-to-rank methods such as BPR-MF (BPR with Matrix Factorization model). We also show that FM-Pair is significantly more effective for ranking, compared to the standard FMs model. Moreover, we show that FM-Pair can utilize context or cross-domain information effectively as the accuracy of recommendations would always improve with the right auxiliary features. Finally we show that FM-Pair has a linear time complexity and scales linearly by exploiting additional features.
△ Less
Submitted 19 December, 2018;
originally announced December 2018.
-
From Volcano to Toyshop: Adaptive Discriminative Region Discovery for Scene Recognition
Authors:
Zhengyu Zhao,
Martha Larson
Abstract:
As deep learning approaches to scene recognition emerge, they have continued to leverage discriminative regions at multiple scales, building on practices established by conventional image classification research. However, approaches remain largely generic, and do not carefully consider the special properties of scenes. In this paper, inspired by the intuitive differences between scenes and objects…
▽ More
As deep learning approaches to scene recognition emerge, they have continued to leverage discriminative regions at multiple scales, building on practices established by conventional image classification research. However, approaches remain largely generic, and do not carefully consider the special properties of scenes. In this paper, inspired by the intuitive differences between scenes and objects, we propose Adi-Red, an adaptive approach to discriminative region discovery for scene recognition. Adi-Red uses a CNN classifier, which was pre-trained using only image-level scene labels, to discover discriminative image regions directly. These regions are then used as a source of features to perform scene recognition. The use of the CNN classifier makes it possible to adapt the number of discriminative regions per image using a simple, yet elegant, threshold, at relatively low computational cost. Experimental results on the scene recognition benchmark dataset SUN397 demonstrate the ability of Adi-Red to outperform the state of the art. Additional experimental analysis on the Places dataset reveals the advantages of Adi-Red, and highlight how they are specific to scenes. We attribute the effectiveness of Adi-Red to the ability of adaptive region discovery to avoid introducing noise, while also not missing out on important information.
△ Less
Submitted 15 August, 2018; v1 submitted 23 July, 2018;
originally announced July 2018.
-
A Stabilized Cut Streamline Diffusion Finite Element Method for Convection-Diffusion Problems on Surfaces
Authors:
Erik Burman,
Peter Hansbo,
Mats G. Larson,
Andre Massing,
Sara Zahedi
Abstract:
We develop a stabilized cut finite element method for the stationary convection diffusion problem on a surface embedded in ${\mathbb{R}}^d$. The cut finite element method is based on using an embedding of the surface into a three dimensional mesh consisting of tetrahedra and then using the restriction of the standard piecewise linear continuous elements to a piecewise linear approximation of the s…
▽ More
We develop a stabilized cut finite element method for the stationary convection diffusion problem on a surface embedded in ${\mathbb{R}}^d$. The cut finite element method is based on using an embedding of the surface into a three dimensional mesh consisting of tetrahedra and then using the restriction of the standard piecewise linear continuous elements to a piecewise linear approximation of the surface. The stabilization consists of a standard streamline diffusion stabilization term on the discrete surface and a so called normal gradient stabilization term on the full tetrahedral elements in the active mesh. We prove optimal order a priori error estimates in the standard norm associated with the streamline diffusion method and bounds for the condition number of the resulting stiffness matrix. The condition number is of optimal order $O(h^{-1})$ for a specific choice of method parameters. Numerical example supporting our theoretical results are also included.
△ Less
Submitted 4 July, 2018;
originally announced July 2018.
-
Assessing the impact of machine intelligence on human behaviour: an interdisciplinary endeavour
Authors:
Emilia Gómez,
Carlos Castillo,
Vicky Charisi,
Verónica Dahl,
Gustavo Deco,
Blagoj Delipetrev,
Nicole Dewandre,
Miguel Ángel González-Ballester,
Fabien Gouyon,
José Hernández-Orallo,
Perfecto Herrera,
Anders Jonsson,
Ansgar Koene,
Martha Larson,
Ramón López de Mántaras,
Bertin Martens,
Marius Miron,
Rubén Moreno-Bote,
Nuria Oliver,
Antonio Puertas Gallardo,
Heike Schweitzer,
Nuria Sebastian,
Xavier Serra,
Joan Serrà,
Songül Tolan
, et al. (1 additional authors not shown)
Abstract:
This document contains the outcome of the first Human behaviour and machine intelligence (HUMAINT) workshop that took place 5-6 March 2018 in Barcelona, Spain. The workshop was organized in the context of a new research programme at the Centre for Advanced Studies, Joint Research Centre of the European Commission, which focuses on studying the potential impact of artificial intelligence on human b…
▽ More
This document contains the outcome of the first Human behaviour and machine intelligence (HUMAINT) workshop that took place 5-6 March 2018 in Barcelona, Spain. The workshop was organized in the context of a new research programme at the Centre for Advanced Studies, Joint Research Centre of the European Commission, which focuses on studying the potential impact of artificial intelligence on human behaviour. The workshop gathered an interdisciplinary group of experts to establish the state of the art research in the field and a list of future research challenges to be addressed on the topic of human and machine intelligence, algorithm's potential impact on human cognitive capabilities and decision making, and evaluation and regulation needs. The document is made of short position statements and identification of challenges provided by each expert, and incorporates the result of the discussions carried out during the workshop. In the conclusion section, we provide a list of emerging research topics and strategies to be addressed in the near future.
△ Less
Submitted 7 June, 2018;
originally announced June 2018.
-
Investigating the Effect of Music and Lyrics on Spoken-Word Recognition
Authors:
Odette Scharenborg,
Martha Larson
Abstract:
Background music in social interaction settings can hinder conversation. Yet, little is known of how specific properties of music impact speech processing. This paper addresses this knowledge gap by investigating 1) whether the masking effect of background music with lyrics is larger than that of music without lyrics, and 2) whether the masking effect is larger for more complex music. To answer th…
▽ More
Background music in social interaction settings can hinder conversation. Yet, little is known of how specific properties of music impact speech processing. This paper addresses this knowledge gap by investigating 1) whether the masking effect of background music with lyrics is larger than that of music without lyrics, and 2) whether the masking effect is larger for more complex music. To answer these questions, a word identification experiment was run in which Dutch participants listened to Dutch CVC words embedded in stretches of background music in two conditions, with and without lyrics, and at three SNRs. Three songs were used of different genres and complexities. Music stretches with and without lyrics were sampled from the same song in order to control for factors beyond the presence of lyrics. The results showed a clear negative impact of the presence of lyrics in background music on spoken-word recognition. This impact is independent of complexity. The results suggest that social spaces (e.g., restaurants, cafés and bars) should make careful choices of music to promote conversation, and open a path for future work.
△ Less
Submitted 13 March, 2018;
originally announced March 2018.
-
Exploring Deep Space: Learning Personalized Ranking in a Semantic Space
Authors:
Jeroen B. P. Vuurens,
Martha Larson,
Arjen P. de Vries
Abstract:
Recommender systems leverage both content and user interactions to generate recommendations that fit users' preferences. The recent surge of interest in deep learning presents new opportunities for exploiting these two sources of information. To recommend items we propose to first learn a user-independent high-dimensional semantic space in which items are positioned according to their substitutabi…
▽ More
Recommender systems leverage both content and user interactions to generate recommendations that fit users' preferences. The recent surge of interest in deep learning presents new opportunities for exploiting these two sources of information. To recommend items we propose to first learn a user-independent high-dimensional semantic space in which items are positioned according to their substitutability, and then learn a user-specific transformation function to transform this space into a ranking according to the user's past preferences. An advantage of the proposed architecture is that it can be used to effectively recommend items using either content that describes the items or user-item ratings. We show that this approach significantly outperforms state-of-the-art recommender systems on the MovieLens 1M dataset.
△ Less
Submitted 22 August, 2016; v1 submitted 31 July, 2016;
originally announced August 2016.
-
Where to be wary: The impact of widespread photo-taking and image enhancement practices on users' geo-privacy
Authors:
Jaeyoung Choi,
Martha Larson,
Xinchao Li,
Gerald Friedland,
Alan Hanjalic
Abstract:
Today's geo-location estimation approaches are able to infer the location of a target image using its visual content alone. These approaches exploit visual matching techniques, applied to a large collection of background images with known geo-locations. Users who are unaware that visual retrieval approaches can compromise their geo-privacy, unwittingly open themselves to risks of crime or other un…
▽ More
Today's geo-location estimation approaches are able to infer the location of a target image using its visual content alone. These approaches exploit visual matching techniques, applied to a large collection of background images with known geo-locations. Users who are unaware that visual retrieval approaches can compromise their geo-privacy, unwittingly open themselves to risks of crime or other unintended consequences. Private photo sharing is not able to protect users effectively, since its inconvenience is a barrier to consistent use, and photos can still fall into the wrong hands if they are re-shared. This paper lays the groundwork for a new approach to geo-privacy of social images: Instead of requiring a complete change of user behavior, we investigate the protection potential latent in users existing practices. We carry out a series of retrieval experiments using a large collection of social images (8.5M) to systematically analyze where users should be wary, and how both photo taking and editing practices impact the performance of geo-location estimation. We find that practices that are currently widespread are already sufficient to protect single-handedly the geo-location ('geo-cloak') up to more than 50% of images whose location would otherwise be automatically predictable. Our conclusion is that protecting users against the unwanted effects of visual retrieval is a viable research field, and should take as its starting point existing user practices.
△ Less
Submitted 3 March, 2016;
originally announced March 2016.