Search | arXiv e-print repository

Self-Calibrating BCIs: Ranking and Recovery of Mental Targets Without Labels

Authors: Jonathan Grizou, Carlos de la Torre-Ortiz, Tuukka Ruotsalo

Abstract: We consider the problem of recovering a mental target (e.g., an image of a face) that a participant has in mind from paired EEG (i.e., brain responses) and image (i.e., perceived faces) data collected during interactive sessions without access to labeled information. The problem has been previously explored with labeled data but not via self-calibration, where labeled data is unavailable. Here, we… ▽ More We consider the problem of recovering a mental target (e.g., an image of a face) that a participant has in mind from paired EEG (i.e., brain responses) and image (i.e., perceived faces) data collected during interactive sessions without access to labeled information. The problem has been previously explored with labeled data but not via self-calibration, where labeled data is unavailable. Here, we present the first framework and an algorithm, CURSOR, that learns to recover unknown mental targets without access to labeled data or pre-trained decoders. Our experiments on naturalistic images of faces demonstrate that CURSOR can (1) predict image similarity scores that correlate with human perceptual judgments without any label information, (2) use these scores to rank stimuli against an unknown mental target, and (3) generate new stimuli indistinguishable from the unknown mental target (validated via a user study, N=53). △ Less

Submitted 11 June, 2025; originally announced June 2025.

Comments: 10 pages, 4 figures, 11 appendix pages, 7 appendix figures

arXiv:2505.17630 [pdf, other]

GIM: Improved Interpretability for Large Language Models

Authors: Joakim Edin, Róbert Csordás, Tuukka Ruotsalo, Zhengxuan Wu, Maria Maistro, Jing Huang, Lars Maaløe

Abstract: Ensuring faithful interpretability in large language models is imperative for trustworthy and reliable AI. A key obstacle is self-repair, a phenomenon where networks compensate for reduced signal in one component by amplifying others, masking the true importance of the ablated component. While prior work attributes self-repair to layer normalization and back-up components that compensate for ablat… ▽ More Ensuring faithful interpretability in large language models is imperative for trustworthy and reliable AI. A key obstacle is self-repair, a phenomenon where networks compensate for reduced signal in one component by amplifying others, masking the true importance of the ablated component. While prior work attributes self-repair to layer normalization and back-up components that compensate for ablated components, we identify a novel form occurring within the attention mechanism, where softmax redistribution conceals the influence of important attention scores. This leads traditional ablation and gradient-based methods to underestimate the significance of all components contributing to these attention scores. We introduce Gradient Interaction Modifications (GIM), a technique that accounts for self-repair during backpropagation. Extensive experiments across multiple large language models (Gemma 2B/9B, LLAMA 1B/3B/8B, Qwen 1.5B/3B) and diverse tasks demonstrate that GIM significantly improves faithfulness over existing circuit identification and feature attribution methods. Our work is a significant step toward better understanding the inner mechanisms of LLMs, which is crucial for improving them and ensuring their safety. Our code is available at https://github.com/JoakimEdin/gim. △ Less

Submitted 23 May, 2025; originally announced May 2025.

MSC Class: 68T07 ACM Class: I.2.0; I.2.7

arXiv:2503.21714 [pdf, other]

As easy as PIE: understanding when pruning causes language models to disagree

Authors: Pietro Tropeano, Maria Maistro, Tuukka Ruotsalo, Christina Lioma

Abstract: Language Model (LM) pruning compresses the model by removing weights, nodes, or other parts of its architecture. Typically, pruning focuses on the resulting efficiency gains at the cost of effectiveness. However, when looking at how individual data points are affected by pruning, it turns out that a particular subset of data points always bears most of the brunt (in terms of reduced accuracy) when… ▽ More Language Model (LM) pruning compresses the model by removing weights, nodes, or other parts of its architecture. Typically, pruning focuses on the resulting efficiency gains at the cost of effectiveness. However, when looking at how individual data points are affected by pruning, it turns out that a particular subset of data points always bears most of the brunt (in terms of reduced accuracy) when pruning, but this effect goes unnoticed when reporting the mean accuracy of all data points. These data points are called PIEs and have been studied in image processing, but not in NLP. In a study of various NLP datasets, pruning methods, and levels of compression, we find that PIEs impact inference quality considerably, regardless of class frequency, and that BERT is more prone to this than BiLSTM. We also find that PIEs contain a high amount of data points that have the largest influence on how well the model generalises to unseen data. This means that when pruning, with seemingly moderate loss to accuracy across all data points, we in fact hurt tremendously those data points that matter the most. We trace what makes PIEs both hard and impactful to inference to their overall longer and more semantically complex text. These findings are novel and contribute to understanding how LMs are affected by pruning. The code is available at: https://github.com/pietrotrope/AsEasyAsPIE △ Less

Submitted 27 March, 2025; originally announced March 2025.

Comments: Accepted to NAACL 2025 (Findings)

arXiv:2502.11921 [pdf, other]

doi 10.1145/3696410.3714589

Joint Evaluation of Fairness and Relevance in Recommender Systems with Pareto Frontier

Authors: Theresia Veronika Rampisela, Tuukka Ruotsalo, Maria Maistro, Christina Lioma

Abstract: Fairness and relevance are two important aspects of recommender systems (RSs). Typically, they are evaluated either (i) separately by individual measures of fairness and relevance, or (ii) jointly using a single measure that accounts for fairness with respect to relevance. However, approach (i) often does not provide a reliable joint estimate of the goodness of the models, as it has two different… ▽ More Fairness and relevance are two important aspects of recommender systems (RSs). Typically, they are evaluated either (i) separately by individual measures of fairness and relevance, or (ii) jointly using a single measure that accounts for fairness with respect to relevance. However, approach (i) often does not provide a reliable joint estimate of the goodness of the models, as it has two different best models: one for fairness and another for relevance. Approach (ii) is also problematic because these measures tend to be ad-hoc and do not relate well to traditional relevance measures, like NDCG. Motivated by this, we present a new approach for jointly evaluating fairness and relevance in RSs: Distance to Pareto Frontier (DPFR). Given some user-item interaction data, we compute their Pareto frontier for a pair of existing relevance and fairness measures, and then use the distance from the frontier as a measure of the jointly achievable fairness and relevance. Our approach is modular and intuitive as it can be computed with existing measures. Experiments with 4 RS models, 3 re-ranking strategies, and 6 datasets show that existing metrics have inconsistent associations with our Pareto-optimal solution, making DPFR a more robust and theoretically well-founded joint measure for assessing fairness and relevance. Our code: https://github.com/theresiavr/DPFR-recsys-evaluation △ Less

Submitted 17 February, 2025; originally announced February 2025.

Comments: Accepted to TheWebConf/WWW 2025 (Oral)

arXiv:2501.18805 [pdf, ps, other]

Are Representation Disentanglement and Interpretability Linked in Recommendation Models? A Critical Review and Reproducibility Study

Authors: Ervin Dervishaj, Tuukka Ruotsalo, Maria Maistro, Christina Lioma

Abstract: Unsupervised learning of disentangled representations has been closely tied to enhancing the representation intepretability of Recommender Systems (RSs). This has been achieved by making the representation of individual features more distinctly separated, so that it is easier to attribute the contribution of features to the model's predictions. However, such advantages in interpretability and feat… ▽ More Unsupervised learning of disentangled representations has been closely tied to enhancing the representation intepretability of Recommender Systems (RSs). This has been achieved by making the representation of individual features more distinctly separated, so that it is easier to attribute the contribution of features to the model's predictions. However, such advantages in interpretability and feature attribution have mainly been explored qualitatively. Moreover, the effect of disentanglement on the model's recommendation performance has been largely overlooked. In this work, we reproduce the recommendation performance, representation disentanglement and representation interpretability of five well-known recommendation models on four RS datasets. We quantify disentanglement and investigate the link of disentanglement with recommendation effectiveness and representation interpretability. While several existing work in RSs have proposed disentangled representations as a gateway to improved effectiveness and interpretability, our findings show that disentanglement is not necessarily related to effectiveness but is closely related to representation interpretability. Our code and results are publicly available at https://github.com/edervishaj/disentanglement-interpretability-recsys. △ Less

Submitted 30 January, 2025; originally announced January 2025.

Comments: Accepted at the 47th European Conference on Information Retrieval (ECIR 2025)

arXiv:2408.08137 [pdf, other]

Normalized AOPC: Fixing Misleading Faithfulness Metrics for Feature Attribution Explainability

Authors: Joakim Edin, Andreas Geert Motzfeldt, Casper L. Christensen, Tuukka Ruotsalo, Lars Maaløe, Maria Maistro

Abstract: Deep neural network predictions are notoriously difficult to interpret. Feature attribution methods aim to explain these predictions by identifying the contribution of each input feature. Faithfulness, often evaluated using the area over the perturbation curve (AOPC), reflects feature attributions' accuracy in describing the internal mechanisms of deep neural networks. However, many studies rely o… ▽ More Deep neural network predictions are notoriously difficult to interpret. Feature attribution methods aim to explain these predictions by identifying the contribution of each input feature. Faithfulness, often evaluated using the area over the perturbation curve (AOPC), reflects feature attributions' accuracy in describing the internal mechanisms of deep neural networks. However, many studies rely on AOPC to compare faithfulness across different models, which we show can lead to false conclusions about models' faithfulness. Specifically, we find that AOPC is sensitive to variations in the model, resulting in unreliable cross-model comparisons. Moreover, AOPC scores are difficult to interpret in isolation without knowing the model-specific lower and upper limits. To address these issues, we propose a normalization approach, Normalized AOPC (NAOPC), enabling consistent cross-model evaluations and more meaningful interpretation of individual scores. Our experiments demonstrate that this normalization can radically change AOPC results, questioning the conclusions of earlier studies and offering a more robust framework for assessing feature attribution faithfulness. Our code is available at https://github.com/JoakimEdin/naopc. △ Less

Submitted 23 May, 2025; v1 submitted 15 August, 2024; originally announced August 2024.

Comments: Accepted to ACL 2025 Main

ACM Class: I.2.0

arXiv:2406.08958 [pdf, other]

An Unsupervised Approach to Achieve Supervised-Level Explainability in Healthcare Records

Authors: Joakim Edin, Maria Maistro, Lars Maaløe, Lasse Borgholt, Jakob D. Havtorn, Tuukka Ruotsalo

Abstract: Electronic healthcare records are vital for patient safety as they document conditions, plans, and procedures in both free text and medical codes. Language models have significantly enhanced the processing of such records, streamlining workflows and reducing manual data entry, thereby saving healthcare providers significant resources. However, the black-box nature of these models often leaves heal… ▽ More Electronic healthcare records are vital for patient safety as they document conditions, plans, and procedures in both free text and medical codes. Language models have significantly enhanced the processing of such records, streamlining workflows and reducing manual data entry, thereby saving healthcare providers significant resources. However, the black-box nature of these models often leaves healthcare professionals hesitant to trust them. State-of-the-art explainability methods increase model transparency but rely on human-annotated evidence spans, which are costly. In this study, we propose an approach to produce plausible and faithful explanations without needing such annotations. We demonstrate on the automated medical coding task that adversarial robustness training improves explanation plausibility and introduce AttInGrad, a new explanation method superior to previous ones. By combining both contributions in a fully unsupervised setup, we produce explanations of comparable quality, or better, to that of a supervised approach. We release our code and model weights. △ Less

Submitted 28 September, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

Comments: Accepted to EMNLP 2024 Main

arXiv:2405.18276 [pdf, other]

doi 10.1145/3626772.3657832

Can We Trust Recommender System Fairness Evaluation? The Role of Fairness and Relevance

Authors: Theresia Veronika Rampisela, Tuukka Ruotsalo, Maria Maistro, Christina Lioma

Abstract: Relevance and fairness are two major objectives of recommender systems (RSs). Recent work proposes measures of RS fairness that are either independent from relevance (fairness-only) or conditioned on relevance (joint measures). While fairness-only measures have been studied extensively, we look into whether joint measures can be trusted. We collect all joint evaluation measures of RS relevance and… ▽ More Relevance and fairness are two major objectives of recommender systems (RSs). Recent work proposes measures of RS fairness that are either independent from relevance (fairness-only) or conditioned on relevance (joint measures). While fairness-only measures have been studied extensively, we look into whether joint measures can be trusted. We collect all joint evaluation measures of RS relevance and fairness, and ask: How much do they agree with each other? To what extent do they agree with relevance/fairness measures? How sensitive are they to changes in rank position, or to increasingly fair and relevant recommendations? We empirically study for the first time the behaviour of these measures across 4 real-world datasets and 4 recommenders. We find that most of these measures: i) correlate weakly with one another and even contradict each other at times; ii) are less sensitive to rank position changes than relevance- and fairness-only measures, meaning that they are less granular than traditional RS measures; and iii) tend to compress scores at the low end of their range, meaning that they are not very expressive. We counter the above limitations with a set of guidelines on the appropriate usage of such measures, i.e., they should be used with caution due to their tendency to contradict each other and of having a very small empirical range. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: Accepted to SIGIR 2024 as full paper

arXiv:2405.15272 [pdf, other]

doi 10.1109/MC.2024.3404994

Physiological Data: Challenges for Privacy and Ethics

Authors: Keith Davis, Tuukka Ruotsalo

Abstract: Wearable devices that measure and record physiological signals are now becoming widely available to the general public with ever-increasing affordability and signal quality. The data from these devices introduce serious ethical challenges that remain largely unaddressed. Users do not always understand how these data can be leveraged to reveal private information about them and developers of these… ▽ More Wearable devices that measure and record physiological signals are now becoming widely available to the general public with ever-increasing affordability and signal quality. The data from these devices introduce serious ethical challenges that remain largely unaddressed. Users do not always understand how these data can be leveraged to reveal private information about them and developers of these devices may not fully grasp how physiological data collected today could be used in the future for completely different purposes. We discuss the potential for wearable devices, initially designed to help users improve their well-being or enhance the experience of some digital application, to be appropriated in ways that extend far beyond their original intended purpose. We identify how the currently available technology can be misused, discuss how pairing physiological data with non-physiological data can radically expand the predictive capacity of physiological wearables, and explore the implications of these expanded capacities for a variety of stakeholders. △ Less

Submitted 24 May, 2024; originally announced May 2024.

ACM Class: K.4.0; K.4.1; K.4.2

arXiv:2405.09691 [pdf, ps, other]

Modeling User Preferences via Brain-Computer Interfacing

Authors: Luis A. Leiva, V. Javier Traver, Alexandra Kawala-Sterniuk, Tuukka Ruotsalo

Abstract: Present Brain-Computer Interfacing (BCI) technology allows inference and detection of cognitive and affective states, but fairly little has been done to study scenarios in which such information can facilitate new applications that rely on modeling human cognition. One state that can be quantified from various physiological signals is attention. Estimates of human attention can be used to reveal p… ▽ More Present Brain-Computer Interfacing (BCI) technology allows inference and detection of cognitive and affective states, but fairly little has been done to study scenarios in which such information can facilitate new applications that rely on modeling human cognition. One state that can be quantified from various physiological signals is attention. Estimates of human attention can be used to reveal preferences and novel dimensions of user experience. Previous approaches have tackled these incredibly challenging tasks using a variety of behavioral signals, from dwell-time to click-through data, and computational models of visual correspondence to these behavioral signals. However, behavioral signals are only rough estimations of the real underlying attention and affective preferences of the users. Indeed, users may attend to some content simply because it is salient, but not because it is really interesting, or simply because it is outrageous. With this paper, we put forward a research agenda and example work using BCI to infer users' preferences, their attentional correlates towards visual content, and their associations with affective experience. Subsequently, we link these to relevant applications, such as information retrieval, personalized steering of generative models, and crowdsourcing population estimates of affective experiences. △ Less

Submitted 31 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

arXiv:2402.15708 [pdf, other]

Query Augmentation by Decoding Semantics from Brain Signals

Authors: Ziyi Ye, Jingtao Zhan, Qingyao Ai, Yiqun Liu, Maarten de Rijke, Christina Lioma, Tuukka Ruotsalo

Abstract: Query augmentation is a crucial technique for refining semantically imprecise queries. Traditionally, query augmentation relies on extracting information from initially retrieved, potentially relevant documents. If the quality of the initially retrieved documents is low, then the effectiveness of query augmentation would be limited as well. We propose Brain-Aug, which enhances a query by incorpora… ▽ More Query augmentation is a crucial technique for refining semantically imprecise queries. Traditionally, query augmentation relies on extracting information from initially retrieved, potentially relevant documents. If the quality of the initially retrieved documents is low, then the effectiveness of query augmentation would be limited as well. We propose Brain-Aug, which enhances a query by incorporating semantic information decoded from brain signals. BrainAug generates the continuation of the original query with a prompt constructed with brain signal information and a ranking-oriented inference approach. Experimental results on fMRI (functional magnetic resonance imaging) datasets show that Brain-Aug produces semantically more accurate queries, leading to improved document ranking performance. Such improvement brought by brain signals is particularly notable for ambiguous queries. △ Less

Submitted 3 March, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

arXiv:2312.09803 [pdf, other]

doi 10.1109/TAFFC.2022.322588510

Contradicted by the Brain: Predicting Individual and Group Preferences via Brain-Computer Interfacing

Authors: Keith M. Davis III, Michiel Spapé, Tuukka Ruotsalo

Abstract: We investigate inferring individual preferences and the contradiction of individual preferences with group preferences through direct measurement of the brain. We report an experiment where brain activity collected from 31 participants produced in response to viewing images is associated with their self-reported preferences. First, we show that brain responses present a graded response to preferen… ▽ More We investigate inferring individual preferences and the contradiction of individual preferences with group preferences through direct measurement of the brain. We report an experiment where brain activity collected from 31 participants produced in response to viewing images is associated with their self-reported preferences. First, we show that brain responses present a graded response to preferences, and that brain responses alone can be used to train classifiers that reliably estimate preferences. Second, we show that brain responses reveal additional preference information that correlates with group preference, even when participants self-reported having no such preference. Our analysis of brain responses carries significant implications for researchers in general, as it suggests an individual's explicit preferences are not always aligned with the preferences inferred from their brain responses. These findings call into question the reliability of explicit and behavioral signals. They also imply that additional, multimodal sources of information may be necessary to infer reliable preference information. △ Less

Submitted 15 December, 2023; originally announced December 2023.

Comments: 12 pages, 7 figures, published in TAFFC. IEEE Transactions on Affective Computing (2022)

arXiv:2311.09889 [pdf, other]

Language Generation from Brain Recordings

Authors: Ziyi Ye, Qingyao Ai, Yiqun Liu, Maarten de Rijke, Min Zhang, Christina Lioma, Tuukka Ruotsalo

Abstract: Generating human language through non-invasive brain-computer interfaces (BCIs) has the potential to unlock many applications, such as serving disabled patients and improving communication. Currently, however, generating language via BCIs has been previously successful only within a classification setup for selecting pre-generated sentence continuation candidates with the most likely cortical sema… ▽ More Generating human language through non-invasive brain-computer interfaces (BCIs) has the potential to unlock many applications, such as serving disabled patients and improving communication. Currently, however, generating language via BCIs has been previously successful only within a classification setup for selecting pre-generated sentence continuation candidates with the most likely cortical semantic representation. Inspired by recent research that revealed associations between the brain and the large computational language models, we propose a generative language BCI that utilizes the capacity of a large language model (LLM) jointly with a semantic brain decoder to directly generate language from functional magnetic resonance imaging (fMRI) input. The proposed model can generate coherent language sequences aligned with the semantic content of visual or auditory language stimuli perceived, without prior knowledge of any pre-generated candidates. We compare the language generated from the presented model with a random control, pre-generated language selection approach, and a standard LLM, which generates common coherent text solely based on the next word likelihood according to statistical language training data. The proposed model is found to generate language that is more aligned with semantic stimulus in response to which brain input is sampled. Our findings demonstrate the potential and feasibility of employing BCIs in direct language generation. △ Less

Submitted 11 March, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

Comments: Preprint. Under Submission

arXiv:2311.01013 [pdf, other]

doi 10.1145/3631943

Evaluation Measures of Individual Item Fairness for Recommender Systems: A Critical Study

Authors: Theresia Veronika Rampisela, Maria Maistro, Tuukka Ruotsalo, Christina Lioma

Abstract: Fairness is an emerging and challenging topic in recommender systems. In recent years, various ways of evaluating and therefore improving fairness have emerged. In this study, we examine existing evaluation measures of fairness in recommender systems. Specifically, we focus solely on exposure-based fairness measures of individual items that aim to quantify the disparity in how individual items are… ▽ More Fairness is an emerging and challenging topic in recommender systems. In recent years, various ways of evaluating and therefore improving fairness have emerged. In this study, we examine existing evaluation measures of fairness in recommender systems. Specifically, we focus solely on exposure-based fairness measures of individual items that aim to quantify the disparity in how individual items are recommended to users, separate from item relevance to users. We gather all such measures and we critically analyse their theoretical properties. We identify a series of limitations in each of them, which collectively may render the affected measures hard or impossible to interpret, to compute, or to use for comparing recommendations. We resolve these limitations by redefining or correcting the affected measures, or we argue why certain limitations cannot be resolved. We further perform a comprehensive empirical analysis of both the original and our corrected versions of these fairness measures, using real-world and synthetic datasets. Our analysis provides novel insights into the relationship between measures based on different fairness concepts, and different levels of measure sensitivity and strictness. We conclude with practical suggestions of which fairness measures should be used and when. Our code is publicly available. To our knowledge, this is the first critical comparison of individual item fairness measures in recommender systems. △ Less

Submitted 2 November, 2023; originally announced November 2023.

Comments: Accepted to ACM Transactions on Recommender Systems (TORS)

arXiv:2304.10909 [pdf, other]

doi 10.1145/3539618.3591918

Automated Medical Coding on MIMIC-III and MIMIC-IV: A Critical Review and Replicability Study

Authors: Joakim Edin, Alexander Junge, Jakob D. Havtorn, Lasse Borgholt, Maria Maistro, Tuukka Ruotsalo, Lars Maaløe

Abstract: Medical coding is the task of assigning medical codes to clinical free-text documentation. Healthcare professionals manually assign such codes to track patient diagnoses and treatments. Automated medical coding can considerably alleviate this administrative burden. In this paper, we reproduce, compare, and analyze state-of-the-art automated medical coding machine learning models. We show that seve… ▽ More Medical coding is the task of assigning medical codes to clinical free-text documentation. Healthcare professionals manually assign such codes to track patient diagnoses and treatments. Automated medical coding can considerably alleviate this administrative burden. In this paper, we reproduce, compare, and analyze state-of-the-art automated medical coding machine learning models. We show that several models underperform due to weak configurations, poorly sampled train-test splits, and insufficient evaluation. In previous work, the macro F1 score has been calculated sub-optimally, and our correction doubles it. We contribute a revised model comparison using stratified sampling and identical experimental setups, including hyperparameters and decision boundary tuning. We analyze prediction errors to validate and falsify assumptions of previous works. The analysis confirms that all models struggle with rare codes, while long documents only have a negligible impact. Finally, we present the first comprehensive results on the newly released MIMIC-IV dataset using the reproduced models. We release our code, model parameters, and new MIMIC-III and MIMIC-IV training and evaluation pipelines to accommodate fair future comparisons. △ Less

Submitted 21 April, 2023; originally announced April 2023.

Comments: 11 pages, 6 figures, to be published in Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '23), July 23--27, 2023, Taipei, Taiwan

ACM Class: H.3.0

arXiv:2108.00920

Interactive Visual Facets to Support Fluid Exploratory Search

Authors: Chen He, Luana Micallef, Barış Serim, Tung Vuong, Tuukka Ruotsalo, Giulio Jacucci

Abstract: Exploratory search starts with ill-defined goals and involves browsing, learning, and formulating new targets for search. To fluidly support such dynamic search behaviours, we focus on devising interactive visual facets (IVF), visualising information facets to support user comprehension and control of the information space. To do this, we reviewed existing faceted search interfaces and derived two… ▽ More Exploratory search starts with ill-defined goals and involves browsing, learning, and formulating new targets for search. To fluidly support such dynamic search behaviours, we focus on devising interactive visual facets (IVF), visualising information facets to support user comprehension and control of the information space. To do this, we reviewed existing faceted search interfaces and derived two design requirements (DR) that have not been fully addressed to support fluid interactions in exploratory search. We then exemplified the requirements through devising an IVF tool, which coordinates a linear and a categorical facet representing the distribution and summarisation of items, respectively, and providing context for faceted exploration (DR1). To support rapid transitions between search criteria (DR2), the tool introduces a novel design concept of using facets to select items without filtering the item space. Particularly, we propose a filter-swipe technique that enables users to drag a categorical facet value sequentially over linear facet bars to view the items in the intersection of the two facets along with the categorical facet dynamically summarizing the items in the interaction. Three applications demonstrate how the features support information discovery with ease. A user study of 11 participants with realistic email search tasks shows that dynamic suggestions through the timeline navigation can help discover useful suggestions for search; the novel design concept was favoured over using facet values as filters. Based on these practices, we derive IVF design implications for fluid, exploratory searches. △ Less

Submitted 11 September, 2021; v1 submitted 2 August, 2021; originally announced August 2021.

Comments: The paper is incomplete

arXiv:1607.03502 [pdf, other]

doi 10.1038/srep38580

Natural brain-information interfaces: Recommending information by relevance inferred from human brain signals

Authors: Manuel J. A. Eugster, Tuukka Ruotsalo, Michiel M. Spapé, Oswald Barral, Niklas Ravaja, Giulio Jacucci, Samuel Kaski

Abstract: Finding relevant information from large document collections such as the World Wide Web is a common task in our daily lives. Estimation of a user's interest or search intention is necessary to recommend and retrieve relevant information from these collections. We introduce a brain-information interface used for recommending information by relevance inferred directly from brain signals. In experime… ▽ More Finding relevant information from large document collections such as the World Wide Web is a common task in our daily lives. Estimation of a user's interest or search intention is necessary to recommend and retrieve relevant information from these collections. We introduce a brain-information interface used for recommending information by relevance inferred directly from brain signals. In experiments, participants were asked to read Wikipedia documents about a selection of topics while their EEG was recorded. Based on the prediction of word relevance, the individual's search intent was modeled and successfully used for retrieving new, relevant documents from the whole English Wikipedia corpus. The results show that the users' interests towards digital content can be modeled from the brain signals evoked by reading. The introduced brain-relevance paradigm enables the recommendation of information without any explicit user interaction, and may be applied across diverse information-intensive applications. △ Less

Submitted 12 July, 2016; originally announced July 2016.

Journal ref: Scientific Reports 6, Article number: 38580 (2016)

Showing 1–17 of 17 results for author: Ruotsalo, T