Skip to main content

Showing 1–11 of 11 results for author: Slezak, D F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2501.03381  [pdf, other

    cs.SC

    THOI: An efficient and accessible library for computing higher-order interactions enhanced by batch-processing

    Authors: Laouen Belloli, Pedro Mediano, Rodrigo Cofré, Diego Fernandez Slezak, Rubén Herzog

    Abstract: Complex systems are characterized by nonlinear dynamics, multi-level interactions, and emergent collective behaviors. Traditional analyses that focus solely on pairwise interactions often oversimplify these systems, neglecting the higher-order interactions critical for understanding their full collective dynamics. Recent advances in multivariate information theory provide a principled framework fo… ▽ More

    Submitted 6 January, 2025; originally announced January 2025.

    Comments: 22 pages, 6 figures

  2. arXiv:2301.00792  [pdf, other

    cs.CL cs.AI

    The Undesirable Dependence on Frequency of Gender Bias Metrics Based on Word Embeddings

    Authors: Francisco Valentini, Germán Rosati, Diego Fernandez Slezak, Edgar Altszyler

    Abstract: Numerous works use word embedding-based metrics to quantify societal biases and stereotypes in texts. Recent studies have found that word embeddings can capture semantic similarity but may be affected by word frequency. In this work we study the effect of frequency when measuring female vs. male gender bias with word embedding-based bias quantification methods. We find that Skip-gram with negative… ▽ More

    Submitted 2 January, 2023; originally announced January 2023.

    Comments: Camera Ready for EMNLP 2022 (Findings)

  3. arXiv:2211.08203  [pdf, other

    cs.CL

    Investigating the Frequency Distortion of Word Embeddings and Its Impact on Bias Metrics

    Authors: Francisco Valentini, Juan Cruz Sosa, Diego Fernandez Slezak, Edgar Altszyler

    Abstract: Recent research has shown that static word embeddings can encode word frequency information. However, little has been studied about this phenomenon and its effects on downstream tasks. In the present work, we systematically study the association between frequency and semantic similarity in several static word embeddings. We find that Skip-gram, GloVe and FastText embeddings tend to produce higher… ▽ More

    Submitted 19 October, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

    Comments: Camera Ready for EMNLP 2023 (Findings)

  4. arXiv:2105.01570  [pdf, other

    q-bio.NC cs.SD eess.AS

    Simple and Cheap Setup for Timing Tapping Responses Synchronized to Auditory Stimuli

    Authors: Martin Miguel, Pablo Riera, Diego Fernandez Slezak

    Abstract: Measuring human capabilities to synchronize in time, adapt to perturbations to timing sequences or reproduce time intervals often require experimental setups that allow recording response times with millisecond precision. Most setups present auditory stimuli using either MIDI devices or specialized hardware such as Arduino and are often expensive or require calibration and advanced programming ski… ▽ More

    Submitted 16 July, 2021; v1 submitted 30 April, 2021; originally announced May 2021.

  5. arXiv:2104.06474  [pdf, other

    cs.CL

    On the Interpretability and Significance of Bias Metrics in Texts: a PMI-based Approach

    Authors: Francisco Valentini, Germán Rosati, Damián Blasi, Diego Fernandez Slezak, Edgar Altszyler

    Abstract: In recent years, word embeddings have been widely used to measure biases in texts. Even if they have proven to be effective in detecting a wide variety of biases, metrics based on word embeddings lack transparency and interpretability. We analyze an alternative PMI-based metric to quantify biases in texts. It can be expressed as a function of conditional probabilities, which provides a simple inte… ▽ More

    Submitted 18 July, 2023; v1 submitted 13 April, 2021; originally announced April 2021.

    Comments: Camera Ready for ACL 2023 (main conference)

  6. arXiv:2009.04985  [pdf, other

    eess.IV cs.CV cs.LG

    Unsupervised Domain Adaptation via CycleGAN for White Matter Hyperintensity Segmentation in Multicenter MR Images

    Authors: Julian Alberto Palladino, Diego Fernandez Slezak, Enzo Ferrante

    Abstract: Automatic segmentation of white matter hyperintensities in magnetic resonance images is of paramount clinical and research importance. Quantification of these lesions serve as a predictor for risk of stroke, dementia and mortality. During the last years, convolutional neural networks (CNN) specifically tailored for biomedical image segmentation have outperformed all previous techniques in this tas… ▽ More

    Submitted 10 September, 2020; originally announced September 2020.

    Comments: Accepted for publication in the International Seminar on Medical Information Processing and Analysis (SIPAIM 2020)

  7. arXiv:1903.03445  [pdf, other

    cs.CV cs.AI cs.LG

    Joint Learning of Brain Lesion and Anatomy Segmentation from Heterogeneous Datasets

    Authors: Nicolas Roulet, Diego Fernandez Slezak, Enzo Ferrante

    Abstract: Brain lesion and anatomy segmentation in magnetic resonance images are fundamental tasks in neuroimaging research and clinical practice. Given enough training data, convolutional neuronal networks (CNN) proved to outperform all existent techniques in both tasks independently. However, to date, little work has been done regarding simultaneous learning of brain lesion and anatomy segmentation from d… ▽ More

    Submitted 15 April, 2019; v1 submitted 8 March, 2019; originally announced March 2019.

    Comments: Accepted for publication at MIDL 2019. Open reviews available at: https://openreview.net/forum?id=Syest0rxlN

  8. arXiv:1712.10054  [pdf, ps, other

    cs.CL cs.AI

    Corpus specificity in LSA and Word2vec: the role of out-of-domain documents

    Authors: Edgar Altszyler, Mariano Sigman, Diego Fernandez Slezak

    Abstract: Latent Semantic Analysis (LSA) and Word2vec are some of the most widely used word embeddings. Despite the popularity of these techniques, the precise mechanisms by which they acquire new semantic relations between words remain unclear. In the present article we investigate whether LSA and Word2vec capacity to identify relevant semantic dimensions increases with size of corpus. One intuitive hypoth… ▽ More

    Submitted 28 December, 2017; originally announced December 2017.

    Journal ref: Proceedings of the 3rd Workshop on Representation Learning for NLP, pages 1-10, 2018, ACL

  9. arXiv:1612.09268  [pdf

    q-bio.NC cs.CL physics.soc-ph

    The ontogeny of discourse structure mimics the development of literature

    Authors: Natalia Bezerra Mota, Sylvia Pinheiro, Mariano Sigman, Diego Fernandez Slezak, Guillermo Cecchi, Mauro Copelli, Sidarta Ribeiro

    Abstract: Discourse varies with age, education, psychiatric state and historical epoch, but the ontogenetic and cultural dynamics of discourse structure remain to be quantitatively characterized. To this end we investigated word graphs obtained from verbal reports of 200 subjects ages 2-58, and 676 literary texts spanning ~5,000 years. In healthy subjects, lexical diversity, graph size, and long-range recur… ▽ More

    Submitted 27 December, 2016; originally announced December 2016.

    Comments: Natalia Bezerra Mota and Sylvia Pinheiro: Equal contribution Sidarta Ribeiro and Mauro Copelli: Corresponding authors

  10. Comparative study of LSA vs Word2vec embeddings in small corpora: a case study in dreams database

    Authors: Edgar Altszyler, Mariano Sigman, Sidarta Ribeiro, Diego Fernández Slezak

    Abstract: Word embeddings have been extensively studied in large text datasets. However, only a few studies analyze semantic representations of small corpora, particularly relevant in single-person text production studies. In the present paper, we compare Skip-gram and LSA capabilities in this scenario, and we test both techniques to extract relevant semantic patterns in single-series dreams reports. LSA sh… ▽ More

    Submitted 11 April, 2017; v1 submitted 5 October, 2016; originally announced October 2016.

    Journal ref: Conscious Cogn. 2017 Nov;56:178-187

  11. arXiv:1606.02231  [pdf, other

    cs.AI stat.AP

    Emotional Intensity analysis in Bipolar subjects

    Authors: Facundo Carrillo, Natalia Mota, Mauro Copelli, Sidarta Ribeiro, Mariano Sigman, Guillermo Cecchi, Diego Fernandez Slezak

    Abstract: The massive availability of digital repositories of human thought opens radical novel way of studying the human mind. Natural language processing tools and computational models have evolved such that many mental conditions are predicted by analysing speech. Transcription of interviews and discourses are analyzed using syntactic, grammatical or sentiment analysis to infer the mental state. Here we… ▽ More

    Submitted 7 June, 2016; originally announced June 2016.

    Comments: Presented at MLINI-2015 workshop, 2015 (arXiv:cs/0101200)

    Report number: MLINI/2015/19