-
Transformer-based classification of user queries for medical consultancy with respect to expert specialization
Abstract: The need for skilled medical support is growing in the era of digital healthcare. This research presents an innovative strategy, utilizing the RuBERT model, for categorizing user inquiries in the field of medical consultation with a focus on expert specialization. By harnessing the capabilities of transformers, we fine-tuned the pre-trained RuBERT model on a varied dataset, which facilitates preci… ▽ More
Submitted 2 October, 2023; v1 submitted 26 September, 2023; originally announced September 2023.
Comments: 16 pages, 5 figures
-
arXiv:2209.12256 [pdf, ps, other]
On the Cryptomorphism between Davis' Subset Lattices, Atomic Lattices, and Closure Systems under T1 Separation Axiom
Abstract: In this paper we count set closure systems (also known as Moore families) for the case when all single element sets are closed. In particular, we give the numbers of such strict (empty set included) and non-strict families for the base set of size $n=6$. We also provide the number of such inequivalent Moore families with respect to all permutations of the base set up to $n=6$. The search in OEIS a… ▽ More
Submitted 25 September, 2022; originally announced September 2022.
MSC Class: 06-04; 06A07; 06A15 ACM Class: G.2.1
-
On Interpretability and Similarity in Concept-Based Machine Learning
Abstract: Machine Learning (ML) provides important techniques for classification and predictions. Most of these are black-box models for users and do not provide decision-makers with an explanation. For the sake of transparency or more validity of decisions, the need to develop explainable/interpretable ML-methods is gaining more and more importance. Certain questions need to be addressed: How does an ML… ▽ More
Submitted 25 February, 2021; originally announced February 2021.
Comments: Invited Talk at AIST 2020
MSC Class: 06A15; 06B99; 68T05; 91A80 ACM Class: I.2.6; I.2.4; I.5.3
-
Triclustering in Big Data Setting
Abstract: In this paper, we describe versions of triclustering algorithms adapted for efficient calculations in distributed environments with MapReduce model or parallelisation mechanism provided by modern programming languages. OAC-family of triclustering algorithms shows good parallelisation capabilities due to the independent processing of triples of a triadic formal context. We provide the time and spac… ▽ More
Submitted 24 October, 2020; originally announced October 2020.
Comments: The paper contains an extended version of the prior work presented at the workshop on FCA in the Big Data Era held on June 25, 2019 at Frankfurt University of Applied Sciences, Frankfurt, Germany
MSC Class: 68T09; 05C65; 62H30 ACM Class: I.5.3; G.2.2; I.2.6; H.2.8; D.1.3
Journal ref: LNCS (2020)
-
Object-Attribute Biclustering for Elimination of Missing Genotypes in Ischemic Stroke Genome-Wide Data
Abstract: Missing genotypes can affect the efficacy of machine learning approaches to identify the risk genetic variants of common diseases and traits. The problem occurs when genotypic data are collected from different experiments with different DNA microarrays, each being characterised by its pattern of uncalled (missing) genotypes. This can prevent the machine learning classifier from assigning the class… ▽ More
Submitted 25 October, 2020; v1 submitted 22 October, 2020; originally announced October 2020.
Comments: Accepted to AIST 2020
MSC Class: 92D20; 62H30; 68T10 ACM Class: I.2.6; I.5.3; I.2.1; J.3
Journal ref: AIST 2020 (CCIS series)
-
DaNetQA: a yes/no Question Answering Dataset for the Russian Language
Abstract: DaNetQA, a new question-answering corpus, follows (Clark et. al, 2019) design: it comprises natural yes/no questions. Each question is paired with a paragraph from Wikipedia and an answer, derived from the paragraph. The task is to take both the question and a paragraph as input and come up with a yes/no answer, i.e. to produce a binary output. In this paper, we present a reproducible approach to… ▽ More
Submitted 15 October, 2020; v1 submitted 6 October, 2020; originally announced October 2020.
Comments: Analysis of Images, Social Networks and Texts - 9 th International Conference, AIST 2020, Skolkovo, Russia, October 15-16, 2020, Revised Selected Papers. Lecture Notes in Computer Science (https://dblp.org/db/series/lncs/index.html), Springer 2020
-
Mixed Integer Programming for Searching Maximum Quasi-Bicliques
Abstract: This paper is related to the problem of finding the maximal quasi-bicliques in a bipartite graph (bigraph). A quasi-biclique in the bigraph is its "almost" complete subgraph. The relaxation of completeness can be understood variously; here, we assume that the subgraph is a $γ$-quasi-biclique if it lacks a certain number of edges to form a biclique such that its density is at least $γ\in (0,1]$. Fo… ▽ More
Submitted 23 February, 2020; originally announced February 2020.
Comments: This paper draft is stored here for self-archiving purposes
MSC Class: 68R10; 90C11 ACM Class: G.2.2; I.5.3
Journal ref: Springer Proceedings in Mathematics & Statistics, vol 315. Springer, Cham (2020)
-
arXiv:1902.02380 [pdf, ps, other]
Compression of Recurrent Neural Networks for Efficient Language Modeling
Abstract: Recurrent neural networks have proved to be an effective method for statistical language modeling. However, in practice their memory and run-time complexity are usually too large to be implemented in real-time offline mobile applications. In this paper we consider several compression techniques for recurrent neural networks including Long-Short Term Memory models. We make particular attention to t… ▽ More
Submitted 6 February, 2019; originally announced February 2019.
Comments: 25 pages, 3 tables, 4 figures
-
arXiv:1708.05963 [pdf, ps, other]
Neural Networks Compression for Language Modeling
Abstract: In this paper, we consider several compression techniques for the language modeling problem based on recurrent neural networks (RNNs). It is known that conventional RNNs, e.g, LSTM-based networks in language modeling, are characterized with either high space complexity or substantial inference time. This problem is especially crucial for mobile applications, in which the constant interaction with… ▽ More
Submitted 20 August, 2017; originally announced August 2017.
Comments: Keywords: LSTM, RNN, language modeling, low-rank factorization, pruning, quantization. Published by Springer in the LNCS series, 7th International Conference on Pattern Recognition and Machine Intelligence, 2017
MSC Class: 62M45; 68T50 ACM Class: I.2.7, I.2.6, I.5.1, I.5.4
-
Introduction to Formal Concept Analysis and Its Applications in Information Retrieval and Related Fields
Abstract: This paper is a tutorial on Formal Concept Analysis (FCA) and its applications. FCA is an applied branch of Lattice Theory, a mathematical discipline which enables formalisation of concepts as basic units of human thinking and analysing data in the object-attribute form. Originated in early 80s, during the last three decades, it became a popular human-centred tool for knowledge representation and… ▽ More
Submitted 8 March, 2017; originally announced March 2017.
MSC Class: 68P20; 06B99; 68T30 ACM Class: H.3.3; G.2; I.2
Journal ref: RuSSIR 2014, Nizhniy Novgorod, Russia, CCIS vol. 505, Springer 42-141
-
Multimodal Clustering for Community Detection
Abstract: Multimodal clustering is an unsupervised technique for mining interesting patterns in $n$-adic binary relations or $n$-mode networks. Among different types of such generalized patterns one can find biclusters and formal concepts (maximal bicliques) for 2-mode case, triclusters and triconcepts for 3-mode case, closed $n$-sets for $n$-mode case, etc. Object-attribute biclustering (OA-biclustering) f… ▽ More
Submitted 27 February, 2017; originally announced February 2017.
MSC Class: 62H30; 91C20; 62H30 ACM Class: I.5.3; J.4
Journal ref: Lecture Notes in Social Networks. Formal Concept Analysis of Social Networks. Eds.: Kuznetsov, Missaoui, Obiedkov, Springer, 2017
-
Towards a Unified Taxonomy of Biclustering Methods
Abstract: Being an unsupervised machine learning and data mining technique, biclustering and its multimodal extensions are becoming popular tools for analysing object-attribute data in different domains. Apart from conventional clustering techniques, biclustering is searching for homogeneous groups of objects while keeping their common description, e.g., in binary setting, their shared attributes. In bioinf… ▽ More
Submitted 17 February, 2017; originally announced February 2017.
Comments: http://ceur-ws.org/Vol-1552/
MSC Class: 06B99; 62H30 ACM Class: I.5.3; H.2.8; I.2.6; I.2.4
Journal ref: Russian and South African Workshop on Knowledge Discovery Techniques Based on Formal Concept Analysis (RuZA 2015), November 30 - December 5, 2015, Stellenbosch, South Africa, In CEUR Workshop Proceedings, Vol. 1552, p. 23-39
-
Bayesian Learning of Consumer Preferences for Residential Demand Response
Abstract: In coming years residential consumers will face real-time electricity tariffs with energy prices varying day to day, and effective energy saving will require automation - a recommender system, which learns consumer's preferences from her actions. A consumer chooses a scenario of home appliance use to balance her comfort level and the energy bill. We propose a Bayesian learning algorithm to estimat… ▽ More
Submitted 27 January, 2017; originally announced January 2017.
MSC Class: 68T05 Learning and adaptive systems ACM Class: I.2.6
Journal ref: IFAC-PapersOnLine, 49(32), 2016, p. 24-29, ISSN 2405-8963
-
On closure operators related to maximal tricliques in tripartite hypergraphs
Abstract: Triadic Formal Concept Analysis (3FCA) was introduced by Lehman and Wille almost two decades ago. And many researchers work in Data Mining and Formal Concept Analysis using the notions of closed sets, Galois and closure operators, closure systems, but up-to-date even though that different researchers actively work on mining triadic and n-ary relations, a proper closure operator for enumeration of… ▽ More
Submitted 26 February, 2017; v1 submitted 23 February, 2016; originally announced February 2016.
Comments: Draft for spec. issue of DAM (2015)
MSC Class: 06B99; 06A15 ACM Class: G.2.2; H.2.8
-
arXiv:1508.03856 [pdf, ps, other]
Two-stage Cascaded Classifier for Purchase Prediction
Abstract: In this paper we describe our machine learning solution for the RecSys Challenge, 2015. We have proposed a time efficient two-stage cascaded classifier for the prediction of buy sessions and purchased items within such sessions. Based on the model, several interesting features found, and formation of our own test bed, we have achieved a reasonable score. Usage of Random Forests helps us to cope wi… ▽ More
Submitted 16 August, 2015; originally announced August 2015.
-
arXiv:1507.05497 [pdf, ps, other]
RAPS: A Recommender Algorithm Based on Pattern Structures
Abstract: We propose a new algorithm for recommender systems with numeric ratings which is based on Pattern Structures (RAPS). As the input the algorithm takes rating matrix, e.g., such that it contains movies rated by users. For a target user, the algorithm returns a rated list of items (movies) based on its previous ratings and ratings of other users. We compare the results of the proposed algorithm in te… ▽ More
Submitted 20 July, 2015; originally announced July 2015.
Comments: The paper presented at FCA4AI 2015 in conjunction with IJCAI 2015
MSC Class: 06F99 ACM Class: H.3.3; H.2.8; I.5.4
-
Can FCA-based Recommender System Suggest a Proper Classifier?
Abstract: The paper briefly introduces multiple classifier systems and describes a new algorithm, which improves classification accuracy by means of recommendation of a proper algorithm to an object classification. This recommendation is done assuming that a classifier is likely to predict the label of the object correctly if it has correctly classified its neighbors. The process of assigning a classifier t… ▽ More
Submitted 21 April, 2015; originally announced April 2015.
Comments: 10 pages, 1 figure, 4 tables, ECAI 2014, workshop "What FCA can do for "Artifficial Intelligence"
MSC Class: 62-07
Journal ref: CEUR Workshop Proceedings, 1257, pp. 17-26 (2014)
-
arXiv:1412.4726 [pdf, ps, other]
Experimental economics for web mining
Abstract: This paper offers a step towards research infrastructure, which makes data from experimental economics efficiently usable for analysis of web data. We believe that regularities of human behavior found in experimental data also emerge in real world web data. A format for data from experiments is suggested, which enables its publication as open data. Once standardized datasets of experiments are ava… ▽ More
Submitted 15 December, 2014; originally announced December 2014.
Comments: 3 pages, 2 tables
-
arXiv:1402.5593 [pdf, ps, other]
Reciprocity in Gift-Exchange-Games
Abstract: This paper presents an analysis of data from a gift-exchange-game experiment. The experiment was described in `The Impact of Social Comparisons on Reciprocity' by Gächter et al. 2012. Since this paper uses state-of-art data science techniques, the results provide a different point of view on the problem. As already shown in relevant literature from experimental economics, human decisions deviate f… ▽ More
Submitted 23 February, 2014; originally announced February 2014.
Comments: 6 pages, 2 figures, 5 tables
Journal ref: Experimental Economics and Machine Learning 2016, CEUR-WS Vol-1627, urn:nbn:de:0074-1627-1
-
A Typology of Collaboration Platform Users
Abstract: In this paper we present a review of the existing typologies of Internet service users. We zoom in on social networking services including blogs and crowdsourcing websites. Based on the results of the analysis of the considered typologies obtained by means of FCA we developed a new user typology of a certain class of Internet services, namely a collaboration innovation platform. Cluster analysis o… ▽ More
Submitted 30 November, 2013; originally announced December 2013.
MSC Class: 68U35; 91D30 ACM Class: K.4.3
Journal ref: R. Tagiew et al. (Eds.) Proc. of Int. Workshop on Experimental Economics in Machine Learning 2012. Published by KU-Leuven, ISBN 978-9-08-140992-6, pp. 9-19
-
An FCA-based Boolean Matrix Factorisation for Collaborative Filtering
Abstract: We propose a new approach for Collaborative Filtering which is based on Boolean Matrix Factorisation (BMF) and Formal Concept Analysis. In a series of experiments on real data (Movielens dataset) we compare the approach with the SVD- and NMF-based algorithms in terms of Mean Average Error (MAE). One of the experimental consequences is that it is enough to have a binary-scaled rating data to obtain… ▽ More
Submitted 16 October, 2013; originally announced October 2013.
Comments: http://ceur-ws.org/Vol-977/paper8.pdf
MSC Class: 06B99; 03G10; 15B34 ACM Class: H.2.8; H.2.3
Journal ref: In: C. Carpineto, A. Napoli, S.O. Kuznetsov (eds), FCA Meets IR 2013, Vol. 977, CEUR Workshop Proceeding, 2013. P. 57-73
-
Recommender System Based on Algorithm of Bicluster Analysis RecBi
Abstract: In this paper we propose two new algorithms based on biclustering analysis, which can be used at the basis of a recommender system for educational orientation of Russian School graduates. The first algorithm was designed to help students make a choice between different university faculties when some of their preferences are known. The second algorithm was developed for the special situation when n… ▽ More
Submitted 13 February, 2012; originally announced February 2012.
MSC Class: 68T05 ACM Class: H.2.8
Journal ref: CEUR Workshop proceedings Vol-757, CDUD'11 - Concept Discovery in Unstructured Data, pp. 122-126, 2011
-
Concept-based Recommendations for Internet Advertisement
Abstract: The problem of detecting terms that can be interesting to the advertiser is considered. If a company has already bought some advertising terms which describe certain services, it is reasonable to find out the terms bought by competing companies. A part of them can be recommended as future advertising terms to the company. The goal of this work is to propose better interpretable recommendations b… ▽ More
Submitted 26 June, 2009; originally announced June 2009.
Comments: D.I.Ignatov, S.O. Kuznetsov. Concept-based Recommendations for Internet Advertisement//In proceedings of The Sixth International Conference Concept Lattices and Their Applications (CLA'08), Olomouc, Czech Republic, 2008 ISBN 978-80-244-2111-7
ACM Class: I.2.1; H.2.8
-
arXiv:0905.1424 [pdf, ps, other]
Concept Stability for Constructing Taxonomies of Web-site Users
Abstract: Owners of a web-site are often interested in analysis of groups of users of their site. Information on these groups can help optimizing the structure and contents of the site. In this paper we use an approach based on formal concepts for constructing taxonomies of user groups. For decreasing the huge amount of concepts that arise in applications, we employ stability index of a concept, which descr… ▽ More
Submitted 24 November, 2016; v1 submitted 9 May, 2009; originally announced May 2009.
Comments: Sergei O. Kuznetsov, D.I. Ignatov, Concept Stability for Constructing Taxonomies of Web-site users, in Proc. Social Network Analysis and Conceptual Structures: Exploring Opportunities, S. Obiedkov, C. Roth (Eds.), Clermont-Ferrand (France), February 16, 2007
ACM Class: H.2.8; J.4