-
VisQualdex -- the comprehensive guide to good data visualization
Authors:
Jan Sawicki,
Michał Burdukiewicz
Abstract:
The rapid influx of low-quality data visualisations is one of the main challenges in today's communication. Misleading, unreadable, or confusing visualisations spread misinformation, failing to fulfill their purpose. The lack of proper tooling further heightens the problem of the quality assessment process. Therefore, we propose VisQualdex, a systematic set of guidelines isnpired by the Grammar of…
▽ More
The rapid influx of low-quality data visualisations is one of the main challenges in today's communication. Misleading, unreadable, or confusing visualisations spread misinformation, failing to fulfill their purpose. The lack of proper tooling further heightens the problem of the quality assessment process. Therefore, we propose VisQualdex, a systematic set of guidelines isnpired by the Grammar of Graphics for evaluating the quality of data visualisations. To increase the practical impact of VisQualdex, we make these guidelines available in the form of the web server, visqual.info.
△ Less
Submitted 14 April, 2023; v1 submitted 21 January, 2022;
originally announced January 2022.
-
Exploring usability of Reddit in data science and knowledge processing
Authors:
Jan Sawicki,
Maria Ganzha,
Marcin Paprzycki,
Amelia Bădică
Abstract:
This contribution argues that Reddit, as a massive, categorized, open-access dataset, is a useful data source, for "almost any topic". Hence, it can be used in data science, e.g. for knowledge exploration. This statement is backed-up with presented analysis, based on 180 manually annotated papers, related to Reddit itself, and data acquired from popular databases of scientific papers. Finally, an…
▽ More
This contribution argues that Reddit, as a massive, categorized, open-access dataset, is a useful data source, for "almost any topic". Hence, it can be used in data science, e.g. for knowledge exploration. This statement is backed-up with presented analysis, based on 180 manually annotated papers, related to Reddit itself, and data acquired from popular databases of scientific papers. Finally, an open source tool is introduced, which provides an easy access to Reddit resources, and an exploratory data analysis of how Reddit covers selected topics. These functions can be used as a prelude analysis to a broader exploration of Reddit's applicability.
△ Less
Submitted 14 April, 2023; v1 submitted 5 October, 2021;
originally announced October 2021.
-
Approximation of the objective insensitivity regions using Hierarchic Memetic Strategy coupled with Covariance Matrix Adaptation Evolutionary Strategy
Authors:
Jakub Sawicki,
Maciej Smołka,
Marcin Łoś,
Robert Schaefer
Abstract:
One of the most challenging types of ill-posedness in global optimization is the presence of insensitivity regions in design parameter space, so the identification of their shape will be crucial, if ill-posedness is irrecoverable. Such problems may be solved using global stochastic search followed by post-processing of a local sample and a local objective approximation. We propose a new approach o…
▽ More
One of the most challenging types of ill-posedness in global optimization is the presence of insensitivity regions in design parameter space, so the identification of their shape will be crucial, if ill-posedness is irrecoverable. Such problems may be solved using global stochastic search followed by post-processing of a local sample and a local objective approximation. We propose a new approach of this type composed of Hierarchic Memetic Strategy (HMS) powered by the Covariance Matrix Adaptation Evolutionary Strategy (CMA-ES) well-known as an effective, self-adaptable stochastic optimization algorithm and we leverage the distribution density knowledge it accumulates to better identify and separate insensitivity regions. The results of benchmarks prove that the improved HMS-CMA-ES strategy is effective in both the total computational cost and the accuracy of insensitivity region approximation. The reference data for the tests was obtained by means of a well-known effective strategy of multimodal stochastic optimization called the Niching Evolutionary Algorithm 2 (NEA2), that also uses CMA-ES as a component.
△ Less
Submitted 17 May, 2019;
originally announced May 2019.