-
FA*IR: A Fair Top-k Ranking Algorithm
Authors:
Meike Zehlike,
Francesco Bonchi,
Carlos Castillo,
Sara Hajian,
Mohamed Megahed,
Ricardo Baeza-Yates
Abstract:
In this work, we define and solve the Fair Top-k Ranking problem, in which we want to determine a subset of k candidates from a large pool of n >> k candidates, maximizing utility (i.e., select the "best" candidates) subject to group fairness criteria. Our ranked group fairness definition extends group fairness using the standard notion of protected groups and is based on ensuring that the proport…
▽ More
In this work, we define and solve the Fair Top-k Ranking problem, in which we want to determine a subset of k candidates from a large pool of n >> k candidates, maximizing utility (i.e., select the "best" candidates) subject to group fairness criteria. Our ranked group fairness definition extends group fairness using the standard notion of protected groups and is based on ensuring that the proportion of protected candidates in every prefix of the top-k ranking remains statistically above or indistinguishable from a given minimum.
Utility is operationalized in two ways: (i) every candidate included in the top-$k$ should be more qualified than every candidate not included; and (ii) for every pair of candidates in the top-k, the more qualified candidate should be ranked above. An efficient algorithm is presented for producing the Fair Top-k Ranking, and tested experimentally on existing datasets as well as new datasets released with this paper, showing that our approach yields small distortions with respect to rankings that maximize utility without considering fairness criteria.
To the best of our knowledge, this is the first algorithm grounded in statistical tests that can mitigate biases in the representation of an under-represented group along a ranked list.
△ Less
Submitted 2 July, 2018; v1 submitted 20 June, 2017;
originally announced June 2017.
-
A New Model of Thermal Propagation in Human Tissue by Using HIFU Application
Authors:
Saeed Reza Hajian,
Ali Abbaspour Tehrani Fard,
Majid Pouladian,
Gholam Reza Hemmasi
Abstract:
In outside the body HIFU treatment that focused ultrasound beams hit severely with cancer tissue layer especially the soft one, at the time of passage of the body different layers as long as they want to reach tumor, put their own way components under mechanical and even thermal influence and they can cause skin lesions. To reduce this effect a specific mechanical model can be used that means body…
▽ More
In outside the body HIFU treatment that focused ultrasound beams hit severely with cancer tissue layer especially the soft one, at the time of passage of the body different layers as long as they want to reach tumor, put their own way components under mechanical and even thermal influence and they can cause skin lesions. To reduce this effect a specific mechanical model can be used that means body tissue is considered as a mechanical model, it is affected when passing sound mechanical waves through it and each layer has an average heat. Gradually sound intensity decreases through every layer passage, finally in one direction a decreased intensity sound reach tumor tissue. If sound propagated directions increase, countless waves with decreased intensity are gathered upon the tumor tissue that causes a lot of heat focus on tumor tissue. Depending on the kind and mechanical properties of the tissue, intensity of each sound wave when it passes through tissue can be controlled to reduce damages outside the tumor tissue.
△ Less
Submitted 17 November, 2016;
originally announced November 2016.
-
Modeling pressure distribution and heat in the body tissue and extract the relationship between them in order to improve treatment planning in HIFU
Authors:
Saeed Reza Hajian,
Ali Abbaspour Tehrani Fard,
Majid Pouladian,
Gholam Reza Hemmasi
Abstract:
In high intensity focused ultrasound (HIFU) systems using non-ionizing methods in cancer treatment, if the device is applied to the body externally, the HIFU beam can damage nearby healthy tissues and burn skin due to lack of knowledge about the viscoelastic properties of patient tissue and failure to consider the physical properties of tissue in treatment planning. Addressing this problem by usin…
▽ More
In high intensity focused ultrasound (HIFU) systems using non-ionizing methods in cancer treatment, if the device is applied to the body externally, the HIFU beam can damage nearby healthy tissues and burn skin due to lack of knowledge about the viscoelastic properties of patient tissue and failure to consider the physical properties of tissue in treatment planning. Addressing this problem by using various methods, such as MRI or ultrasound, elastography can effectively measure visco-elastic properties of tissue and fits within the pattern of stimulation and total treatment planning. In this paper, in a linear path of HIFU propagation, and by considering the smallest part of the path, including voxel with three mechanical elements of mass, spring and damper, which represents the properties of viscoelasticity of tissue, by creating waves of HIFU in the wire environment of MATLAB mechanics and stimulating these elements, pressure and heat transfer due to stimulation in the hypothetical voxel was obtained. Through the repeatability of these three-dimensional elements, tissue is created. The measurement was performed on three layers. The values of these elements for liver tissue and kidney of sheep in a practical example and outside the body are measured, and pressure and heat for three layers of liver and kidney tissue of an organism were obtained by applying ultrasound signals with a designed model. This action is repeated in three different directions, and the results are then compared with simulation software for ultrasound, as a reference to U.S. Food and Drug Administration (FDA) measures for HIFU, as well as comparisons of results with an operational method for an HIFU cell.
△ Less
Submitted 25 October, 2016;
originally announced October 2016.
-
Analysis of Schwarz methods for a hybridizable discontinuous Galerkin discretization: the many subdomain case
Authors:
Martin J. Gander,
Soheil Hajian
Abstract:
Schwarz methods are attractive parallel solution techniques for solving large-scale linear systems obtained from discretizations of partial differential equations (PDEs). Due to the iterative nature of Schwarz methods, convergence rates are an important criterion to quantify their performance. Optimized Schwarz methods (OSM) form a class of Schwarz methods that are designed to achieve faster conve…
▽ More
Schwarz methods are attractive parallel solution techniques for solving large-scale linear systems obtained from discretizations of partial differential equations (PDEs). Due to the iterative nature of Schwarz methods, convergence rates are an important criterion to quantify their performance. Optimized Schwarz methods (OSM) form a class of Schwarz methods that are designed to achieve faster convergence rates by employing optimized transmission conditions between subdomains. It has been shown recently that for a two-subdomain case, OSM is a natural solver for hybridizable discontinuous Galerkin (HDG) discretizations of elliptic PDEs. In this paper, we generalize the preceding result to the many-subdomain case and obtain sharp convergence rates with respect to the mesh size and polynomial degree, the subdomain diameter, and the zeroth-order term of the underlying PDE, which allows us for the first time to give precise convergence estimates for OSM used to solve parabolic problems by implicit time stepping. We illustrate our theoretical results with numerical experiments.
△ Less
Submitted 11 May, 2017; v1 submitted 13 March, 2016;
originally announced March 2016.
-
Exposing the Probabilistic Causal Structure of Discrimination
Authors:
Francesco Bonchi,
Sara Hajian,
Bud Mishra,
Daniele Ramazzotti
Abstract:
Discrimination discovery from data is an important task aiming at identifying patterns of illegal and unethical discriminatory activities against protected-by-law groups, e.g., ethnic minorities. While any legally-valid proof of discrimination requires evidence of causality, the state-of-the-art methods are essentially correlation-based, albeit, as it is well known, correlation does not imply caus…
▽ More
Discrimination discovery from data is an important task aiming at identifying patterns of illegal and unethical discriminatory activities against protected-by-law groups, e.g., ethnic minorities. While any legally-valid proof of discrimination requires evidence of causality, the state-of-the-art methods are essentially correlation-based, albeit, as it is well known, correlation does not imply causation.
In this paper we take a principled causal approach to the data mining problem of discrimination detection in databases. Following Suppes' probabilistic causation theory, we define a method to extract, from a dataset of historical decision records, the causal structures existing among the attributes in the data. The result is a type of constrained Bayesian network, which we dub Suppes-Bayes Causal Network (SBCN). Next, we develop a toolkit of methods based on random walks on top of the SBCN, addressing different anti-discrimination legal concepts, such as direct and indirect discrimination, group and individual discrimination, genuine requirement, and favoritism. Our experiments on real-world datasets confirm the inferential power of our approach in all these different tasks.
△ Less
Submitted 8 March, 2017; v1 submitted 2 October, 2015;
originally announced October 2015.
-
Analysis of Schwarz methods for a hybridizable discontinuous Galerkin discretization
Authors:
Martin J. Gander,
Soheil Hajian
Abstract:
Schwarz methods are attractive parallel solvers for large scale linear systems obtained when partial differential equations are discretized. For hybridizable discontinuous Galerkin (HDG) methods, this is a relatively new field of research, because HDG methods impose continuity across elements using a Robin condition, while classical Schwarz solvers use Dirichlet transmission conditions. Robin cond…
▽ More
Schwarz methods are attractive parallel solvers for large scale linear systems obtained when partial differential equations are discretized. For hybridizable discontinuous Galerkin (HDG) methods, this is a relatively new field of research, because HDG methods impose continuity across elements using a Robin condition, while classical Schwarz solvers use Dirichlet transmission conditions. Robin conditions are used in optimized Schwarz methods to get faster convergence compared to classical Schwarz methods, and this even without overlap, when the Robin parameter is well chosen. We present in this paper a rigorous convergence analysis of Schwarz methods for the concrete case of hybridizable interior penalty (IPH) method. We show that the penalization parameter needed for convergence of IPH leads to slow convergence of the classical additive Schwarz method, and propose a modified solver which leads to much faster convergence. Our analysis is entirely at the discrete level, and thus holds for arbitrary interfaces between two subdomains. We then generalize the method to the case of many subdomains, including cross points, and obtain a new class of preconditioners for Krylov subspace methods which exhibit better convergence properties than the classical additive Schwarz preconditioner. We illustrate our results with numerical experiments.
△ Less
Submitted 10 December, 2014; v1 submitted 17 April, 2014;
originally announced April 2014.
-
Simultaneous Discrimination Prevention and Privacy Protection in Data Publishing and Mining
Authors:
Sara Hajian
Abstract:
Data mining is an increasingly important technology for extracting useful knowledge hidden in large collections of data. There are, however, negative social perceptions about data mining, among which potential privacy violation and potential discrimination. Automated data collection and data mining techniques such as classification have paved the way to making automated decisions, like loan granti…
▽ More
Data mining is an increasingly important technology for extracting useful knowledge hidden in large collections of data. There are, however, negative social perceptions about data mining, among which potential privacy violation and potential discrimination. Automated data collection and data mining techniques such as classification have paved the way to making automated decisions, like loan granting/denial, insurance premium computation. If the training datasets are biased in what regards discriminatory attributes like gender, race, religion, discriminatory decisions may ensue. In the first part of this thesis, we tackle discrimination prevention in data mining and propose new techniques applicable for direct or indirect discrimination prevention individually or both at the same time. We discuss how to clean training datasets and outsourced datasets in such a way that direct and/or indirect discriminatory decision rules are converted to legitimate (non-discriminatory) classification rules. In the second part of this thesis, we argue that privacy and discrimination risks should be tackled together. We explore the relationship between privacy preserving data mining and discrimination prevention in data mining to design holistic approaches capable of addressing both threats simultaneously during the knowledge discovery process. As part of this effort, we have investigated for the first time the problem of discrimination and privacy aware frequent pattern discovery, i.e. the sanitization of the collection of patterns mined from a transaction database in such a way that neither privacy-violating nor discriminatory inferences can be inferred on the released patterns. Moreover, we investigate the problem of discrimination and privacy aware data publishing, i.e. transforming the data, instead of patterns, in order to simultaneously fulfill privacy preservation and discrimination prevention.
△ Less
Submitted 28 June, 2013;
originally announced June 2013.
-
High order and energy preserving discontinuous Galerkin methods for the Vlasov-Poisson system
Authors:
Blanca Ayuso de Dios,
Soheil Hajian
Abstract:
We present a computational study for a family of discontinuous Galerkin methods for the one dimensional Vlasov-Poisson system that has been recently introduced. We introduce a slight modification of the methods to allow for feasible computations while preserving the properties of the original methods. We study numerically the verification of the theoretical and convergence analysis, discussing als…
▽ More
We present a computational study for a family of discontinuous Galerkin methods for the one dimensional Vlasov-Poisson system that has been recently introduced. We introduce a slight modification of the methods to allow for feasible computations while preserving the properties of the original methods. We study numerically the verification of the theoretical and convergence analysis, discussing also the conservation properties of the schemes. The methods are validated through their application to some of the benchmarks in the simulation of plasma physics.
△ Less
Submitted 30 September, 2012; v1 submitted 18 September, 2012;
originally announced September 2012.
-
Multifractal Detrended Cross-Correlation Analysis of Sunspot Numbers and River Flow Fluctuations
Authors:
S. Hajian,
M. Sadegh Movahed
Abstract:
We use the Detrended Cross-Correlation Analysis (DCCA) to investigate the influence of sun activity represented by sunspot numbers on one of the climate indicators, specifically rivers, represented by river flow fluctuation for Daugava, Holston, Nolichucky and French Broad rivers. The Multifractal Detrended Cross-Correlation Analysis (MF-DXA) shows that there exist some crossovers in the cross-cor…
▽ More
We use the Detrended Cross-Correlation Analysis (DCCA) to investigate the influence of sun activity represented by sunspot numbers on one of the climate indicators, specifically rivers, represented by river flow fluctuation for Daugava, Holston, Nolichucky and French Broad rivers. The Multifractal Detrended Cross-Correlation Analysis (MF-DXA) shows that there exist some crossovers in the cross-correlation fluctuation function versus time scale of the river flow and sunspot series. One of these crossovers corresponds to the well-known cycle of solar activity demonstrating a universal property of the mentioned rivers. The scaling exponent given by DCCA for original series at intermediate time scale, $(12-24)\leq s\leq 130$ months, is $λ= 1.17\pm0.04$ which is almost similar for all underlying rivers at $1σ$confidence interval showing the second universal behavior of river runoffs. To remove the sinusoidal trends embedded in data sets, we apply the Singular Value Decomposition (SVD) method. Our results show that there exists a long-range cross-correlation between the sunspot numbers and the underlying streamflow records. The magnitude of the scaling exponent and the corresponding cross-correlation exponent are $λ\in (0.76, 0.85)$ and $γ_{\times}\in(0.30, 0.48)$, respectively. Different values for scaling and cross-correlation exponents may be related to local and external factors such as topography, drainage network morphology, human activity and so on. Multifractal cross-correlation analysis demonstrates that all underlying fluctuations have almost weak multifractal nature which is also a universal property for data series. In addition the empirical relation between scaling exponent derived by DCCA and Detrended Fluctuation Analysis (DFA), $ λ\approx(h_{\rm sun} + h_{\rm river})/2$ is confirmed.
△ Less
Submitted 24 July, 2010; v1 submitted 2 August, 2009;
originally announced August 2009.