-
Data-driven Super-Resolution of Flood Inundation Maps using Synthetic Simulations
Authors:
Akshay Aravamudan,
Zimeena Rasheed,
Xi Zhang,
Kira E. Scarpignato,
Efthymios I. Nikolopoulos,
Witold F. Krajewski,
Georgios C. Anagnostopoulos
Abstract:
The frequency of extreme flood events is increasing throughout the world. Daily, high-resolution (30m) Flood Inundation Maps (FIM) observed from space play a key role in informing mitigation and preparedness efforts to counter these extreme events. However, the temporal frequency of publicly available high-resolution FIMs, e.g., from Landsat, is at the order of two weeks thus limiting the effectiv…
▽ More
The frequency of extreme flood events is increasing throughout the world. Daily, high-resolution (30m) Flood Inundation Maps (FIM) observed from space play a key role in informing mitigation and preparedness efforts to counter these extreme events. However, the temporal frequency of publicly available high-resolution FIMs, e.g., from Landsat, is at the order of two weeks thus limiting the effective monitoring of flood inundation dynamics. Conversely, global, low-resolution (~300m) Water Fraction Maps (WFM) are publicly available from NOAA VIIRS daily. Motivated by the recent successes of deep learning methods for single image super-resolution, we explore the effectiveness and limitations of similar data-driven approaches to downscaling low-resolution WFMs to high-resolution FIMs. To overcome the scarcity of high-resolution FIMs, we train our models with high-quality synthetic data obtained through physics-based simulations. We evaluate our models on real-world data from flood events in the state of Iowa. The study indicates that data-driven approaches exhibit superior reconstruction accuracy over non-data-driven alternatives and that the use of synthetic data is a viable proxy for training purposes. Additionally, we show that our trained models can exhibit superior zero-shot performance when transferred to regions with hydroclimatological similarity to the U.S. Midwest.
△ Less
Submitted 14 February, 2025;
originally announced February 2025.
-
Dataflow Optimized Reconfigurable Acceleration for FEM-based CFD Simulations
Authors:
Anastassis Kapetanakis,
Aggelos Ferikoglou,
George Anagnostopoulos,
Sotirios Xydis
Abstract:
Computational Fluid Dynamics (CFD) simulations are essential for analyzing and optimizing fluid flows in a wide range of real-world applications. These simulations involve approximating the solutions of the Navier-Stokes differential equations using numerical methods, which are highly compute- and memory-intensive due to their need for high-precision iterations. In this work, we introduce a high-p…
▽ More
Computational Fluid Dynamics (CFD) simulations are essential for analyzing and optimizing fluid flows in a wide range of real-world applications. These simulations involve approximating the solutions of the Navier-Stokes differential equations using numerical methods, which are highly compute- and memory-intensive due to their need for high-precision iterations. In this work, we introduce a high-performance FPGA accelerator specifically designed for numerically solving the Navier-Stokes equations. We focus on the Finite Element Method (FEM) due to its ability to accurately model complex geometries and intricate setups typical of real-world applications. Our accelerator is implemented using High-Level Synthesis (HLS) on an AMD Alveo U200 FPGA, leveraging the reconfigurability of FPGAs to offer a flexible and adaptable solution. The proposed solution achieves 7.9x higher performance than optimized Vitis-HLS implementations and 45% lower latency with 3.64x less power compared to a software implementation on a high-end server CPU. This highlights the potential of our approach to solve Navier-Stokes equations more effectively, paving the way for tackling even more challenging CFD simulations in the future.
△ Less
Submitted 7 April, 2025; v1 submitted 25 November, 2024;
originally announced November 2024.
-
A Hybrid Microscopic Model for Multimodal Traffic with Empirical Observations from Aerial Footage
Authors:
Georg Anagnostopoulos,
Nikolas Geroliminis
Abstract:
Microscopic traffic flow models can be distinguished in lane-based or lane-free depending on the degree of lane-discipline. This distinction holds true only if motorcycles are neglected in lane-based traffic. In cities, as opposed to highways, this is an oversimplification and it would be more accurate to speak of hybrid situations, where lane discipline can be made mode-dependent. Empirical evide…
▽ More
Microscopic traffic flow models can be distinguished in lane-based or lane-free depending on the degree of lane-discipline. This distinction holds true only if motorcycles are neglected in lane-based traffic. In cities, as opposed to highways, this is an oversimplification and it would be more accurate to speak of hybrid situations, where lane discipline can be made mode-dependent. Empirical evidence shows that cars follow the lanes as defined by the infrastructure, while motorcycles do not necessarily adhere to predefined norms and may participate in self-organized formation of virtual lanes. This phenomenon is the result of complex interactions between different traffic participants competing for limited space. In order to better understand the dynamics of modal interaction microscopically, we first analyze empirical data from detailed trajectories obtained by the pNEUMA experiment and observe patterns of mixed traffic. Then, we propose a hybrid model for multimodal vehicular traffic. The hybrid model is inspired by the pedestrian flow literature, featuring collision-free and anticipatory properties, and we demonstrate that it is able to reproduce empirical observations from aerial footage.
△ Less
Submitted 25 October, 2022;
originally announced October 2022.
-
Influence Dynamics Among Narratives: A Case Study of the Venezuelan Presidential Crisis
Authors:
Akshay Aravamudan,
Xi Zhang,
Jihye Song,
Stephen M. Fiore,
Georgios C. Anagnostopoulos
Abstract:
It is widely understood that diffusion of and simultaneous interactions between narratives -- defined here as persistent point-of-view messaging -- significantly contributes to the shaping of political discourse and public opinion. In this work, we propose a methodology based on Multi-Variate Hawkes Processes and our newly-introduced Process Influence Measures for quantifying and assessing how suc…
▽ More
It is widely understood that diffusion of and simultaneous interactions between narratives -- defined here as persistent point-of-view messaging -- significantly contributes to the shaping of political discourse and public opinion. In this work, we propose a methodology based on Multi-Variate Hawkes Processes and our newly-introduced Process Influence Measures for quantifying and assessing how such narratives influence (Granger-cause) each other. Such an approach may aid social scientists enhance their understanding of socio-geopolitical phenomena as they manifest themselves and evolve in the realm of social media. In order to show its merits, we apply our methodology on Twitter narratives during the 2019 Venezuelan presidential crisis. Our analysis indicates a nuanced, evolving influence structure between 8 distinct narratives, part of which could be explained by landmark historical events.
△ Less
Submitted 5 November, 2021;
originally announced November 2021.
-
ProxyFAUG: Proximity-based Fingerprint Augmentation
Authors:
Grigorios G. Anagnostopoulos,
Alexandros Kalousis
Abstract:
The proliferation of data-demanding machine learning methods has brought to light the necessity for methodologies which can enlarge the size of training datasets, with simple, rule-based methods. In-line with this concept, the fingerprint augmentation scheme proposed in this work aims to augment fingerprint datasets which are used to train positioning models. The proposed method utilizes fingerpri…
▽ More
The proliferation of data-demanding machine learning methods has brought to light the necessity for methodologies which can enlarge the size of training datasets, with simple, rule-based methods. In-line with this concept, the fingerprint augmentation scheme proposed in this work aims to augment fingerprint datasets which are used to train positioning models. The proposed method utilizes fingerprints which are recorded in spacial proximity, in order to perform fingerprint augmentation, creating new fingerprints which combine the features of the original ones. The proposed method of composing the new, augmented fingerprints is inspired by the crossover and mutation operators of genetic algorithms. The ProxyFAUG method aims to improve the achievable positioning accuracy of fingerprint datasets, by introducing a rule-based, stochastic, proximity-based method of fingerprint augmentation. The performance of ProxyFAUG is evaluated in an outdoor Sigfox setting using a public dataset. The best performing published positioning method on this dataset is improved by 40% in terms of median error and 6% in terms of mean error, with the use of the augmented dataset. The analysis of the results indicate a systematic and significant performance improvement at the lower error quartiles, as indicated by the impressive improvement of the median error.
△ Less
Submitted 12 January, 2022; v1 submitted 4 February, 2021;
originally announced February 2021.
-
Analysing the Data-Driven Approach of Dynamically Estimating Positioning Accuracy
Authors:
Grigorios G. Anagnostopoulos,
Alexandros Kalousis
Abstract:
The primary expectation from positioning systems is for them to provide the users with reliable estimates of their position. An additional piece of information that can greatly help the users utilize position estimates is the level of uncertainty that a positioning system assigns to the position estimate it produced. The concept of dynamically estimating the accuracy of position estimates of finge…
▽ More
The primary expectation from positioning systems is for them to provide the users with reliable estimates of their position. An additional piece of information that can greatly help the users utilize position estimates is the level of uncertainty that a positioning system assigns to the position estimate it produced. The concept of dynamically estimating the accuracy of position estimates of fingerprinting positioning systems has been sporadically discussed over the last decade in the literature of the field, where mainly handcrafted rules based on domain knowledge have been proposed. The emergence of IoT devices and the proliferation of data from Low Power Wide Area Networks (LPWANs) have facilitated the conceptualization of data-driven methods of determining the estimated certainty over position estimates. In this work, we analyze the data-driven approach of determining the Dynamic Accuracy Estimation (DAE), considering it in the broader context of a positioning system. More specifically, with the use of a public LoRaWAN dataset, the current work analyses: the repartition of the available training set between the tasks of determining the location estimates and the DAE, the concept of selecting a subset of the most reliable estimates, and the impact that the spatial distribution of the data has to the accuracy of the DAE. The work provides a wide overview of the data-driven approach of DAE determination in the context of the overall design of a positioning system.
△ Less
Submitted 24 February, 2021; v1 submitted 20 November, 2020;
originally announced November 2020.
-
StationRank: Aggregate dynamics of the Swiss railway
Authors:
Georg Anagnostopoulos,
Vahid Moosavi
Abstract:
Increasing availability and quality of actual, as opposed to scheduled, open transport data offers new possibilities for capturing the spatiotemporal dynamics of the railway and other networks of social infrastructure. One way to describe such complex phenomena is in terms of stochastic processes. At its core, a stochastic model is domain-agnostic and algorithms discussed here have been successful…
▽ More
Increasing availability and quality of actual, as opposed to scheduled, open transport data offers new possibilities for capturing the spatiotemporal dynamics of the railway and other networks of social infrastructure. One way to describe such complex phenomena is in terms of stochastic processes. At its core, a stochastic model is domain-agnostic and algorithms discussed here have been successfully used in other applications, including Google's PageRank citation ranking. Our key assumption is that train routes constitute meaningful sequences analogous to sentences of literary text. A corpus of routes is thus susceptible to the same analytic tool-set as a corpus of sentences. With our experiment in Switzerland, we introduce a method for building Markov Chains from aggregated daily streams of railway traffic data. The stationary distributions under normal and perturbed conditions are used to define systemic risk measures with non-evident,valuable information about railway infrastructure.
△ Less
Submitted 17 October, 2020; v1 submitted 4 June, 2020;
originally announced June 2020.
-
Deep Agent: Studying the Dynamics of Information Spread and Evolution in Social Networks
Authors:
Ivan Garibay,
Toktam A. Oghaz,
Niloofar Yousefi,
Ece C. Mutlu,
Madeline Schiappa,
Steven Scheinert,
Georgios C. Anagnostopoulos,
Christina Bouwens,
Stephen M. Fiore,
Alexander Mantzaris,
John T. Murphy,
William Rand,
Anastasia Salter,
Mel Stanfill,
Gita Sukthankar,
Nisha Baral,
Gabriel Fair,
Chathika Gunaratne,
Neda B. Hajiakhoond,
Jasser Jasser,
Chathura Jayalath,
Olivia Newton,
Samaneh Saadat,
Chathurani Senevirathna,
Rachel Winter
, et al. (1 additional authors not shown)
Abstract:
This paper explains the design of a social network analysis framework, developed under DARPA's SocialSim program, with novel architecture that models human emotional, cognitive and social factors. Our framework is both theory and data-driven, and utilizes domain expertise. Our simulation effort helps in understanding how information flows and evolves in social media platforms. We focused on modeli…
▽ More
This paper explains the design of a social network analysis framework, developed under DARPA's SocialSim program, with novel architecture that models human emotional, cognitive and social factors. Our framework is both theory and data-driven, and utilizes domain expertise. Our simulation effort helps in understanding how information flows and evolves in social media platforms. We focused on modeling three information domains: cryptocurrencies, cyber threats, and software vulnerabilities for the three interrelated social environments: GitHub, Reddit, and Twitter. We participated in the SocialSim DARPA Challenge in December 2018, in which our models were subjected to extensive performance evaluation for accuracy, generalizability, explainability, and experimental power. This paper reports the main concepts and models, utilized in our social media modeling effort in developing a multi-resolution simulation at the user, community, population, and content levels.
△ Less
Submitted 29 May, 2021; v1 submitted 25 March, 2020;
originally announced March 2020.
-
A Reproducible Analysis of RSSI Fingerprinting for Outdoor Localization Using Sigfox: Preprocessing and Hyperparameter Tuning
Authors:
Grigorios G. Anagnostopoulos,
Alexandros Kalousis
Abstract:
Fingerprinting techniques, which are a common method for indoor localization, have been recently applied with success into outdoor settings. Particularly, the communication signals of Low Power Wide Area Networks (LPWAN) such as Sigfox, have been used for localization. In this rather recent field of study, not many publicly available datasets, which would facilitate the consistent comparison of di…
▽ More
Fingerprinting techniques, which are a common method for indoor localization, have been recently applied with success into outdoor settings. Particularly, the communication signals of Low Power Wide Area Networks (LPWAN) such as Sigfox, have been used for localization. In this rather recent field of study, not many publicly available datasets, which would facilitate the consistent comparison of different positioning systems, exist so far. In the current study, a published dataset of RSSI measurements on a Sigfox network deployed in Antwerp, Belgium is used to analyse the appropriate selection of preprocessing steps and to tune the hyperparameters of a kNN fingerprinting method. Initially, the tuning of hyperparameter k for a variety of distance metrics, and the selection of efficient data transformation schemes, proposed by relevant works, is presented. In addition, accuracy improvements are achieved in this study, by a detailed examination of the appropriate adjustment of the parameters of the data transformation schemes tested, and of the handling of out of range values. With the appropriate tuning of these factors, the achieved mean localization error was 298 meters, and the median error was 109 meters. To facilitate the reproducibility of tests and comparability of results, the code and train/validation/test split used in this study are available.
△ Less
Submitted 14 August, 2019;
originally announced August 2019.
-
A Reproducible Comparison of RSSI Fingerprinting Localization Methods Using LoRaWAN
Authors:
Grigorios G. Anagnostopoulos,
Alexandros Kalousis
Abstract:
The use of fingerprinting localization techniques in outdoor IoT settings has started to gain popularity over the recent years. Communication signals of Low Power Wide Area Networks (LPWAN), such as LoRaWAN, are used to estimate the location of low power mobile devices. In this study, a publicly available dataset of LoRaWAN RSSI measurements is utilized to compare different machine learning method…
▽ More
The use of fingerprinting localization techniques in outdoor IoT settings has started to gain popularity over the recent years. Communication signals of Low Power Wide Area Networks (LPWAN), such as LoRaWAN, are used to estimate the location of low power mobile devices. In this study, a publicly available dataset of LoRaWAN RSSI measurements is utilized to compare different machine learning methods and their accuracy in producing location estimates. The tested methods are: the k Nearest Neighbours method, the Extra Trees method and a neural network approach using a Multilayer Perceptron. To facilitate the reproducibility of tests and the comparability of results, the code and the train/validation/test split of the dataset used in this study have become available. The neural network approach was the method with the highest accuracy, achieving a mean error of 358 meters and a median error of 204 meters.
△ Less
Submitted 14 August, 2019;
originally announced August 2019.
-
Learning Hash Function through Codewords
Authors:
Yinjie Huang,
Michael Georgiopoulos,
Georgios C. Anagnostopoulos
Abstract:
In this paper, we propose a novel hash learning approach that has the following main distinguishing features, when compared to past frameworks. First, the codewords are utilized in the Hamming space as ancillary techniques to accomplish its hash learning task. These codewords, which are inferred from the data, attempt to capture grouping aspects of the data's hash codes. Furthermore, the proposed…
▽ More
In this paper, we propose a novel hash learning approach that has the following main distinguishing features, when compared to past frameworks. First, the codewords are utilized in the Hamming space as ancillary techniques to accomplish its hash learning task. These codewords, which are inferred from the data, attempt to capture grouping aspects of the data's hash codes. Furthermore, the proposed framework is capable of addressing supervised, unsupervised and, even, semi-supervised hash learning scenarios. Additionally, the framework adopts a regularization term over the codewords, which automatically chooses the codewords for the problem. To efficiently solve the problem, one Block Coordinate Descent algorithm is showcased in the paper. We also show that one step of the algorithms can be casted into several Support Vector Machine problems which enables our algorithms to utilize efficient software package. For the regularization term, a closed form solution of the proximal operator is provided in the paper. A series of comparative experiments focused on content-based image retrieval highlights its performance advantages.
△ Less
Submitted 22 February, 2019;
originally announced February 2019.
-
Reduced-Rank Local Distance Metric Learning for k-NN Classification
Authors:
YInjie Huang,
Cong Li,
Michael Georgiopoulos,
Georgios C. Anagnostopoulos
Abstract:
We propose a new method for local distance metric learning based on sample similarity as side information. These local metrics, which utilize conical combinations of metric weight matrices, are learned from the pooled spatial characteristics of the data, as well as the similarity profiles between the pairs of samples, whose distances are measured. The main objective of our framework is to yield me…
▽ More
We propose a new method for local distance metric learning based on sample similarity as side information. These local metrics, which utilize conical combinations of metric weight matrices, are learned from the pooled spatial characteristics of the data, as well as the similarity profiles between the pairs of samples, whose distances are measured. The main objective of our framework is to yield metrics, such that the resulting distances between similar samples are small and distances between dissimilar samples are above a certain threshold. For learning and inference purposes, we describe a transductive, as well as an inductive algorithm; the former approach naturally befits our framework, while the latter one is provided in the interest of faster learning. Experimental results on a collection of classification problems imply that the new methods may exhibit notable performance advantages over alternative metric learning approaches that have recently appeared in the literature.
△ Less
Submitted 21 February, 2019;
originally announced February 2019.
-
Multi-Task Learning Using Neighborhood Kernels
Authors:
Niloofar Yousefi,
Cong Li,
Mansooreh Mollaghasemi,
Georgios Anagnostopoulos,
Michael Georgiopoulos
Abstract:
This paper introduces a new and effective algorithm for learning kernels in a Multi-Task Learning (MTL) setting. Although, we consider a MTL scenario here, our approach can be easily applied to standard single task learning, as well. As shown by our empirical results, our algorithm consistently outperforms the traditional kernel learning algorithms such as uniform combination solution, convex comb…
▽ More
This paper introduces a new and effective algorithm for learning kernels in a Multi-Task Learning (MTL) setting. Although, we consider a MTL scenario here, our approach can be easily applied to standard single task learning, as well. As shown by our empirical results, our algorithm consistently outperforms the traditional kernel learning algorithms such as uniform combination solution, convex combinations of base kernels as well as some kernel alignment-based models, which have been proven to give promising results in the past. We present a Rademacher complexity bound based on which a new Multi-Task Multiple Kernel Learning (MT-MKL) model is derived. In particular, we propose a Support Vector Machine-regularized model in which, for each task, an optimal kernel is learned based on a neighborhood-defining kernel that is not restricted to be positive semi-definite. Comparative experimental results are showcased that underline the merits of our neighborhood-defining framework in both classification and regression problems.
△ Less
Submitted 11 July, 2017;
originally announced July 2017.
-
Local Rademacher Complexity-based Learning Guarantees for Multi-Task Learning
Authors:
Niloofar Yousefi,
Yunwen Lei,
Marius Kloft,
Mansooreh Mollaghasemi,
Georgios Anagnostopoulos
Abstract:
We show a Talagrand-type concentration inequality for Multi-Task Learning (MTL), using which we establish sharp excess risk bounds for MTL in terms of distribution- and data-dependent versions of the Local Rademacher Complexity (LRC). We also give a new bound on the LRC for norm regularized as well as strongly convex hypothesis classes, which applies not only to MTL but also to the standard i.i.d.…
▽ More
We show a Talagrand-type concentration inequality for Multi-Task Learning (MTL), using which we establish sharp excess risk bounds for MTL in terms of distribution- and data-dependent versions of the Local Rademacher Complexity (LRC). We also give a new bound on the LRC for norm regularized as well as strongly convex hypothesis classes, which applies not only to MTL but also to the standard i.i.d. setting. Combining both results, one can now easily derive fast-rate bounds on the excess risk for many prominent MTL methods, including---as we demonstrate---Schatten-norm, group-norm, and graph-regularized MTL. The derived bounds reflect a relationship akeen to a conservation law of asymptotic convergence rates. This very relationship allows for trading off slower rates w.r.t. the number of tasks for faster rates with respect to the number of available samples per task, when compared to the rates obtained via a traditional, global Rademacher analysis.
△ Less
Submitted 9 February, 2017; v1 submitted 18 February, 2016;
originally announced February 2016.
-
Multi-Task Learning with Group-Specific Feature Space Sharing
Authors:
Niloofar Yousefi,
Michael Georgiopoulos,
Georgios C. Anagnostopoulos
Abstract:
When faced with learning a set of inter-related tasks from a limited amount of usable data, learning each task independently may lead to poor generalization performance. Multi-Task Learning (MTL) exploits the latent relations between tasks and overcomes data scarcity limitations by co-learning all these tasks simultaneously to offer improved performance. We propose a novel Multi-Task Multiple Kern…
▽ More
When faced with learning a set of inter-related tasks from a limited amount of usable data, learning each task independently may lead to poor generalization performance. Multi-Task Learning (MTL) exploits the latent relations between tasks and overcomes data scarcity limitations by co-learning all these tasks simultaneously to offer improved performance. We propose a novel Multi-Task Multiple Kernel Learning framework based on Support Vector Machines for binary classification tasks. By considering pair-wise task affinity in terms of similarity between a pair's respective feature spaces, the new framework, compared to other similar MTL approaches, offers a high degree of flexibility in determining how similar feature spaces should be, as well as which pairs of tasks should share a common feature space in order to benefit overall performance. The associated optimization problem is solved via a block coordinate descent, which employs a consensus-form Alternating Direction Method of Multipliers algorithm to optimize the Multiple Kernel Learning weights and, hence, to determine task affinities. Empirical evaluation on seven data sets exhibits a statistically significant improvement of our framework's results compared to the ones of several other Clustered Multi-Task Learning methods.
△ Less
Submitted 13 August, 2015;
originally announced August 2015.
-
Hash Function Learning via Codewords
Authors:
Yinjie Huang,
Michael Georgiopoulos,
Georgios C. Anagnostopoulos
Abstract:
In this paper we introduce a novel hash learning framework that has two main distinguishing features, when compared to past approaches. First, it utilizes codewords in the Hamming space as ancillary means to accomplish its hash learning task. These codewords, which are inferred from the data, attempt to capture similarity aspects of the data's hash codes. Secondly and more importantly, the same fr…
▽ More
In this paper we introduce a novel hash learning framework that has two main distinguishing features, when compared to past approaches. First, it utilizes codewords in the Hamming space as ancillary means to accomplish its hash learning task. These codewords, which are inferred from the data, attempt to capture similarity aspects of the data's hash codes. Secondly and more importantly, the same framework is capable of addressing supervised, unsupervised and, even, semi-supervised hash learning tasks in a natural manner. A series of comparative experiments focused on content-based image retrieval highlights its performance advantages.
△ Less
Submitted 18 August, 2015; v1 submitted 13 August, 2015;
originally announced August 2015.
-
Conic Multi-Task Classification
Authors:
Cong Li,
Michael Georgiopoulos,
Georgios C. Anagnostopoulos
Abstract:
Traditionally, Multi-task Learning (MTL) models optimize the average of task-related objective functions, which is an intuitive approach and which we will be referring to as Average MTL. However, a more general framework, referred to as Conic MTL, can be formulated by considering conic combinations of the objective functions instead; in this framework, Average MTL arises as a special case, when al…
▽ More
Traditionally, Multi-task Learning (MTL) models optimize the average of task-related objective functions, which is an intuitive approach and which we will be referring to as Average MTL. However, a more general framework, referred to as Conic MTL, can be formulated by considering conic combinations of the objective functions instead; in this framework, Average MTL arises as a special case, when all combination coefficients equal 1. Although the advantage of Conic MTL over Average MTL has been shown experimentally in previous works, no theoretical justification has been provided to date. In this paper, we derive a generalization bound for the Conic MTL method, and demonstrate that the tightest bound is not necessarily achieved, when all combination coefficients equal 1; hence, Average MTL may not always be the optimal choice, and it is important to consider Conic MTL. As a byproduct of the generalization bound, it also theoretically explains the good experimental results of previous relevant works. Finally, we propose a new Conic MTL model, whose conic combination coefficients minimize the generalization bound, instead of choosing them heuristically as has been done in previous methods. The rationale and advantage of our model is demonstrated and verified via a series of experiments by comparing with several other methods.
△ Less
Submitted 20 August, 2014;
originally announced August 2014.
-
Pareto-Path Multi-Task Multiple Kernel Learning
Authors:
Cong Li,
Michael Georgiopoulos,
Georgios C. Anagnostopoulos
Abstract:
A traditional and intuitively appealing Multi-Task Multiple Kernel Learning (MT-MKL) method is to optimize the sum (thus, the average) of objective functions with (partially) shared kernel function, which allows information sharing amongst tasks. We point out that the obtained solution corresponds to a single point on the Pareto Front (PF) of a Multi-Objective Optimization (MOO) problem, which con…
▽ More
A traditional and intuitively appealing Multi-Task Multiple Kernel Learning (MT-MKL) method is to optimize the sum (thus, the average) of objective functions with (partially) shared kernel function, which allows information sharing amongst tasks. We point out that the obtained solution corresponds to a single point on the Pareto Front (PF) of a Multi-Objective Optimization (MOO) problem, which considers the concurrent optimization of all task objectives involved in the Multi-Task Learning (MTL) problem. Motivated by this last observation and arguing that the former approach is heuristic, we propose a novel Support Vector Machine (SVM) MT-MKL framework, that considers an implicitly-defined set of conic combinations of task objectives. We show that solving our framework produces solutions along a path on the aforementioned PF and that it subsumes the optimization of the average of objective functions as a special case. Using algorithms we derived, we demonstrate through a series of experimental results that the framework is capable of achieving better classification performance, when compared to other similar MTL approaches.
△ Less
Submitted 11 April, 2014;
originally announced April 2014.
-
A Unifying Framework for Typical Multi-Task Multiple Kernel Learning Problems
Authors:
Cong Li,
Michael Georgiopoulos,
Georgios C. Anagnostopoulos
Abstract:
Over the past few years, Multi-Kernel Learning (MKL) has received significant attention among data-driven feature selection techniques in the context of kernel-based learning. MKL formulations have been devised and solved for a broad spectrum of machine learning problems, including Multi-Task Learning (MTL). Solving different MKL formulations usually involves designing algorithms that are tailored…
▽ More
Over the past few years, Multi-Kernel Learning (MKL) has received significant attention among data-driven feature selection techniques in the context of kernel-based learning. MKL formulations have been devised and solved for a broad spectrum of machine learning problems, including Multi-Task Learning (MTL). Solving different MKL formulations usually involves designing algorithms that are tailored to the problem at hand, which is, typically, a non-trivial accomplishment.
In this paper we present a general Multi-Task Multi-Kernel Learning (Multi-Task MKL) framework that subsumes well-known Multi-Task MKL formulations, as well as several important MKL approaches on single-task problems. We then derive a simple algorithm that can solve the unifying framework. To demonstrate the flexibility of the proposed framework, we formulate a new learning problem, namely Partially-Shared Common Space (PSCS) Multi-Task MKL, and demonstrate its merits through experimentation.
△ Less
Submitted 20 January, 2014;
originally announced January 2014.
-
Multi-Task Classification Hypothesis Space with Improved Generalization Bounds
Authors:
Cong Li,
Michael Georgiopoulos,
Georgios C. Anagnostopoulos
Abstract:
This paper presents a RKHS, in general, of vector-valued functions intended to be used as hypothesis space for multi-task classification. It extends similar hypothesis spaces that have previously considered in the literature. Assuming this space, an improved Empirical Rademacher Complexity-based generalization bound is derived. The analysis is itself extended to an MKL setting. The connection betw…
▽ More
This paper presents a RKHS, in general, of vector-valued functions intended to be used as hypothesis space for multi-task classification. It extends similar hypothesis spaces that have previously considered in the literature. Assuming this space, an improved Empirical Rademacher Complexity-based generalization bound is derived. The analysis is itself extended to an MKL setting. The connection between the proposed hypothesis space and a Group-Lasso type regularizer is discussed. Finally, experimental results, with some SVM-based Multi-Task Learning problems, underline the quality of the derived bounds and validate the paper's analysis.
△ Less
Submitted 9 December, 2013;
originally announced December 2013.
-
Kernel-based Distance Metric Learning in the Output Space
Authors:
Cong Li,
Michael Georgiopoulos,
Georgios C. Anagnostopoulos
Abstract:
In this paper we present two related, kernel-based Distance Metric Learning (DML) methods. Their respective models non-linearly map data from their original space to an output space, and subsequent distance measurements are performed in the output space via a Mahalanobis metric. The dimensionality of the output space can be directly controlled to facilitate the learning of a low-rank metric. Both…
▽ More
In this paper we present two related, kernel-based Distance Metric Learning (DML) methods. Their respective models non-linearly map data from their original space to an output space, and subsequent distance measurements are performed in the output space via a Mahalanobis metric. The dimensionality of the output space can be directly controlled to facilitate the learning of a low-rank metric. Both methods allow for simultaneous inference of the associated metric and the mapping to the output space, which can be used to visualize the data, when the output space is 2- or 3-dimensional. Experimental results for a collection of classification tasks illustrate the advantages of the proposed methods over other traditional and kernel-based DML approaches.
△ Less
Submitted 28 April, 2014; v1 submitted 9 December, 2013;
originally announced December 2013.