-
Fully Dynamic Euclidean Bi-Chromatic Matching in Sublinear Update Time
Authors:
Gramoz Goranci,
Peter Kiss,
Neel Patel,
Martin P. Seybold,
Eva Szilagyi,
Da Wei Zheng
Abstract:
We consider the Euclidean bi-chromatic matching problem in the dynamic setting, where the goal is to efficiently process point insertions and deletions while maintaining a high-quality solution. Computing the minimum cost bi-chromatic matching is one of the core problems in geometric optimization that has found many applications, most notably in estimating Wasserstein distance between two distribu…
▽ More
We consider the Euclidean bi-chromatic matching problem in the dynamic setting, where the goal is to efficiently process point insertions and deletions while maintaining a high-quality solution. Computing the minimum cost bi-chromatic matching is one of the core problems in geometric optimization that has found many applications, most notably in estimating Wasserstein distance between two distributions. In this work, we present the first fully dynamic algorithm for Euclidean bi-chromatic matching with sub-linear update time. For any fixed $\varepsilon > 0$, our algorithm achieves $O(1/\varepsilon)$-approximation and handles updates in $O(n^{\varepsilon})$ time. Our experiments show that our algorithm enables effective monitoring of the distributional drift in the Wasserstein distance on real and synthetic data sets, while outperforming the runtime of baseline approximations by orders of magnitudes.
△ Less
Submitted 13 May, 2025;
originally announced May 2025.
-
Deterministic Dynamic Maximal Matching in Sublinear Update Time
Authors:
Aaron Bernstein,
Sayan Bhattacharya,
Peter Kiss,
Thatchaphol Saranurak
Abstract:
We give a fully dynamic deterministic algorithm for maintaining a maximal matching of an $n$-vertex graph in $\tilde{O}(n^{8/9})$ amortized update time. This breaks the long-standing $Ω(n)$-update-time barrier on dense graphs, achievable by trivially scanning all incident vertices of the updated edge, and affirmatively answers a major open question repeatedly asked in the literature [BGS15, BCHN18…
▽ More
We give a fully dynamic deterministic algorithm for maintaining a maximal matching of an $n$-vertex graph in $\tilde{O}(n^{8/9})$ amortized update time. This breaks the long-standing $Ω(n)$-update-time barrier on dense graphs, achievable by trivially scanning all incident vertices of the updated edge, and affirmatively answers a major open question repeatedly asked in the literature [BGS15, BCHN18, Sol22]. We also present a faster randomized algorithm against an adaptive adversary with $\tilde{O}(n^{3/4})$ amortized update time.
Our approach employs the edge degree constrained subgraph (EDCS), a central object for optimizing approximation ratio, in a completely novel way; we instead use it for maintaining a matching that matches all high degree vertices in sublinear update time so that it remains to handle low degree vertices rather straightforwardly. To optimize this approach, we employ tools never used in the dynamic matching literature prior to our work, including sublinear-time algorithms for matching high degree vertices, random walks on directed expanders, and the monotone Even-Shiloach tree for dynamic shortest paths.
△ Less
Submitted 29 April, 2025;
originally announced April 2025.
-
Improved Bounds for Fully Dynamic Matching via Ordered Ruzsa-Szemeredi Graphs
Authors:
Sepehr Assadi,
Sanjeev Khanna,
Peter Kiss
Abstract:
In a very recent breakthrough, Behnezhad and Ghafari [FOCS'24] developed a novel fully dynamic randomized algorithm for maintaining a $(1-ε)$-approximation of maximum matching with amortized update time potentially much better than the trivial $O(n)$ update time. The runtime of the BG algorithm is parameterized via the following graph theoretical concept:
* For any $n$, define $ORS(n)$ -- standi…
▽ More
In a very recent breakthrough, Behnezhad and Ghafari [FOCS'24] developed a novel fully dynamic randomized algorithm for maintaining a $(1-ε)$-approximation of maximum matching with amortized update time potentially much better than the trivial $O(n)$ update time. The runtime of the BG algorithm is parameterized via the following graph theoretical concept:
* For any $n$, define $ORS(n)$ -- standing for Ordered RS Graph -- to be the largest number of edge-disjoint matchings $M_1,\ldots,M_t$ of size $Θ(n)$ in an $n$-vertex graph such that for every $i \in [t]$, $M_i$ is an induced matching in the subgraph $M_{i} \cup M_{i+1} \cup \ldots \cup M_t$.
Then, for any fixed $ε> 0$, the BG algorithm runs in \[
O\left( \sqrt{n^{1+O(ε)} \cdot ORS(n)} \right) \] amortized update time with high probability, even against an adaptive adversary. $ORS(n)$ is a close variant of a more well-known quantity regarding RS graphs (which require every matching to be induced regardless of the ordering). It is currently only known that $n^{o(1)} \leqslant ORS(n) \leqslant n^{1-o(1)}$, and closing this gap appears to be a notoriously challenging problem.
In this work, we further strengthen the result of Behnezhad and Ghafari and push it to limit to obtain a randomized algorithm with amortized update time of \[
n^{o(1)} \cdot ORS(n) \] with high probability, even against an adaptive adversary. In the limit, i.e., if current lower bounds for $ORS(n) = n^{o(1)}$ are almost optimal, our algorithm achieves an $n^{o(1)}$ update time for $(1-ε)$-approximation of maximum matching, almost fully resolving this fundamental question. In its current stage also, this fully reduces the algorithmic problem of designing dynamic matching algorithms to a purely combinatorial problem of upper bounding $ORS(n)$ with no algorithmic considerations.
△ Less
Submitted 18 October, 2024; v1 submitted 19 June, 2024;
originally announced June 2024.
-
Near-Optimal Dynamic Rounding of Fractional Matchings in Bipartite Graphs
Authors:
Sayan Bhattacharya,
Peter Kiss,
Aaron Sidford,
David Wajc
Abstract:
We study dynamic $(1-ε)$-approximate rounding of fractional matchings -- a key ingredient in numerous breakthroughs in the dynamic graph algorithms literature. Our first contribution is a surprisingly simple deterministic rounding algorithm in bipartite graphs with amortized update time $O(ε^{-1} \log^2 (ε^{-1} \cdot n))$, matching an (unconditional) recourse lower bound of $Ω(ε^{-1})$ up to logar…
▽ More
We study dynamic $(1-ε)$-approximate rounding of fractional matchings -- a key ingredient in numerous breakthroughs in the dynamic graph algorithms literature. Our first contribution is a surprisingly simple deterministic rounding algorithm in bipartite graphs with amortized update time $O(ε^{-1} \log^2 (ε^{-1} \cdot n))$, matching an (unconditional) recourse lower bound of $Ω(ε^{-1})$ up to logarithmic factors. Moreover, this algorithm's update time improves provided the minimum (non-zero) weight in the fractional matching is lower bounded throughout. Combining this algorithm with novel dynamic \emph{partial rounding} algorithms to increase this minimum weight, we obtain several algorithms that improve this dependence on $n$. For example, we give a high-probability randomized algorithm with $\tilde{O}(ε^{-1}\cdot (\log\log n)^2)$-update time against adaptive adversaries. (We use Soft-Oh notation, $\tilde{O}$, to suppress polylogarithmic factors in the argument, i.e., $\tilde{O}(f)=O(f\cdot \mathrm{poly}(\log f))$.) Using our rounding algorithms, we also round known $(1-ε)$-decremental fractional bipartite matching algorithms with no asymptotic overhead, thus improving on state-of-the-art algorithms for the decremental bipartite matching problem. Further, we provide extensions of our results to general graphs and to maintaining almost-maximal matchings.
△ Less
Submitted 23 February, 2024; v1 submitted 20 June, 2023;
originally announced June 2023.
-
Incremental $(1-ε)$-approximate dynamic matching in $O(poly(1/ε))$ update time
Authors:
Joakim Blikstad,
Peter Kiss
Abstract:
In the dynamic approximate maximum bipartite matching problem we are given bipartite graph $G$ undergoing updates and our goal is to maintain a matching of $G$ which is large compared the maximum matching size $μ(G)$. We define a dynamic matching algorithm to be $α$ (respectively $(α, β)$)-approximate if it maintains matching $M$ such that at all times $|M | \geq μ(G) \cdot α$ (respectively…
▽ More
In the dynamic approximate maximum bipartite matching problem we are given bipartite graph $G$ undergoing updates and our goal is to maintain a matching of $G$ which is large compared the maximum matching size $μ(G)$. We define a dynamic matching algorithm to be $α$ (respectively $(α, β)$)-approximate if it maintains matching $M$ such that at all times $|M | \geq μ(G) \cdot α$ (respectively $|M| \geq μ(G) \cdot α- β$).
We present the first deterministic $(1-ε)$-approximate dynamic matching algorithm with $O(poly(ε^{-1}))$ amortized update time for graphs undergoing edge insertions. Previous solutions either required super-constant [Gupta FSTTCS'14, Bhattacharya-Kiss-Saranurak SODA'23] or exponential in $1/ε$ [Grandoni-Leonardi-Sankowski-Schwiegelshohn-Solomon SODA'19] update time. Our implementation is arguably simpler than the mentioned algorithms and its description is self contained. Moreover, we show that if we allow for additive $(1, ε\cdot n)$-approximation our algorithm seamlessly extends to also handle vertex deletions, on top of edge insertions. This makes our algorithm one of the few small update time algorithms for $(1-ε)$-approximate dynamic matching allowing for updates both increasing and decreasing the maximum matching size of $G$ in a fully dynamic manner.
△ Less
Submitted 12 July, 2023; v1 submitted 16 February, 2023;
originally announced February 2023.
-
Dynamic $(1+ε)$-Approximate Matching Size in Truly Sublinear Update Time
Authors:
Sayan Bhattacharya,
Peter Kiss,
Thatchaphol Saranurak
Abstract:
We show a fully dynamic algorithm for maintaining $(1+ε)$-approximate \emph{size} of maximum matching of the graph with $n$ vertices and $m$ edges using $m^{0.5-Ω_ε(1)}$ update time. This is the first polynomial improvement over the long-standing $O(n)$ update time, which can be trivially obtained by periodic recomputation. Thus, we resolve the value version of a major open question of the dynamic…
▽ More
We show a fully dynamic algorithm for maintaining $(1+ε)$-approximate \emph{size} of maximum matching of the graph with $n$ vertices and $m$ edges using $m^{0.5-Ω_ε(1)}$ update time. This is the first polynomial improvement over the long-standing $O(n)$ update time, which can be trivially obtained by periodic recomputation. Thus, we resolve the value version of a major open question of the dynamic graph algorithms literature (see, e.g., [Gupta and Peng FOCS'13], [Bernstein and Stein SODA'16],[Behnezhad and Khanna SODA'22]).
Our key technical component is the first sublinear algorithm for $(1,εn)$-approximate maximum matching with sublinear running time on dense graphs. All previous algorithms suffered a multiplicative approximation factor of at least $1.499$ or assumed that the graph has a very small maximum degree.
△ Less
Submitted 28 April, 2024; v1 submitted 9 February, 2023;
originally announced February 2023.
-
Sublinear Algorithms for $(1.5+ε)$-Approximate Matching
Authors:
Sayan Bhattacharya,
Peter Kiss,
Thatchaphol Saranurak
Abstract:
We study sublinear time algorithms for estimating the size of maximum matching. After a long line of research, the problem was finally settled by Behnezhad [FOCS'22], in the regime where one is willing to pay an approximation factor of $2$. Very recently, Behnezhad et al.[SODA'23] improved the approximation factor to $(2-\frac{1}{2^{O(1/γ)}})$ using $n^{1+γ}$ time. This improvement over the factor…
▽ More
We study sublinear time algorithms for estimating the size of maximum matching. After a long line of research, the problem was finally settled by Behnezhad [FOCS'22], in the regime where one is willing to pay an approximation factor of $2$. Very recently, Behnezhad et al.[SODA'23] improved the approximation factor to $(2-\frac{1}{2^{O(1/γ)}})$ using $n^{1+γ}$ time. This improvement over the factor $2$ is, however, minuscule and they asked if even $1.99$-approximation is possible in $n^{2-Ω(1)}$ time. We give a strong affirmative answer to this open problem by showing $(1.5+ε)$-approximation algorithms that run in $n^{2-Θ(ε^{2})}$ time. Our approach is conceptually simple and diverges from all previous sublinear-time matching algorithms: we show a sublinear time algorithm for computing a variant of the edge-degree constrained subgraph (EDCS), a concept that has previously been exploited in dynamic [Bernstein Stein ICALP'15, SODA'16], distributed [Assadi et al. SODA'19] and streaming [Bernstein ICALP'20] settings, but never before in the sublinear setting. Independent work: Behnezhad, Roghani and Rubinstein [BRR'23] independently showed sublinear algorithms similar to our Theorem 1.2 in both adjacency list and matrix models. Furthermore, in [BRR'23], they show additional results on strictly better-than-1.5 approximate matching algorithms in both upper and lower bound sides.
△ Less
Submitted 26 April, 2023; v1 submitted 30 November, 2022;
originally announced December 2022.
-
Matrix Factorization for Cache Optimization in Content Delivery Networks (CDN)
Authors:
Adolf Kamuzora,
Wadie Skaf,
Ermiyas Birihanu,
Jiyan Mahmud,
Péter Kiss,
Tamás Jursonovics,
Peter Pogrzeba,
Imre Lendák,
Tomáš Horváth
Abstract:
Content delivery networks (CDNs) are key components of high throughput, low latency services on the internet. CDN cache servers have limited storage and bandwidth and implement state-of-the-art cache admission and eviction algorithms to select the most popular and relevant content for the customers served. The aim of this study was to utilize state-of-the-art recommender system techniques for pred…
▽ More
Content delivery networks (CDNs) are key components of high throughput, low latency services on the internet. CDN cache servers have limited storage and bandwidth and implement state-of-the-art cache admission and eviction algorithms to select the most popular and relevant content for the customers served. The aim of this study was to utilize state-of-the-art recommender system techniques for predicting ratings for cache content in CDN. Matrix factorization was used in predicting content popularity which is valuable information in content eviction and content admission algorithms run on CDN edge servers. A custom implemented matrix factorization class and MyMediaLite were utilized. The input CDN logs were received from a European telecommunication service provider. We built a matrix factorization model with that data and utilized grid search to tune its hyper-parameters. Experimental results indicate that there is promise about the proposed approaches and we showed that a low root mean square error value can be achieved on the real-life CDN log data.
△ Less
Submitted 5 October, 2022;
originally announced November 2022.
-
Client Error Clustering Approaches in Content Delivery Networks (CDN)
Authors:
Ermiyas Birihanu,
Jiyan Mahmud,
Péter Kiss,
Adolf Kamuzora,
Wadie Skaf,
Tomáš Horváth,
Tamás Jursonovics,
Peter Pogrzeba,
Imre Lendák
Abstract:
Content delivery networks (CDNs) are the backbone of the Internet and are key in delivering high quality video on demand (VoD), web content and file services to billions of users. CDNs usually consist of hierarchically organized content servers positioned as close to the customers as possible. CDN operators face a significant challenge when analyzing billions of web server and proxy logs generated…
▽ More
Content delivery networks (CDNs) are the backbone of the Internet and are key in delivering high quality video on demand (VoD), web content and file services to billions of users. CDNs usually consist of hierarchically organized content servers positioned as close to the customers as possible. CDN operators face a significant challenge when analyzing billions of web server and proxy logs generated by their systems. The main objective of this study was to analyze the applicability of various clustering methods in CDN error log analysis. We worked with real-life CDN proxy logs, identified key features included in the logs (e.g., content type, HTTP status code, time-of-day, host) and clustered the log lines corresponding to different host types offering live TV, video on demand, file caching and web content. Our experiments were run on a dataset consisting of proxy logs collected over a 7-day period from a single, physical CDN server running multiple types of services (VoD, live TV, file). The dataset consisted of 2.2 billion log lines. Our analysis showed that CDN error clustering is a viable approach towards identifying recurring errors and improving overall quality of service.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
Dynamic Algorithms for Packing-Covering LPs via Multiplicative Weight Updates
Authors:
Sayan Bhattacharya,
Peter Kiss,
Thatchaphol Saranurak
Abstract:
In the dynamic linear program (LP) problem, we are given an LP undergoing updates and we need to maintain an approximately optimal solution. Recently, significant attention (e.g., [Gupta et al. STOC'17; Arar et al. ICALP'18, Wajc STOC'20]) has been devoted to the study of special cases of dynamic packing and covering LPs, such as the dynamic fractional matching and set cover problems. But until no…
▽ More
In the dynamic linear program (LP) problem, we are given an LP undergoing updates and we need to maintain an approximately optimal solution. Recently, significant attention (e.g., [Gupta et al. STOC'17; Arar et al. ICALP'18, Wajc STOC'20]) has been devoted to the study of special cases of dynamic packing and covering LPs, such as the dynamic fractional matching and set cover problems. But until now, there is no non-trivial dynamic algorithm for general packing and covering LPs.
In this paper, we settle the complexity of dynamic packing and covering LPs, up to a polylogarithmic factor in update time. More precisely, in the partially dynamic setting (where updates can either only relax or only restrict the feasible region), we give near-optimal deterministic $ε$-approximation algorithms with polylogarithmic amortized update time. Then, we show that both partially dynamic updates and amortized update time are necessary; without any of these conditions, the trivial algorithm that recomputes the solution from scratch after every update is essentially the best possible, assuming SETH.
To obtain our results, we initiate a systematic study of the multiplicative weights update (MWU) method in the dynamic setting. As by-products of our techniques, we also obtain the first online $(1+ε)$-competitive algorithms for both covering and packing LPs with polylogarithmic recourse, and the first streaming algorithms for covering and packing LPs with linear space and polylogarithmic passes.
△ Less
Submitted 15 July, 2022;
originally announced July 2022.
-
Dynamic Matching with Better-than-2 Approximation in Polylogarithmic Update Time
Authors:
Sayan Bhattacharya,
Peter Kiss,
Thatchaphol Saranurak,
David Wajc
Abstract:
We present dynamic algorithms with polylogarithmic update time for estimating the size of the maximum matching of a graph undergoing edge insertions and deletions with approximation ratio strictly better than $2$. Specifically, we obtain a $1+\frac{1}{\sqrt{2}}+ε\approx 1.707+ε$ approximation in bipartite graphs and a $1.973+ε$ approximation in general graphs. We thus answer in the affirmative the…
▽ More
We present dynamic algorithms with polylogarithmic update time for estimating the size of the maximum matching of a graph undergoing edge insertions and deletions with approximation ratio strictly better than $2$. Specifically, we obtain a $1+\frac{1}{\sqrt{2}}+ε\approx 1.707+ε$ approximation in bipartite graphs and a $1.973+ε$ approximation in general graphs. We thus answer in the affirmative the major open question first posed in the influential work of Onak and Rubinfeld (STOC'10) and repeatedly asked in the dynamic graph algorithms literature. Our randomized algorithms also work against an adaptive adversary and guarantee worst-case polylog update time, both w.h.p.
Our algorithms are based on simulating new two-pass streaming matching algorithms in the dynamic setting. Our key new idea is to invoke the recent sublinear-time matching algorithm of Behnezhad (FOCS'21) in a white-box manner to efficiently simulate the second pass of our streaming algorithms, while bypassing the well-known vertex-update barrier.
△ Less
Submitted 27 April, 2023; v1 submitted 15 July, 2022;
originally announced July 2022.
-
Multimodal E-Commerce Product Classification Using Hierarchical Fusion
Authors:
Tsegaye Misikir Tashu,
Sara Fattouh,
Peter Kiss,
Tomas Horvath
Abstract:
In this work, we present a multi-modal model for commercial product classification, that combines features extracted by multiple neural network models from textual (CamemBERT and FlauBERT) and visual data (SE-ResNeXt-50), using simple fusion techniques. The proposed method significantly outperformed the unimodal models' performance and the reported performance of similar models on our specific tas…
▽ More
In this work, we present a multi-modal model for commercial product classification, that combines features extracted by multiple neural network models from textual (CamemBERT and FlauBERT) and visual data (SE-ResNeXt-50), using simple fusion techniques. The proposed method significantly outperformed the unimodal models' performance and the reported performance of similar models on our specific task. We did experiments with multiple fusing techniques and found, that the best performing technique to combine the individual embedding of the unimodal network is based on combining concatenation and averaging the feature vectors. Each modality complemented the shortcomings of the other modalities, demonstrating that increasing the number of modalities can be an effective method for improving the performance of multi-label and multimodal classification problems.
△ Less
Submitted 7 July, 2022;
originally announced July 2022.
-
Deterministic Dynamic Matching In Worst-Case Update Time
Authors:
Peter Kiss
Abstract:
We present deterministic algorithms for maintaining a $(3/2 + ε)$ and $(2 + ε)$-approximate maximum matching in a fully dynamic graph with worst-case update times $\hat{O}(\sqrt{n})$ and $\tilde{O}(1)$ respectively. The fastest known deterministic worst-case update time algorithms for achieving approximation ratio $(2 - δ)$ (for any $δ> 0$) and $(2 + ε)$ were both shown by Roghani et al. [2021] wi…
▽ More
We present deterministic algorithms for maintaining a $(3/2 + ε)$ and $(2 + ε)$-approximate maximum matching in a fully dynamic graph with worst-case update times $\hat{O}(\sqrt{n})$ and $\tilde{O}(1)$ respectively. The fastest known deterministic worst-case update time algorithms for achieving approximation ratio $(2 - δ)$ (for any $δ> 0$) and $(2 + ε)$ were both shown by Roghani et al. [2021] with update times $O(n^{3/4})$ and $O_ε(\sqrt{n})$ respectively. We close the gap between worst-case and amortized algorithms for the two approximation ratios as the best deterministic amortized update times for the problem are $O_ε(\sqrt{n})$ and $\tilde{O}(1)$ which were shown in Bernstein and Stein [SODA'2021] and Bhattacharya and Kiss [ICALP'2021] respectively.
In order to achieve both results we explicitly state a method implicitly used in Nanongkai and Saranurak [STOC'2017] and Bernstein et al. [arXiv'2020] which allows to transform dynamic algorithms capable of processing the input in batches to a dynamic algorithms with worst-case update time.
\textbf{Independent Work:} Independently and concurrently to our work Grandoni et al. [arXiv'2021] has presented a fully dynamic algorithm for maintaining a $(3/2 + ε)$-approximate maximum matching with deterministic worst-case update time $O_ε(\sqrt{n})$.
△ Less
Submitted 19 November, 2021; v1 submitted 23 August, 2021;
originally announced August 2021.
-
Deterministic Rounding of Dynamic Fractional Matchings
Authors:
Sayan Bhattacharya,
Peter Kiss
Abstract:
We present a framework for deterministically rounding a dynamic fractional matching. Applying our framework in a black-box manner on top of existing fractional matching algorithms, we derive the following new results: (1) The first deterministic algorithm for maintaining a $(2-δ)$-approximate maximum matching in a fully dynamic bipartite graph, in arbitrarily small polynomial update time. (2) The…
▽ More
We present a framework for deterministically rounding a dynamic fractional matching. Applying our framework in a black-box manner on top of existing fractional matching algorithms, we derive the following new results: (1) The first deterministic algorithm for maintaining a $(2-δ)$-approximate maximum matching in a fully dynamic bipartite graph, in arbitrarily small polynomial update time. (2) The first deterministic algorithm for maintaining a $(1+δ)$-approximate maximum matching in a decremental bipartite graph, in polylogarithmic update time. (3) The first deterministic algorithm for maintaining a $(2+δ)$-approximate maximum matching in a fully dynamic general graph, in small polylogarithmic (specifically, $O(\log^4 n)$) update time. These results are respectively obtained by applying our framework on top of the fractional matching algorithms of Bhattacharya et al. [STOC'16], Bernstein et al. [FOCS'20], and Bhattacharya and Kulkarni [SODA'19].
Prior to our work, there were two known general-purpose rounding schemes for dynamic fractional matchings. Both these schemes, by Arar et al. [ICALP'18] and Wajc [STOC'20], were randomized.
Our rounding scheme works by maintaining a good {\em matching-sparsifier} with bounded arboricity, and then applying the algorithm of Peleg and Solomon [SODA'16] to maintain a near-optimal matching in this low arboricity graph. To the best of our knowledge, this is the first dynamic matching algorithm that works on general graphs by using an algorithm for low-arboricity graphs as a black-box subroutine. This feature of our rounding scheme might be of independent interest.
△ Less
Submitted 4 May, 2021;
originally announced May 2021.
-
Optimized Tracking of Topic Evolution
Authors:
Patrick Kiss,
Elaheh Momeni
Abstract:
Topic evolution modeling has been researched for a long time and has gained considerable interest. A state-of-the-art method has been recently using word modeling algorithms in combination with community detection mechanisms to achieve better results in a more effective way. We analyse results of this approach and discuss the two major challenges that this approach still faces. Although the topics…
▽ More
Topic evolution modeling has been researched for a long time and has gained considerable interest. A state-of-the-art method has been recently using word modeling algorithms in combination with community detection mechanisms to achieve better results in a more effective way. We analyse results of this approach and discuss the two major challenges that this approach still faces. Although the topics that have resulted from the recent algorithm are good in general, they are very noisy due to many topics that are very unimportant because of their size, words, or ambiguity. Additionally, the number of words defining each topic is too large, making it difficult to analyse them in their unsorted state. In this paper, we propose approaches to tackle these challenges by adding topic filtering and network analysis metrics to define the importance of a topic. We test different combinations of these metrics to see which combination yields the best results. Furthermore, we add word filtering and ranking to each topic to identify the words with the highest novelty automatically. We evaluate our enhancement methods in two ways: human qualitative evaluation and automatic quantitative evaluation. Moreover, we created two case studies to test the quality of the clusters and words. In the quantitative evaluation, we use the pairwise mutual information score to test the coherency of topics. The quantitative evaluation also includes an analysis of execution times for each part of the program. The results of the experimental evaluations show that the two evaluation methods agree on the positive feasibility of the algorithm. We then show possible extensions in the form of usability and future improvements to the algorithm.
△ Less
Submitted 16 December, 2019;
originally announced December 2019.
-
Main-Belt Asteroids in the K2 Engineering Field of View
Authors:
R. Szabó,
K. Sárneczky,
Gy. M. Szabó,
A. Pál,
Cs. P. Kiss,
B. Csák,
L. Illés,
G. Rácz,
L. L. Kiss
Abstract:
Unlike NASA's original Kepler Discovery Mission, the renewed K2 Mission will stare at the plane of the Ecliptic, observing each field for approximately 75 days. This will bring new opportunities and challenges, in particular the presence of a large number of main-belt asteroids that will contaminate the photometry. The large pixel size makes K2 data susceptible to the effect of apparent minor plan…
▽ More
Unlike NASA's original Kepler Discovery Mission, the renewed K2 Mission will stare at the plane of the Ecliptic, observing each field for approximately 75 days. This will bring new opportunities and challenges, in particular the presence of a large number of main-belt asteroids that will contaminate the photometry. The large pixel size makes K2 data susceptible to the effect of apparent minor planet encounters. Here we investigate the effects of asteroid encounters on photometric precision using a sub-sample of the K2 Engineering data taken in February, 2014. We show examples of asteroid contamination to facilitate their recognition and distinguish these events from other error sources. We conclude that main-belt asteroids will have considerable effects on K2 photometry of a large number of photometric targets during the Mission, that will have to be taken into account. These results will be readily applicable for future space photometric missions applying large-format CCDs, such as TESS and PLATO.
△ Less
Submitted 23 January, 2015;
originally announced January 2015.