-
Emoji Driven Crypto Assets Market Reactions
Authors:
Xiaorui Zuo,
Yao-Tsung Chen,
Wolfgang Karl Härdle
Abstract:
In the burgeoning realm of cryptocurrency, social media platforms like Twitter have become pivotal in influencing market trends and investor sentiments. In our study, we leverage GPT-4 and a fine-tuned transformer-based BERT model for a multimodal sentiment analysis, focusing on the impact of emoji sentiment on cryptocurrency markets. By translating emojis into quantifiable sentiment data, we corr…
▽ More
In the burgeoning realm of cryptocurrency, social media platforms like Twitter have become pivotal in influencing market trends and investor sentiments. In our study, we leverage GPT-4 and a fine-tuned transformer-based BERT model for a multimodal sentiment analysis, focusing on the impact of emoji sentiment on cryptocurrency markets. By translating emojis into quantifiable sentiment data, we correlate these insights with key market indicators like BTC Price and the VCRIX index. Our architecture's analysis of emoji sentiment demonstrated a distinct advantage over FinBERT's pure text sentiment analysis in such predicting power. This approach may be fed into the development of trading strategies aimed at utilizing social media elements to identify and forecast market trends. Crucially, our findings suggest that strategies based on emoji sentiment can facilitate the avoidance of significant market downturns and contribute to the stabilization of returns. This research underscores the practical benefits of integrating advanced AI-driven analyses into financial strategies, offering a nuanced perspective on the interplay between digital communication and market dynamics in an academic context.
△ Less
Submitted 4 May, 2024; v1 submitted 16 February, 2024;
originally announced February 2024.
-
Deep Learning and NLP in Cryptocurrency Forecasting: Integrating Financial, Blockchain, and Social Media Data
Authors:
Vincent Gurgul,
Stefan Lessmann,
Wolfgang Karl Härdle
Abstract:
We introduce novel approaches to cryptocurrency price forecasting, leveraging Machine Learning (ML) and Natural Language Processing (NLP) techniques, with a focus on Bitcoin and Ethereum. By analysing news and social media content, primarily from Twitter and Reddit, we assess the impact of public sentiment on cryptocurrency markets. A distinctive feature of our methodology is the application of th…
▽ More
We introduce novel approaches to cryptocurrency price forecasting, leveraging Machine Learning (ML) and Natural Language Processing (NLP) techniques, with a focus on Bitcoin and Ethereum. By analysing news and social media content, primarily from Twitter and Reddit, we assess the impact of public sentiment on cryptocurrency markets. A distinctive feature of our methodology is the application of the BART MNLI zero-shot classification model to detect bullish and bearish trends, significantly advancing beyond traditional sentiment analysis. Additionally, we systematically compare a range of pre-trained and fine-tuned deep learning NLP models against conventional dictionary-based sentiment analysis methods. Another key contribution of our work is the adoption of local extrema alongside daily price movements as predictive targets, reducing trading frequency and portfolio volatility. Our findings demonstrate that integrating textual data into cryptocurrency price forecasting not only improves forecasting accuracy but also consistently enhances the profitability and Sharpe ratio across various validation scenarios, particularly when applying deep learning NLP techniques. The entire codebase of our experiments is made available via an online repository: https://anonymous.4open.science/r/crypto-forecasting-public
△ Less
Submitted 25 October, 2024; v1 submitted 23 November, 2023;
originally announced November 2023.
-
Robustifying Markowitz
Authors:
Wolfgang Karl Härdle,
Yegor Klochkov,
Alla Petukhina,
Nikita Zhivotovskiy
Abstract:
Markowitz mean-variance portfolios with sample mean and covariance as input parameters feature numerous issues in practice. They perform poorly out of sample due to estimation error, they experience extreme weights together with high sensitivity to change in input parameters. The heavy-tail characteristics of financial time series are in fact the cause for these erratic fluctuations of weights tha…
▽ More
Markowitz mean-variance portfolios with sample mean and covariance as input parameters feature numerous issues in practice. They perform poorly out of sample due to estimation error, they experience extreme weights together with high sensitivity to change in input parameters. The heavy-tail characteristics of financial time series are in fact the cause for these erratic fluctuations of weights that consequently create substantial transaction costs. In robustifying the weights we present a toolbox for stabilizing costs and weights for global minimum Markowitz portfolios. Utilizing a projected gradient descent (PGD) technique, we avoid the estimation and inversion of the covariance operator as a whole and concentrate on robust estimation of the gradient descent increment. Using modern tools of robust statistics we construct a computationally efficient estimator with almost Gaussian properties based on median-of-means uniformly over weights. This robustified Markowitz approach is confirmed by empirical studies on equity markets. We demonstrate that robustified portfolios reach the lowest turnover compared to shrinkage-based and constrained portfolios while preserving or slightly improving out-of-sample performance.
△ Less
Submitted 28 December, 2022;
originally announced December 2022.
-
Shapley Curves: A Smoothing Perspective
Authors:
Ratmir Miftachov,
Georg Keilbar,
Wolfgang Karl Härdle
Abstract:
This paper fills the limited statistical understanding of Shapley values as a variable importance measure from a nonparametric (or smoothing) perspective. We introduce population-level \textit{Shapley curves} to measure the true variable importance, determined by the conditional expectation function and the distribution of covariates. Having defined the estimand, we derive minimax convergence rate…
▽ More
This paper fills the limited statistical understanding of Shapley values as a variable importance measure from a nonparametric (or smoothing) perspective. We introduce population-level \textit{Shapley curves} to measure the true variable importance, determined by the conditional expectation function and the distribution of covariates. Having defined the estimand, we derive minimax convergence rates and asymptotic normality under general conditions for the two leading estimation strategies. For finite sample inference, we propose a novel version of the wild bootstrap procedure tailored for capturing lower-order terms in the estimation of Shapley curves. Numerical studies confirm our theoretical findings, and an empirical application analyzes the determining factors of vehicle prices.
△ Less
Submitted 3 April, 2024; v1 submitted 23 November, 2022;
originally announced November 2022.
-
Quantinar: a blockchain p2p ecosystem for honest scientific research
Authors:
Raul Bag,
Bruno Spilak,
Julian Winkel,
Wolfgang Karl Härdle
Abstract:
Living in the Information Age, the power of data and correct statistical analysis has never been more prevalent. Academics and practitioners require nowadays an accurate application of quantitative methods. Yet many branches are subject to a crisis of integrity, which is shown in an improper use of statistical models, $p$-hacking, HARKing, or failure to replicate results. We propose the use of a P…
▽ More
Living in the Information Age, the power of data and correct statistical analysis has never been more prevalent. Academics and practitioners require nowadays an accurate application of quantitative methods. Yet many branches are subject to a crisis of integrity, which is shown in an improper use of statistical models, $p$-hacking, HARKing, or failure to replicate results. We propose the use of a Peer-to-Peer (P2P) ecosystem based on a blockchain network, Quantinar (quantinar.com), to support quantitative analytics knowledge paired with code in the form of Quantlets (quantlet.com) or software snippets. The integration of blockchain technology makes Quantinar a decentralized autonomous organization (DAO) that ensures fully transparent and reproducible scientific research.
△ Less
Submitted 31 March, 2023; v1 submitted 13 November, 2022;
originally announced November 2022.
-
A Data-driven Case-based Reasoning in Bankruptcy Prediction
Authors:
Wei Li,
Wolfgang Karl Härdle,
Stefan Lessmann
Abstract:
There has been intensive research regarding machine learning models for predicting bankruptcy in recent years. However, the lack of interpretability limits their growth and practical implementation. This study proposes a data-driven explainable case-based reasoning (CBR) system for bankruptcy prediction. Empirical results from a comparative study show that the proposed approach performs superior t…
▽ More
There has been intensive research regarding machine learning models for predicting bankruptcy in recent years. However, the lack of interpretability limits their growth and practical implementation. This study proposes a data-driven explainable case-based reasoning (CBR) system for bankruptcy prediction. Empirical results from a comparative study show that the proposed approach performs superior to existing, alternative CBR systems and is competitive with state-of-the-art machine learning models. We also demonstrate that the asymmetrical feature similarity comparison mechanism in the proposed CBR system can effectively capture the asymmetrically distributed nature of financial attributes, such as a few companies controlling more cash than the majority, hence improving both the accuracy and explainability of predictions. In addition, we delicately examine the explainability of the CBR system in the decision-making process of bankruptcy prediction. While much research suggests a trade-off between improving prediction accuracy and explainability, our findings show a prospective research avenue in which an explainable model that thoroughly incorporates data attributes by design can reconcile the dilemma.
△ Less
Submitted 2 November, 2022;
originally announced November 2022.
-
Blockchain mechanism and distributional characteristics of cryptos
Authors:
Min-Bin Lin,
Kainat Khowaja,
Cathy Yi-Hsuan Chen,
Wolfgang Karl Härdle
Abstract:
We investigate the relationship between underlying blockchain mechanism of cryptocurrencies and its distributional characteristics. In addition to price, we emphasise on using actual block size and block time as the operational features of cryptos. We use distributional characteristics such as fourier power spectrum, moments, quantiles, global we optimums, as well as the measures for long term dep…
▽ More
We investigate the relationship between underlying blockchain mechanism of cryptocurrencies and its distributional characteristics. In addition to price, we emphasise on using actual block size and block time as the operational features of cryptos. We use distributional characteristics such as fourier power spectrum, moments, quantiles, global we optimums, as well as the measures for long term dependencies, risk and noise to summarise the information from crypto time series. With the hypothesis that the blockchain structure explains the distributional characteristics of cryptos, we use characteristic based spectral clustering to cluster the selected cryptos into five groups. We scrutinise these clusters and find that indeed, the clusters of cryptos share similar mechanism such as origin of fork, difficulty adjustment frequency, and the nature of block size. This paper provides crypto creators and users with a better understanding toward the connection between the blockchain protocol design and distributional characteristics of cryptos.
△ Less
Submitted 24 August, 2021; v1 submitted 26 November, 2020;
originally announced November 2020.
-
How to Measure the Performance of a Collaborative Research Center
Authors:
Alona Zharova,
Janine Tellinger-Rice,
Wolfgang Karl Härdle
Abstract:
New Public Management helps universities and research institutions to perform in a highly competitive research environment. Evaluating publicly financed research improves transparency, helps in reflection and self-assessment, and provides information for strategic decision making. In this paper we provide empirical evidence using data from a Collaborative Research Center (CRC) on financial inputs…
▽ More
New Public Management helps universities and research institutions to perform in a highly competitive research environment. Evaluating publicly financed research improves transparency, helps in reflection and self-assessment, and provides information for strategic decision making. In this paper we provide empirical evidence using data from a Collaborative Research Center (CRC) on financial inputs and research output from 2005 to 2016. After selecting performance indicators suitable for a CRC, we describe main properties of the data using visualization techniques. To study the relationship between the dimensions of research performance, we use a time fixed effects panel data model and fixed effects Poisson model. With the help of year dummy variables, we show how the pattern of research productivity changes over time after controlling for staff and travel costs. The joint depiction of the time fixed effects and the research project's life cycle allows a better understanding of the development of the number of discussion papers over time.
△ Less
Submitted 16 September, 2020;
originally announced September 2020.