Search | arXiv e-print repository

Monolithic Hybrid Recommender System for Suggesting Relevant Movies

Abstract: Recommendation systems have become the fundamental services to facilitate users information access. Generally, recommendation system works by filtering historical behaviors to understand and learn users preferences. With the growth of online information, recommendations have become of crucial importance in information filtering to prevent the information overload problem. In this study, we conside… ▽ More Recommendation systems have become the fundamental services to facilitate users information access. Generally, recommendation system works by filtering historical behaviors to understand and learn users preferences. With the growth of online information, recommendations have become of crucial importance in information filtering to prevent the information overload problem. In this study, we considered hybrid post-fusion of two approaches of collaborative filtering, by using sequences of watched movies and considering the related movies rating. After considering both techniques and applying the weights matrix, the recommendations would be modified to correspond to the users preference as needed. We discussed that various weights would be set based on use cases. For instance, in cases where we have the rating for most classes, we will assign a higher weight to the rating matrix and in case where the rating is unavailable for the majority of cases, the higher weights might be assigned to the sequential dataset. An extensive discussion is made in the context of this paper. Sequential type of the watched movies was used in conjunction of the rating as especially that model might be inadequate in distinguishing users long-term preference and that does not account for the rating of the watched movies and thus that model along might not suffice. Extensive discussion was made regarding the literature and methodological approach to solve the problem. △ Less

Submitted 16 November, 2024; originally announced December 2024.

arXiv:2410.23433 [pdf]

Assessing Concordance between RNA-Seq and NanoString Technologies in Ebola-Infected Nonhuman Primates Using Machine Learning

Authors: Mostafa Rezapour, Aarthi Narayanan, Wyatt H. Mowery, Metin Nafi Gurcan

Abstract: This study evaluates the concordance between RNA sequencing (RNA-Seq) and NanoString technologies for gene expression analysis in non-human primates (NHPs) infected with Ebola virus (EBOV). We performed a detailed comparison of both platforms, demonstrating a strong correlation between them, with Spearman coefficients for 56 out of 62 samples ranging from 0.78 to 0.88, with a mean of 0.83 and a me… ▽ More This study evaluates the concordance between RNA sequencing (RNA-Seq) and NanoString technologies for gene expression analysis in non-human primates (NHPs) infected with Ebola virus (EBOV). We performed a detailed comparison of both platforms, demonstrating a strong correlation between them, with Spearman coefficients for 56 out of 62 samples ranging from 0.78 to 0.88, with a mean of 0.83 and a median of 0.85. Bland-Altman analysis further confirmed high consistency, with most measurements falling within 95% confidence limits. A machine learning approach, using the Supervised Magnitude-Altitude Scoring (SMAS) method trained on NanoString data, identified OAS1 as a key marker for distinguishing RT-qPCR positive from negative samples. Remarkably, when applied to RNA-Seq data, OAS1 also achieved 100% accuracy in differentiating infected from uninfected samples using logistic regression, demonstrating its robustness across platforms. Further differential expression analysis identified 12 common genes including ISG15, OAS1, IFI44, IFI27, IFIT2, IFIT3, IFI44L, MX1, MX2, OAS2, RSAD2, and OASL which demonstrated the highest levels of statistical significance and biological relevance across both platforms. Gene Ontology (GO) analysis confirmed that these genes are directly involved in key immune and viral infection pathways, reinforcing their importance in EBOV infection. In addition, RNA-Seq uniquely identified genes such as CASP5, USP18, and DDX60, which play key roles in immune regulation and antiviral defense. This finding highlights the broader detection capabilities of RNA-Seq and underscores the complementary strengths of both platforms in providing a comprehensive and accurate assessment of gene expression changes during Ebola virus infection. △ Less

Submitted 30 October, 2024; originally announced October 2024.

arXiv:2405.08931 [pdf, ps, other]

A QPTAS for Facility Location on Unit Disk graphs

Authors: Zachary Friggstad, Mohsen Rezapour, Mohammad R. Salavatipour, Hao Sun

Abstract: We study the classic \textsc{(Uncapacitated) Facility Location} problem on Unit Disk Graphs (UDGs). For a given point set $P$ in the plane, the unit disk graph UDG(P) on $P$ has vertex set $P$ and an edge between two distinct points $p, q \in P$ if and only if their Euclidean distance $|pq|$ is at most 1. The weight of the edge $pq$ is equal to their distance $|pq|$. An instance of \fl on UDG(P) c… ▽ More We study the classic \textsc{(Uncapacitated) Facility Location} problem on Unit Disk Graphs (UDGs). For a given point set $P$ in the plane, the unit disk graph UDG(P) on $P$ has vertex set $P$ and an edge between two distinct points $p, q \in P$ if and only if their Euclidean distance $|pq|$ is at most 1. The weight of the edge $pq$ is equal to their distance $|pq|$. An instance of \fl on UDG(P) consists of a set $C\subseteq P$ of clients and a set $F\subseteq P$ of facilities, each having an opening cost $f_i$. The goal is to pick a subset $F'\subseteq F$ to open while minimizing $\sum_{i\in F'} f_i + \sum_{v\in C} d(v,F')$, where $d(v,F')$ is the distance of $v$ to nearest facility in $F'$ through UDG(P). In this paper, we present the first Quasi-Polynomial Time Approximation Schemes (QPTAS) for the problem. While approximation schemes are well-established for facility location problems on sparse geometric graphs (such as planar graphs), there is a lack of such results for dense graphs. Specifically, prior to this study, to the best of our knowledge, there was no approximation scheme for any facility location problem on UDGs in the general setting. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2404.14030 [pdf, other]

Towards Using Behavior Trees in Industrial Automation Controllers

Authors: Aleksandr Sidorenko, Mahdi Rezapour, Achim Wagner, Martin Ruskowski

Abstract: The Industry 4.0 paradigm manifests the shift towards mass customization and cyber-physical production systems (CPPS) and sets new requirements for industrial automation software in terms of modularity, flexibility, and short development cycles of control programs. Though programmable logical controllers (PLCs) have been evolving into versatile and powerful edge devices, there is a lack of PLC sof… ▽ More The Industry 4.0 paradigm manifests the shift towards mass customization and cyber-physical production systems (CPPS) and sets new requirements for industrial automation software in terms of modularity, flexibility, and short development cycles of control programs. Though programmable logical controllers (PLCs) have been evolving into versatile and powerful edge devices, there is a lack of PLC software flexibility and integration between low-level programs and high-level task-oriented control frameworks. Behavior trees (BTs) is a novel framework, which enables rapid design of modular hierarchical control structures. It combines improved modularity with a simple and intuitive design of control logic. This paper proposes an approach for improving the industrial control software design by integrating BTs into PLC programs and separating hardware related functionalities from the coordination logic. Several strategies for integration of BTs into PLCs are shown. The first two integrate BTs with the IEC 61131 based PLCs and are based on the use of the PLCopen Common Behavior Model. The last one utilized event-based BTs and shows the integration with the IEC 61499 based controllers. An application example demonstrates the approach. The paper contributes in the following ways. First, we propose a new PLC software design, which improves modularity, supports better separation of concerns, and enables rapid development and reconfiguration of the control software. Second, we show and evaluate the integration of the BT framework into both IEC 61131 and IEC 61499 based PLCs, as well as the integration of the PLCopen function blocks with the external BT library. This leads to better integration of the low-level PLC code and the AI-based task-oriented frameworks. It also improves the skill-based programming approach for PLCs by using BTs for skills composition. △ Less

Submitted 22 April, 2024; originally announced April 2024.

arXiv:2403.15454 [pdf]

Emotion Detection with Transformers: A Comparative Study

Authors: Mahdi Rezapour

Abstract: In this study, we explore the application of transformer-based models for emotion classification on text data. We train and evaluate several pre-trained transformer models, on the Emotion dataset using different variants of transformers. The paper also analyzes some factors that in-fluence the performance of the model, such as the fine-tuning of the transformer layer, the trainability of the layer… ▽ More In this study, we explore the application of transformer-based models for emotion classification on text data. We train and evaluate several pre-trained transformer models, on the Emotion dataset using different variants of transformers. The paper also analyzes some factors that in-fluence the performance of the model, such as the fine-tuning of the transformer layer, the trainability of the layer, and the preprocessing of the text data. Our analysis reveals that commonly applied techniques like removing punctuation and stop words can hinder model performance. This might be because transformers strength lies in understanding contextual relationships within text. Elements like punctuation and stop words can still convey sentiment or emphasis and removing them might disrupt this context. △ Less

Submitted 27 July, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.14050 [pdf]

Extracting Emotion Phrases from Tweets using BART

Authors: Mahdi Rezapour

Abstract: Sentiment analysis is a natural language processing task that aims to identify and extract the emotional aspects of a text. However, many existing sentiment analysis methods primarily classify the overall polarity of a text, overlooking the specific phrases that convey sentiment. In this paper, we applied an approach to sentiment analysis based on a question-answering framework. Our approach lever… ▽ More Sentiment analysis is a natural language processing task that aims to identify and extract the emotional aspects of a text. However, many existing sentiment analysis methods primarily classify the overall polarity of a text, overlooking the specific phrases that convey sentiment. In this paper, we applied an approach to sentiment analysis based on a question-answering framework. Our approach leverages the power of Bidirectional Autoregressive Transformer (BART), a pre-trained sequence-to-sequence model, to extract a phrase from a given text that amplifies a given sentiment polarity. We create a natural language question that identifies the specific emotion to extract and then guide BART to pay attention to the relevant emotional cues in the text. We use a classifier within BART to predict the start and end positions of the answer span within the text, which helps to identify the precise boundaries of the extracted emotion phrase. Our approach offers several advantages over most sentiment analysis studies, including capturing the complete context and meaning of the text and extracting precise token spans that highlight the intended sentiment. We achieved an end loss of 87% and Jaccard score of 0.61. △ Less

Submitted 27 July, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

arXiv:2401.08738 [pdf]

doi 10.3389/frai.2024.1405332

Machine Learning-Based Analysis of Ebola Virus' Impact on Gene Expression in Nonhuman Primates

Authors: Mostafa Rezapour, Muhammad Khalid Khan Niazi, Hao Lu, Aarthi Narayanan, Metin Nafi Gurcan

Abstract: This study introduces the Supervised Magnitude-Altitude Scoring (SMAS) methodology, a machine learning-based approach, for analyzing gene expression data obtained from nonhuman primates (NHPs) infected with Ebola virus (EBOV). We utilize a comprehensive dataset of NanoString gene expression profiles from Ebola-infected NHPs, deploying the SMAS system for nuanced host-pathogen interaction analysis.… ▽ More This study introduces the Supervised Magnitude-Altitude Scoring (SMAS) methodology, a machine learning-based approach, for analyzing gene expression data obtained from nonhuman primates (NHPs) infected with Ebola virus (EBOV). We utilize a comprehensive dataset of NanoString gene expression profiles from Ebola-infected NHPs, deploying the SMAS system for nuanced host-pathogen interaction analysis. SMAS effectively combines gene selection based on statistical significance and expression changes, employing linear classifiers such as logistic regression to accurately differentiate between RT-qPCR positive and negative NHP samples. A key finding of our research is the identification of IFI6 and IFI27 as critical biomarkers, demonstrating exceptional predictive performance with 100% accuracy and Area Under the Curve (AUC) metrics in classifying various stages of Ebola infection. Alongside IFI6 and IFI27, genes, including MX1, OAS1, and ISG15, were significantly upregulated, highlighting their essential roles in the immune response to EBOV. Our results underscore the efficacy of the SMAS method in revealing complex genetic interactions and response mechanisms during EBOV infection. This research provides valuable insights into EBOV pathogenesis and aids in developing more precise diagnostic tools and therapeutic strategies to address EBOV infection in particular and viral infection in general. △ Less

Submitted 22 January, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

Comments: 28 pages, 8 figures, 2 tables

arXiv:2309.15990 [pdf]

Machine Learning Based Analytics for the Significance of Gait Analysis in Monitoring and Managing Lower Extremity Injuries

Authors: Mostafa Rezapour, Rachel B. Seymour, Stephen H. Sims, Madhav A. Karunakar, Nahir Habet, Metin Nafi Gurcan

Abstract: This study explored the potential of gait analysis as a tool for assessing post-injury complications, e.g., infection, malunion, or hardware irritation, in patients with lower extremity fractures. The research focused on the proficiency of supervised machine learning models predicting complications using consecutive gait datasets. We identified patients with lower extremity fractures at an academi… ▽ More This study explored the potential of gait analysis as a tool for assessing post-injury complications, e.g., infection, malunion, or hardware irritation, in patients with lower extremity fractures. The research focused on the proficiency of supervised machine learning models predicting complications using consecutive gait datasets. We identified patients with lower extremity fractures at an academic center. Patients underwent gait analysis with a chest-mounted IMU device. Using software, raw gait data was preprocessed, emphasizing 12 essential gait variables. Machine learning models including XGBoost, Logistic Regression, SVM, LightGBM, and Random Forest were trained, tested, and evaluated. Attention was given to class imbalance, addressed using SMOTE. We introduced a methodology to compute the Rate of Change (ROC) for gait variables, independent of the time difference between gait analyses. XGBoost was the optimal model both before and after applying SMOTE. Prior to SMOTE, the model achieved an average test AUC of 0.90 (95% CI: [0.79, 1.00]) and test accuracy of 86% (95% CI: [75%, 97%]). Feature importance analysis attributed importance to the duration between injury and gait analysis. Data patterns showed early physiological compensations, followed by stabilization phases, emphasizing prompt gait analysis. This study underscores the potential of machine learning, particularly XGBoost, in gait analysis for orthopedic care. Predicting post-injury complications, early gait assessment becomes vital, revealing intervention points. The findings support a shift in orthopedics towards a data-informed approach, enhancing patient outcomes. △ Less

Submitted 27 September, 2023; originally announced September 2023.

Comments: 13 pages, 6 figures

arXiv:2309.09412 [pdf]

Cross-attention-based saliency inference for predicting cancer metastasis on whole slide images

Authors: Ziyu Su, Mostafa Rezapour, Usama Sajjad, Shuo Niu, Metin Nafi Gurcan, Muhammad Khalid Khan Niazi

Abstract: Although multiple instance learning (MIL) methods are widely used for automatic tumor detection on whole slide images (WSI), they suffer from the extreme class imbalance within the small tumor WSIs. This occurs when the tumor comprises only a few isolated cells. For early detection, it is of utmost importance that MIL algorithms can identify small tumors, even when they are less than 1% of the siz… ▽ More Although multiple instance learning (MIL) methods are widely used for automatic tumor detection on whole slide images (WSI), they suffer from the extreme class imbalance within the small tumor WSIs. This occurs when the tumor comprises only a few isolated cells. For early detection, it is of utmost importance that MIL algorithms can identify small tumors, even when they are less than 1% of the size of the WSI. Existing studies have attempted to address this issue using attention-based architectures and instance selection-based methodologies, but have not yielded significant improvements. This paper proposes cross-attention-based salient instance inference MIL (CASiiMIL), which involves a novel saliency-informed attention mechanism, to identify breast cancer lymph node micro-metastasis on WSIs without the need for any annotations. Apart from this new attention mechanism, we introduce a negative representation learning algorithm to facilitate the learning of saliency-informed attention weights for improved sensitivity on tumor WSIs. The proposed model outperforms the state-of-the-art MIL methods on two popular tumor metastasis detection datasets, and demonstrates great cross-center generalizability. In addition, it exhibits excellent accuracy in classifying WSIs with small tumor lesions. Moreover, we show that the proposed model has excellent interpretability attributed to the saliency-informed attention weights. We strongly believe that the proposed method will pave the way for training algorithms for early tumor detection on large datasets where acquiring fine-grained annotations is practically impossible. △ Less

Submitted 17 September, 2023; originally announced September 2023.

arXiv:2301.07700 [pdf]

doi 10.1016/j.compbiomed.2023.107607

Attention2Minority: A salient instance inference-based multiple instance learning for classifying small lesions in whole slide images

Authors: Ziyu Su, Mostafa Rezapour, Usama Sajjad, Metin Nafi Gurcan, Muhammad Khalid Khan Niazi

Abstract: Multiple instance learning (MIL) models have achieved remarkable success in analyzing whole slide images (WSIs) for disease classification problems. However, with regard to gigapixel WSI classification problems, current MIL models are often incapable of differentiating a WSI with extremely small tumor lesions. This minute tumor-to-normal area ratio in a MIL bag inhibits the attention mechanism fro… ▽ More Multiple instance learning (MIL) models have achieved remarkable success in analyzing whole slide images (WSIs) for disease classification problems. However, with regard to gigapixel WSI classification problems, current MIL models are often incapable of differentiating a WSI with extremely small tumor lesions. This minute tumor-to-normal area ratio in a MIL bag inhibits the attention mechanism from properly weighting the areas corresponding to minor tumor lesions. To overcome this challenge, we propose salient instance inference MIL (SiiMIL), a weakly-supervised MIL model for WSI classification. Our method initially learns representations of normal WSIs, and it then compares the normal WSIs representations with all the input patches to infer the salient instances of the input WSI. Finally, it employs attention-based MIL to perform the slide-level classification based on the selected patches of the WSI. Our experiments imply that SiiMIL can accurately identify tumor instances, which could only take up less than 1% of a WSI, so that the ratio of tumor to normal instances within a bag can increase by two to four times. It is worth mentioning that it performs equally well for large tumor lesions. As a result, SiiMIL achieves a significant improvement in performance over the state-of-the-art MIL methods. △ Less

Submitted 11 December, 2023; v1 submitted 18 January, 2023; originally announced January 2023.

arXiv:2202.07441 [pdf, other]

doi 10.1371/journal.pone.0276767

Artificial Intelligence-Based Analytics for Impacts of COVID-19 and Online Learning on College Students' Mental Health

Authors: Mostafa Rezapour, Scott K. Elmshaeuser

Abstract: COVID-19, the disease caused by the novel coronavirus (SARS-CoV-2), first emerged in Wuhan, China late in December 2019. Not long after, the virus spread worldwide and was declared a pandemic by the World Health Organization in March 2020. This caused many changes around the world and in the United States, including an educational shift towards online learning. In this paper, we seek to understand… ▽ More COVID-19, the disease caused by the novel coronavirus (SARS-CoV-2), first emerged in Wuhan, China late in December 2019. Not long after, the virus spread worldwide and was declared a pandemic by the World Health Organization in March 2020. This caused many changes around the world and in the United States, including an educational shift towards online learning. In this paper, we seek to understand how the COVID-19 pandemic and increase in online learning impact college students' emotional wellbeing. We use several machine learning and statistical models to analyze data collected by the Faculty of Public Administration at the University of Ljubljana, Slovenia in conjunction with an international consortium of universities, other higher education institutions, and students' associations. Our results indicate that features related to students' academic life have the largest impact on their emotional wellbeing. Other important factors include students' satisfaction with their university's and government's handling of the pandemic as well as students' financial security. △ Less

Submitted 5 September, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

Comments: 41 pages, 31 Figures

arXiv:2112.12901 [pdf, other]

A machine learning analysis of the relationship between some underlying medical conditions and COVID-19 susceptibility

Authors: Mostafa Rezapour, Colin A. Varady

Abstract: For the past couple years, the Coronavirus, commonly known as COVID-19, has significantly affected the daily lives of all citizens residing in the United States by imposing several, fatal health risks that cannot go unnoticed. In response to the growing fear and danger COVID-19 inflicts upon societies in the USA, several vaccines and boosters have been created as a permanent remedy for individuals… ▽ More For the past couple years, the Coronavirus, commonly known as COVID-19, has significantly affected the daily lives of all citizens residing in the United States by imposing several, fatal health risks that cannot go unnoticed. In response to the growing fear and danger COVID-19 inflicts upon societies in the USA, several vaccines and boosters have been created as a permanent remedy for individuals to take advantage of. In this paper, we investigate the relationship between the COVID-19 vaccines and boosters and the total case count for the Coronavirus across multiple states in the USA. Additionally, this paper discusses the relationship between several, selected underlying health conditions with COVID-19. To discuss these relationships effectively, this paper will utilize statistical tests and machine learning methods for analysis and discussion purposes. Furthermore, this paper reflects upon conclusions made about the relationship between educational attainment, race, and COVID-19 and the possible connections that can be established with underlying health conditions, vaccination rates, and COVID-19 total case and death counts. △ Less

Submitted 14 February, 2022; v1 submitted 23 December, 2021; originally announced December 2021.

Comments: 38 pages, 23 figures

arXiv:2112.06261 [pdf, other]

A Machine Learning Analysis of Impact of the Covid-19 Pandemic on Alcohol Consumption Habit Changes Among Healthcare Workers in the U.S

Authors: Mostafa Rezapour

Abstract: In this paper, we discuss the impact of the Covid-19 pandemic on alcohol consumption habit changes among healthcare workers in the United States. We utilize multiple supervised and unsupervised machine learning methods and models such as Decision Trees, Logistic Regression, Naive Bayes classifier, k-Nearest Neighbors, Support Vector Machines, Multilayer perceptron, XGBoost, CatBoost, LightGBM, Chi… ▽ More In this paper, we discuss the impact of the Covid-19 pandemic on alcohol consumption habit changes among healthcare workers in the United States. We utilize multiple supervised and unsupervised machine learning methods and models such as Decision Trees, Logistic Regression, Naive Bayes classifier, k-Nearest Neighbors, Support Vector Machines, Multilayer perceptron, XGBoost, CatBoost, LightGBM, Chi-Squared Test and mutual information method on a mental health survey data obtained from the University of Michigan Inter-University Consortium for Political and Social Research to find out relationships between COVID-19 related negative effects and alcohol consumption habit changes among healthcare workers. Our findings suggest that COVID-19-related school closures, COVID-19-related work schedule changes and COVID-related news exposure may lead to an increase in alcohol use among healthcare workers in the United States. △ Less

Submitted 7 September, 2022; v1 submitted 12 December, 2021; originally announced December 2021.

arXiv:2112.00227 [pdf, other]

doi 10.1038/s41598-022-19314-1

A Machine Learning Analysis of COVID-19 Mental Health Data

Authors: Mostafa Rezapour, Lucas Hansen

Abstract: In late December 2019, the novel coronavirus (Sars-Cov-2) and the resulting disease COVID-19 were first identified in Wuhan China. The disease slipped through containment measures, with the first known case in the United States being identified on January 20th, 2020. In this paper, we utilize survey data from the Inter-university Consortium for Political and Social Research and apply several stati… ▽ More In late December 2019, the novel coronavirus (Sars-Cov-2) and the resulting disease COVID-19 were first identified in Wuhan China. The disease slipped through containment measures, with the first known case in the United States being identified on January 20th, 2020. In this paper, we utilize survey data from the Inter-university Consortium for Political and Social Research and apply several statistical and machine learning models and techniques such as Decision Trees, Multinomial Logistic Regression, Naive Bayes, k-Nearest Neighbors, Support Vector Machines, Neural Networks, Random Forests, Gradient Tree Boosting, XGBoost, CatBoost, LightGBM, Synthetic Minority Oversampling, and Chi-Squared Test to analyze the impacts the COVID-19 pandemic has had on the mental health of frontline workers in the United States. Through the interpretation of the many models applied to the mental health survey data, we have concluded that the most important factor in predicting the mental health decline of a frontline worker is the healthcare role the individual is in (Nurse, Emergency Room Staff, Surgeon, etc.), followed by the amount of sleep the individual has had in the last week, the amount of COVID-19 related news an individual has consumed on average in a day, the age of the worker, and the usage of alcohol and cannabis. △ Less

Submitted 10 May, 2022; v1 submitted 30 November, 2021; originally announced December 2021.

Comments: 29 pages

arXiv:2107.06321 [pdf, other]

A New Multipoint Symmetric Secant Method with a Dense Initial Matrix

Authors: Jennifer B. Erway, Mostafa Rezapour

Abstract: In large-scale optimization, when either forming or storing Hessian matrices are prohibitively expensive, quasi-Newton methods are often used in lieu of Newton's method because they only require first-order information to approximate the true Hessian. Multipoint symmetric secant (MSS) methods can be thought of as generalizations of quasi-Newton methods in that they attempt to impose additional req… ▽ More In large-scale optimization, when either forming or storing Hessian matrices are prohibitively expensive, quasi-Newton methods are often used in lieu of Newton's method because they only require first-order information to approximate the true Hessian. Multipoint symmetric secant (MSS) methods can be thought of as generalizations of quasi-Newton methods in that they attempt to impose additional requirements on their approximation of the Hessian. Given an initial Hessian approximation, MSS methods generate a sequence of possibly-indefinite matrices using rank-2 updates to solve nonconvex unconstrained optimization problems. For practical reasons, up to now, the initialization has been a constant multiple of the identity matrix. In this paper, we propose a new limited-memory MSS method for large-scale nonconvex optimization that allows for dense initializations. Numerical results on the CUTEst test problems suggest that the MSS method using a dense initialization outperforms the standard initialization. Numerical results also suggest that this approach is competitive with both a basic L-SR1 trust-region method and an L-PSB method. △ Less

Submitted 9 August, 2022; v1 submitted 13 July, 2021; originally announced July 2021.

arXiv:2104.11594 [pdf, other]

Dynamic investment portfolio optimization using a Multivariate Merton Model with Correlated Jump Risk

Authors: Bahareh Afhami, Mohsen Rezapour, Mohsen Madadi, Vahed Maroufy

Abstract: In this paper, we are concerned with the optimization of a dynamic investment portfolio when the securities which follow a multivariate Merton model with dependent jumps are periodically invested and proceed by approximating the Condition-Value-at-Risk (CVaR) by comonotonic bounds and maximize the expected terminal wealth. Numerical studies as well as applications of our results to real datasets a… ▽ More In this paper, we are concerned with the optimization of a dynamic investment portfolio when the securities which follow a multivariate Merton model with dependent jumps are periodically invested and proceed by approximating the Condition-Value-at-Risk (CVaR) by comonotonic bounds and maximize the expected terminal wealth. Numerical studies as well as applications of our results to real datasets are also provided. △ Less

Submitted 22 April, 2021; originally announced April 2021.

MSC Class: C630; C580; C650

arXiv:2104.10240 [pdf, ps, other]

Portfolio Selection under Multivariate Merton Model with Correlated Jump Risk

Authors: Bahareh Afhami, Mohsen Rezapour, Mohsen Madadi, Vahed Maroufy

Abstract: Portfolio selection in the periodic investment of securities modeled by a multivariate Merton model with dependent jumps is considered. The optimization framework is designed to maximize expected terminal wealth when portfolio risk is measured by the Condition-Value-at-Risk ($CVaR$). Solving the portfolio optimization problem by Monte Carlo simulation often requires intensive and time-consuming co… ▽ More Portfolio selection in the periodic investment of securities modeled by a multivariate Merton model with dependent jumps is considered. The optimization framework is designed to maximize expected terminal wealth when portfolio risk is measured by the Condition-Value-at-Risk ($CVaR$). Solving the portfolio optimization problem by Monte Carlo simulation often requires intensive and time-consuming computation; hence a faster and more efficient portfolio optimization method based on closed-form comonotonic bounds for the risk measure $CVaR$ of the terminal wealth is proposed. △ Less

Submitted 20 April, 2021; originally announced April 2021.

MSC Class: C630; C580; C650

arXiv:2004.09058 [pdf, other]

Neural-trust-region algorithm for unconstrained optimization (Part 1)

Authors: Mostafa Rezapour, Thomas Asaki

Abstract: In this paper (part 1), we describe a derivative-free trust-region method for solving unconstrained optimization problems. We will discuss a method when we relax the model order assumption and use artificial neural network techniques to build a computationally relatively inexpensive model. We directly find an estimate of the objective function minimizer without explicitly constructing a model func… ▽ More In this paper (part 1), we describe a derivative-free trust-region method for solving unconstrained optimization problems. We will discuss a method when we relax the model order assumption and use artificial neural network techniques to build a computationally relatively inexpensive model. We directly find an estimate of the objective function minimizer without explicitly constructing a model function. Therefore, we need to have the neural-network model derivatives, which can be obtained simply through a back-propagation process. △ Less

Submitted 25 May, 2020; v1 submitted 20 April, 2020; originally announced April 2020.

arXiv:2001.01610 [pdf, ps, other]

A new sigmoidal fractional derivative for regularization

Authors: Mostafa Rezapour, Adebowale Sijuwade, Thomas J. Asaki

Abstract: In this paper, we propose a new fractional derivative, which is based on a Caputo-type derivative with a smooth kernel. We show that the proposed fractional derivative reduces to the classical derivative and has a smoothing effect which is compatible with $\ell_{1}$ regularization. Moreover, it satisfies some classical properties. In this paper, we propose a new fractional derivative, which is based on a Caputo-type derivative with a smooth kernel. We show that the proposed fractional derivative reduces to the classical derivative and has a smoothing effect which is compatible with $\ell_{1}$ regularization. Moreover, it satisfies some classical properties. △ Less

Submitted 16 March, 2020; v1 submitted 3 January, 2020; originally announced January 2020.

MSC Class: 26A33

arXiv:1912.12810 [pdf, ps, other]

A new Laplace-type fractional derivative

Authors: Mostafa Rezapour, Adebowale Sijuwade

Abstract: In this paper, we present a new derivative via the Laplace transform. The Laplace transform leads to a natural form of the fractional derivative which is equivalent to a Riemann-Liouville derivative with fixed terminal point. We first consider a representation which interacts well with periodic functions, examine some rudimentary properties and propose a generalization. The interest for this new a… ▽ More In this paper, we present a new derivative via the Laplace transform. The Laplace transform leads to a natural form of the fractional derivative which is equivalent to a Riemann-Liouville derivative with fixed terminal point. We first consider a representation which interacts well with periodic functions, examine some rudimentary properties and propose a generalization. The interest for this new approach arose from recent developments in fractional differential equations involving Caputo-type derivatives and applications in regularization problems. △ Less

Submitted 30 January, 2020; v1 submitted 29 December, 2019; originally announced December 2019.

arXiv:1707.04295 [pdf, other]

Approximation Schemes for Clustering with Outliers

Authors: Zachary Friggstad, Kamyar Khodamoradi, Mohsen Rezapour, Mohammad R. Salavatipour

Abstract: Clustering problems are well-studied in a variety of fields such as data science, operations research, and computer science. Such problems include variants of centre location problems, $k$-median, and $k$-means to name a few. In some cases, not all data points need to be clustered; some may be discarded for various reasons. We study clustering problems with outliers. More specifically, we look a… ▽ More Clustering problems are well-studied in a variety of fields such as data science, operations research, and computer science. Such problems include variants of centre location problems, $k$-median, and $k$-means to name a few. In some cases, not all data points need to be clustered; some may be discarded for various reasons. We study clustering problems with outliers. More specifically, we look at Uncapacitated Facility Location (UFL), $k$-Median, and $k$-Means. In UFL with outliers, we have to open some centres, discard up to $z$ points of $\cal X$ and assign every other point to the nearest open centre, minimizing the total assignment cost plus centre opening costs. In $k$-Median and $k$-Means, we have to open up to $k$ centres but there are no opening costs. In $k$-Means, the cost of assigning $j$ to $i$ is $δ^2(j,i)$. We present several results. Our main focus is on cases where $δ$ is a doubling metric or is the shortest path metrics of graphs from a minor-closed family of graphs. For uniform-cost UFL with outliers on such metrics we show that a multiswap simple local search heuristic yields a PTAS. With a bit more work, we extend this to bicriteria approximations for the $k$-Median and $k$-Means problems in the same metrics where, for any constant $ε> 0$, we can find a solution using $(1+ε)k$ centres whose cost is at most a $(1+ε)$-factor of the optimum and uses at most $z$ outliers. We also show that natural local search heuristics that do not violate the number of clusters and outliers for $k$-Median (or $k$-Means) will have unbounded gap even in Euclidean metrics. Furthermore, we show how our analysis can be extended to general metrics for $k$-Means with outliers to obtain a $(25+ε,1+ε)$ bicriteria. △ Less

Submitted 13 July, 2017; originally announced July 2017.

arXiv:1703.07272 [pdf, ps, other]

Heavy Tails for an Alternative Stochastic Perpetuity Model

Authors: Thomas Mikosch, Mohsen Rezapour, Olivier Wintenberger

Abstract: In this paper we consider a stochastic model of perpetuity-type. In contrast to the classical affine perpetuity model of Kesten [12] and Goldie [8] all discount factors in the model are mutually independent. We prove that the tails of the distribution of this model are regularly varying both in the univariate and multivariate cases. Due to the additional randomness in the model the tails are not p… ▽ More In this paper we consider a stochastic model of perpetuity-type. In contrast to the classical affine perpetuity model of Kesten [12] and Goldie [8] all discount factors in the model are mutually independent. We prove that the tails of the distribution of this model are regularly varying both in the univariate and multivariate cases. Due to the additional randomness in the model the tails are not pure power laws as in the Kesten-Goldie setting but involve a logarithmic term. △ Less

Submitted 21 March, 2017; originally announced March 2017.

arXiv:1605.02563 [pdf, other]

The eigenvalues of the sample covariance matrix of a multivariate heavy-tailed stochastic volatility model

Authors: Anja Janßen, Thomas Mikosch, Mohsen Rezapour, Xiaolei Xie

Abstract: We consider a multivariate heavy-tailed stochastic volatility model and analyze the large-sample behavior of its sample covariance matrix. We study the limiting behavior of its entries in the infinite-variance case and derive results for the ordered eigenvalues and corresponding eigenvectors. Essentially, we consider two different cases where the tail behavior either stems from the i.i.d. innovati… ▽ More We consider a multivariate heavy-tailed stochastic volatility model and analyze the large-sample behavior of its sample covariance matrix. We study the limiting behavior of its entries in the infinite-variance case and derive results for the ordered eigenvalues and corresponding eigenvectors. Essentially, we consider two different cases where the tail behavior either stems from the i.i.d. innovations of the process or from its volatility sequence. In both cases, we make use of a large deviations technique for regularly varying time series to derive multivariate $α$-stable limit distributions of the sample covariance matrix. While we show that in the case of heavy-tailed innovations the limiting behavior resembles that of completely independent observations, we also derive that in the case of a heavy-tailed volatility sequence the possible limiting behavior is more diverse, i.e. allowing for dependencies in the limiting distributions which are determined by the structure of the underlying volatility sequence. △ Less

Submitted 9 May, 2016; originally announced May 2016.

MSC Class: 60B20 (Primary) 60F05; 60G10; 60G70; 62M10 (Secondary)

arXiv:1603.08976 [pdf, other]

Local Search Yields a PTAS for k-Means in Doubling Metrics

Authors: Zachary Friggstad, Mohsen Rezapour, Mohammad R. Salavatipour

Abstract: The most well known and ubiquitous clustering problem encountered in nearly every branch of science is undoubtedly $k$-means: given a set of data points and a parameter $k$, select $k$ centres and partition the data points into $k$ clusters around these centres so that the sum of squares of distances of the points to their cluster centre is minimized. Typically these data points lie… ▽ More The most well known and ubiquitous clustering problem encountered in nearly every branch of science is undoubtedly $k$-means: given a set of data points and a parameter $k$, select $k$ centres and partition the data points into $k$ clusters around these centres so that the sum of squares of distances of the points to their cluster centre is minimized. Typically these data points lie $\mathbb{R}^d$ for some $d\geq 2$. $k$-means and the first algorithms for it were introduced in the 1950's. Since then, hundreds of papers have studied this problem and many algorithms have been proposed for it. The most commonly used algorithm is known as Lloyd-Forgy, which is also referred to as "the" $k$-means algorithm, and various extensions of it often work very well in practice. However, they may produce solutions whose cost is arbitrarily large compared to the optimum solution. Kanungo et al. [2004] analyzed a simple local search heuristic to get a polynomial-time algorithm with approximation ratio $9+ε$ for any fixed $ε>0$ for $k$-means in Euclidean space. Finding an algorithm with a better approximation guarantee has remained one of the biggest open questions in this area, in particular whether one can get a true PTAS for fixed dimension Euclidean space. We settle this problem by showing that a simple local search algorithm provides a PTAS for $k$-means in $\mathbb{R}^d$ for any fixed $d$. More precisely, for any error parameter $ε>0$, the local search algorithm that considers swaps of up to $ρ=d^{O(d)}\cdotε^{-O(d/ε)}$ centres at a time finds a solution using exactly $k$ centres whose cost is at most a $(1+ε)$-factor greater than the optimum. Finally, we provide the first demonstration that local search yields a PTAS for the uncapacitated facility location problem and $k$-median with non-uniform opening costs in doubling metrics. △ Less

Submitted 9 January, 2017; v1 submitted 29 March, 2016; originally announced March 2016.

arXiv:1312.2780 [pdf, ps, other]

doi 10.3150/12-BEJ426

Stochastic volatility models with possible extremal clustering

Authors: Thomas Mikosch, Mohsen Rezapour

Abstract: In this paper we consider a heavy-tailed stochastic volatility model, $X_t=σ_tZ_t$, $t\in\mathbb{Z}$, where the volatility sequence $(σ_t)$ and the i.i.d. noise sequence $(Z_t)$ are assumed independent, $(σ_t)$ is regularly varying with index $α>0$, and the $Z_t$'s have moments of order larger than $α$. In the literature (see Ann. Appl. Probab. 8 (1998) 664-675, J. Appl. Probab. 38A (2001) 93-104,… ▽ More In this paper we consider a heavy-tailed stochastic volatility model, $X_t=σ_tZ_t$, $t\in\mathbb{Z}$, where the volatility sequence $(σ_t)$ and the i.i.d. noise sequence $(Z_t)$ are assumed independent, $(σ_t)$ is regularly varying with index $α>0$, and the $Z_t$'s have moments of order larger than $α$. In the literature (see Ann. Appl. Probab. 8 (1998) 664-675, J. Appl. Probab. 38A (2001) 93-104, In Handbook of Financial Time Series (2009) 355-364 Springer), it is typically assumed that $(\logσ_t)$ is a Gaussian stationary sequence and the $Z_t$'s are regularly varying with some index $α$ (i.e., $(σ_t)$ has lighter tails than the $Z_t$'s), or that $(Z_t)$ is i.i.d. centered Gaussian. In these cases, we see that the sequence $(X_t)$ does not exhibit extremal clustering. In contrast to this situation, under the conditions of this paper, both situations are possible; $(X_t)$ may or may not have extremal clustering, depending on the clustering behavior of the $σ$-sequence. △ Less

Submitted 10 December, 2013; originally announced December 2013.

Comments: Published in at http://dx.doi.org/10.3150/12-BEJ426 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm)

Report number: IMS-BEJ-BEJ426

Journal ref: Bernoulli 2013, Vol. 19, No. 5A, 1688-1713

Showing 1–25 of 25 results for author: Rezapour, M