-
Bridging Domain Gaps in Agricultural Image Analysis: A Comprehensive Review From Shallow Adaptation to Deep Learning
Authors:
Xing Hu,
Siyuan Chen,
Xuming Huang,
Qianqian Duan,
LingKun Luo,
Ruijiao Li,
Huiliang Shang,
Linhua Jiang,
Jianping Yang,
Hamid Reza Karimi,
Dawei Zhang
Abstract:
With the growing application of computer vision in agriculture, image analysis has become essential for tasks such as crop health monitoring and pest detection. However, significant domain shifts caused by environmental variations, different crop types, and diverse data acquisition methods hinder model generalization across regions, seasons, and complex agricultural settings. This paper investigat…
▽ More
With the growing application of computer vision in agriculture, image analysis has become essential for tasks such as crop health monitoring and pest detection. However, significant domain shifts caused by environmental variations, different crop types, and diverse data acquisition methods hinder model generalization across regions, seasons, and complex agricultural settings. This paper investigates how Domain Adaptation (DA) techniques can address these challenges by improving cross-domain transferability in agricultural image analysis. Given the limited availability of labeled data, weak model adaptability, and dynamic field conditions, DA has emerged as a promising solution. The review systematically summarizes recent advances in DA for agricultural imagery, focusing on applications such as crop health monitoring, pest detection, and fruit recognition, where DA methods have enhanced performance across diverse domains. DA approaches are categorized into shallow and deep learning methods, including supervised, semi-supervised, and unsupervised strategies, with particular attention to adversarial learning-based techniques that have demonstrated strong potential in complex scenarios. In addition, the paper reviews key public agricultural image datasets, evaluating their strengths and limitations in DA research. Overall, this work offers a comprehensive framework and critical insights to guide future research and development of domain adaptation in agricultural vision tasks.
△ Less
Submitted 20 June, 2025; v1 submitted 6 June, 2025;
originally announced June 2025.
-
Exploring Large Language Models for Climate Forecasting
Authors:
Yang Wang,
Hassan A. Karimi
Abstract:
With the increasing impacts of climate change, there is a growing demand for accessible tools that can provide reliable future climate information to support planning, finance, and other decision-making applications. Large language models (LLMs), such as GPT-4, present a promising approach to bridging the gap between complex climate data and the general public, offering a way for non-specialist us…
▽ More
With the increasing impacts of climate change, there is a growing demand for accessible tools that can provide reliable future climate information to support planning, finance, and other decision-making applications. Large language models (LLMs), such as GPT-4, present a promising approach to bridging the gap between complex climate data and the general public, offering a way for non-specialist users to obtain essential climate insights through natural language interaction. However, an essential challenge remains under-explored: evaluating the ability of LLMs to provide accurate and reliable future climate predictions, which is crucial for applications that rely on anticipating climate trends. In this study, we investigate the capability of GPT-4 in predicting rainfall at short-term (15-day) and long-term (12-month) scales. We designed a series of experiments to assess GPT's performance under different conditions, including scenarios with and without expert data inputs. Our results indicate that GPT, when operating independently, tends to generate conservative forecasts, often reverting to historical averages in the absence of clear trend signals. This study highlights both the potential and challenges of applying LLMs for future climate predictions, providing insights into their integration with climate-related applications and suggesting directions for enhancing their predictive capabilities in the field.
△ Less
Submitted 20 November, 2024;
originally announced November 2024.
-
Distortion of Multi-Winner Elections on the Line Metric: The Polar Comparison Rule
Authors:
Negar Babashah,
Hasti Karimi,
Masoud Seddighin,
Golnoosh Shahkarami
Abstract:
We study the problem of selecting a committee of size $k$ from a set of $m$ alternatives, based solely on the ordinal preferences of voters. Both voters and alternatives lie on the line metric, and the goal is to minimize a social cost function based on metric distances. While the distances to committee members fully determine the social cost, voting rules only have access to the ordinal preferenc…
▽ More
We study the problem of selecting a committee of size $k$ from a set of $m$ alternatives, based solely on the ordinal preferences of voters. Both voters and alternatives lie on the line metric, and the goal is to minimize a social cost function based on metric distances. While the distances to committee members fully determine the social cost, voting rules only have access to the ordinal preference list of each voter. The distortion of a voting rule is the worst-case ratio between the cost of the selected committee and the cost of the optimal one, over all consistent distance metrics. Extending distortion to multi-winner elections requires defining how a voter's cost is aggregated over the committee. Caragiannis et al. (2022) studied $q$-cost, where the cost is defined as the distance to the voter's $q$th closest committee member. In this work, we focus on the additive cost, where a voter's cost is the sum of their distances to all committee members. The overall social cost is either utilitarian (sum of individual costs) or egalitarian (maximum individual cost).
We introduce a new voting rule, the Polar Comparison Rule, and analyze its distortion for the utilitarian additive cost. We show that it achieves a distortion of roughly $7/3$ for any committee size $k$. More specifically, for $k = 2$ and $k = 3$, we establish tight bounds of $1 + \sqrt{2} \approx 2.41$ and $7/3 \approx 2.33$, respectively. Moreover, we provide lower bounds that depend on the parity of $k$, and analyze both small and large committee sizes. Finally, we study the egalitarian additive cost and analyze the distortion bounds in multi-winner elections.
△ Less
Submitted 2 June, 2025; v1 submitted 20 November, 2024;
originally announced November 2024.
-
BTS: A Comprehensive Benchmark for Tie Strength Prediction
Authors:
Xueqi Cheng,
Catherine Yang,
Yuying Zhao,
Yu Wang,
Hamid Karimi,
Tyler Derr
Abstract:
The rapid rise of online social networks underscores the need to understand the heterogeneous strengths of online relationships. Yet, efforts to assess tie strength (TS) are hindered by the lack of ground-truth labels, differing research perspectives, and limited model performance in real-world settings. To address this gap, we introduce BTS, a comprehensive Benchmark for Tie Strength prediction,…
▽ More
The rapid rise of online social networks underscores the need to understand the heterogeneous strengths of online relationships. Yet, efforts to assess tie strength (TS) are hindered by the lack of ground-truth labels, differing research perspectives, and limited model performance in real-world settings. To address this gap, we introduce BTS, a comprehensive Benchmark for Tie Strength prediction, aiming to establish a standardized foundation for evaluating and advancing TS prediction methodologies. Specifically, our contributions are: TS Pseudo-Label Techniques -- we categorize TS into seven standardized pseudo-labeling techniques based on prior literature; TS Dataset Collection -- we present a representative collection of three social networks and perform data analysis by investigating the class distributions and correlations across the generated pseudo-labels; TS Pseudo-Label Evaluation Framework -- we propose a standardized framework to evaluate the pseudo-label quality from the perspective of tie resilience; Benchmarking -- we evaluate existing tie strength prediction model performance using the BTS dataset collection, exploring the effects of different experiment settings, models, and evaluation criteria on the results. Furthermore, we derive key insights to enhance existing methods and shed light on promising directions for future research in this domain. The BTS dataset collection, along with the curation codes and experimental scripts, is all available at: https://github.com/XueqiC/Awesome-Tie-Strength-Prediction.
△ Less
Submitted 7 June, 2025; v1 submitted 24 October, 2024;
originally announced October 2024.
-
CNN-based Labelled Crack Detection for Image Annotation
Authors:
Mohsen Asghari Ilani,
Leila Amini,
Hossein Karimi,
Maryam Shavali Kuhshuri
Abstract:
Numerous image processing techniques (IPTs) have been employed to detect crack defects, offering an alternative to human-conducted onsite inspections. These IPTs manipulate images to extract defect features, particularly cracks in surfaces produced through Additive Manufacturing (AM). This article presents a vision-based approach that utilizes deep convolutional neural networks (CNNs) for crack de…
▽ More
Numerous image processing techniques (IPTs) have been employed to detect crack defects, offering an alternative to human-conducted onsite inspections. These IPTs manipulate images to extract defect features, particularly cracks in surfaces produced through Additive Manufacturing (AM). This article presents a vision-based approach that utilizes deep convolutional neural networks (CNNs) for crack detection in AM surfaces. Traditional image processing techniques face challenges with diverse real-world scenarios and varying crack types. To overcome these challenges, our proposed method leverages CNNs, eliminating the need for extensive feature extraction. Annotation for CNN training is facilitated by LabelImg without the requirement for additional IPTs. The trained CNN, enhanced by OpenCV preprocessing techniques, achieves an outstanding 99.54% accuracy on a dataset of 14,982 annotated images with resolutions of 1536 x 1103 pixels. Evaluation metrics exceeding 96% precision, 98% recall, and a 97% F1-score highlight the precision and effectiveness of the entire process.
△ Less
Submitted 4 December, 2024; v1 submitted 20 August, 2024;
originally announced August 2024.
-
Evidential Uncertainty Sets in Deep Classifiers Using Conformal Prediction
Authors:
Hamed Karimi,
Reza Samavi
Abstract:
In this paper, we propose Evidential Conformal Prediction (ECP) method for image classifiers to generate the conformal prediction sets. Our method is designed based on a non-conformity score function that has its roots in Evidential Deep Learning (EDL) as a method of quantifying model (epistemic) uncertainty in DNN classifiers. We use evidence that are derived from the logit values of target label…
▽ More
In this paper, we propose Evidential Conformal Prediction (ECP) method for image classifiers to generate the conformal prediction sets. Our method is designed based on a non-conformity score function that has its roots in Evidential Deep Learning (EDL) as a method of quantifying model (epistemic) uncertainty in DNN classifiers. We use evidence that are derived from the logit values of target labels to compute the components of our non-conformity score function: the heuristic notion of uncertainty in CP, uncertainty surprisal, and expected utility. Our extensive experimental evaluation demonstrates that ECP outperforms three state-of-the-art methods for generating CP sets, in terms of their set sizes and adaptivity while maintaining the coverage of true labels.
△ Less
Submitted 30 July, 2024; v1 submitted 15 June, 2024;
originally announced June 2024.
-
Assessing the Promise and Pitfalls of ChatGPT for Automated Code Generation
Authors:
Muhammad Fawad Akbar Khan,
Max Ramsdell,
Erik Falor,
Hamid Karimi
Abstract:
This paper presents a comprehensive evaluation of the code generation capabilities of ChatGPT, a prominent large language model, compared to human programmers. A novel dataset of 131 code-generation prompts across 5 categories was curated to enable robust analysis. Code solutions were generated by both ChatGPT and humans for all prompts, resulting in 262 code samples. A meticulous manual assessmen…
▽ More
This paper presents a comprehensive evaluation of the code generation capabilities of ChatGPT, a prominent large language model, compared to human programmers. A novel dataset of 131 code-generation prompts across 5 categories was curated to enable robust analysis. Code solutions were generated by both ChatGPT and humans for all prompts, resulting in 262 code samples. A meticulous manual assessment methodology prioritized evaluating correctness, comprehensibility, and security using 14 established code quality metrics. The key findings reveal ChatGPT's strengths in crafting concise, efficient code with advanced constructs, showcasing strengths in data analysis tasks (93.1% accuracy) but limitations in visual-graphical challenges. Comparative analysis with human code highlights ChatGPT's inclination towards modular design and superior error handling. Additionally, machine learning models effectively distinguished ChatGPT from human code with up to 88% accuracy, suggesting detectable coding style disparities. By providing profound insights into ChatGPT's code generation capabilities and limitations through quantitative metrics and qualitative analysis, this study makes valuable contributions toward advancing AI-based programming assistants. The curated dataset and methodology offer a robust foundation for future research in this nascent domain. All data and codes are available on https://github.com/DSAatUSU/ChatGPT-promises-and-pitfalls.
△ Less
Submitted 5 November, 2023;
originally announced November 2023.
-
Enhancing the Performance of Automated Grade Prediction in MOOC using Graph Representation Learning
Authors:
Soheila Farokhi,
Aswani Yaramala,
Jiangtao Huang,
Muhammad F. A. Khan,
Xiaojun Qi,
Hamid Karimi
Abstract:
In recent years, Massive Open Online Courses (MOOCs) have gained significant traction as a rapidly growing phenomenon in online learning. Unlike traditional classrooms, MOOCs offer a unique opportunity to cater to a diverse audience from different backgrounds and geographical locations. Renowned universities and MOOC-specific providers, such as Coursera, offer MOOC courses on various subjects. Aut…
▽ More
In recent years, Massive Open Online Courses (MOOCs) have gained significant traction as a rapidly growing phenomenon in online learning. Unlike traditional classrooms, MOOCs offer a unique opportunity to cater to a diverse audience from different backgrounds and geographical locations. Renowned universities and MOOC-specific providers, such as Coursera, offer MOOC courses on various subjects. Automated assessment tasks like grade and early dropout predictions are necessary due to the high enrollment and limited direct interaction between teachers and learners. However, current automated assessment approaches overlook the structural links between different entities involved in the downstream tasks, such as the students and courses. Our hypothesis suggests that these structural relationships, manifested through an interaction graph, contain valuable information that can enhance the performance of the task at hand. To validate this, we construct a unique knowledge graph for a large MOOC dataset, which will be publicly available to the research community. Furthermore, we utilize graph embedding techniques to extract latent structural information encoded in the interactions between entities in the dataset. These techniques do not require ground truth labels and can be utilized for various tasks. Finally, by combining entity-specific features, behavioral features, and extracted structural features, we enhance the performance of predictive machine learning models in student assignment grade prediction. Our experiments demonstrate that structural features can significantly improve the predictive performance of downstream assessment tasks. The code and data are available in \url{https://github.com/DSAatUSU/MOOPer_grade_prediction}
△ Less
Submitted 18 October, 2023;
originally announced October 2023.
-
A novel asymmetrical autoencoder with a sparsifying discrete cosine Stockwell transform layer for gearbox sensor data compression
Authors:
Xin Zhu,
Daoguang Yang,
Hongyi Pan,
Hamid Reza Karimi,
Didem Ozevin,
Ahmet Enis Cetin
Abstract:
The lack of an efficient compression model remains a challenge for the wireless transmission of gearbox data in non-contact gear fault diagnosis problems. In this paper, we present a signal-adaptive asymmetrical autoencoder with a transform domain layer to compress sensor signals. First, a new discrete cosine Stockwell transform (DCST) layer is introduced to replace linear layers in a multi-layer…
▽ More
The lack of an efficient compression model remains a challenge for the wireless transmission of gearbox data in non-contact gear fault diagnosis problems. In this paper, we present a signal-adaptive asymmetrical autoencoder with a transform domain layer to compress sensor signals. First, a new discrete cosine Stockwell transform (DCST) layer is introduced to replace linear layers in a multi-layer autoencoder. A trainable filter is implemented in the DCST domain by utilizing the multiplication property of the convolution. A trainable hard-thresholding layer is applied to reduce redundant data in the DCST layer to make the feature map sparse. In comparison to the linear layer, the DCST layer reduces the number of trainable parameters and improves the accuracy of data reconstruction. Second, training the autoencoder with a sparsifying DCST layer only requires a small number of datasets. The proposed method is superior to other autoencoder-based methods on the University of Connecticut (UoC) and Southeast University (SEU) gearbox datasets, as the average quality score is improved by 2.00% at the lowest and 32.35% at the highest with a limited number of training samples
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
SentimentGPT: Exploiting GPT for Advanced Sentiment Analysis and its Departure from Current Machine Learning
Authors:
Kiana Kheiri,
Hamid Karimi
Abstract:
This study presents a thorough examination of various Generative Pretrained Transformer (GPT) methodologies in sentiment analysis, specifically in the context of Task 4 on the SemEval 2017 dataset. Three primary strategies are employed: 1) prompt engineering using the advanced GPT-3.5 Turbo, 2) fine-tuning GPT models, and 3) an inventive approach to embedding classification. The research yields de…
▽ More
This study presents a thorough examination of various Generative Pretrained Transformer (GPT) methodologies in sentiment analysis, specifically in the context of Task 4 on the SemEval 2017 dataset. Three primary strategies are employed: 1) prompt engineering using the advanced GPT-3.5 Turbo, 2) fine-tuning GPT models, and 3) an inventive approach to embedding classification. The research yields detailed comparative insights among these strategies and individual GPT models, revealing their unique strengths and potential limitations. Additionally, the study compares these GPT-based methodologies with other current, high-performing models previously used with the same dataset. The results illustrate the significant superiority of the GPT approaches in terms of predictive performance, more than 22\% in F1-score compared to the state-of-the-art. Further, the paper sheds light on common challenges in sentiment analysis tasks, such as understanding context and detecting sarcasm. It underscores the enhanced capabilities of the GPT models to effectively handle these complexities. Taken together, these findings highlight the promising potential of GPT models in sentiment analysis, setting the stage for future research in this field. The code can be found at https://github.com/DSAatUSU/SentimentGPT
△ Less
Submitted 23 July, 2023; v1 submitted 16 July, 2023;
originally announced July 2023.
-
Quantifying Deep Learning Model Uncertainty in Conformal Prediction
Authors:
Hamed Karimi,
Reza Samavi
Abstract:
Precise estimation of predictive uncertainty in deep neural networks is a critical requirement for reliable decision-making in machine learning and statistical modeling, particularly in the context of medical AI. Conformal Prediction (CP) has emerged as a promising framework for representing the model uncertainty by providing well-calibrated confidence levels for individual predictions. However, t…
▽ More
Precise estimation of predictive uncertainty in deep neural networks is a critical requirement for reliable decision-making in machine learning and statistical modeling, particularly in the context of medical AI. Conformal Prediction (CP) has emerged as a promising framework for representing the model uncertainty by providing well-calibrated confidence levels for individual predictions. However, the quantification of model uncertainty in conformal prediction remains an active research area, yet to be fully addressed. In this paper, we explore state-of-the-art CP methodologies and their theoretical foundations. We propose a probabilistic approach in quantifying the model uncertainty derived from the produced prediction sets in conformal prediction and provide certified boundaries for the computed uncertainty. By doing so, we allow model uncertainty measured by CP to be compared by other uncertainty quantification methods such as Bayesian (e.g., MC-Dropout and DeepEnsemble) and Evidential approaches.
△ Less
Submitted 3 January, 2024; v1 submitted 1 June, 2023;
originally announced June 2023.
-
A Persian Benchmark for Joint Intent Detection and Slot Filling
Authors:
Masoud Akbari,
Amir Hossein Karimi,
Tayyebeh Saeedi,
Zeinab Saeidi,
Kiana Ghezelbash,
Fatemeh Shamsezat,
Mohammad Akbari,
Ali Mohades
Abstract:
Natural Language Understanding (NLU) is important in today's technology as it enables machines to comprehend and process human language, leading to improved human-computer interactions and advancements in fields such as virtual assistants, chatbots, and language-based AI systems. This paper highlights the significance of advancing the field of NLU for low-resource languages. With intent detection…
▽ More
Natural Language Understanding (NLU) is important in today's technology as it enables machines to comprehend and process human language, leading to improved human-computer interactions and advancements in fields such as virtual assistants, chatbots, and language-based AI systems. This paper highlights the significance of advancing the field of NLU for low-resource languages. With intent detection and slot filling being crucial tasks in NLU, the widely used datasets ATIS and SNIPS have been utilized in the past. However, these datasets only cater to the English language and do not support other languages. In this work, we aim to address this gap by creating a Persian benchmark for joint intent detection and slot filling based on the ATIS dataset. To evaluate the effectiveness of our benchmark, we employ state-of-the-art methods for intent detection and slot filling.
△ Less
Submitted 1 March, 2023;
originally announced March 2023.
-
The Authors Matter: Understanding and Mitigating Implicit Bias in Deep Text Classification
Authors:
Haochen Liu,
Wei Jin,
Hamid Karimi,
Zitao Liu,
Jiliang Tang
Abstract:
It is evident that deep text classification models trained on human data could be biased. In particular, they produce biased outcomes for texts that explicitly include identity terms of certain demographic groups. We refer to this type of bias as explicit bias, which has been extensively studied. However, deep text classification models can also produce biased outcomes for texts written by authors…
▽ More
It is evident that deep text classification models trained on human data could be biased. In particular, they produce biased outcomes for texts that explicitly include identity terms of certain demographic groups. We refer to this type of bias as explicit bias, which has been extensively studied. However, deep text classification models can also produce biased outcomes for texts written by authors of certain demographic groups. We refer to such bias as implicit bias of which we still have a rather limited understanding. In this paper, we first demonstrate that implicit bias exists in different text classification tasks for different demographic groups. Then, we build a learning-based interpretation method to deepen our knowledge of implicit bias. Specifically, we verify that classifiers learn to make predictions based on language features that are related to the demographic attributes of the authors. Next, we propose a framework Debiased-TC to train deep text classifiers to make predictions on the right features and consequently mitigate implicit bias. We conduct extensive experiments on three real-world datasets. The results show that the text classification models trained under our proposed framework outperform traditional models significantly in terms of fairness, and also slightly in terms of classification performance.
△ Less
Submitted 6 May, 2021;
originally announced May 2021.
-
Road to the White House: Analyzing the Relations Between Mainstream and Social Media During the U.S. Presidential Primaries
Authors:
Aaron Brookhouse,
Tyler Derr,
Hamid Karimi,
H. Russell Bernard,
Jiliang Tang
Abstract:
Information is crucial to the function of a democratic society where well-informed citizens can make rational political decisions. While in the past political entities were primarily utilizing newspaper and later television to inform the public, with the rise of the Internet and online social media, the political arena has transformed into a more complex structure. Now, more than ever, people expr…
▽ More
Information is crucial to the function of a democratic society where well-informed citizens can make rational political decisions. While in the past political entities were primarily utilizing newspaper and later television to inform the public, with the rise of the Internet and online social media, the political arena has transformed into a more complex structure. Now, more than ever, people express themselves online while mainstream news agencies attempt to seize the power of the Internet to spread their agenda. To grasp the political coexistence of mainstream media and online social media, in this paper, we perform an analysis between these two sources of information in the context of the U.S. 2020 presidential election. In particular, we collect data during the 2020 Democratic Party presidential primaries pertaining to the candidates and by analyzing this data, we highlight similarities and differences between these two main types of sources, detect the potential impact they have on each other, and understand how this impact relationship can change over time. To supplement these two main sources and to establish a baseline, we also include Google Trends search results and Polling results for each of the candidates that are being analyzed.
△ Less
Submitted 19 September, 2020;
originally announced September 2020.
-
Characterizing the Decision Boundary of Deep Neural Networks
Authors:
Hamid Karimi,
Tyler Derr,
Jiliang Tang
Abstract:
Deep neural networks and in particular, deep neural classifiers have become an integral part of many modern applications. Despite their practical success, we still have limited knowledge of how they work and the demand for such an understanding is evergrowing. In this regard, one crucial aspect of deep neural network classifiers that can help us deepen our knowledge about their decision-making beh…
▽ More
Deep neural networks and in particular, deep neural classifiers have become an integral part of many modern applications. Despite their practical success, we still have limited knowledge of how they work and the demand for such an understanding is evergrowing. In this regard, one crucial aspect of deep neural network classifiers that can help us deepen our knowledge about their decision-making behavior is to investigate their decision boundaries. Nevertheless, this is contingent upon having access to samples populating the areas near the decision boundary. To achieve this, we propose a novel approach we call Deep Decision boundary Instance Generation (DeepDIG). DeepDIG utilizes a method based on adversarial example generation as an effective way of generating samples near the decision boundary of any deep neural network model. Then, we introduce a set of important principled characteristics that take advantage of the generated instances near the decision boundary to provide multifaceted understandings of deep neural networks. We have performed extensive experiments on multiple representative datasets across various deep neural network models and characterized their decision boundaries. The code is publicly available at https://github.com/hamidkarimi/DeepDIG/.
△ Less
Submitted 3 June, 2020; v1 submitted 24 December, 2019;
originally announced December 2019.
-
Learning Hierarchical Discourse-level Structure for Fake News Detection
Authors:
Hamid Karimi,
Jiliang Tang
Abstract:
On the one hand, nowadays, fake news articles are easily propagated through various online media platforms and have become a grand threat to the trustworthiness of information. On the other hand, our understanding of the language of fake news is still minimal. Incorporating hierarchical discourse-level structure of fake and real news articles is one crucial step toward a better understanding of ho…
▽ More
On the one hand, nowadays, fake news articles are easily propagated through various online media platforms and have become a grand threat to the trustworthiness of information. On the other hand, our understanding of the language of fake news is still minimal. Incorporating hierarchical discourse-level structure of fake and real news articles is one crucial step toward a better understanding of how these articles are structured. Nevertheless, this has rarely been investigated in the fake news detection domain and faces tremendous challenges. First, existing methods for capturing discourse-level structure rely on annotated corpora which are not available for fake news datasets. Second, how to extract out useful information from such discovered structures is another challenge. To address these challenges, we propose Hierarchical Discourse-level Structure for Fake news detection. HDSF learns and constructs a discourse-level structure for fake/real news articles in an automated and data-driven manner. Moreover, we identify insightful structure-related properties, which can explain the discovered structures and boost our understating of fake news. Conducted experiments show the effectiveness of the proposed approach. Further structural analysis suggests that real and fake news present substantial differences in the hierarchical discourse-level structures.
△ Less
Submitted 10 April, 2019; v1 submitted 26 February, 2019;
originally announced March 2019.
-
Deep Adversarial Network Alignment
Authors:
Tyler Derr,
Hamid Karimi,
Xiaorui Liu,
Jiejun Xu,
Jiliang Tang
Abstract:
Network alignment, in general, seeks to discover the hidden underlying correspondence between nodes across two (or more) networks when given their network structure. However, most existing network alignment methods have added assumptions of additional constraints to guide the alignment, such as having a set of seed node-node correspondences across the networks or the existence of side-information.…
▽ More
Network alignment, in general, seeks to discover the hidden underlying correspondence between nodes across two (or more) networks when given their network structure. However, most existing network alignment methods have added assumptions of additional constraints to guide the alignment, such as having a set of seed node-node correspondences across the networks or the existence of side-information. Instead, we seek to develop a general network alignment algorithm that makes no additional assumptions. Recently, network embedding has proven effective in many network analysis tasks, but embeddings of different networks are not aligned. Thus, we present our Deep Adversarial Network Alignment (DANA) framework that first uses deep adversarial learning to discover complex mappings for aligning the embedding distributions of the two networks. Then, using our learned mapping functions, DANA performs an efficient nearest neighbor node alignment. We perform experiments on real world datasets to show the effectiveness of our framework for first aligning the graph embedding distributions and then discovering node alignments that outperform existing methods.
△ Less
Submitted 26 February, 2019;
originally announced February 2019.
-
An Artificial Neural Network for Gait Analysis to Estimate Blood Alcohol Content Level
Authors:
Pedram Gharani,
Brian Suffoletto,
Tammy Chung,
Hassan Karimi
Abstract:
Impairments in gait occur after alcohol consumption, and, if detected in real-time, could guide the delivery of "just-in-time" injury prevention interventions. We aimed to identify the salient features of gait that could be used for estimating blood alcohol content (BAC) level in a typical drinking environment. We recruited 10 young adults with a history of heavy drinking to test our research app.…
▽ More
Impairments in gait occur after alcohol consumption, and, if detected in real-time, could guide the delivery of "just-in-time" injury prevention interventions. We aimed to identify the salient features of gait that could be used for estimating blood alcohol content (BAC) level in a typical drinking environment. We recruited 10 young adults with a history of heavy drinking to test our research app. During four consecutive Fridays and Saturdays, every hour from 8pm to 12am, they were prompted to use the app to report alcohol consumption and complete a 5-step straight-line walking task, during which 3-axis acceleration and angular velocity data was sampled at a frequency of 100 Hz. BAC for each subject was calculated. From sensor signals, 24 features were calculated using a sliding window technique, including energy, mean, and standard deviation. Using an artificial neural network (ANN), we performed regression analysis to define a model determining association between gait features and BACs. 70\% of data was used as a training dataset, and the results were tested and validated using the rest of samples. We evaluated different training algorithms for the neural network and the result showed that a Bayesian regularization neural network (BRNN) was the most efficient and accurate. Analyses support the use of the tandem gait task paired with our approach to reliably estimate BAC based on gait features. Results from this work could be useful in designing effective prevention interventions to reduce risky behaviors during periods of alcohol consumption.
△ Less
Submitted 14 December, 2017; v1 submitted 2 December, 2017;
originally announced December 2017.
-
Using Phone Sensors and an Artificial Neural Network to Detect Gait Changes During Drinking Episodes in the Natural Environment
Authors:
Brian Suffoletto,
Pedram Gharani,
Tammy Chung,
Hassan Karimi
Abstract:
Phone sensors could be useful in assessing changes in gait that occur with alcohol consumption. This study determined (1) feasibility of collecting gait-related data during drinking occasions in the natural environment, and (2) how gait-related features measured by phone sensors relate to estimated blood alcohol concentration (eBAC). Ten young adult heavy drinkers were prompted to complete a 5-ste…
▽ More
Phone sensors could be useful in assessing changes in gait that occur with alcohol consumption. This study determined (1) feasibility of collecting gait-related data during drinking occasions in the natural environment, and (2) how gait-related features measured by phone sensors relate to estimated blood alcohol concentration (eBAC). Ten young adult heavy drinkers were prompted to complete a 5-step gait task every hour from 8pm to 12am over four consecutive weekends. We collected 3-xis accelerometer, gyroscope, and magnetometer data from phone sensors, and computed 24 gait-related features using a sliding window technique. eBAC levels were calculated at each time point based on Ecological Momentary Assessment (EMA) of alcohol use. We used an artificial neural network model to analyze associations between sensor features and eBACs in training (70% of the data) and validation and test (30% of the data) datasets. We analyzed 128 data points where both eBAC and gait-related sensor data was captured, either when not drinking (n=60), while eBAC was ascending (n=55) or eBAC was descending (n=13). 21 data points were captured at times when the eBAC was greater than the legal limit (0.08 mg/dl). Using a Bayesian regularized neural network, gait-related phone sensor features showed a high correlation with eBAC (Pearson's r > 0.9), and >95% of estimated eBAC would fall between -0.012 and +0.012 of actual eBAC. It is feasible to collect gait-related data from smartphone sensors during drinking occasions in the natural environment. Sensor-based features can be used to infer gait changes associated with elevated blood alcohol content.
△ Less
Submitted 13 November, 2017; v1 submitted 9 November, 2017;
originally announced November 2017.
-
Synthesizing Deep Neural Network Architectures using Biological Synaptic Strength Distributions
Authors:
A. H. Karimi,
M. J. Shafiee,
A. Ghodsi,
A. Wong
Abstract:
In this work, we perform an exploratory study on synthesizing deep neural networks using biological synaptic strength distributions, and the potential influence of different distributions on modelling performance particularly for the scenario associated with small data sets. Surprisingly, a CNN with convolutional layer synaptic strengths drawn from biologically-inspired distributions such as log-n…
▽ More
In this work, we perform an exploratory study on synthesizing deep neural networks using biological synaptic strength distributions, and the potential influence of different distributions on modelling performance particularly for the scenario associated with small data sets. Surprisingly, a CNN with convolutional layer synaptic strengths drawn from biologically-inspired distributions such as log-normal or correlated center-surround distributions performed relatively well suggesting a possibility for designing deep neural network architectures that do not require many data samples to learn, and can sidestep current training procedures while maintaining or boosting modelling performance.
△ Less
Submitted 30 June, 2017;
originally announced July 2017.
-
Effective optimization using sample persistence: A case study on quantum annealers and various Monte Carlo optimization methods
Authors:
Hamed Karimi,
Gili Rosenberg,
Helmut G. Katzgraber
Abstract:
We present and apply a general-purpose, multi-start algorithm for improving the performance of low-energy samplers used for solving optimization problems. The algorithm iteratively fixes the value of a large portion of the variables to values that have a high probability of being optimal. The resulting problems are smaller and less connected, and samplers tend to give better low-energy samples for…
▽ More
We present and apply a general-purpose, multi-start algorithm for improving the performance of low-energy samplers used for solving optimization problems. The algorithm iteratively fixes the value of a large portion of the variables to values that have a high probability of being optimal. The resulting problems are smaller and less connected, and samplers tend to give better low-energy samples for these problems. The algorithm is trivially parallelizable, since each start in the multi-start algorithm is independent, and could be applied to any heuristic solver that can be run multiple times to give a sample. We present results for several classes of hard problems solved using simulated annealing, path-integral quantum Monte Carlo, parallel tempering with isoenergetic cluster moves, and a quantum annealer, and show that the success metrics as well as the scaling are improved substantially. When combined with this algorithm, the quantum annealer's scaling was substantially improved for native Chimera graph problems. In addition, with this algorithm the scaling of the time to solution of the quantum annealer is comparable to the Hamze--de Freitas--Selby algorithm on the weak-strong cluster problems introduced by Boixo et al. Parallel tempering with isoenergetic cluster moves was able to consistently solve 3D spin glass problems with 8000 variables when combined with our method, whereas without our method it could not solve any.
△ Less
Submitted 27 October, 2017; v1 submitted 23 June, 2017;
originally announced June 2017.
-
Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition
Authors:
Hamed Karimi,
Julie Nutini,
Mark Schmidt
Abstract:
In 1963, Polyak proposed a simple condition that is sufficient to show a global linear convergence rate for gradient descent. This condition is a special case of the Łojasiewicz inequality proposed in the same year, and it does not require strong convexity (or even convexity). In this work, we show that this much-older Polyak-Łojasiewicz (PL) inequality is actually weaker than the main conditions…
▽ More
In 1963, Polyak proposed a simple condition that is sufficient to show a global linear convergence rate for gradient descent. This condition is a special case of the Łojasiewicz inequality proposed in the same year, and it does not require strong convexity (or even convexity). In this work, we show that this much-older Polyak-Łojasiewicz (PL) inequality is actually weaker than the main conditions that have been explored to show linear convergence rates without strong convexity over the last 25 years. We also use the PL inequality to give new analyses of randomized and greedy coordinate descent methods, sign-based gradient descent methods, and stochastic gradient methods in the classic setting (with decreasing or constant step-sizes) as well as the variance-reduced setting. We further propose a generalization that applies to proximal-gradient methods for non-smooth optimization, leading to simple proofs of linear convergence of these methods. Along the way, we give simple convergence results for a wide variety of problems in machine learning: least squares, logistic regression, boosting, resilient backpropagation, L1-regularization, support vector machines, stochastic dual coordinate ascent, and stochastic variance-reduced gradient methods.
△ Less
Submitted 12 September, 2020; v1 submitted 16 August, 2016;
originally announced August 2016.
-
Boosting quantum annealer performance via sample persistence
Authors:
Hamed Karimi,
Gili Rosenberg
Abstract:
We propose a novel method for reducing the number of variables in quadratic unconstrained binary optimization problems, using a quantum annealer (or any sampler) to fix the value of a large portion of the variables to values that have a high probability of being optimal. The resulting problems are usually much easier for the quantum annealer to solve, due to their being smaller and consisting of d…
▽ More
We propose a novel method for reducing the number of variables in quadratic unconstrained binary optimization problems, using a quantum annealer (or any sampler) to fix the value of a large portion of the variables to values that have a high probability of being optimal. The resulting problems are usually much easier for the quantum annealer to solve, due to their being smaller and consisting of disconnected components. This approach significantly increases the success rate and number of observations of the best known energy value in samples obtained from the quantum annealer, when compared with calling the quantum annealer without using it, even when using fewer annealing cycles. Use of the method results in a considerable improvement in success metrics even for problems with high-precision couplers and biases, which are more challenging for the quantum annealer to solve. The results are further enhanced by applying the method iteratively and combining it with classical pre-processing. We present results for both Chimera graph-structured problems and embedded problems from a real-world application.
△ Less
Submitted 18 May, 2017; v1 submitted 24 June, 2016;
originally announced June 2016.