Showing 1–2 of 2 results for author: Alzahrani, E

Search v0.5.6 released 2020-02-24

arXiv:2109.13890 [pdf]

cs.CL

How Different Text-preprocessing Techniques Using The BERT Model Affect The Gender Profiling of Authors

Authors: Esam Alzahrani, Leon Jololian

Abstract: Forensic author profiling plays an important role in indicating possible profiles for suspects. Among the many automated solutions recently proposed for author profiling, transfer learning outperforms many other state-of-the-art techniques in natural language processing. Nevertheless, the sophisticated technique has yet to be fully exploited for author profiling. At the same time, whereas current… ▽ More Forensic author profiling plays an important role in indicating possible profiles for suspects. Among the many automated solutions recently proposed for author profiling, transfer learning outperforms many other state-of-the-art techniques in natural language processing. Nevertheless, the sophisticated technique has yet to be fully exploited for author profiling. At the same time, whereas current methods of author profiling, all largely based on features engineering, have spawned significant variation in each model used, transfer learning usually requires a preprocessed text to be fed into the model. We reviewed multiple references in the literature and determined the most common preprocessing techniques associated with authors' genders profiling. Considering the variations in potential preprocessing techniques, we conducted an experimental study that involved applying five such techniques to measure each technique's effect while using the BERT model, chosen for being one of the most-used stock pretrained models. We used the Hugging face transformer library to implement the code for each preprocessing case. In our five experiments, we found that BERT achieves the best accuracy in predicting the gender of the author when no preprocessing technique is applied. Our best case achieved 86.67% accuracy in predicting the gender of authors. △ Less

Submitted 28 September, 2021; originally announced September 2021.

Journal ref: Advances in Machine Learning 3rd International Conference on Machine Learning & Applications (CMLA 2021), September 25~26, 2021, Toronto, Canada Volume Editors : David C. Wyld, Dhinaharan Nagamalai (Eds) ISBN : 978-1-925953-49-7
arXiv:1910.00796 [pdf, other]

cs.IT math.CO

doi 10.1109/TIT.2023.3247860

Transition Waste Optimization for Coded Elastic Computing

Authors: Hoang Dau, Ryan Gabrys, Yu-Chih Huang, Chen Feng, Quang-Hung Luu, Eidah Alzahrani, Zahir Tari

Abstract: Distributed computing, in which a resource-intensive task is divided into subtasks and distributed among different machines, plays a key role in solving large-scale problems. Coded computing is a recently emerging paradigm where redundancy for distributed computing is introduced to alleviate the impact of slow machines (stragglers) on the completion time. We investigate coded computing solutions o… ▽ More Distributed computing, in which a resource-intensive task is divided into subtasks and distributed among different machines, plays a key role in solving large-scale problems. Coded computing is a recently emerging paradigm where redundancy for distributed computing is introduced to alleviate the impact of slow machines (stragglers) on the completion time. We investigate coded computing solutions over elastic resources, where the set of available machines may change in the middle of the computation. This is motivated by recently available services in the cloud computing industry (e.g., EC2 Spot, Azure Batch) where low-priority virtual machines are offered at a fraction of the price of the on-demand instances but can be preempted on short notice. Our contributions are three-fold. We first introduce a new concept called transition waste that quantifies the number of tasks existing machines must abandon or take over when a machine joins/leaves. We then develop an efficient method to minimize the transition waste for the cyclic task allocation scheme recently proposed in the literature (Yang et al. ISIT'19). Finally, we establish a novel solution based on finite geometry achieving zero transition wastes given that the number of active machines varies within a fixed range. △ Less

Submitted 14 March, 2023; v1 submitted 2 October, 2019; originally announced October 2019.

Comments: 24 pages, accepted by IEEE Transactions on Information Theory

Search v0.5.6 released 2020-02-24