Search | arXiv e-print repository

Alignment at Pre-training! Towards Native Alignment for Arabic LLMs

Authors: Juhao Liang, Zhenyang Cai, Jianqing Zhu, Huang Huang, Kewei Zong, Bang An, Mosen Alharthi, Juncai He, Lian Zhang, Haizhou Li, Benyou Wang, Jinchao Xu

Abstract: The alignment of large language models (LLMs) is critical for developing effective and safe language models. Traditional approaches focus on aligning models during the instruction tuning or reinforcement learning stages, referred to in this paper as `post alignment'. We argue that alignment during the pre-training phase, which we term `native alignment', warrants investigation. Native alignment ai… ▽ More The alignment of large language models (LLMs) is critical for developing effective and safe language models. Traditional approaches focus on aligning models during the instruction tuning or reinforcement learning stages, referred to in this paper as `post alignment'. We argue that alignment during the pre-training phase, which we term `native alignment', warrants investigation. Native alignment aims to prevent unaligned content from the beginning, rather than relying on post-hoc processing. This approach leverages extensively aligned pre-training data to enhance the effectiveness and usability of pre-trained models. Our study specifically explores the application of native alignment in the context of Arabic LLMs. We conduct comprehensive experiments and ablation studies to evaluate the impact of native alignment on model performance and alignment stability. Additionally, we release open-source Arabic LLMs that demonstrate state-of-the-art performance on various benchmarks, providing significant benefits to the Arabic LLM community. △ Less

Submitted 4 December, 2024; originally announced December 2024.

Comments: Accepted to NeurIPS 2024 main conference. see https://github.com/FreedomIntelligence/AceGPT-v2

arXiv:2407.10240 [pdf]

xLSTMTime : Long-term Time Series Forecasting With xLSTM

Authors: Musleh Alharthi, Ausif Mahmood

Abstract: In recent years, transformer-based models have gained prominence in multivariate long-term time series forecasting (LTSF), demonstrating significant advancements despite facing challenges such as high computational demands, difficulty in capturing temporal dynamics, and managing long-term dependencies. The emergence of LTSF-Linear, with its straightforward linear architecture, has notably outperfo… ▽ More In recent years, transformer-based models have gained prominence in multivariate long-term time series forecasting (LTSF), demonstrating significant advancements despite facing challenges such as high computational demands, difficulty in capturing temporal dynamics, and managing long-term dependencies. The emergence of LTSF-Linear, with its straightforward linear architecture, has notably outperformed transformer-based counterparts, prompting a reevaluation of the transformer's utility in time series forecasting. In response, this paper presents an adaptation of a recent architecture termed extended LSTM (xLSTM) for LTSF. xLSTM incorporates exponential gating and a revised memory structure with higher capacity that has good potential for LTSF. Our adopted architecture for LTSF termed as xLSTMTime surpasses current approaches. We compare xLSTMTime's performance against various state-of-the-art models across multiple real-world da-tasets, demonstrating superior forecasting capabilities. Our findings suggest that refined recurrent architectures can offer competitive alternatives to transformer-based models in LTSF tasks, po-tentially redefining the landscape of time series forecasting. △ Less

Submitted 11 August, 2024; v1 submitted 14 July, 2024; originally announced July 2024.

arXiv:2407.09925 [pdf]

Resilience in PON-based data centre architectures with two-tier cascaded AWGRs

Authors: Mohammed Alharthi, Sanaa H. Mohamed, Taisir E. H. El-Gorashi, Jaafar M. H. Elmirghani

Abstract: This paper investigates the performance of a two-tier AWGR-based Passive Optical Network (PON) data centre architecture against an AWGR-based PON data centre architecture by considering various scenarios involving link failures to evaluate the resilience of both designs. To optimize traffic routing under different failure scenarios, a Mixed Integer Linear Programming (MILP) model is developed and… ▽ More This paper investigates the performance of a two-tier AWGR-based Passive Optical Network (PON) data centre architecture against an AWGR-based PON data centre architecture by considering various scenarios involving link failures to evaluate the resilience of both designs. To optimize traffic routing under different failure scenarios, a Mixed Integer Linear Programming (MILP) model is developed and the power consumption and delay performance is assessed. The results demonstrate that the two-tier AWGR architecture reduced the power consumption and the delay compared to the AWGR-based architecture by up to 10% and 61%, respectively. △ Less

Submitted 13 July, 2024; originally announced July 2024.

arXiv:2305.18937 [pdf]

WDM/TDM over Passive Optical Networks with Cascaded-AWGRs for Data Centers

Authors: Mohammed Alharthi, Sanaa H. Mohamed, Taisir E. H. El-Gorashi, Jaafar M. H. Elmirghani

Abstract: Data centers based on Passive Optical Networks (PONs) can provide high capacity, low cost, scalability, elasticity and high energy-efficiency. This paper introduces the use of WDM-TDM multiple access in a PON-based data center that offers multipath routing via two-tier cascaded Arrayed Waveguide Grating Routers (AWGRs) to improve the utilization of resources. A Mixed Integer Linear Programming (MI… ▽ More Data centers based on Passive Optical Networks (PONs) can provide high capacity, low cost, scalability, elasticity and high energy-efficiency. This paper introduces the use of WDM-TDM multiple access in a PON-based data center that offers multipath routing via two-tier cascaded Arrayed Waveguide Grating Routers (AWGRs) to improve the utilization of resources. A Mixed Integer Linear Programming (MILP) model is developed to optimize resource allocation while considering multipath routing. The results show that all-to-all connectivity is achieved in the architecture through the use of two different wavelength within different time slots for the communication between racks in the same or different cells, as well as with the OLT switches. △ Less

Submitted 30 May, 2023; originally announced May 2023.

arXiv:2203.12761 [pdf]

Energy-Efficient VM Placement in PON-based Data Center Architectures with Cascaded AWGRs

Authors: Mohammed Alharthi, Sanaa H. Mohamed, Barzan Yosuf, Taisir E. H. El-Gorashi, Jaafar M. H. Elmirghani

Abstract: Data centers based on Passive Optical Networks (PONs) can offer scalability, low cost and high energy-efficiency. Application in data centers can use Virtual Machines (VMs) to provide efficient utilization of the physical resources. This paper investigates the impact of VM placement on the energyefficiency in a PON-based data center architecture that utilizes cascaded Arrayed Waveguide Grating Rou… ▽ More Data centers based on Passive Optical Networks (PONs) can offer scalability, low cost and high energy-efficiency. Application in data centers can use Virtual Machines (VMs) to provide efficient utilization of the physical resources. This paper investigates the impact of VM placement on the energyefficiency in a PON-based data center architecture that utilizes cascaded Arrayed Waveguide Grating Routers (AWGRs). In this paper, we develop a Mixed Integer Linear Programming (MILP) optimization model to optimize the VM placement in the proposed PON-based data center architecture. This optimization aims to minimize the power consumption of the networking and computing by placing the VMs and their demands in the optimum number of resources (i.e., servers and networking devices) in the data center. We first minimize the processing power consumption only and then we minimize the processing and networking power consumption. The results show that a reduction in the networking power consumption by up to 75% is achieved when performing joint minimization of processing and networking power consumption compared to considering the minimization of the processing power consumption only. △ Less

Submitted 23 March, 2022; originally announced March 2022.

arXiv:2112.15561 [pdf, other]

SOK: On the Analysis of Web Browser Security

Authors: Jungwon Lim, Yonghwi Jin, Mansour Alharthi, Xiaokuan Zhang, Jinho Jung, Rajat Gupta, Kuilin Li, Daehee Jang, Taesoo Kim

Abstract: Web browsers are integral parts of everyone's daily life. They are commonly used for security-critical and privacy sensitive tasks, like banking transactions and checking medical records. Unfortunately, modern web browsers are too complex to be bug free (e.g., 25 million lines of code in Chrome), and their role as an interface to the cyberspace makes them an attractive target for attacks. Accordin… ▽ More Web browsers are integral parts of everyone's daily life. They are commonly used for security-critical and privacy sensitive tasks, like banking transactions and checking medical records. Unfortunately, modern web browsers are too complex to be bug free (e.g., 25 million lines of code in Chrome), and their role as an interface to the cyberspace makes them an attractive target for attacks. Accordingly, web browsers naturally become an arena for demonstrating advanced exploitation techniques by attackers and state-of-the-art defenses by browser vendors. Web browsers, arguably, are the most exciting place to learn the latest security issues and techniques, but remain as a black art to most security researchers because of their fast-changing characteristics and complex code bases. To bridge this gap, this paper attempts to systematize the security landscape of modern web browsers by studying the popular classes of security bugs, their exploitation techniques, and deployed defenses. More specifically, we first introduce a unified architecture that faithfully represents the security design of four major web browsers. Second, we share insights from a 10-year longitudinal study on browser bugs. Third, we present a timeline and context of mitigation schemes and their effectiveness. Fourth, we share our lessons from a full-chain exploit used in 2020 Pwn2Own competition. and the implication of bug bounty programs to web browser security. We believe that the key takeaways from this systematization can shed light on how to advance the status quo of modern web browsers, and, importantly, how to create secure yet complex software in the future. △ Less

Submitted 31 December, 2021; originally announced December 2021.

arXiv:2111.01263 [pdf]

Optimized Passive Optical Networks with Cascaded-AWGRs for Data Centers

Authors: Mohammed Alharthi, Sanaa H. Mohamed, Barzan Yosuf, Taisir E. H. El-Gorashi, Jaafar M. H. Elmirghani

Abstract: The use of Passive Optical Networks (PONs) in modern and future data centers can provide energy efficiency, high capacity, low cost, scalability, and elasticity. This paper introduces a passive optical network design with 2-tier cascaded Arrayed Waveguide Grating Routers (AWGRs) to connect groups of racks (i.e. cells) within a data center. This design employs a Software-Defined Networking (SDN) co… ▽ More The use of Passive Optical Networks (PONs) in modern and future data centers can provide energy efficiency, high capacity, low cost, scalability, and elasticity. This paper introduces a passive optical network design with 2-tier cascaded Arrayed Waveguide Grating Routers (AWGRs) to connect groups of racks (i.e. cells) within a data center. This design employs a Software-Defined Networking (SDN) controller to manage the routing and assignment of the networking resource while introducing multiple paths between any two cells to improve routing, load balancing and resilience. We provide benchmarking results for the power consumption to compare the energy efficiency of this design to state-of-the-art data centers. The results indicate that the cascaded AWGRs architecture can achieve up to 43% saving in the networking power consumption compared to Fat-Tree data center architecture. △ Less

Submitted 1 November, 2021; originally announced November 2021.

arXiv:1710.04977 [pdf, other]

Bayes factors for partially observed stochastic epidemic models

Authors: Muteb Alharthi, Theodore Kypraios, Philip D. O'Neill

Abstract: We consider the problem of model choice for stochastic epidemic models given partial observation of a disease outbreak through time. Our main focus is on the use of Bayes factors. Although Bayes factors have appeared in the epidemic modelling literature before, they can be hard to compute and little attention has been given to fundamental questions concerning their utility. In this paper we derive… ▽ More We consider the problem of model choice for stochastic epidemic models given partial observation of a disease outbreak through time. Our main focus is on the use of Bayes factors. Although Bayes factors have appeared in the epidemic modelling literature before, they can be hard to compute and little attention has been given to fundamental questions concerning their utility. In this paper we derive analytic expressions for Bayes factors given complete observation through time, which suggest practical guidelines for model choice problems. We extend the power posterior method for computing Bayes factors so as to account for missing data and apply this approach to partially observed epidemics. For comparison, we also explore the use of a deviance information criterion for missing data scenarios. The methods are illustrated via examples involving both simulated and real data. △ Less

Submitted 13 October, 2017; originally announced October 2017.

Showing 1–8 of 8 results for author: Alharthi, M