-
Multi-Programming Language Ensemble for Code Generation in Large Language Model
Authors:
Tengfei Xue,
Xuefeng Li,
Tahir Azim,
Roman Smirnov,
Jianhui Yu,
Arash Sadrieh,
Babak Pahlavan
Abstract:
Large language models (LLMs) have significantly improved code generation, particularly in one-pass code generation. However, most existing approaches focus solely on generating code in a single programming language, overlooking the potential of leveraging the multi-language capabilities of LLMs. LLMs have varying patterns of errors across different languages, suggesting that a more robust approach…
▽ More
Large language models (LLMs) have significantly improved code generation, particularly in one-pass code generation. However, most existing approaches focus solely on generating code in a single programming language, overlooking the potential of leveraging the multi-language capabilities of LLMs. LLMs have varying patterns of errors across different languages, suggesting that a more robust approach could be developed by leveraging these multi-language outputs. In this study, we propose Multi-Programming Language Ensemble (MPLE), a novel ensemble-based method that utilizes code generation across multiple programming languages to enhance overall performance. By treating each language-specific code generation process as an individual "weak expert" and effectively integrating their outputs, our method mitigates language-specific errors and biases. This multi-language ensemble strategy leverages the complementary strengths of different programming languages, enabling the model to produce more accurate and robust code. Our approach can be seamlessly integrated with commonly used techniques such as the reflection algorithm and Monte Carlo tree search to improve code generation quality further. Experimental results show that our framework consistently enhances baseline performance by up to 17.92% on existing benchmarks (HumanEval and HumanEval-plus), with a standout result of 96.25% accuracy on the HumanEval benchmark, achieving new state-of-the-art results across various LLM models. The code will be released at https://github.com/NinjaTech-AI/MPLE
△ Less
Submitted 6 September, 2024;
originally announced September 2024.
-
NinjaLLM: Fast, Scalable and Cost-effective RAG using Amazon SageMaker and AWS Trainium and Inferentia2
Authors:
Tengfei Xue,
Xuefeng Li,
Roman Smirnov,
Tahir Azim,
Arash Sadrieh,
Babak Pahlavan
Abstract:
Retrieval-augmented generation (RAG) techniques are widely used today to retrieve and present information in a conversational format. This paper presents a set of enhancements to traditional RAG techniques, focusing on large language models (LLMs) fine-tuned and hosted on AWS Trainium and Inferentia2 AI chips via SageMaker. These chips are characterized by their elasticity, affordability, and effi…
▽ More
Retrieval-augmented generation (RAG) techniques are widely used today to retrieve and present information in a conversational format. This paper presents a set of enhancements to traditional RAG techniques, focusing on large language models (LLMs) fine-tuned and hosted on AWS Trainium and Inferentia2 AI chips via SageMaker. These chips are characterized by their elasticity, affordability, and efficient performance for AI compute tasks. Besides enabling deployment on these chips, this work aims to improve tool usage, add citation capabilities, and mitigate the risks of hallucinations and unsafe responses due to context bias. We benchmark our RAG system's performance on the Natural Questions and HotPotQA datasets, achieving an accuracy of 62% and 59% respectively, exceeding other models such as DBRX and Mixtral Instruct.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Instanced model simplification using combined geometric and appearance-related metric
Authors:
Sadia Tariq,
Anis Ur Rahman,
Tahir Azim,
Rehman Gull Khan
Abstract:
Evolution of 3D graphics and graphical worlds has brought issues like content optimization, real-time processing, rendering, and shared storage limitation under consideration. Generally, different simplification approaches are used to make 3D meshes viable for rendering. However, many of these approaches ignore vertex attributes for instanced 3D meshes. In this paper, we implement and evaluate a s…
▽ More
Evolution of 3D graphics and graphical worlds has brought issues like content optimization, real-time processing, rendering, and shared storage limitation under consideration. Generally, different simplification approaches are used to make 3D meshes viable for rendering. However, many of these approaches ignore vertex attributes for instanced 3D meshes. In this paper, we implement and evaluate a simple and improved version to simplify instanced 3D textured models. The approach uses different vertex attributes in addition to geometry to simplify mesh instances. The resulting simplified models demonstrate efficient time-space requirements and better visual quality.
△ Less
Submitted 7 January, 2021;
originally announced January 2021.
-
A Green Enterprise Computing Architecture for Developing Countries
Authors:
Rabia Akbar,
Tahir Azim
Abstract:
Developing countries often have access to limited energy resources, which frequently results in power cuts and failures. During these power cuts, enterprises rely on backup sources for power such as uninterruptible power supplies (UPS) and electric generators. This paper proposes AnywareDC, an architecture that builds on the recent work on Anyware to reduce energy utilization in the presence of su…
▽ More
Developing countries often have access to limited energy resources, which frequently results in power cuts and failures. During these power cuts, enterprises rely on backup sources for power such as uninterruptible power supplies (UPS) and electric generators. This paper proposes AnywareDC, an architecture that builds on the recent work on Anyware to reduce energy utilization in the presence of such intermittent power supplies. Anyware reduces energy usage by providing enterprise users laptops instead of desktops, while maintaining performance using a central compute cluster. Our basic insight is that in the presence of power cuts, only the routers and the cluster needs to be provided power: the laptops can continue to run on their own batteries. This reduces both energy usage and UPS load allowing it to supply power for longer, thus also saving generator fuel costs. Simulations show that this architecture reduces energy usage by up to 80% compared to one not using Anyware, and by up to 20% compared to Anyware.
△ Less
Submitted 4 February, 2016;
originally announced February 2016.
-
Mobile Computing in Physics Analysis - An Indicator for eScience
Authors:
A. Ali,
A. Anjum,
T. Azim,
J. Bunn,
A. Ikram,
R. McClatchey,
H. Newman,
C. Steenberg,
M. Thomas,
I. Willers
Abstract:
This paper presents the design and implementation of a Grid-enabled physics analysis environment for handheld and other resource-limited computing devices as one example of the use of mobile devices in eScience. Handheld devices offer great potential because they provide ubiquitous access to data and round-the-clock connectivity over wireless links. Our solution aims to provide users of handheld…
▽ More
This paper presents the design and implementation of a Grid-enabled physics analysis environment for handheld and other resource-limited computing devices as one example of the use of mobile devices in eScience. Handheld devices offer great potential because they provide ubiquitous access to data and round-the-clock connectivity over wireless links. Our solution aims to provide users of handheld devices the capability to launch heavy computational tasks on computational and data Grids, monitor the jobs status during execution, and retrieve results after job completion. Users carry their jobs on their handheld devices in the form of executables (and associated libraries). Users can transparently view the status of their jobs and get back their outputs without having to know where they are being executed. In this way, our system is able to act as a high-throughput computing environment where devices ranging from powerful desktop machines to small handhelds can employ the power of the Grid. The results shown in this paper are readily applicable to the wider eScience community.
△ Less
Submitted 5 July, 2007;
originally announced July 2007.
-
JClarens: A Java Framework for Developing and Deploying Web Services for Grid Computing
Authors:
Michael Thomas,
Conrad Steenberg,
Frank van Lingen,
Harvey Newman,
Julian Bunn,
Arshad Ali,
Richard McClatchey,
Ashiq Anjum,
Tahir Azim,
Waqas ur Rehman,
Faisal Khan,
Jang Uk In
Abstract:
High Energy Physics (HEP) and other scientific communities have adopted Service Oriented Architectures (SOA) as part of a larger Grid computing effort. This effort involves the integration of many legacy applications and programming libraries into a SOA framework. The Grid Analysis Environment (GAE) is such a service oriented architecture based on the Clarens Grid Services Framework and is being…
▽ More
High Energy Physics (HEP) and other scientific communities have adopted Service Oriented Architectures (SOA) as part of a larger Grid computing effort. This effort involves the integration of many legacy applications and programming libraries into a SOA framework. The Grid Analysis Environment (GAE) is such a service oriented architecture based on the Clarens Grid Services Framework and is being developed as part of the Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC) at European Laboratory for Particle Physics (CERN). Clarens provides a set of authorization, access control, and discovery services, as well as XMLRPC and SOAP access to all deployed services. Two implementations of the Clarens Web Services Framework (Python and Java) offer integration possibilities for a wide range of programming languages. This paper describes the Java implementation of the Clarens Web Services Framework called JClarens. and several web services of interest to the scientific and Grid community that have been deployed using JClarens.
△ Less
Submitted 11 April, 2005;
originally announced April 2005.
-
Heterogeneous Relational Databases for a Grid-enabled Analysis Environment
Authors:
Arshad Ali,
Ashiq Anjum,
Tahir Azim,
Julian Bunn,
Saima Iqbal,
Richard McClatchey,
Harvey Newman,
S. Yousaf Shah,
Tony Solomonides,
Conrad Steenberg,
Michael Thomas,
Frank van Lingen,
Ian Willers
Abstract:
Grid based systems require a database access mechanism that can provide seamless homogeneous access to the requested data through a virtual data access system, i.e. a system which can take care of tracking the data that is stored in geographically distributed heterogeneous databases. This system should provide an integrated view of the data that is stored in the different repositories by using a…
▽ More
Grid based systems require a database access mechanism that can provide seamless homogeneous access to the requested data through a virtual data access system, i.e. a system which can take care of tracking the data that is stored in geographically distributed heterogeneous databases. This system should provide an integrated view of the data that is stored in the different repositories by using a virtual data access mechanism, i.e. a mechanism which can hide the heterogeneity of the backend databases from the client applications. This paper focuses on accessing data stored in disparate relational databases through a web service interface, and exploits the features of a Data Warehouse and Data Marts. We present a middleware that enables applications to access data stored in geographically distributed relational databases without being aware of their physical locations and underlying schema. A web service interface is provided to enable applications to access this middleware in a language and platform independent way. A prototype implementation was created based on Clarens [4], Unity [7] and POOL [8]. This ability to access the data stored in the distributed relational databases transparently is likely to be a very powerful one for Grid users, especially the scientific community wishing to collate and analyze data distributed over the Grid.
△ Less
Submitted 10 April, 2005;
originally announced April 2005.
-
Resource Management Services for a Grid Analysis Environment
Authors:
Arshad Ali,
Ashiq Anjum,
Tahir Azim,
Julian Bunn,
Atif Mehmood,
Richard McClatchey,
Harvey Newman,
Waqas ur Rehman,
Conrad Steenberg,
Michael Thomas,
Frank van Lingen,
Ian Willers,
Muhammad Adeel Zafar
Abstract:
Selecting optimal resources for submitting jobs on a computational Grid or accessing data from a data grid is one of the most important tasks of any Grid middleware. Most modern Grid software today satisfies this responsibility and gives a best-effort performance to solve this problem. Almost all decisions regarding scheduling and data access are made by the software automatically, giving users…
▽ More
Selecting optimal resources for submitting jobs on a computational Grid or accessing data from a data grid is one of the most important tasks of any Grid middleware. Most modern Grid software today satisfies this responsibility and gives a best-effort performance to solve this problem. Almost all decisions regarding scheduling and data access are made by the software automatically, giving users little or no control over the entire process. To solve this problem, a more interactive set of services and middleware is desired that provides users more information about Grid weather, and gives them more control over the decision making process. This paper presents a set of services that have been developed to provide more interactive resource management capabilities within the Grid Analysis Environment (GAE) being developed collaboratively by Caltech, NUST and several other institutes. These include a steering service, a job monitoring service and an estimator service that have been designed and written using a common Grid-enabled Web Services framework named Clarens. The paper also presents a performance analysis of the developed services to show that they have indeed resulted in a more interactive and powerful system for user-centric Grid-enabled physics analysis.
△ Less
Submitted 10 April, 2005;
originally announced April 2005.
-
A Grid-enabled Interface to Condor for Interactive Analysis on Handheld and Resource-limited Devices
Authors:
Arshad Ali,
Ashiq Anjum,
Tahir Azim,
Julian Bunn,
Ahsan Ikram,
Richard McClatchey,
Harvey Newman,
Conrad Steenberg,
Michael Thomas,
Ian Willers
Abstract:
This paper was withdrawn by the authors.
This paper was withdrawn by the authors.
△ Less
Submitted 30 September, 2004; v1 submitted 5 July, 2004;
originally announced July 2004.
-
Distributed Analysis and Load Balancing System for Grid Enabled Analysis on Hand-held devices using Multi-Agents Systems
Authors:
Naveed Ahmad,
Arshad Ali,
Ashiq Anjum,
Tahir Azim,
Julian Bunn,
Ali Hassan,
Ahsan Ikram,
Frank van Lingen,
Richard McClatchey,
Harvey Newman,
Conrad Steenberg,
Michael Thomas,
Ian Willers
Abstract:
Handheld devices, while growing rapidly, are inherently constrained and lack the capability of executing resource hungry applications. This paper presents the design and implementation of distributed analysis and load-balancing system for hand-held devices using multi-agents system. This system enables low resource mobile handheld devices to act as potential clients for Grid enabled applications…
▽ More
Handheld devices, while growing rapidly, are inherently constrained and lack the capability of executing resource hungry applications. This paper presents the design and implementation of distributed analysis and load-balancing system for hand-held devices using multi-agents system. This system enables low resource mobile handheld devices to act as potential clients for Grid enabled applications and analysis environments. We propose a system, in which mobile agents will transport, schedule, execute and return results for heavy computational jobs submitted by handheld devices. Moreover, in this way, our system provides high throughput computing environment for hand-held devices.
△ Less
Submitted 5 July, 2004;
originally announced July 2004.