Search | arXiv e-print repository

Chitrarth: Bridging Vision and Language for a Billion People

Authors: Shaharukh Khan, Ayush Tarun, Abhinav Ravi, Ali Faraz, Akshat Patidar, Praveen Kumar Pokala, Anagha Bhangare, Raja Kolla, Chandra Khatri, Shubham Agarwal

Abstract: Recent multimodal foundation models are primarily trained on English or high resource European language data, which hinders their applicability to other medium and low-resource languages. To address this limitation, we introduce Chitrarth (Chitra: Image; Artha: Meaning), an inclusive Vision-Language Model (VLM), specifically targeting the rich linguistic diversity and visual reasoning across 10 pr… ▽ More Recent multimodal foundation models are primarily trained on English or high resource European language data, which hinders their applicability to other medium and low-resource languages. To address this limitation, we introduce Chitrarth (Chitra: Image; Artha: Meaning), an inclusive Vision-Language Model (VLM), specifically targeting the rich linguistic diversity and visual reasoning across 10 prominent Indian languages. Our model effectively integrates a state-of-the-art (SOTA) multilingual Large Language Model (LLM) with a vision module, primarily trained on multilingual image-text data. Furthermore, we also introduce BharatBench, a comprehensive framework for evaluating VLMs across various Indian languages, ultimately contributing to more diverse and effective AI systems. Our model achieves SOTA results for benchmarks across low resource languages while retaining its efficiency in English. Through our research, we aim to set new benchmarks in multilingual-multimodal capabilities, offering substantial improvements over existing models and establishing a foundation to facilitate future advancements in this arena. △ Less

Submitted 21 February, 2025; originally announced February 2025.

arXiv:2502.09642 [pdf, other]

Krutrim LLM: Multilingual Foundational Model for over a Billion People

Authors: Aditya Kallappa, Palash Kamble, Abhinav Ravi, Akshat Patidar, Vinayak Dhruv, Deepak Kumar, Raghav Awasthi, Arveti Manjunath, Himanshu Gupta, Shubham Agarwal, Kumar Ashish, Gautam Bhargava, Chandra Khatri

Abstract: India is a diverse society with unique challenges in developing AI systems, including linguistic diversity, oral traditions, data accessibility, and scalability. Existing foundation models are primarily trained on English, limiting their effectiveness for India's population. Indic languages comprise only 1 percent of Common Crawl corpora despite India representing 18 percent of the global populati… ▽ More India is a diverse society with unique challenges in developing AI systems, including linguistic diversity, oral traditions, data accessibility, and scalability. Existing foundation models are primarily trained on English, limiting their effectiveness for India's population. Indic languages comprise only 1 percent of Common Crawl corpora despite India representing 18 percent of the global population, leading to linguistic biases. Thousands of regional languages, dialects, and code mixing create additional representation challenges due to sparse training data. We introduce Krutrim LLM, a 2 trillion token multilingual model designed for India's linguistic landscape. It incorporates the largest known Indic dataset, mitigating data scarcity and ensuring balanced performance across dialects. Krutrim outperforms or matches state-of-the-art models on Indic benchmarks while maintaining competitive English performance. Despite being significantly smaller in training flops, Krutrim LLM matches or exceeds models like LLAMA-2 on 10 out of 16 tasks, with an average score of 0.57 versus 0.55. This evidences Krutrim's flexible multilingual fluency across diverse linguistic contexts. Krutrim is integrated with real-time search to improve factual accuracy in conversational AI applications. This enhances accessibility for over 1 billion users worldwide. Through intentional design choices addressing data imbalances, Krutrim LLM signifies meaningful progress in building ethical, globally representative AI models. △ Less

Submitted 24 February, 2025; v1 submitted 10 February, 2025; originally announced February 2025.

arXiv:2406.05243 [pdf, other]

MARTINI Coarse-grained Force Field for Thermoplastic Starch Nanocomposites

Authors: Ankit Patidar, Gaurav Goel

Abstract: Thermoplastic starch (TPS) is an excellent film-forming material, and adding fillers such as tetramethylammonium-montmorillonite (TMA-MMT) clay has significantly expanded its use in packaging applications. We first used all-atom (AA) simulation to predict several macroscopic (Young modulus, glass transition temperature, density) and microscopic (conformation along 1-4 and 1-6 glycosidic linkages,… ▽ More Thermoplastic starch (TPS) is an excellent film-forming material, and adding fillers such as tetramethylammonium-montmorillonite (TMA-MMT) clay has significantly expanded its use in packaging applications. We first used all-atom (AA) simulation to predict several macroscopic (Young modulus, glass transition temperature, density) and microscopic (conformation along 1-4 and 1-6 glycosidic linkages, composite morphology) properties of TPS melt and TPS-TMA-MMT composite. The interplay of polymer-surface, plasticizer-surface, and polymer-plasticizer interactions leads to conformational and dynamic properties distinct from those in systems with either attractive or repulsive polymer-surface interactions. A subset of AA properties was used to parameterize the MARTINI coarse-grained (CG) force field (FF) for the melt and composite systems. Specifically, we determined the missing bonded parameters of amylose and amylopectin and rationalized the bead types for 1-4 and 1-6 linked alpha-D glucose using two-body excess entropy, density, and bond and angle distributions in AA TPS melt. The MARTINI CG model for TPS was combined with an existing parameter set for TMA-MMT. The liquid-liquid partitioning-based MARTINI-2 FF shows freezing and compaction of polymer chains near the sheet surface, further accentuated by lowering of dispersive interactions between pairs of high covalent coordination ring units of TPS polymers and MMT sheet. A rescaling of the dispersive component of TPS MMT cross-interactions was used to optimize the FF for the composite system, with structural, thermodynamic, and dynamic properties obtained from long AA simulations forming the constraints for optimization. The obtained CG FF parameters provided excellent estimates for several other properties of the melt and composite systems not used in parameter estimation, thus establishing the robustness of the developed model. △ Less

Submitted 7 June, 2024; originally announced June 2024.

arXiv:1902.04055 [pdf, ps, other]

doi 10.23638/DMTCS-21-2-5

On the number of pancake stacks requiring four flips to be sorted

Authors: Saúl A. Blanco, Charles Buehrle, Akshay Patidar

Abstract: Using existing classification results for the 7- and 8-cycles in the pancake graph, we determine the number of permutations that require 4 pancake flips (prefix reversals) to be sorted. A similar characterization of the 8-cycles in the burnt pancake graph, due to the authors, is used to derive a formula for the number of signed permutations requiring 4 (burnt) pancake flips to be sorted. We furthe… ▽ More Using existing classification results for the 7- and 8-cycles in the pancake graph, we determine the number of permutations that require 4 pancake flips (prefix reversals) to be sorted. A similar characterization of the 8-cycles in the burnt pancake graph, due to the authors, is used to derive a formula for the number of signed permutations requiring 4 (burnt) pancake flips to be sorted. We furthermore provide an analogous characterization of the 9-cycles in the burnt pancake graph. Finally we present numerical evidence that polynomial formulae exist giving the number of signed permutations that require $k$ flips to be sorted, with $5\leq k\leq9$. △ Less

Submitted 26 October, 2019; v1 submitted 11 February, 2019; originally announced February 2019.

Comments: We have finalized for the paper for publication in DMTCS, updated a reference to its published version, moved the abstract to its proper location, and added a thank you to the referees. The paper has 27 pages, 6 figures, and 2 tables

MSC Class: 05A15; 05A05; 68R10 ACM Class: G.2.0; G.2.1; G.2.2

Journal ref: Discrete Mathematics & Theoretical Computer Science, Vol. 21 no. 2, Permutation Patters 2018 (November 4, 2019) dmtcs:5214

arXiv:1811.11997 [pdf]

Hand Gesture Detection and Conversion to Speech and Text

Authors: K. Manikandan, Ayush Patidar, Pallav Walia, Aneek Barman Roy

Abstract: The hand gestures are one of the typical methods used in sign language. It is very difficult for the hearing-impaired people to communicate with the world. This project presents a solution that will not only automatically recognize the hand gestures but will also convert it into speech and text output so that impaired person can easily communicate with normal people. A camera attached to computer… ▽ More The hand gestures are one of the typical methods used in sign language. It is very difficult for the hearing-impaired people to communicate with the world. This project presents a solution that will not only automatically recognize the hand gestures but will also convert it into speech and text output so that impaired person can easily communicate with normal people. A camera attached to computer will capture images of hand and the contour feature extraction is used to recognize the hand gestures of the person. Based on the recognized gestures, the recorded soundtrack will be played. △ Less

Submitted 29 November, 2018; originally announced November 2018.

Comments: 5 pages, 5 figures, International Conference on Innovations and Discoveries in Science, Engineering and Technology(ICIDSET) 2018

Journal ref: International Journal of Pure and Applied Mathematics, Volume 120 No. 6 2018, 1347-1362, ISSN: 1314-3395 (on-line version)

arXiv:1808.04890 [pdf, ps, other]

Cycles in the burnt pancake graphs

Authors: Saúl A. Blanco, Charles Buehrle, Akshay Patidar

Abstract: The pancake graph $P_n$ is the Cayley graph of the symmetric group $S_n$ on $n$ elements generated by prefix reversals. $P_n$ has been shown to have properties that makes it a useful network scheme for parallel processors. For example, it is $(n-1)$-regular, vertex-transitive, and one can embed cycles in it of length $\ell$ with $6\leq\ell\leq n!$. The burnt pancake graph $BP_n$, which is the Cayl… ▽ More The pancake graph $P_n$ is the Cayley graph of the symmetric group $S_n$ on $n$ elements generated by prefix reversals. $P_n$ has been shown to have properties that makes it a useful network scheme for parallel processors. For example, it is $(n-1)$-regular, vertex-transitive, and one can embed cycles in it of length $\ell$ with $6\leq\ell\leq n!$. The burnt pancake graph $BP_n$, which is the Cayley graph of the group of signed permutations $B_n$ using prefix reversals as generators, has similar properties. Indeed, $BP_n$ is $n$-regular and vertex-transitive. In this paper, we show that $BP_n$ has every cycle of length $\ell$ with $8\leq\ell\leq 2^n n!$. The proof given is a constructive one that utilizes the recursive structure of $BP_n$. We also present a complete characterization of all the $8$-cycles in $BP_n$ for $n \geq 2$, which are the smallest cycles embeddable in $BP_n$, by presenting their canonical forms as products of the prefix reversal generators. △ Less

Submitted 24 July, 2019; v1 submitted 14 August, 2018; originally announced August 2018.

Comments: Added a reference, clarified some definitions, fixed some typos. 42 pages, 9 figures, 20 pages of appendices

MSC Class: 68R10; 05C25; 05C45 ACM Class: G.2.2; C.2.1

Showing 1–6 of 6 results for author: Patidar, A