-
Chitrarth: Bridging Vision and Language for a Billion People
Authors:
Shaharukh Khan,
Ayush Tarun,
Abhinav Ravi,
Ali Faraz,
Akshat Patidar,
Praveen Kumar Pokala,
Anagha Bhangare,
Raja Kolla,
Chandra Khatri,
Shubham Agarwal
Abstract:
Recent multimodal foundation models are primarily trained on English or high resource European language data, which hinders their applicability to other medium and low-resource languages. To address this limitation, we introduce Chitrarth (Chitra: Image; Artha: Meaning), an inclusive Vision-Language Model (VLM), specifically targeting the rich linguistic diversity and visual reasoning across 10 pr…
▽ More
Recent multimodal foundation models are primarily trained on English or high resource European language data, which hinders their applicability to other medium and low-resource languages. To address this limitation, we introduce Chitrarth (Chitra: Image; Artha: Meaning), an inclusive Vision-Language Model (VLM), specifically targeting the rich linguistic diversity and visual reasoning across 10 prominent Indian languages. Our model effectively integrates a state-of-the-art (SOTA) multilingual Large Language Model (LLM) with a vision module, primarily trained on multilingual image-text data. Furthermore, we also introduce BharatBench, a comprehensive framework for evaluating VLMs across various Indian languages, ultimately contributing to more diverse and effective AI systems. Our model achieves SOTA results for benchmarks across low resource languages while retaining its efficiency in English. Through our research, we aim to set new benchmarks in multilingual-multimodal capabilities, offering substantial improvements over existing models and establishing a foundation to facilitate future advancements in this arena.
△ Less
Submitted 21 February, 2025;
originally announced February 2025.
-
Krutrim LLM: Multilingual Foundational Model for over a Billion People
Authors:
Aditya Kallappa,
Palash Kamble,
Abhinav Ravi,
Akshat Patidar,
Vinayak Dhruv,
Deepak Kumar,
Raghav Awasthi,
Arveti Manjunath,
Himanshu Gupta,
Shubham Agarwal,
Kumar Ashish,
Gautam Bhargava,
Chandra Khatri
Abstract:
India is a diverse society with unique challenges in developing AI systems, including linguistic diversity, oral traditions, data accessibility, and scalability. Existing foundation models are primarily trained on English, limiting their effectiveness for India's population. Indic languages comprise only 1 percent of Common Crawl corpora despite India representing 18 percent of the global populati…
▽ More
India is a diverse society with unique challenges in developing AI systems, including linguistic diversity, oral traditions, data accessibility, and scalability. Existing foundation models are primarily trained on English, limiting their effectiveness for India's population. Indic languages comprise only 1 percent of Common Crawl corpora despite India representing 18 percent of the global population, leading to linguistic biases. Thousands of regional languages, dialects, and code mixing create additional representation challenges due to sparse training data.
We introduce Krutrim LLM, a 2 trillion token multilingual model designed for India's linguistic landscape. It incorporates the largest known Indic dataset, mitigating data scarcity and ensuring balanced performance across dialects. Krutrim outperforms or matches state-of-the-art models on Indic benchmarks while maintaining competitive English performance. Despite being significantly smaller in training flops, Krutrim LLM matches or exceeds models like LLAMA-2 on 10 out of 16 tasks, with an average score of 0.57 versus 0.55. This evidences Krutrim's flexible multilingual fluency across diverse linguistic contexts.
Krutrim is integrated with real-time search to improve factual accuracy in conversational AI applications. This enhances accessibility for over 1 billion users worldwide. Through intentional design choices addressing data imbalances, Krutrim LLM signifies meaningful progress in building ethical, globally representative AI models.
△ Less
Submitted 24 February, 2025; v1 submitted 10 February, 2025;
originally announced February 2025.
-
MARTINI Coarse-grained Force Field for Thermoplastic Starch Nanocomposites
Authors:
Ankit Patidar,
Gaurav Goel
Abstract:
Thermoplastic starch (TPS) is an excellent film-forming material, and adding fillers such as tetramethylammonium-montmorillonite (TMA-MMT) clay has significantly expanded its use in packaging applications. We first used all-atom (AA) simulation to predict several macroscopic (Young modulus, glass transition temperature, density) and microscopic (conformation along 1-4 and 1-6 glycosidic linkages,…
▽ More
Thermoplastic starch (TPS) is an excellent film-forming material, and adding fillers such as tetramethylammonium-montmorillonite (TMA-MMT) clay has significantly expanded its use in packaging applications. We first used all-atom (AA) simulation to predict several macroscopic (Young modulus, glass transition temperature, density) and microscopic (conformation along 1-4 and 1-6 glycosidic linkages, composite morphology) properties of TPS melt and TPS-TMA-MMT composite. The interplay of polymer-surface, plasticizer-surface, and polymer-plasticizer interactions leads to conformational and dynamic properties distinct from those in systems with either attractive or repulsive polymer-surface interactions. A subset of AA properties was used to parameterize the MARTINI coarse-grained (CG) force field (FF) for the melt and composite systems. Specifically, we determined the missing bonded parameters of amylose and amylopectin and rationalized the bead types for 1-4 and 1-6 linked alpha-D glucose using two-body excess entropy, density, and bond and angle distributions in AA TPS melt. The MARTINI CG model for TPS was combined with an existing parameter set for TMA-MMT. The liquid-liquid partitioning-based MARTINI-2 FF shows freezing and compaction of polymer chains near the sheet surface, further accentuated by lowering of dispersive interactions between pairs of high covalent coordination ring units of TPS polymers and MMT sheet. A rescaling of the dispersive component of TPS MMT cross-interactions was used to optimize the FF for the composite system, with structural, thermodynamic, and dynamic properties obtained from long AA simulations forming the constraints for optimization. The obtained CG FF parameters provided excellent estimates for several other properties of the melt and composite systems not used in parameter estimation, thus establishing the robustness of the developed model.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
On the number of pancake stacks requiring four flips to be sorted
Authors:
Saúl A. Blanco,
Charles Buehrle,
Akshay Patidar
Abstract:
Using existing classification results for the 7- and 8-cycles in the pancake graph, we determine the number of permutations that require 4 pancake flips (prefix reversals) to be sorted. A similar characterization of the 8-cycles in the burnt pancake graph, due to the authors, is used to derive a formula for the number of signed permutations requiring 4 (burnt) pancake flips to be sorted. We furthe…
▽ More
Using existing classification results for the 7- and 8-cycles in the pancake graph, we determine the number of permutations that require 4 pancake flips (prefix reversals) to be sorted. A similar characterization of the 8-cycles in the burnt pancake graph, due to the authors, is used to derive a formula for the number of signed permutations requiring 4 (burnt) pancake flips to be sorted. We furthermore provide an analogous characterization of the 9-cycles in the burnt pancake graph. Finally we present numerical evidence that polynomial formulae exist giving the number of signed permutations that require $k$ flips to be sorted, with $5\leq k\leq9$.
△ Less
Submitted 26 October, 2019; v1 submitted 11 February, 2019;
originally announced February 2019.
-
Hand Gesture Detection and Conversion to Speech and Text
Authors:
K. Manikandan,
Ayush Patidar,
Pallav Walia,
Aneek Barman Roy
Abstract:
The hand gestures are one of the typical methods used in sign language. It is very difficult for the hearing-impaired people to communicate with the world. This project presents a solution that will not only automatically recognize the hand gestures but will also convert it into speech and text output so that impaired person can easily communicate with normal people. A camera attached to computer…
▽ More
The hand gestures are one of the typical methods used in sign language. It is very difficult for the hearing-impaired people to communicate with the world. This project presents a solution that will not only automatically recognize the hand gestures but will also convert it into speech and text output so that impaired person can easily communicate with normal people. A camera attached to computer will capture images of hand and the contour feature extraction is used to recognize the hand gestures of the person. Based on the recognized gestures, the recorded soundtrack will be played.
△ Less
Submitted 29 November, 2018;
originally announced November 2018.
-
Cycles in the burnt pancake graphs
Authors:
Saúl A. Blanco,
Charles Buehrle,
Akshay Patidar
Abstract:
The pancake graph $P_n$ is the Cayley graph of the symmetric group $S_n$ on $n$ elements generated by prefix reversals. $P_n$ has been shown to have properties that makes it a useful network scheme for parallel processors. For example, it is $(n-1)$-regular, vertex-transitive, and one can embed cycles in it of length $\ell$ with $6\leq\ell\leq n!$. The burnt pancake graph $BP_n$, which is the Cayl…
▽ More
The pancake graph $P_n$ is the Cayley graph of the symmetric group $S_n$ on $n$ elements generated by prefix reversals. $P_n$ has been shown to have properties that makes it a useful network scheme for parallel processors. For example, it is $(n-1)$-regular, vertex-transitive, and one can embed cycles in it of length $\ell$ with $6\leq\ell\leq n!$. The burnt pancake graph $BP_n$, which is the Cayley graph of the group of signed permutations $B_n$ using prefix reversals as generators, has similar properties. Indeed, $BP_n$ is $n$-regular and vertex-transitive. In this paper, we show that $BP_n$ has every cycle of length $\ell$ with $8\leq\ell\leq 2^n n!$. The proof given is a constructive one that utilizes the recursive structure of $BP_n$. We also present a complete characterization of all the $8$-cycles in $BP_n$ for $n \geq 2$, which are the smallest cycles embeddable in $BP_n$, by presenting their canonical forms as products of the prefix reversal generators.
△ Less
Submitted 24 July, 2019; v1 submitted 14 August, 2018;
originally announced August 2018.