-
xGen-MM-Vid (BLIP-3-Video): You Only Need 32 Tokens to Represent a Video Even in VLMs
Authors:
Michael S. Ryoo,
Honglu Zhou,
Shrikant Kendre,
Can Qin,
Le Xue,
Manli Shu,
Jongwoo Park,
Kanchana Ranasinghe,
Silvio Savarese,
Ran Xu,
Caiming Xiong,
Juan Carlos Niebles
Abstract:
We present xGen-MM-Vid (BLIP-3-Video): a multimodal language model for videos, particularly designed to efficiently capture temporal information over multiple frames. BLIP-3-Video takes advantage of the 'temporal encoder' in addition to the conventional visual tokenizer, which maps a sequence of tokens over multiple frames into a compact set of visual tokens. This enables BLIP3-Video to use much f…
▽ More
We present xGen-MM-Vid (BLIP-3-Video): a multimodal language model for videos, particularly designed to efficiently capture temporal information over multiple frames. BLIP-3-Video takes advantage of the 'temporal encoder' in addition to the conventional visual tokenizer, which maps a sequence of tokens over multiple frames into a compact set of visual tokens. This enables BLIP3-Video to use much fewer visual tokens than its competing models (e.g., 32 vs. 4608 tokens). We explore different types of temporal encoders, including learnable spatio-temporal pooling as well as sequential models like Token Turing Machines. We experimentally confirm that BLIP-3-Video obtains video question-answering accuracies comparable to much larger state-of-the-art models (e.g., 34B), while being much smaller (i.e., 4B) and more efficient by using fewer visual tokens. The project website is at https://www.salesforceairesearch.com/opensource/xGen-MM-Vid/index.html
△ Less
Submitted 9 June, 2025; v1 submitted 21 October, 2024;
originally announced October 2024.
-
xGen-MM (BLIP-3): A Family of Open Large Multimodal Models
Authors:
Le Xue,
Manli Shu,
Anas Awadalla,
Jun Wang,
An Yan,
Senthil Purushwalkam,
Honglu Zhou,
Viraj Prabhu,
Yutong Dai,
Michael S Ryoo,
Shrikant Kendre,
Jieyu Zhang,
Can Qin,
Shu Zhang,
Chia-Chih Chen,
Ning Yu,
Juntao Tan,
Tulika Manoj Awalgaonkar,
Shelby Heinecke,
Huan Wang,
Yejin Choi,
Ludwig Schmidt,
Zeyuan Chen,
Silvio Savarese,
Juan Carlos Niebles
, et al. (2 additional authors not shown)
Abstract:
This report introduces xGen-MM (also known as BLIP-3), a framework for developing Large Multimodal Models (LMMs). The framework comprises meticulously curated datasets, a training recipe, model architectures, and a resulting suite of LMMs. xGen-MM, short for xGen-MultiModal, expands the Salesforce xGen initiative on foundation AI models. Our models undergo rigorous evaluation across a range of tas…
▽ More
This report introduces xGen-MM (also known as BLIP-3), a framework for developing Large Multimodal Models (LMMs). The framework comprises meticulously curated datasets, a training recipe, model architectures, and a resulting suite of LMMs. xGen-MM, short for xGen-MultiModal, expands the Salesforce xGen initiative on foundation AI models. Our models undergo rigorous evaluation across a range of tasks, including both single and multi-image benchmarks. Our pre-trained base model exhibits strong in-context learning capabilities and the instruction-tuned model demonstrates competitive performance among open-source LMMs with similar model sizes. In addition, we introduce a safety-tuned model with DPO, aiming to mitigate harmful behaviors such as hallucinations and improve safety. We open-source our models, curated large-scale datasets, and our fine-tuning codebase to facilitate further advancements in LMM research. Associated resources will be available on our project page above.
△ Less
Submitted 28 August, 2024; v1 submitted 16 August, 2024;
originally announced August 2024.
-
Limit cycles of piecewise smooth differential systems with nilpotent center and linear saddle
Authors:
Nanasaheb Phatangare,
Krishnat Masalkar,
Subhash Kendre
Abstract:
In this paper, we study the number of limit cycles of a piecewise smooth differential system separated by one or two parallel straight lines or rays formed by a nilpotent center or degenerate center and linear saddle. Piecewise linear differential systems separated by one or two parallel straight lines with one of the subsystems of type nilpotent center and other subsystems of type linear saddle c…
▽ More
In this paper, we study the number of limit cycles of a piecewise smooth differential system separated by one or two parallel straight lines or rays formed by a nilpotent center or degenerate center and linear saddle. Piecewise linear differential systems separated by one or two parallel straight lines with one of the subsystems of type nilpotent center and other subsystems of type linear saddle can have at most two limit cycles and there are systems in these classes having one limit cycle. The limit cycle in particular consists of saddle separatrices of the subsystem.
△ Less
Submitted 21 July, 2024;
originally announced July 2024.
-
Limit cycles of piecewise smooth differential systems of the type nonlinear center and saddle
Authors:
Nanasaheb Phatangare,
Krishnat Masalkar,
Subhash Kendre
Abstract:
Piecewise linear differential systems separated by two parallel straight lines of the type of center-center-Hamiltonian saddle and the center-Hamiltonian saddle-Hamiltonian saddle can have at most one limit cycle and there are systems in these classes having one limit cycle. In this paper, we study the limit cycles of a piecewise smooth differential system separated by two parallel straight lines…
▽ More
Piecewise linear differential systems separated by two parallel straight lines of the type of center-center-Hamiltonian saddle and the center-Hamiltonian saddle-Hamiltonian saddle can have at most one limit cycle and there are systems in these classes having one limit cycle. In this paper, we study the limit cycles of a piecewise smooth differential system separated by two parallel straight lines formed by nonlinear centers and a Hamiltonian saddle.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Theta function identities and q series
Authors:
Hemant Masal,
Subhash Kendre,
Hemant Bhate
Abstract:
We establish some functional identities of theta functions, an elementary proof of classical fourth-order identities, Landen transformations, and q series from the eigenvectors of the discrete Fourier transform. Also, we derive connection between Rogers-Ramanujan type identity and theta function identity.
We establish some functional identities of theta functions, an elementary proof of classical fourth-order identities, Landen transformations, and q series from the eigenvectors of the discrete Fourier transform. Also, we derive connection between Rogers-Ramanujan type identity and theta function identity.
△ Less
Submitted 12 December, 2023; v1 submitted 10 December, 2023;
originally announced December 2023.
-
FDM Printing: a Fabrication Method for Fluidic Soft Circuits?
Authors:
Savita V. Kendre,
Lehong Wang,
Ethan Wilke,
Nicholas Pacheco,
Loris Fichera,
Markus P. Nemitz
Abstract:
Existing fluidic soft logic gates for the control of soft robots either rely on extensive manual fabrication processes or expensive printing techniques. In our work, we explore Fused Deposition Modeling for creating fully 3D printed fluidic logic gates. We print a soft bistable valve from thermoplastic polyurethane using a desktop FDM printer. We introduce a new printing nozzle for extruding tubin…
▽ More
Existing fluidic soft logic gates for the control of soft robots either rely on extensive manual fabrication processes or expensive printing techniques. In our work, we explore Fused Deposition Modeling for creating fully 3D printed fluidic logic gates. We print a soft bistable valve from thermoplastic polyurethane using a desktop FDM printer. We introduce a new printing nozzle for extruding tubing. Our fabrication strategy reduces the production time of soft bistable valves from 27 hours with replica molding to 3 hours with a FDM printer. Our rapid and cost-effective fabrication process for fluidic logic gates seeks to democratize fluidic circuitry for the control of soft robots.
△ Less
Submitted 2 December, 2023;
originally announced December 2023.
-
STREAM: Software Tool for Routing Efficiently Advanced Macrofluidics
Authors:
Lehong Wang,
Savita V. Kendre,
Haotian Liu,
Markus P. Nemitz
Abstract:
The current fabrication and assembly of fluidic circuits for soft robots relies heavily on manual processes; as the complexity of fluidic circuits increases, manual assembly becomes increasingly arduous, error-prone, and timeconsuming. We introduce a software tool that generates printable fluidic networks automatically. We provide a library of fluidic logic elements that are easily 3D printed from…
▽ More
The current fabrication and assembly of fluidic circuits for soft robots relies heavily on manual processes; as the complexity of fluidic circuits increases, manual assembly becomes increasingly arduous, error-prone, and timeconsuming. We introduce a software tool that generates printable fluidic networks automatically. We provide a library of fluidic logic elements that are easily 3D printed from thermoplastic polyurethanes using Fused Deposition Modeling only. Our software tool and component library allow the development of arbitrary soft digital circuits. We demonstrate a variable frequency ring oscillator and a full adder. The simplicity of our approach using FDM printers only, democratizes fluidic circuit implementation beyond specialized laboratories. Our software is available on GitHub (https://github.com/roboticmaterialsgroup/FluidLogic).
△ Less
Submitted 2 December, 2023;
originally announced December 2023.
-
Ramanujan Theta Function Identities and Quadratic Numbers
Authors:
Hemant Masal,
Hemant Bhate,
Subhash Kendre
Abstract:
Eigenvectors of the discrete Fourier transform can be expressed using Ramanujan theta functions. New theta function identities, Ramanujan theta function identities, and generating functions for the quadratic numbers are a consequence.
Eigenvectors of the discrete Fourier transform can be expressed using Ramanujan theta functions. New theta function identities, Ramanujan theta function identities, and generating functions for the quadratic numbers are a consequence.
△ Less
Submitted 20 January, 2023;
originally announced January 2023.
-
Printable Flexible Robots for Remote Learning
Authors:
Savita V. Kendre,
Gus. T. Teran,
Lauryn Whiteside,
Tyler Looney,
Ryley Wheelock,
Surya Ghai,
Markus P. Nemitz
Abstract:
The COVID-19 pandemic has revealed the importance of digital fabrication to enable online learning, which remains a challenge for robotics courses. We introduce a teaching methodology that allows students to participate remotely in a hands-on robotics course involving the design and fabrication of robots. Our methodology employs 3D printing techniques with flexible filaments to create innovative s…
▽ More
The COVID-19 pandemic has revealed the importance of digital fabrication to enable online learning, which remains a challenge for robotics courses. We introduce a teaching methodology that allows students to participate remotely in a hands-on robotics course involving the design and fabrication of robots. Our methodology employs 3D printing techniques with flexible filaments to create innovative soft robots; robots are made from flexible, as opposed to rigid, materials. Students design flexible robotic components such as actuators, sensors, and controllers using CAD software, upload their designs to a remote 3D printing station, monitor the print with a web camera, and inspect the components with lab staff before being mailed for testing and assembly. At the end of the course, students will have iterated through several designs and created fluidically-driven soft robots. Our remote teaching methodology enables educators to utilize 3D printing resources to teach soft robotics and cultivate creativity among students to design novel and innovative robots. Our methodology seeks to democratize robotics engineering by decoupling hands-on learning experiences from expensive equipment in the learning environment.
△ Less
Submitted 15 July, 2022;
originally announced July 2022.
-
Tube-Balloon Logic for the Exploration of Fluidic Control Elements
Authors:
Jovanna A. Tracz,
Lukas Wille,
Dylan Pathiraja,
Savita V. Kendre,
Ron Pfisterer,
Ethan Turett,
Gus T. Teran,
Christoffer K. Abrahamsson,
Samuel E. Root,
Won-Kyu Lee,
Daniel J. Preston,
Haihui Joy Jiang,
George M. Whitesides,
Markus P. Nemitz
Abstract:
The control of pneumatically driven soft robots typically requires electronics. Microcontrollers are connected to power electronics that switch valves and pumps on and off. As a recent alternative, fluidic control methods have been introduced, in which soft digital logic gates permit multiple actuation states to be achieved in soft systems. Such systems have demonstrated autonomous behaviors witho…
▽ More
The control of pneumatically driven soft robots typically requires electronics. Microcontrollers are connected to power electronics that switch valves and pumps on and off. As a recent alternative, fluidic control methods have been introduced, in which soft digital logic gates permit multiple actuation states to be achieved in soft systems. Such systems have demonstrated autonomous behaviors without the use of electronics. However, fluidic controllers have required complex fabrication processes. To democratize the exploration of fluidic controllers, we developed tube-balloon logic circuitry, which consists of logic gates made from straws and balloons. Each tube-balloon logic device takes a novice five minutes to fabricate and costs $0.45. Tube-balloon logic devices can also operate at pressures of up to 200 kPa and oscillate at frequencies of up to 15 Hz. We configure the tube-balloon logic device as NOT-, NAND-, and NOR-gates and assemble them into a three-ring oscillator to demonstrate a vibrating sieve that separates sugar from rice. Because tube-balloon logic devices are low-cost, easy to fabricate, and their operating principle is simple, they are well suited for exploring fundamental concepts of fluidic control schemes while encouraging design inquiry for pneumatically driven soft robots
△ Less
Submitted 8 February, 2022;
originally announced February 2022.
-
The Soft Compiler: A Web-Based Tool for the Design of Modular Pneumatic Circuits for Soft Robots
Authors:
Lauryn Whiteside,
Savita V. Kendre,
Tian Y. Fan,
Jovanna A. Tracz,
Gus T. Teran,
Thomas C. Underwood,
Mohammed E. Sayed,
Haihui J. Jiang,
Adam A. Stokes,
Daniel J. Preston,
George M. Whitesides,
Markus P. Nemitz
Abstract:
Developing soft circuits from individual soft logic gates poses a unique challenge: with increasing numbers of logic gates, the design and implementation of circuits leads to inefficiencies due to mathematically unoptimized circuits and wiring mistakes during assembly. It is therefore practically important to introduce design tools that support the development of soft circuits. We developed a web-…
▽ More
Developing soft circuits from individual soft logic gates poses a unique challenge: with increasing numbers of logic gates, the design and implementation of circuits leads to inefficiencies due to mathematically unoptimized circuits and wiring mistakes during assembly. It is therefore practically important to introduce design tools that support the development of soft circuits. We developed a web-based graphical user interface, the Soft Compiler, that accepts a user-defined robot behavior as a truth table to generate a mathematically optimized circuit diagram that guides the assembly of a soft fluidic circuit. We describe the design and experimental verification of three soft circuits of increasing complexity, using the Soft Compiler as a design tool and a novel pneumatic glove as an input interface. In one example, we reduce the size of a soft circuit from the original 11 logic gates to 4 logic gates while maintaining circuit functionality. The Soft Compiler is a web-based design tool for fluidic, soft circuits and published under open-source MIT License.
△ Less
Submitted 8 February, 2022;
originally announced February 2022.
-
Bifurcations of limit cycles in piecewise smooth Hamiltonian system with boundary perturbation
Authors:
Nanasaheb Phatangare,
Krishnat Masalkar,
Subhash Kendre
Abstract:
In this paper, the general planar piecewise smooth Hamiltonian system with period annulus around the center at the origin is considered. We obtain the expressions for the first order and the second order Melnikov functions of it's general second order perturbation, which can be used to find the number of limit cycles bifurcated from periodic orbits. Further, we have shown that the number of limit…
▽ More
In this paper, the general planar piecewise smooth Hamiltonian system with period annulus around the center at the origin is considered. We obtain the expressions for the first order and the second order Melnikov functions of it's general second order perturbation, which can be used to find the number of limit cycles bifurcated from periodic orbits. Further, we have shown that the number of limit cycles of the system $\dot{X}=\begin{cases} (H_y^+,-H_x^+) & \mbox{if}~y>\varepsilon f(x)\\ (H_y^-,-H_x^-) & \mbox{if}~y<\varepsilon f(x) \end{cases}$ equals to the number of positive zeros of $f$ when at $\varepsilon=0$ the system has a period annulus around the origin.
△ Less
Submitted 20 December, 2019;
originally announced December 2019.
-
Detecting Hate Speech and Offensive Language on Twitter using Machine Learning: An N-gram and TFIDF based Approach
Authors:
Aditya Gaydhani,
Vikrant Doma,
Shrikant Kendre,
Laxmi Bhagwat
Abstract:
Toxic online content has become a major issue in today's world due to an exponential increase in the use of internet by people of different cultures and educational background. Differentiating hate speech and offensive language is a key challenge in automatic detection of toxic text content. In this paper, we propose an approach to automatically classify tweets on Twitter into three classes: hatef…
▽ More
Toxic online content has become a major issue in today's world due to an exponential increase in the use of internet by people of different cultures and educational background. Differentiating hate speech and offensive language is a key challenge in automatic detection of toxic text content. In this paper, we propose an approach to automatically classify tweets on Twitter into three classes: hateful, offensive and clean. Using Twitter dataset, we perform experiments considering n-grams as features and passing their term frequency-inverse document frequency (TFIDF) values to multiple machine learning models. We perform comparative analysis of the models considering several values of n in n-grams and TFIDF normalization methods. After tuning the model giving the best results, we achieve 95.6% accuracy upon evaluating it on test data. We also create a module which serves as an intermediate between user and Twitter.
△ Less
Submitted 23 September, 2018;
originally announced September 2018.