-
Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese
Authors:
Khang T. Doan,
Bao G. Huynh,
Dung T. Hoang,
Thuc D. Pham,
Nhat H. Pham,
Quan T. M. Nguyen,
Bang Q. Vo,
Suong N. Hoang
Abstract:
In this report, we introduce Vintern-1B, a reliable 1-billion-parameters multimodal large language model (MLLM) for Vietnamese language tasks. By integrating the Qwen2-0.5B-Instruct language model with the InternViT-300M-448px visual model, Vintern-1B is optimized for a range of applications, including optical character recognition (OCR), document extraction, and general question-answering in Viet…
▽ More
In this report, we introduce Vintern-1B, a reliable 1-billion-parameters multimodal large language model (MLLM) for Vietnamese language tasks. By integrating the Qwen2-0.5B-Instruct language model with the InternViT-300M-448px visual model, Vintern-1B is optimized for a range of applications, including optical character recognition (OCR), document extraction, and general question-answering in Vietnamese context. The model is fine-tuned on an extensive dataset of over 3 million image-question-answer pairs, achieving robust performance and reliable results across multiple Vietnamese language benchmarks like OpenViVQA and ViTextVQA. Vintern-1B is small enough to fit into various on-device applications easily. Additionally, we have open-sourced several Vietnamese vision question answering (VQA) datasets for text and diagrams, created with Gemini 1.5 Flash. Our models are available at: https://huggingface.co/5CD-AI/Vintern-1B-v2.
△ Less
Submitted 23 August, 2024; v1 submitted 22 August, 2024;
originally announced August 2024.
-
0
Authors:
Quan Thoi Minh Nguyen
Abstract:
What is the funniest number in cryptography? 0. The reason is that for all x, x*0 = 0, i.e., the equation is always satisfied no matter what x is. This article discusses crypto bugs in four BLS signatures' libraries (ethereum/py ecc, supranational/blst, herumi/bls, sigp/milagro bls) that revolve around 0. Furthermore, we develop "splitting zero" attacks to show a weakness in the proof-of-possessio…
▽ More
What is the funniest number in cryptography? 0. The reason is that for all x, x*0 = 0, i.e., the equation is always satisfied no matter what x is. This article discusses crypto bugs in four BLS signatures' libraries (ethereum/py ecc, supranational/blst, herumi/bls, sigp/milagro bls) that revolve around 0. Furthermore, we develop "splitting zero" attacks to show a weakness in the proof-of-possession aggregate signature scheme standardized in BLS RFC draft v4. Eth2 bug bounties program generously awarded $35,000 in total for the reported bugs.
△ Less
Submitted 20 April, 2021;
originally announced April 2021.
-
A "Final" Security Bug
Authors:
Quan Thoi Minh Nguyen
Abstract:
This article discusses a fixed critical security bug in Google Tink's Ed25519 Java implementation. The bug allows remote attackers to extract the private key with only two Ed25519 signatures. The vulnerability comes from the misunderstanding of what "final" in Java programming language means. The bug was discovered during security review before Google Tink was officially released. It reinforces th…
▽ More
This article discusses a fixed critical security bug in Google Tink's Ed25519 Java implementation. The bug allows remote attackers to extract the private key with only two Ed25519 signatures. The vulnerability comes from the misunderstanding of what "final" in Java programming language means. The bug was discovered during security review before Google Tink was officially released. It reinforces the challenge in writing safe cryptographic code and the importance of the security review process even for the code written by professional cryptographers.
△ Less
Submitted 3 April, 2020;
originally announced April 2020.
-
Intuitive Understanding of Quantum Computation and Post-Quantum Cryptography
Authors:
Quan Thoi Minh Nguyen
Abstract:
Post-quantum cryptography is inevitable. National Institute of Standards and Technology (NIST) starts standardizing quantum-resistant public-key cryptography (aka post-quantum cryptography). The reason is that investment in quantum computing is blooming which poses significant threats to our currently deployed cryptographic algorithms. As a security engineer, to prepare for the apocalypse in advan…
▽ More
Post-quantum cryptography is inevitable. National Institute of Standards and Technology (NIST) starts standardizing quantum-resistant public-key cryptography (aka post-quantum cryptography). The reason is that investment in quantum computing is blooming which poses significant threats to our currently deployed cryptographic algorithms. As a security engineer, to prepare for the apocalypse in advance, I've been watching the development of quantum computers and post-quantum cryptography closely. Never mind, I simply made up an excuse to study these fascinating scientific fields. However, they are extremely hard to understand, at least to an amateur like me. This article shares with you my notes with the hope that you will have an intuitive understanding of the beautiful and mind-blowing quantum algorithms and post-quantum cryptography. Update: Multivariate signature scheme Rainbow is broken by Ward Beullens. Supersingular Isogeny Diffie-Hellman protocol (SIDH) is broken by Wouter Castryck and Thomas Decru
△ Less
Submitted 31 July, 2022; v1 submitted 17 March, 2020;
originally announced March 2020.