Showing 1–2 of 2 results for author: Chauvin, T

Search v0.5.6 released 2020-02-24

arXiv:2407.08708 [pdf, other]

cs.CR cs.AI cs.LG

eyeballvul: a future-proof benchmark for vulnerability detection in the wild

Authors: Timothee Chauvin

Abstract: Long contexts of recent LLMs have enabled a new use case: asking models to find security vulnerabilities in entire codebases. To evaluate model performance on this task, we introduce eyeballvul: a benchmark designed to test the vulnerability detection capabilities of language models at scale, that is sourced and updated weekly from the stream of published vulnerabilities in open-source repositorie… ▽ More Long contexts of recent LLMs have enabled a new use case: asking models to find security vulnerabilities in entire codebases. To evaluate model performance on this task, we introduce eyeballvul: a benchmark designed to test the vulnerability detection capabilities of language models at scale, that is sourced and updated weekly from the stream of published vulnerabilities in open-source repositories. The benchmark consists of a list of revisions in different repositories, each associated with the list of known vulnerabilities present at that revision. An LLM-based scorer is used to compare the list of possible vulnerabilities returned by a model to the list of known vulnerabilities for each revision. As of July 2024, eyeballvul contains 24,000+ vulnerabilities across 6,000+ revisions and 5,000+ repositories, and is around 55GB in size. △ Less

Submitted 13 July, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

Comments: Due to a bug in the litellm library (we haven't tracked exactly which one, but probably at least https://github.com/BerriAI/litellm/commit/2452753e084e8134c0c484b32c63fb5f2950c5ba), our Gemini 1.5 Pro inference costs were incorrect. We've updated the relevant plot (Fig 7) and its interpretation (both Gemini 1.5 Pro and Claude 3.5 Sonnet stand out from the other models, not just Gemini 1.5 Pro)
arXiv:2111.07959 [pdf, other]

cs.IT

Neural Normalized Min-Sum Message-Passing vs. Viterbi Decoding for the CCSDS Line Product Code

Authors: Jonathan Nguyen, Linfang Wang, Chester Hulse, Sahil Dani, Amaael Antonini, Todd Chauvin, Divsalar Dariush, Richard Wesel

Abstract: The Consultative Committee for Space Data Systems (CCSDS) 141.11-O-1 Line Product Code (LPC) provides a rare opportunity to compare maximum-likelihood decoding and message passing. The LPC considered in this paper is intended to serve as the inner code in conjunction with a (255,239) Reed Solomon (RS) code whose symbols are bytes of data. This paper represents the 141.11-O-1 LPC as a bipartite gra… ▽ More The Consultative Committee for Space Data Systems (CCSDS) 141.11-O-1 Line Product Code (LPC) provides a rare opportunity to compare maximum-likelihood decoding and message passing. The LPC considered in this paper is intended to serve as the inner code in conjunction with a (255,239) Reed Solomon (RS) code whose symbols are bytes of data. This paper represents the 141.11-O-1 LPC as a bipartite graph and uses that graph to formulate both maximum likelihood (ML) and message passing algorithms. ML decoding must, of course, have the best frame error rate (FER) performance. However, a fixed point implementation of a Neural-Normalized MinSum (N-NMS) message passing decoder closely approaches ML performance with a significantly lower complexity. △ Less

Submitted 15 November, 2021; originally announced November 2021.

Comments: This paper has been submitted to ICC 2022

Search v0.5.6 released 2020-02-24