-
An Analysis of Hyper-Parameter Optimization Methods for Retrieval Augmented Generation
Authors:
Matan Orbach,
Ohad Eytan,
Benjamin Sznajder,
Ariel Gera,
Odellia Boni,
Yoav Kantor,
Gal Bloch,
Omri Levy,
Hadas Abraham,
Nitzan Barzilay,
Eyal Shnarch,
Michael E. Factor,
Shila Ofek-Koifman,
Paula Ta-Shma,
Assaf Toledo
Abstract:
Finding the optimal Retrieval-Augmented Generation (RAG) configuration for a given use case can be complex and expensive. Motivated by this challenge, frameworks for RAG hyper-parameter optimization (HPO) have recently emerged, yet their effectiveness has not been rigorously benchmarked. To address this gap, we present a comprehensive study involving 5 HPO algorithms over 5 datasets from diverse d…
▽ More
Finding the optimal Retrieval-Augmented Generation (RAG) configuration for a given use case can be complex and expensive. Motivated by this challenge, frameworks for RAG hyper-parameter optimization (HPO) have recently emerged, yet their effectiveness has not been rigorously benchmarked. To address this gap, we present a comprehensive study involving 5 HPO algorithms over 5 datasets from diverse domains, including a new one collected for this work on real-world product documentation. Our study explores the largest HPO search space considered to date, with two optimized evaluation metrics. Analysis of the results shows that RAG HPO can be done efficiently, either greedily or with iterative random search, and that it significantly boosts RAG performance for all datasets. For greedy HPO approaches, we show that optimizing models first is preferable to the prevalent practice of optimizing sequentially according to the RAG pipeline order.
△ Less
Submitted 6 May, 2025;
originally announced May 2025.
-
CarGait: Cross-Attention based Re-ranking for Gait recognition
Authors:
Gavriel Habib,
Noa Barzilay,
Or Shimshi,
Rami Ben-Ari,
Nir Darshan
Abstract:
Gait recognition is a computer vision task that identifies individuals based on their walking patterns. Gait recognition performance is commonly evaluated by ranking a gallery of candidates and measuring the accuracy at the top Rank-$K$. Existing models are typically single-staged, i.e. searching for the probe's nearest neighbors in a gallery using a single global feature representation. Although…
▽ More
Gait recognition is a computer vision task that identifies individuals based on their walking patterns. Gait recognition performance is commonly evaluated by ranking a gallery of candidates and measuring the accuracy at the top Rank-$K$. Existing models are typically single-staged, i.e. searching for the probe's nearest neighbors in a gallery using a single global feature representation. Although these models typically excel at retrieving the correct identity within the top-$K$ predictions, they struggle when hard negatives appear in the top short-list, leading to relatively low performance at the highest ranks (e.g., Rank-1). In this paper, we introduce CarGait, a Cross-Attention Re-ranking method for gait recognition, that involves re-ordering the top-$K$ list leveraging the fine-grained correlations between pairs of gait sequences through cross-attention between gait strips. This re-ranking scheme can be adapted to existing single-stage models to enhance their final results. We demonstrate the capabilities of CarGait by extensive experiments on three common gait datasets, Gait3D, GREW, and OU-MVLP, and seven different gait models, showing consistent improvements in Rank-1,5 accuracy, superior results over existing re-ranking methods, and strong baselines.
△ Less
Submitted 5 March, 2025;
originally announced March 2025.
-
Camera Calibration and Stereo via a Single Image of a Spherical Mirror
Authors:
Nissim Barzilay,
Ofek Narinsky,
Michael Werman
Abstract:
This paper presents a novel technique for camera calibration using a single view that incorporates a spherical mirror. Leveraging the distinct characteristics of the sphere's contour visible in the image and its reflections, we showcase the effectiveness of our method in achieving precise calibration. Furthermore, the reflection from the mirrored surface provides additional information about the s…
▽ More
This paper presents a novel technique for camera calibration using a single view that incorporates a spherical mirror. Leveraging the distinct characteristics of the sphere's contour visible in the image and its reflections, we showcase the effectiveness of our method in achieving precise calibration. Furthermore, the reflection from the mirrored surface provides additional information about the surrounding scene beyond the image frame.
Our method paves the way for the development of simple catadioptric stereo systems. We explore the challenges and opportunities associated with employing a single mirrored sphere, highlighting the potential applications of this setup in practical scenarios. The paper delves into the intricacies of the geometry and calibration procedures involved in catadioptric stereo utilizing a spherical mirror.
Experimental results, encompassing both synthetic and real-world data, are presented to illustrate the feasibility and accuracy of our approach.
△ Less
Submitted 24 September, 2024;
originally announced September 2024.
-
SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature
Authors:
David Wadden,
Kejian Shi,
Jacob Morrison,
Aakanksha Naik,
Shruti Singh,
Nitzan Barzilay,
Kyle Lo,
Tom Hope,
Luca Soldaini,
Shannon Zejiang Shen,
Doug Downey,
Hannaneh Hajishirzi,
Arman Cohan
Abstract:
We present SciRIFF (Scientific Resource for Instruction-Following and Finetuning), a dataset of 137K instruction-following demonstrations for 54 tasks covering five essential scientific literature understanding capabilities: information extraction, summarization, question answering, claim verification, and classification. SciRIFF demonstrations are notable for their long input contexts, detailed t…
▽ More
We present SciRIFF (Scientific Resource for Instruction-Following and Finetuning), a dataset of 137K instruction-following demonstrations for 54 tasks covering five essential scientific literature understanding capabilities: information extraction, summarization, question answering, claim verification, and classification. SciRIFF demonstrations are notable for their long input contexts, detailed task specifications, and complex structured outputs. While instruction-following resources are available in specific domains such as clinical medicine and chemistry, SciRIFF is the first dataset focused on extracting and synthesizing information from research literature across a wide range of scientific fields. To demonstrate the utility of SciRIFF, we develop a sample-efficient strategy to adapt a general instruction-following model for science by performing additional finetuning on a mix of general-domain and SciRIFF demonstrations. In evaluations on nine held-out scientific tasks, our model -- called SciTulu -- improves over a strong LLM baseline by 28.1% and 6.5% at the 7B and 70B scales respectively, while maintaining general instruction-following performance within 2% of the baseline. We are optimistic that SciRIFF will facilitate the development and evaluation of LLMs to help researchers navigate the ever-growing body of scientific literature. We release our dataset, model checkpoints, and data processing and evaluation code to enable further research.
△ Less
Submitted 19 August, 2024; v1 submitted 10 June, 2024;
originally announced June 2024.
-
Watch Where You Head: A View-biased Domain Gap in Gait Recognition and Unsupervised Adaptation
Authors:
Gavriel Habib,
Noa Barzilay,
Or Shimshi,
Rami Ben-Ari,
Nir Darshan
Abstract:
Gait Recognition is a computer vision task aiming to identify people by their walking patterns. Although existing methods often show high performance on specific datasets, they lack the ability to generalize to unseen scenarios. Unsupervised Domain Adaptation (UDA) tries to adapt a model, pre-trained in a supervised manner on a source domain, to an unlabelled target domain. There are only a few wo…
▽ More
Gait Recognition is a computer vision task aiming to identify people by their walking patterns. Although existing methods often show high performance on specific datasets, they lack the ability to generalize to unseen scenarios. Unsupervised Domain Adaptation (UDA) tries to adapt a model, pre-trained in a supervised manner on a source domain, to an unlabelled target domain. There are only a few works on UDA for gait recognition proposing solutions to limited scenarios. In this paper, we reveal a fundamental phenomenon in adaptation of gait recognition models, caused by the bias in the target domain to viewing angle or walking direction. We then suggest a remedy to reduce this bias with a novel triplet selection strategy combined with curriculum learning. To this end, we present Gait Orientation-based method for Unsupervised Domain Adaptation (GOUDA). We provide extensive experiments on four widely-used gait datasets, CASIA-B, OU-MVLP, GREW, and Gait3D, and on three backbones, GaitSet, GaitPart, and GaitGL, justifying the view bias and showing the superiority of our proposed method over prior UDA works.
△ Less
Submitted 10 December, 2023; v1 submitted 13 July, 2023;
originally announced July 2023.
-
MISS GAN: A Multi-IlluStrator Style Generative Adversarial Network for image to illustration translation
Authors:
Noa Barzilay,
Tal Berkovitz Shalev,
Raja Giryes
Abstract:
Unsupervised style transfer that supports diverse input styles using only one trained generator is a challenging and interesting task in computer vision. This paper proposes a Multi-IlluStrator Style Generative Adversarial Network (MISS GAN) that is a multi-style framework for unsupervised image-to-illustration translation, which can generate styled yet content preserving images. The illustrations…
▽ More
Unsupervised style transfer that supports diverse input styles using only one trained generator is a challenging and interesting task in computer vision. This paper proposes a Multi-IlluStrator Style Generative Adversarial Network (MISS GAN) that is a multi-style framework for unsupervised image-to-illustration translation, which can generate styled yet content preserving images. The illustrations dataset is a challenging one since it is comprised of illustrations of seven different illustrators, hence contains diverse styles. Existing methods require to train several generators (as the number of illustrators) to handle the different illustrators' styles, which limits their practical usage, or require to train an image specific network, which ignores the style information provided in other images of the illustrator. MISS GAN is both input image specific and uses the information of other images using only one trained model.
△ Less
Submitted 12 August, 2021;
originally announced August 2021.