Search | arXiv e-print repository

SAM 2: Segment Anything in Images and Videos

Authors: Nikhila Ravi, Valentin Gabeur, Yuan-Ting Hu, Ronghang Hu, Chaitanya Ryali, Tengyu Ma, Haitham Khedr, Roman Rädle, Chloe Rolland, Laura Gustafson, Eric Mintun, Junting Pan, Kalyan Vasudev Alwala, Nicolas Carion, Chao-Yuan Wu, Ross Girshick, Piotr Dollár, Christoph Feichtenhofer

Abstract: We present Segment Anything Model 2 (SAM 2), a foundation model towards solving promptable visual segmentation in images and videos. We build a data engine, which improves model and data via user interaction, to collect the largest video segmentation dataset to date. Our model is a simple transformer architecture with streaming memory for real-time video processing. SAM 2 trained on our data provi… ▽ More We present Segment Anything Model 2 (SAM 2), a foundation model towards solving promptable visual segmentation in images and videos. We build a data engine, which improves model and data via user interaction, to collect the largest video segmentation dataset to date. Our model is a simple transformer architecture with streaming memory for real-time video processing. SAM 2 trained on our data provides strong performance across a wide range of tasks. In video segmentation, we observe better accuracy, using 3x fewer interactions than prior approaches. In image segmentation, our model is more accurate and 6x faster than the Segment Anything Model (SAM). We believe that our data, model, and insights will serve as a significant milestone for video segmentation and related perception tasks. We are releasing our main model, dataset, as well as code for model training and our demo. △ Less

Submitted 28 October, 2024; v1 submitted 1 August, 2024; originally announced August 2024.

Comments: Website: https://ai.meta.com/sam2

arXiv:2309.00035 [pdf, other]

FACET: Fairness in Computer Vision Evaluation Benchmark

Authors: Laura Gustafson, Chloe Rolland, Nikhila Ravi, Quentin Duval, Aaron Adcock, Cheng-Yang Fu, Melissa Hall, Candace Ross

Abstract: Computer vision models have known performance disparities across attributes such as gender and skin tone. This means during tasks such as classification and detection, model performance differs for certain classes based on the demographics of the people in the image. These disparities have been shown to exist, but until now there has not been a unified approach to measure these differences for com… ▽ More Computer vision models have known performance disparities across attributes such as gender and skin tone. This means during tasks such as classification and detection, model performance differs for certain classes based on the demographics of the people in the image. These disparities have been shown to exist, but until now there has not been a unified approach to measure these differences for common use-cases of computer vision models. We present a new benchmark named FACET (FAirness in Computer Vision EvaluaTion), a large, publicly available evaluation set of 32k images for some of the most common vision tasks - image classification, object detection and segmentation. For every image in FACET, we hired expert reviewers to manually annotate person-related attributes such as perceived skin tone and hair type, manually draw bounding boxes and label fine-grained person-related classes such as disk jockey or guitarist. In addition, we use FACET to benchmark state-of-the-art vision models and present a deeper understanding of potential performance disparities and challenges across sensitive demographic attributes. With the exhaustive annotations collected, we probe models using single demographics attributes as well as multiple attributes using an intersectional approach (e.g. hair color and perceived skin tone). Our results show that classification, detection, segmentation, and visual grounding models exhibit performance disparities across demographic attributes and intersections of attributes. These harms suggest that not all people represented in datasets receive fair and equitable treatment in these vision tasks. We hope current and future results using our benchmark will contribute to fairer, more robust vision models. FACET is available publicly at https://facet.metademolab.com/ △ Less

Submitted 31 August, 2023; originally announced September 2023.

arXiv:2304.02643 [pdf, other]

Segment Anything

Authors: Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollár, Ross Girshick

Abstract: We introduce the Segment Anything (SA) project: a new task, model, and dataset for image segmentation. Using our efficient model in a data collection loop, we built the largest segmentation dataset to date (by far), with over 1 billion masks on 11M licensed and privacy respecting images. The model is designed and trained to be promptable, so it can transfer zero-shot to new image distributions and… ▽ More We introduce the Segment Anything (SA) project: a new task, model, and dataset for image segmentation. Using our efficient model in a data collection loop, we built the largest segmentation dataset to date (by far), with over 1 billion masks on 11M licensed and privacy respecting images. The model is designed and trained to be promptable, so it can transfer zero-shot to new image distributions and tasks. We evaluate its capabilities on numerous tasks and find that its zero-shot performance is impressive -- often competitive with or even superior to prior fully supervised results. We are releasing the Segment Anything Model (SAM) and corresponding dataset (SA-1B) of 1B masks and 11M images at https://segment-anything.com to foster research into foundation models for computer vision. △ Less

Submitted 5 April, 2023; originally announced April 2023.

Comments: Project web-page: https://segment-anything.com

arXiv:2111.09604 [pdf, other]

doi 10.1103/PhysRevX.12.021006

Emission of photon multiplets by a dc-biased superconducting circuit

Authors: G. C. Ménard, A. Peugeot, C. Padurariu, C. Rolland, B. Kubala, Y. Mukharsky, Z. Iftikhar, C. Altimiras, P. Roche, H. le Sueur, P. Joyez, D. Vion, D. Esteve, J. Ankerhold, F. Portier

Abstract: We observe the emission of bunches of $k \geqslant 1$ photons by a circuit made of a microwave resonator in series with a voltage-biased tunable Josephson junction. The bunches are emitted at specific values $V_k$ of the bias voltage, for which each Cooper pair tunneling across the junction creates exactly k photons in the resonator. The latter is a micro-fabricated spiral coil which resonates and… ▽ More We observe the emission of bunches of $k \geqslant 1$ photons by a circuit made of a microwave resonator in series with a voltage-biased tunable Josephson junction. The bunches are emitted at specific values $V_k$ of the bias voltage, for which each Cooper pair tunneling across the junction creates exactly k photons in the resonator. The latter is a micro-fabricated spiral coil which resonates and leaks photons at 4.4~GHz in a measurement line. Its characteristic impedance of 1.97~k$Ω$ is high enough to reach a strong junction-resonator coupling and a bright emission of the k-photon bunches. We show that a RWA treatment of the system accounts quantitatively for the observed radiation intensity, from $k=1$ to $6$, and over three orders of magnitude when varying the Josephson energy $E_J$. We also measure the second order correlation function of the radiated microwave to determine its Fano factor $F_k$, which in the low $E_J$ limit, confirms with $F_k=k$ the emission of $k$ photon bunches. At larger $E_J$, a more complex behavior is observed in quantitative agreement with numerical simulations. △ Less

Submitted 8 March, 2022; v1 submitted 18 November, 2021; originally announced November 2021.

Journal ref: Phys. Rev. X 12, 021006 (2022)

arXiv:1810.06217 [pdf, other]

doi 10.1103/PhysRevLett.122.186804

Antibunched photons emitted by a dc biased Josephson junction

Authors: C. Rolland, A. Peugeot, S. Dambach, M. Westig, B. Kubala, Y. Mukharsky, C. Altimiras, H. le Sueur, P. Joyez, D. Vion, P. Roche, D. Esteve, J. Ankerhold, F. Portier

Abstract: We show experimentally that a dc biased Josephson junction in series with a high-enough impedance microwave resonator emits antibunched photons. Our resonator is made of a simple micro-fabricated spiral coil that resonates at 4.4 GHz and reaches a 1.97 k$Ω$ characteristic impedance. The second order correlation function of the power leaking out of the resonator drops down to 0.3 at zero delay, whi… ▽ More We show experimentally that a dc biased Josephson junction in series with a high-enough impedance microwave resonator emits antibunched photons. Our resonator is made of a simple micro-fabricated spiral coil that resonates at 4.4 GHz and reaches a 1.97 k$Ω$ characteristic impedance. The second order correlation function of the power leaking out of the resonator drops down to 0.3 at zero delay, which demonstrates the antibunching of the photons emitted by the circuit at a rate of 6 $10^7$ photons per second. Results are found in quantitative agreement with our theoretical predictions. This simple scheme could offer an efficient and bright single-photon source in the microwave domain. △ Less

Submitted 13 May, 2019; v1 submitted 15 October, 2018; originally announced October 2018.

Journal ref: Phys. Rev. Lett. 122, 186804 (2019)

arXiv:1307.2058 [pdf, other]

doi 10.1063/1.4809576

Inhomogeneous Si-doping of gold-seeded InAs nanowires grown by molecular beam epitaxy

Authors: Chloé Rolland, Philippe Caroff, Christophe Coinon, Xavier Wallart, Renaud Leturcq

Abstract: We have investigated in-situ Si doping of InAs nanowires grown by molecular beam epitaxy from gold seeds. The effectiveness of n-type doping is confirmed by electrical measurements showing an increase of the electron density with the Si flux. We also observe an increase of the electron density along the nanowires from the tip to the base, attributed to the dopant incorporation on the nanowire face… ▽ More We have investigated in-situ Si doping of InAs nanowires grown by molecular beam epitaxy from gold seeds. The effectiveness of n-type doping is confirmed by electrical measurements showing an increase of the electron density with the Si flux. We also observe an increase of the electron density along the nanowires from the tip to the base, attributed to the dopant incorporation on the nanowire facets whereas no detectable incorporation occurs through the seed. Furthermore the Si incorporation strongly influences the lateral growth of the nanowires without giving rise to significant tapering, revealing the complex interplay between axial and lateral growth. △ Less

Submitted 8 July, 2013; originally announced July 2013.

Comments: 5 pages, 3 figures

Journal ref: Appl. Phys. Lett. 102, 223105 (2013)

arXiv:0911.0430 [pdf]

Enhancing the Guidance of the Intentional Model "MAP": Graph Theory Application

Authors: Rebecca Deneckere, Elena Kornyshova, Colette Rolland

Abstract: The MAP model was introduced in information system engineering in order to model processes on a flexible way. The intentional level of this model helps an engineer to execute a process with a strong relationship to the situation of the project at hand. In the literature, attempts for having a practical use of maps are not numerous. Our aim is to enhance the guidance mechanisms of the process exe… ▽ More The MAP model was introduced in information system engineering in order to model processes on a flexible way. The intentional level of this model helps an engineer to execute a process with a strong relationship to the situation of the project at hand. In the literature, attempts for having a practical use of maps are not numerous. Our aim is to enhance the guidance mechanisms of the process execution by reusing graph algorithms. After clarifying the existing relationship between graphs and maps, we improve the MAP model by adding qualitative criteria. We then offer a way to express maps with graphs and propose to use Graph theory algorithms to offer an automatic guidance of the map. We illustrate our proposal by an example and discuss its limitations. △ Less

Submitted 2 November, 2009; originally announced November 2009.

Comments: 9 pages

Journal ref: Research challenges in Information Systems, Fes : Morocco (2009)

Showing 1–7 of 7 results for author: Rolland, C