SSE: Multimodal Semantic Data Selection and Enrichment for Industrial-scale Data Assimilation

Shen, Maying; Chang, Nadine; Liu, Sifei; Alvarez, Jose M.

Computer Science > Computer Vision and Pattern Recognition

arXiv:2409.13860 (cs)

[Submitted on 20 Sep 2024]

Title:SSE: Multimodal Semantic Data Selection and Enrichment for Industrial-scale Data Assimilation

Authors:Maying Shen, Nadine Chang, Sifei Liu, Jose M. Alvarez

View PDF HTML (experimental)

Abstract:In recent years, the data collected for artificial intelligence has grown to an unmanageable amount. Particularly within industrial applications, such as autonomous vehicles, model training computation budgets are being exceeded while model performance is saturating -- and yet more data continues to pour in. To navigate the flood of data, we propose a framework to select the most semantically diverse and important dataset portion. Then, we further semantically enrich it by discovering meaningful new data from a massive unlabeled data pool. Importantly, we can provide explainability by leveraging foundation models to generate semantics for every data point. We quantitatively show that our Semantic Selection and Enrichment framework (SSE) can a) successfully maintain model performance with a smaller training dataset and b) improve model performance by enriching the smaller dataset without exceeding the original dataset size. Consequently, we demonstrate that semantic diversity is imperative for optimal data selection and model performance.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2409.13860 [cs.CV]
	(or arXiv:2409.13860v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2409.13860

Submission history

From: Nadine Chang [view email]
[v1] Fri, 20 Sep 2024 19:17:52 UTC (11,892 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:SSE: Multimodal Semantic Data Selection and Enrichment for Industrial-scale Data Assimilation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:SSE: Multimodal Semantic Data Selection and Enrichment for Industrial-scale Data Assimilation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators