Notes on Applicability of GPT-4 to Document Understanding

Borchmann, Łukasz

Computer Science > Computation and Language

arXiv:2405.18433 (cs)

[Submitted on 28 May 2024]

Title:Notes on Applicability of GPT-4 to Document Understanding

Authors:Łukasz Borchmann

View PDF HTML (experimental)

Abstract:We perform a missing, reproducible evaluation of all publicly available GPT-4 family models concerning the Document Understanding field, where it is frequently required to comprehend text spacial arrangement and visual clues in addition to textual semantics. Benchmark results indicate that though it is hard to achieve satisfactory results with text-only models, GPT-4 Vision Turbo performs well when one provides both text recognized by an external OCR engine and document images on the input. Evaluation is followed by analyses that suggest possible contamination of textual GPT-4 models and indicate the significant performance drop for lengthy documents.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2405.18433 [cs.CL]
	(or arXiv:2405.18433v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2405.18433

Submission history

From: Lukasz Borchmann [view email]
[v1] Tue, 28 May 2024 17:59:53 UTC (7,670 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2024-05

Change to browse by:

References & Citations

export BibTeX citation

Computer Science > Computation and Language

Title:Notes on Applicability of GPT-4 to Document Understanding

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Notes on Applicability of GPT-4 to Document Understanding

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators