FishEye8K: A Benchmark and Dataset for Fisheye Camera Object Detection

Gochoo, Munkhjargal; Otgonbold, Munkh-Erdene; Ganbold, Erkhembayar; Hsieh, Jun-Wei; Chang, Ming-Ching; Chen, Ping-Yang; Dorj, Byambaa; Jassmi, Hamad Al; Batnasan, Ganzorig; Alnajjar, Fady; Abduljabbar, Mohammed; Lin, Fang-Pang

Computer Science > Computer Vision and Pattern Recognition

arXiv:2305.17449 (cs)

[Submitted on 27 May 2023 (v1), last revised 6 Jun 2023 (this version, v2)]

Title:FishEye8K: A Benchmark and Dataset for Fisheye Camera Object Detection

Authors:Munkhjargal Gochoo, Munkh-Erdene Otgonbold, Erkhembayar Ganbold, Jun-Wei Hsieh, Ming-Ching Chang, Ping-Yang Chen, Byambaa Dorj, Hamad Al Jassmi, Ganzorig Batnasan, Fady Alnajjar, Mohammed Abduljabbar, Fang-Pang Lin

View PDF

Abstract:With the advance of AI, road object detection has been a prominent topic in computer vision, mostly using perspective cameras. Fisheye lens provides omnidirectional wide coverage for using fewer cameras to monitor road intersections, however with view distortions. To our knowledge, there is no existing open dataset prepared for traffic surveillance on fisheye cameras. This paper introduces an open FishEye8K benchmark dataset for road object detection tasks, which comprises 157K bounding boxes across five classes (Pedestrian, Bike, Car, Bus, and Truck). In addition, we present benchmark results of State-of-The-Art (SoTA) models, including variations of YOLOv5, YOLOR, YOLO7, and YOLOv8. The dataset comprises 8,000 images recorded in 22 videos using 18 fisheye cameras for traffic monitoring in Hsinchu, Taiwan, at resolutions of 1080$\times$1080 and 1280$\times$1280. The data annotation and validation process were arduous and time-consuming, due to the ultra-wide panoramic and hemispherical fisheye camera images with large distortion and numerous road participants, particularly people riding scooters. To avoid bias, frames from a particular camera were assigned to either the training or test sets, maintaining a ratio of about 70:30 for both the number of images and bounding boxes in each class. Experimental results show that YOLOv8 and YOLOR outperform on input sizes 640$\times$640 and 1280$\times$1280, respectively. The dataset will be available on GitHub with PASCAL VOC, MS COCO, and YOLO annotation formats. The FishEye8K benchmark will provide significant contributions to the fisheye video analytics and smart city applications.

Comments:	CVPR Workshops 2023
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2305.17449 [cs.CV]
	(or arXiv:2305.17449v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2305.17449

Submission history

From: Munkhjargal Gochoo [view email]
[v1] Sat, 27 May 2023 11:26:25 UTC (10,956 KB)
[v2] Tue, 6 Jun 2023 07:02:32 UTC (10,957 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:FishEye8K: A Benchmark and Dataset for Fisheye Camera Object Detection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:FishEye8K: A Benchmark and Dataset for Fisheye Camera Object Detection

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators