Multi-stream CNN based Video Semantic Segmentation for Automated Driving

Sistu, Ganesh; Chennupati, Sumanth; Yogamani, Senthil

doi:10.5220/0007248401730180

Computer Science > Computer Vision and Pattern Recognition

arXiv:1901.02511 (cs)

[Submitted on 8 Jan 2019]

Title:Multi-stream CNN based Video Semantic Segmentation for Automated Driving

Authors:Ganesh Sistu, Sumanth Chennupati, Senthil Yogamani

View PDF

Abstract:Majority of semantic segmentation algorithms operate on a single frame even in the case of videos. In this work, the goal is to exploit temporal information within the algorithm model for leveraging motion cues and temporal consistency. We propose two simple high-level architectures based on Recurrent FCN (RFCN) and Multi-Stream FCN (MSFCN) networks. In case of RFCN, a recurrent network namely LSTM is inserted between the encoder and decoder. MSFCN combines the encoders of different frames into a fused encoder via 1x1 channel-wise convolution. We use a ResNet50 network as the baseline encoder and construct three networks namely MSFCN of order 2 & 3 and RFCN of order 2. MSFCN-3 produces the best results with an accuracy improvement of 9% and 15% for Highway and New York-like city scenarios in the SYNTHIA-CVPR'16 dataset using mean IoU metric. MSFCN-3 also produced 11% and 6% for SegTrack V2 and DAVIS datasets over the baseline FCN network. We also designed an efficient version of MSFCN-2 and RFCN-2 using weight sharing among the two encoders. The efficient MSFCN-2 provided an improvement of 11% and 5% for KITTI and SYNTHIA with negligible increase in computational complexity compared to the baseline version.

Comments:	Accepted for Oral Presentation at VISAPP 2019
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1901.02511 [cs.CV]
	(or arXiv:1901.02511v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1901.02511
Related DOI:	https://doi.org/10.5220/0007248401730180

Submission history

From: Sumanth Chennupati [view email]
[v1] Tue, 8 Jan 2019 20:45:49 UTC (5,399 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Multi-stream CNN based Video Semantic Segmentation for Automated Driving

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Multi-stream CNN based Video Semantic Segmentation for Automated Driving

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators