Mitigating Bad Ground Truth in Supervised Machine Learning based Crop Classification: A Multi-Level Framework with Sentinel-2 Images

A, Sanayya; Shetty, Amoolya; Sharma, Abhijeet; Ravichandran, Venkatesh; Gosuvarapalli, Masthan Wali; Jain, Sarthak; Nanjundiah, Priyamvada; Dutta, Ujjal Kr; Sharma, Divya

Computer Science > Computer Vision and Pattern Recognition

arXiv:2503.11807 (cs)

[Submitted on 14 Mar 2025]

Title:Mitigating Bad Ground Truth in Supervised Machine Learning based Crop Classification: A Multi-Level Framework with Sentinel-2 Images

Authors:Sanayya A, Amoolya Shetty, Abhijeet Sharma, Venkatesh Ravichandran, Masthan Wali Gosuvarapalli, Sarthak Jain, Priyamvada Nanjundiah, Ujjal Kr Dutta, Divya Sharma

View PDF HTML (experimental)

Abstract:In agricultural management, precise Ground Truth (GT) data is crucial for accurate Machine Learning (ML) based crop classification. Yet, issues like crop mislabeling and incorrect land identification are common. We propose a multi-level GT cleaning framework while utilizing multi-temporal Sentinel-2 data to address these issues. Specifically, this framework utilizes generating embeddings for farmland, clustering similar crop profiles, and identification of outliers indicating GT errors. We validated clusters with False Colour Composite (FCC) checks and used distance-based metrics to scale and automate this verification process. The importance of cleaning the GT data became apparent when the models were trained on the clean and unclean data. For instance, when we trained a Random Forest model with the clean GT data, we achieved upto 70\% absolute percentage points higher for the F1 score metric. This approach advances crop classification methodologies, with potential for applications towards improving loan underwriting and agricultural decision-making.

Comments:	Accepted In IEEE India Geoscience and Remote Sensing Symposium (InGARSS) 2024
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2503.11807 [cs.CV]
	(or arXiv:2503.11807v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2503.11807

Submission history

From: Ujjal Kr Dutta [view email]
[v1] Fri, 14 Mar 2025 18:50:30 UTC (1,216 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Mitigating Bad Ground Truth in Supervised Machine Learning based Crop Classification: A Multi-Level Framework with Sentinel-2 Images

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Mitigating Bad Ground Truth in Supervised Machine Learning based Crop Classification: A Multi-Level Framework with Sentinel-2 Images

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators