As part of a larger contract with a client, I developed an image segmentation method to detect shadows on landscape imagery, primarily in mountainous regions. This particular task presented two primary challenges:
- there only a few datasets available to train a shadow segmentation network.
- None of the available datasets contained mountain imagery.
Tackling the first point, the SBU Shadow dataset was used to train an initial shadow detection model. This dataset contains numerous images with pixel-level labels of shadows. While this dataset is useful, these shadows only appear on flat surfaces, and are quite distinct from their environments, such as the below:
An initial model was trained on this dataset with UNet. While our results on a holdout evaluation set from SBU were decent, we had no way to measure performance on mountain imagery.
To address this issue, we utilize imagery from geopose3k. Geopose3k contains over three thousand precise camera poses of mountain landscape images. Several methods for automatically producing labeled images were attempted, but ultimately these were ineffective. Using the original pose imagery, pixel-level shadow annotations were applied manually using sparse annotation. While these labels were not extremely accurate, they were fast to produce. Around 150 annotated images were created.
Using this new labeled dataset, we were now able to evauate the model trained on SBU Shadow. the segmentation model produced from SBU Shadow was further trained on just the sparsely labeled data. This produced more accurate results overall at test time.
To provide even better results, hyperparameter tuning was performed with Ax. Ax is a hyperparameter search tool that uses Bayesian and bandit optimization to intelligently search for the best hyperparameters. This process helped to provide more stable training and better overall results.
Quantitative results on an evaluation dataset are shown in the table above. These improvements gave us a roughly 12% impovement on a small holdout of the sparsely labeled geopose3k images. Some cleaner qualitative results can be seen below: