Note: SpaceNet’s mission is to accelerate geospatial machine learning and is supported by the SpaceNet member organizations. To learn more visit https://spacenetchallenge.github.io/.

This post is part 3 of a series about the SpaceNet 4: Off-Nadir Dataset and Building Detection Challenge. For the first two parts of the series, click hereand here.

The SpaceNet Competition Round 4: Off-Nadir Building Detection Challenge has begun! In a recent post, we described competition baseline models that we built, and outlined some of the challenges we encountered while training models to identify building footprints from the dataset. While evaluating the performance of these models, we observed a very unusual phenomenon: “jagged” plot lines for the SpaceNet IoU F1 Score when stratifying evaluation data by look angle. Strangely, building prediction in images taken at nearly identical look angles — for example, 29 and 30 degrees — produced radically different performance scores. The graph depicting this phenomenon is reproduced below.

Building detection performance vs. evaluation angle for SpaceNet Round 4 baseline models. For more details about the baseline, click here. The black dashed line represents the threshold to transition from “nadir” to “off-nadir”, and the red dashed line represents the transition to “very off-nadir”.

So, what is happening? Why does imagery acquired at such similar angles produce such different predictions? A first hint appeared when we overlaid imagery and predictions at one chip location from two different collects:

Predictions overlaid on top of imagery for two different collects taken of the same location. Building detection for the collect containing one of these images achieved an F1 score of >0.5, while the other achieved <0.2. one of these images comes from a 29 degree off-nadir collect; the other is 30 degrees off-nadir.

Both predictions find the roofs of buildings in the imagery quite well, but the roofs move! What’s wrong? To find out, let’s overlay both of the building predictions with the manually produced “ground truth” building labels for the same area:

Manually generated building labels (gray) for the image above, overlaid with predictions based on the 29 degree (pink) and 30 degree (green) nadir angle collects. Neither set of predictions is perfect, but one clearly matches the manual labels better than the other.

As you can see, only one of the predicted label sets matches the manually labeled building footprints well. Interestingly, the manual labeling was done on the 7 degree imagery (the closest collect to nadir in the dataset), which doesn’t match either of these images perfectly. This suggests that model is learning to account for look angle in one of these images, but not the other — the predictions from the other image are geospatially shifted by 10–15 pixels, which corresponds to roughly 5–7.5 meters on the ground. Why would this be?

To understand the problem, we need to understand two statistics related to remote sensing collects: the look (nadir) angle and the target azimuth angle.

Understanding Look Angle and Target Azimuth Angle

There are two important angles to consider when analyzing off-nadir remote sensing data: the lookangle, or how far from directly above the target imagery was acquired, and the target azimuth angle, which is effectively the compass direction(relative to North) that a satellite is pointing to visualize its target. See the schematic below.

Even if two collects are taken at the exact same look angle, the images they acquire can differ if they were taken from different targetazimuth angles. Take the case below as an example:

Building roofs will project to very different positions on the ground in imagery acquired at different target azimuth angles, even if the look angle is the same.

Here, two satellite collects are taken at the same nadir angle, but from opposite sides of a building (180° shifted targetazimuth angle). The roof of the building will be projected to very different ground locations in these images! The same phenomenon occurs in the satellite imagery shown above. Notice that the height of a structure dictates how much the projection will deviate: the parking lot surrounding the buildings in the bottom of the imagery stays in the exact same place between the two different images, even as the roof shifts. The taller the building, the more the geospatial location of its roof will be distorted as azimuth angle varies. The effect of this distortion will also be amplified as nadir angle increases. Models designed to geolocate objects in off-nadir imagery will need to account for this phenomenon.

Look angle, target azimuth angle, and the SpaceNet 4 Dataset

For the SpaceNet 4 dataset, we can use the target azimuth angle to determine where the satellite was with respect to Atlanta during imaging. Based on the look and target azimuth angles, we can put together the following depiction of where the collects took place:

Location where each collect was taken from to generate the SpaceNet 4 Off-Nadir dataset. This not-to-scale representation is simplified: in reality, the satellite did not pass directly over Atlanta, but nearby. See this paper and the dataset metadata for additional details.

There are two key things to note here:

1. There are far more collects taken from the South side of Atlanta than from the North.

2. All of the collects with reduced prediction scores were taken from the North.

Having determined this, we re-plotted model performance using negative values for look angles in the beginning of the series of collects (imagery taken from the North), and positive values for nadir angles after the satellite had passed over Atlanta. The results are striking: rather than the jagged performance line observed earlier, we see an asymmetric peak in performance.

Model evaluation performance versus look angle, with the collects taken on opposite sides of the city separated. Black dashed lines represent the threshold to transition from “nadir” to “off-nadir”, and the red dashed line represents the transition to “very off-nadir”.

Even though the off-nadir training dataset includes collects from the “negative” look angle range, the off-nadir-trained model detects buildings very poorly from images in those same collects. All of this goes to show that a single, generalized model will not necessarily perform well across all remote sensing data, even if it is imagery of the exact same location and nearly the same look angle, taken less than 5 minutes apart, merely at a different target azimuth angle! Critical features of the imagery, such as sunlight reflection off of structures and the appearance of shadows, can change dramatically as target azimuth angle varies. These issues must be considered when developing models that utilize off-nadir imagery.