Global patterns of gully occurrence

This dataset accompanies the manuscript "Global Patterns of Gully Occurrence and their Sensitivity to Environmental Changes". It provides the first global assessment of gully head density (GHD) and gully susceptibility at a resolution of 1 km², based on a machine learning framework.
Registration is requested: 
No
Year: 
2026
Themes: 
Keywords: 

Title: Global Patterns of Gully Occurrence and their Sensitivity to Environmental Changes
Description: Gully formation is a significant driver of soil erosion and land degradation worldwide and often leads to important downstream impacts. Nonetheless, our understanding of the global patterns and the factors controlling this process remains limited. The maps, presented here, were derived from a large global database of gully observations. In total, 17,499 representative 1x1 km² study sites were mapped using high-resolution Google Earth imagery. At each site, the number of visible gully heads was assessed in a semi-quantitative way, yielding a probabilistic estimate of gully head density.  Environmental predictor variables describing topography, geomorphology, soil, land use/cover, and climate were extracted from global geospatial datasets. A final set of 26 features was selected through recursive feature elimination and correlation analysis. Using these observations and predictors, Random Forest (RF) models were trained and validated. An ensemble of 100 RF regression models was used to simulate global GHD patterns, while an ensemble of 100 RF classification models simulated gully susceptibility (probability of at least one gully head per km²). Validation against independent test data confirmed that the models capture regional patterns of gully occurrence robustly. The datasets provided here are the outputs of these ensembles, representing average predictions and associated uncertainty. They can be used to identify global hotspots of gully erosion and to distinguish regions where gully occurrence is mainly driven by natural factors versus those strongly influenced by land use and land cover.

The following datasets are available (figures as reference of the cited publication):

  • Featurelist.xlsx: List of the 26 predictor variables selected for the Random Forest ensemble, with explanations of their source and meaning.
  • Fig1b_GHD_Average.tif: GeoTIFF of the global simulated gully head density map (average of 100 RF models). Corresponds to Fig. 1b in the article.
  • Fig1c_Susceptibility_Average.tif: GeoTIFF of the global simulated gully susceptibility map (average of 100 RF models). Corresponds to Fig. 1c in the article.
  • Fig4a_Map_DominantFeature.tif: GeoTIFF underlying Fig. 4a in the article. Each pixel indicates the environmental feature group (topography, soil, land cover/use, or climate) that had the most dominant influence on GHD prediction. See Featurelist.xlsx for details.
  • Fig4b_GHD_Average_LU_dominant_Positive_OthersDominant_Negative.tif: GeoTIFF underlying Fig. 4b in the article. Positive values indicate estimated GHD where land use/land cover features had a dominant effect. Negative values indicate GHD where other environmental factors were dominant.
  • FigSI5_GHD_MaxRange.tif: GeoTIFF underlying Supplementary Fig. SI.5. Shows the prediction range (maximum – minimum) in estimated GHD values across the 100 RF models, providing an indication of model uncertainty.

Spatial coverage: Global
Pixel size: 1km x 1km
Temporal coverage
Projection: EPSG:4326 - WGS 84

Reference: Chen, Y., De Geeter, S., Poesen, J., Matthews, F., Campforts, B., Borrelli, P., Panagos, P., Vanmaercke, M. (2025). Global Patterns of Gully Occurrence and their Sensitivity to Environmental Changes. International Soil and Water Conservation Research: DOI: 10.1016/j.iswcr.2025.09.004


 

Download the whole dataset . Below the explanation of each file.

  • Featurelist.xlsx: List of the 26 predictor variables selected for the Random Forest ensemble, with explanations of their source and meaning.
  • Fig1b_GHD_Average.tif: GeoTIFF of the global simulated gully head density map (average of 100 RF models). Corresponds to Fig. 1b in the article.
  • Fig1c_Susceptibility_Average.tif: GeoTIFF of the global simulated gully susceptibility map (average of 100 RF models). Corresponds to Fig. 1c in the article.
  • Fig4a_Map_DominantFeature.tif: GeoTIFF underlying Fig. 4a in the article. Each pixel indicates the environmental feature group (topography, soil, land cover/use, or climate) that had the most dominant influence on GHD prediction. See Featurelist.xlsx for details.
  • Fig4b_GHD_Average_LU_dominant_Positive_OthersDominant_Negative.tif: GeoTIFF underlying Fig. 4b in the article. Positive values indicate estimated GHD where land use/land cover features had a dominant effect. Negative values indicate GHD where other environmental factors were dominant.
  • FigSI5_GHD_MaxRange.tif: GeoTIFF underlying Supplementary Fig. SI.5. Shows the prediction range (maximum – minimum) in estimated GHD values across the 100 RF models, providing an indication of model uncertainty.

Note: They are intended for global- to regional-scale analyses. Predictions at local scale should be interpreted with caution due to inherent uncertainties in the modelling framework.