Comparisons of two global built area land cover datasets in methods to disaggregate human population in eleven countries from the global South

Stevens, Forrest R.; Gaughan, Andrea E.; Nieves, Jeremiah J.; King, Adam; Sorichetta, Alessandro; Linard, Catherine; Tatem, Andrew J.

doi:10.6084/m9.figshare.9891563.v1

tjde_a_1633424_sm9000.docx (4.37 MB)

Comparisons of two global built area land cover datasets in methods to disaggregate human population in eleven countries from the global South

journal contribution

posted on 2019-09-23, 10:39 authored by Forrest R. Stevens, Andrea E. Gaughan, Jeremiah J. Nieves, Adam King, Alessandro Sorichetta, Catherine Linard, Andrew J. Tatem

Mapping built land cover at unprecedented detail has been facilitated by increasing availability of global high-resolution imagery and image processing methods. These advances in urban feature extraction and built-area detection can refine the mapping of human population densities, especially in lower income countries where rapid urbanization and changing population is accompanied by frequently out-of-date or inaccurate census data. However, in these contexts it is unclear how best to use built-area data to disaggregate areal, count-based census data. Here we tested two methods using remotely sensed, built-area land cover data to disaggregate population data. These included simple, areal weighting and more complex statistical models with other ancillary information. Outcomes were assessed across eleven countries, representing different world regions varying in population densities, types of built infrastructure, and environmental characteristics. We found that for seven of 11 countries a Random Forest-based, machine learning approach outperforms simple, binary dasymetric disaggregation into remotely-sensed built areas. For these more complex models there was little evidence to support using any single built land cover input over the rest, and in most cases using more than one built-area data product resulted in higher predictive capacity. We discuss these results and implications for future population modeling approaches.