In a previous article titled "Introducing GlobalBuildingAtlas: A Global Building Height Dataset ", I shared a global building dataset that includes height information. However, some readers noted concerns about its data quality. Therefore, I would like to share two other global building footprint datasets I discovered while working on CIM-related projects: one from Microsoft (open-source) and another from Google. Both are products of major tech companies, and their usefulness can be evaluated through testing.

Microsoft GlobalMLBuildingFootprints

Microsoft's GlobalMLBuildingFootprints open-source repository provides a global building footprint dataset. It detects approximately 1.4 billion buildings from multi-source Bing Maps imagery (2014–2024) and is freely available under the ODbL license. The data is provided in row-separated GeoJSON format (with a .csv.gz extension) using the EPSG:4326 coordinate system and includes attributes such as estimated height and confidence scores.

Microsoft currently offers a make-gis-friendly.py script to convert the data into formats compatible with QGIS and ArcGIS. The dataset covers multiple regions and is continuously updated. It also integrates with the OpenStreetMap ecosystem, providing high-quality foundational data for GIS spatial analysis, urban planning, and other applications.

Official website: https://github.com/microsoft/GlobalMLBuildingFootprints

Google Open Buildings

Google Research's Open Buildings project is an open-source initiative focused on sharing global building footprint data. Multiple versions of the dataset can be accessed via a dedicated page. Leveraging high-resolution satellite imagery and machine learning, the project accurately extracts building polygon footprints. The core dataset covers over 1 billion buildings worldwide and includes key attributes such as latitude and longitude, footprint area, and confidence scores. Some versions also provide additional information like building height and number of floors.

The data can be filtered and downloaded by region and is directly importable into tools like QGIS and ArcGIS for spatial analysis. With its extensive coverage, especially in underserved regions such as Africa and Southeast Asia, the dataset offers high-temporal and high-precision foundational spatial data support for GIS applications like urban planning, disaster response, and population statistics.

Official website: https://sites.research.google/gr/open-buildings/

Why Is Data for China Missing?

If you intend to use the data for regions within China, you may be disappointed. This is precisely why I previously recommended the GlobalBuildingAtlas 3D LOD1 dataset. Although both datasets claim global coverage, data for China is absent.

Why is this the case? While Google's Open Buildings dataset is not open-source, I came across the following response in Microsoft's open-source issues:

In summary, this is due to the specific regulatory constraints surrounding geospatial data in China (a point most readers are likely aware of). To ensure compliance and avoid complications, the data for China is simply not released.

Conclusion

Both datasets serve as excellent resources for GIS work involving global building footprint data. However, if your focus is on China, you may need to explore alternative datasets. In future articles, I will continue to share datasets specifically covering building footprints within China.