A community approach to land-use mapping to reduce deforestation
By Alicia Sullivan, Product Manager & Katelyn Tarrio, Student Researcher, Google Earth Engine
At Google, we have a long history of investing in efforts to combat deforestation. Earlier this year, we shared how Google Earth and Earth Engine have helped countries avoid emissions from deforestation, estimated at 95 megatons of CO2 equivalent — and get REDD+ payments in return. Now, we’re going a step further by focusing on one of the major drivers of deforestation: commodity crop production.
Working to halt deforestation
Commodity crops such as oil palm, cocoa and rubber are major contributors to global forest loss, yet broadly accessible maps that identify where these crops are grown remain a gap. Meanwhile, regulations such as the European Union Deforestation Regulation (EUDR), the United Kingdom’s Environment Act and proposed bills like the US FOREST Act are shifting deforestation-free commodity production from voluntary commitments to mandatory requirements for businesses operating in global markets. The need for consistent, open, geospatial forest data is what inspired us to be a founding member of the Forest Data Partnership (FDaP), a consortium of industry, government, and institutional leaders whose mission is to “halt and reverse forest loss from commodity production by collaboratively improving global monitoring and supply chain tracking and accelerating restoration.”
To support the Forest Data Partnership, we recently released maps of rubber, cocoa and palm for top producing countries for 2020 and 2023, along with a global map of forest changes as of 2020. These datasets are a critical step to help track deforestation linked to commodity crop production, empowering users to make data-driven decisions to accelerate the shift toward sustainable sourcing and forest conservation. We are also releasing the models used to create them under an open license, available on GitHub.
Collaborating for impact
Recently, a variety of organizations have produced satellite-based products to understand commodity-driven deforestation. For those familiar with the EUDR and other sustainable sourcing policies, you’ve probably seen a proliferation of data solutions coming out over the last year. A challenge with these products is that they may not be freely accessible and mechanisms to validate their accuracy or improve them are not always clear.
The Forest Data Partnership is unique in its inclusive, “community” approach that prioritizes collaboration, uniting stakeholders across academia, government, NGOs, and industry to pool diverse datasets to train, test, and validate machine learning models. This synergy enhances the quality of resulting models and maps, providing a more comprehensive view of commodities and deforestation. We offer avenues for stakeholders to contribute data and feedback to the effort, improving the outcome for all users.
A key to our success with this approach is the continuous iteration and improvement of models and maps over time, driven by ongoing contributions from the community. To support that, the Google team has built a robust ML process, combining the cloud scale geospatial data analysis capabilities of Earth Engine with the latest machine learning advances to quickly re-train models with new community contributions and publish updated probability maps. This creates a fast, reliable way to improve the models and maps over time.
These models were developed with data contributions from FDaP organizations like the World Resources Institute, SERVIR and the Food & Agriculture Organization (FAO) of the United Nations, as well as openly licensed contributions from academic, government and non-profit organizations around the world.
Tree commodity maps
Using community models, we produced probability maps for palm, rubber and cocoa at 10-meter resolution for the years 2020 and 2023. Model inputs leverage annual composites of Sentinel-1, Sentinel-2, ALOS PALSAR-2, and slope from digital elevation models (described in detail here). The resulting maps represent per-pixel normalized outputs (from 0 to 1) measuring the model’s confidence that the underlying area is occupied by rubber, cocoa or palm. Our selection of commodities and geographies was based on the availability of high-quality reference data we received from the community: palm maps cover Southeast Asia, West Africa and parts of South America; cocoa centers on Ghana and Cote d’Ivoire, the top two producers globally; rubber maps cover Southeast Asia and parts of West Africa.
Accurately classifying different tree crops is a challenging task, particularly for places where smallholder plantations (<10 hectares) dominate. These areas are routinely underrepresented in existing maps, despite that many commodities are grown not on industrial farms but on small plantations. One such area is Cote d’Ivoire, a highly heterogeneous landscape where tree crop cultivation is widespread. Below we show how our community models for 2023 perform over an area in central Cote d’Ivoire clustered with many intermixed, small plantations. Probability estimates for each commodity (palm, rubber and cocoa) at the 0.80 threshold cover the majority of each small plot, effectively identifying and differentiating each tree crop. There’s certainly room for improvement, especially with confusion between palm and rubber. Tricky places with high probabilities across multiple commodity types are actually very important: they help identify priority areas where training data improvements such as field data collection is needed for better landscape representation. These areas will serve as benchmarks for future model iterations to test improvement.
For crops like cocoa, palm and rubber, tree cover changes often result from routine management activities like harvesting, tree replacement and thinning. Without accurate commodity maps, these changes may be mistaken for deforestation events. So in addition to helping identify where crops are, these maps help identify where deforestation has not occurred.
Our goal is to create global probability maps for palm, rubber, cocoa, and coffee, key drivers of deforestation. While our community approach offers significant benefits, map accuracy depends on training data availability and varies regionally. For more details on limitations, see the dataset descriptions in the Forest Data Partnership publisher catalog.
Though there’s still work ahead, this is a crucial step in showcasing the value of the community approach. We look forward to fostering ongoing data contributions to improve the models and commodity maps. The good news is, you — the community — can help! If you have feedback or data to share, please reach out to the Forest Data Partnership here.
Forest Persistence
Equally important in a deforestation-free supply chain assessment is knowing where natural forests are and if they’ve undergone changes. As mentioned above, these assessments are now a requirement in many markets. The timing of deforestation or degradation events, relative to regulatory baselines, is critical for risk assessment. For example, the EUDR mandates that producers demonstrate their commodities were not sourced from a forest disturbed after December 31, 2020.
To address this issue, we developed Forest Persistence, a global data layer to estimate whether a forest has been disturbed over time. Forest Persistence was created by combining multiple published datasets related to forest cover in an ensemble to reduce biases of individual sources and highlight areas where they align. As a result, Forest Persistence represents the current consensus from available open data on the location of undisturbed forests.
Unlike other forest products with discrete labels of “primary” or “natural”, “persistence” is represented as a score (from 0 to 1) to reflect the continuous nature of forests: areas that are mostly undisturbed and areas that are completely undisturbed exist on a gradient rather than a strict binary. The example below shows part of the Amazon rainforest in central Colombia, where pastures drive deforestation. Between 2017 and 2024, some forest areas were converted to pastures, while some pastures were abandoned and trees began to regrow. Forest Persistence captures these changes because it includes LandTrendr applied to the Google Timelapse dataset from 1984 to 2020. Pastures (white) have low ‘persistence’ values near 0; abandoned pastures with regrowth (light green) score between 0.5 to 0.8; forests with no signs of disturbance score near 1 (in green). A threshold of 0.95 effectively excludes pastures, regrowing secondary forests and the trees surrounding pastures that likely experienced degradation. This continuous feature gives users flexibility in choosing different Forest Persistence values for various applications. The inclusion of changes prior to 2020 aligns with the EUDR cutoff date, helping users identify potential deforestation risk.
Mapping global commodity-driven deforestation is a daunting challenge, but as the saying goes, “many hands make light work.” The Forest Data Partnership invites the geospatial community to help improve our models for the benefit of everyone. We look forward to your feedback and contributions to the community effort by reaching out here.
How to get started
If you’re an Earth Engine user, you can access the 2020 and 2023 probability maps for rubber, cocoa and palm in the Forest Data Partnership publisher catalog. Please note that if you are a commercial user of Earth Engine, you’ll need to apply for access and accept commercial terms. Non-commercial users can access maps without any additional steps under CC-BY 4.0 NC.
Forest Persistence for 2020 is available in the Forest Data Partnership publisher catalog under CC-BY 4.0 for all users.
If you are interested in trying out the community models, they are available on GitHub under an MIT license and can be hosted within your own GCP environment using Vertex AI. The README files for each model describe the inputs, methods and limitations for each product.