Background & Summary

The mortality of wildlife resulting from collisions with vehicles is considered one of the main negative effects of roads on numerous species1. Roadkill can increase the risk of local extinction by reducing effective population size and genetic diversity, while also limiting demographic and genetic rescue mechanisms2,3. Such effects have the potential to disrupt species distribution, spatial population dynamics, and genetic structure, thereby compromising long-term conservation efforts4,5.

Estimates can reach to 340 million of birds killed on the roads in USA6, 194 million birds, 29 million mammals in Europe7 and 17 million of birds and mammals in Latin America8. While numbers of roadkill are high, populations may be able to persist into the future if they are common and abundant9,10. Conversely, species already threatened by other factors, with slow reproductive rates may find it challenging to recover from the loss of individuals, even when roadkill rates are low, thus increasing the risk of local extinctions11,12,13. Several studies that assessed the current and future impacts on mammal populations highlighted that populations of maned wolf (Chrysocyon brachyurus), tiger (Panthera tigris), leopard (Panthera pardus) and brown hyena (Hyaena brunnea) are particularly exposed to road traffic with associated increase of risk of extinction13,14,15,16.

In addition to conservation concerns, the socio-economic impacts of roadkill have been extensively documented, particularly in North America and North and Central Europe17,18,19. In the United States, collisions have been estimated to cause over 200 human fatalities and 29,000 injuries annually, with associated costs ranging from US$ 6 to 12 billion18. Similarly, in Europe, collisions with ungulates resulted in approximately 300 human fatalities and 30,000 injuries per year, with associated costs surpassing one billion dollars17.

Over the next few decades, the “Belt and Road” and “Global Gateway” initiatives represent strategies aimed at investing in transportation and energy infrastructure projects worldwide, involving the expansion of millions of kilometres of the road network20,21,22. These initiatives try to meet global infrastructure development needs mainly in emerging economic countries with well conserved areas, threatening more than 250 species of conservation concern23.

A comprehensive dataset of roadkill locations is essential to evaluate the factors contributing to roadkill risk and enhance our comprehension of its impact on wildlife populations and associated socio-economic costs. Scientists and consultants have conducted road surveys, compiled opportunistic roadkill locations, and developed citizen science applications to comprehend the mechanisms underlying roadkill risk and provide guidance for mitigation. However, a significant portion of the literature produced on this topic, including research papers, dissertations, reports, and other forms of grey literature, is either highly localized or lacks the display of geographic coordinates for roadkill locations. Efforts have been made to compile roadkill data and make it accessible. For instance, the Brazil roadkill data24, databases collected via citizen science25 and the Global Primate Roadkill Database26. However, they are often confined to specific regions or groups of species.

We present the GLOBAL ROADKILL DATA, the largest worldwide compilation of roadkill data on terrestrial vertebrates. We outline the workflow (Fig. 1) to illustrate the sequential steps of the study, in which we merged local-scale survey datasets and opportunistic records into a unified roadkill large dataset comprising 208,570 roadkill records. These records include 2283 species and subspecies from 54 countries across six continents, ranging from 1971 to 2024.

Fig. 1
figure 1

The workflow to compile roadkill data of terrestrial vertebrates.

Large roadkill datasets offer the advantage of preventing the collection of redundant data and are valuable resources for both local and macro-scale analyses regarding roadkill rates, road and landscape features associated with roadkill risk, species more vulnerable to road traffic, and populations at risk due to additional mortality. The standardization of data - such as scientific names, projection coordinates, and units - in a user-friendly format, makes them readily accessible to a broader scientific and non-scientific community, including NGOs, consultants, public administration officials, and road managers. The open-access approach promotes collaboration among researchers and road practitioners, facilitating the replication of studies, validation of findings, and expansion of previous work. Moreover, researchers can utilize such datasets to develop new hypotheses, conduct meta-analyses, address pressing challenges more efficiently and strengthen the robustness of road ecology research. Ensuring widespread access to roadkill data fosters a more diverse and inclusive research community. This not only grants researchers in emerging economies with more data for analysis, but also cultivates a diverse array of perspectives and insights promoting the advance of infrastructure ecology.

Methods

Information sources

A core team from different continents performed a systematic literature search in Web of Science and Google Scholar for published peer-reviewed papers and dissertations. It was searched for the following terms: “roadkill* OR “road-kill” OR “road mortality” AND (country) in English, Portuguese, Spanish, French and/or Mandarin. This initiative was also disseminated to the mailing lists associated with transport infrastructure: The CCSG Transport Working Group (WTG), Infrastructure & Ecology Network Europe (IENE) and Latin American & Caribbean Transport Working Group (LACTWG) (Fig. 1). The core team identified 750 scientific papers and dissertations which made use of roadkill information, but did not make it available. The first authors of these publications were contacted to request georeferenced roadkill locations and to offer co-authorship on this data article. Of the 824 authors contacted, 145 agreed to share georeferenced roadkill locations, often involving additional colleagues who contributed to data collection. Since our main goal was to provide open access to data that had never been shared in this format before, available data from citizen science projects (e.g., globalroakill.net), or data already available in published papers, were not included.

Data compilation

A total of 423 co-authors compiled the following information: continent, country, latitude and longitude in WGS 84 decimal degrees of the roadkill, coordinates uncertainty, class, order, family, scientific name of the roadkill, vernacular name, IUCN status, number of roadkill, year, month, and day of the record, identification of the road, type of road, survey type, observers that recorded the roadkill and the reference containing analyses that included roadkill occurrences (Supplementary Information Table S1 - description of the fields and Table S2 - reference list). When roadkill data were derived from systematic surveys, the dataset included additional information on road length that was surveyed, latitude and longitude of the road (initial and final part of the road segment), survey period, start year of the survey, final year of the survey, 1st month of the year surveyed, last month of the year surveyed, and frequency of the survey. We consolidated 142 valid datasets into a single dataset. We complemented this data with OccurenceID (a UUID generated using Java code), basisOfRecord, countryCode, locality using OpenStreetMap’s API (https://www.openstreetmap.org), geodeticDatum, verbatimScientificName, Kingdom, phylum, genus, specificEpithet, infraspecificEpithet, acceptedNameUsage, scientific name authorship, matchType, taxonRank using Darwin Core Reference Guide (https://dwc.tdwg.org/terms/#dwc:coordinateUncertaintyInMeters), AssociatedReferenceID and link of the associatedReference (URL).

Data collection

This dataset includes two types of data, opportunistic and systematic, clearly labelled in the surveyType column. Opportunistic data has no set methodology, representing data points that the authors collected by mere chance. Systematic data is a result of surveys, with a clearly defined duration, ___location and frequency of sampling. All systematic data points thus possess information on: the latitude and longitude of the start (initialRoadLatitude and initialRoadLongitude) and the end (finalRoadLatitude and finalRoadLongitude) of the sampled road segment; the km of the sampled road segment (roadLength); the starting year (startYear) and month (firstMonthOfTheYear) of the survey, as well as its final year (finalYear) and month (lastMonthOfTheYear); the duration of the survey in months (surveyPeriod); and, finally, the frequency of sampling effort (surveyFrequency). All of the roadkill data that was collected during each of the surveys is fully represented in the dataset.

Data standardization

We conducted a clustering analysis on all text fields to identify similar entries with minor variations, such as typos, and corrected them using OpenRefine (http://openrefine.org). We also standardized all date values using OpenRefine. Coordinate uncertainties listed as 0 m were adjusted to either 30 m or 100 m, depending on whether they were recorded after or before 2000, respectively, following the recommendation in the Darwin Core Reference Guide (https://dwc.tdwg.org/terms/#dwc:coordinateUncertaintyInMeters).

Taxonomy

We cross-referenced all species names with the Global Biodiversity Information Facility (GBIF) Backbone Taxonomy using Java and GBIF’s API (https://doi.org/10.15468/39omei). This process aimed to rectify classification errors, include additional fields such as Kingdom, Phylum, and scientific authorship, and gather comprehensive taxonomic information to address any gap within the datasets. For species not automatically matched (matchType - Table S1), we manually searched for correct synonyms when available.

Species conservation status

Using the species names, we retrieved their conservation status and also vernacular names by cross-referencing with the database downloaded from the IUCN Red List of Threatened Species (https://www.iucnredlist.org). Species without a match were categorized as “Not Evaluated”.

Data Records

GLOBAL ROADKILL DATA is available at Figshare27 https://doi.org/10.6084/m9.figshare.25714233. The dataset incorporates opportunistic (collected incidentally without data collection efforts) and systematic data (collected through planned, structured, and controlled methods designed to ensure consistency and reliability). In total, it comprises 208,570 roadkill records across 177,428 different locations (Fig. 2). Data were collected from the road network of 54 countries from 6 continents: Europe (n = 19), Asia (n = 16), South America (n = 7), North America (n = 4), Africa (n = 6) and Oceania (n = 2).

Fig. 2
figure 2

Distribution and number of roadkill records per country.

All data are georeferenced in WGS84 decimals with maximum uncertainty of 5000 m. Approximately 92% of records have a ___location uncertainty of 30 m or less, with only 1138 records having ___location uncertainties ranging from 1000 to 5000 m. Mammals have the highest number of roadkill records (61%), followed by amphibians (21%), reptiles (10%) and birds (8%). The species with the highest number of records were roe deer (Capreolus capreolus, n = 44,268), pool frog (Pelophylax lessonae, n = 11,999) and European fallow deer (Dama dama, n = 7,426).

We collected information on 126 threatened species with a total of 4570 records. Among the threatened species, the giant anteater (Myrmecophaga tridactyla, VULNERABLE) has the highest number of records n = 1199), followed by the common fire salamander (Salamandra salamandra, VULNERABLE, n = 1043), and European rabbit (Oryctolagus cuniculus, ENDANGERED, n = 440). Records ranged from 1971 and 2024, comprising 72% of the roadkill recorded since 2013. Over 46% of the records were obtained from systematic surveys, with road length and survey period averaging, respectively, 66 km (min-max: 0.09–855 km) and 780 days (1–25,720 days).

Technical Validation

We employed the OpenStreetMap API through Java to detect ___location inaccuracies, and validate whether the geographic coordinates aligned with the specified country. We calculated the distance of each occurrence to the nearest road using the GRIP global roads database28, ensuring that all records were within the defined coordinate uncertainty. We verified if the survey duration matched the provided initial and final survey dates. We calculated the distance between the provided initial and final road coordinates and cross-checked it with the given road length. We identified and merged duplicate entries within the same dataset (same ___location, species, and date), aggregating the number of roadkills for each occurrence.

Usage Notes

The GLOBAL ROADKILL DATA is a compilation of roadkill records and was designed to serve as a valuable resource for a wide range of analyses. Nevertheless, to prevent the generation of meaningless results, users should be aware of the following limitations:

  • Geographic representation - There is an evident bias in the distribution of records. Data originated predominantly from Europe (60% of records), South America (22%), and North America (12%). Conversely, there is a notable lack of records from Asia (5%), Oceania (1%) and Africa (0.3%). This dataset represents 36% of the initial contacts that provided geo-referenced records, which may not necessarily correspond to locations where high-impact roads are present.

  • Location accuracy - Insufficient ___location accuracy was observed for 1% of the data (ranging from 1000 to 5000 m), that was associated with various factors, such as survey methods, recording practices, or timing of the survey.

  • Sampling effort - This dataset comprised both opportunistic data and records from systematic surveys, with a high variability in survey duration and frequency. As a result, the use of both opportunistic and systematic surveys may affect the relative abundance of roadkill making it hard to make sound comparisons among species or areas.

  • Detectability and carcass removal bias - Although several studies had a high frequency of road surveys, the duration of carcass persistence on roads may vary with species size and environmental conditions, affecting detectability. Accordingly, several approaches account for survey frequency and target species to estimate more realistic roadkill rates29,30.

Acknowledging these limitations, it is important to highlight that this dataset is the largest available roadkill compilation on terrestrial vertebrate species worldwide. Records obtained from systematic surveys enable the estimation of roadkill rates, exploration of spatial and temporal patterns of roadkill, modelling of factors potentially explaining roadkill risk across diverse species, and analysis of the potential population impacts. Opportunistic data have the potential to generate extensive datasets across large areas, providing an overview of which species and regions are most exposed to and at risk from traffic. By integrating both systematic and opportunistic data, we can compile lists of species affected by collisions with vehicles, identify threatened species at risk, and assess local and landscape-level factors influencing mortality probabilities using presence-only approaches. Furthermore, this dataset facilitates the identification of geographic gaps in roadkill surveys, focusing scientists and road agencies’ efforts in data-deficient areas. Finally, beyond its applications in road ecology, this dataset contributes with species occurrences for distribution modelling and broader macroecological and conservation studies.