Ten years of ecosystem services matrix: Review of a (r)evolution

With the Ecosystem Service (ES) concept's popularisation, the need for robust and practical methodologies for ES assessments has increased. The ES matrix approach, linking ecosystem types or other geospatial units with ES in easy-to-apply lookup tables, was first developed ten years ago and, since then, has been broadly used. Whereas detailed methodological guidelines can be found in literature, the ES matrix approach seems to be often used in a quick (and maybe even "quick and dirty”) way. Based on a review of scientific publications, in which the ES matrix approach was used, we present the diversity of application contexts, highlight trends of uses and propose future recommendations for improved applications of the ES matrix. A total of 109 studies applying the ES matrix approach and one methodological study without concrete applications were considered for the review. Amongst the main patterns observed, the ES matrix approach allows the assessment of a higher number of ES than other ES assessment methods. ES can be jointly assessed with indicators for ecosystem condition and biodiversity in the ES matrix. Although the ES matrix allows us consider many data sources to achieve the assessment scores for the individual ES, in the reviewed studies, these were mainly used together with expert-based scoring (73%) and/or ES ‡ § | ¶,# © Campagne C et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. scores that were based on an already-published ES matrix or deduced by information found in related scientific publications (51%). We must acknowledge that 27% of the studies did not clearly explain their methodology. This points out a lack of method elucidation on how the data had been used and where the scores came from. Although some studies addressed the need to consider variabilities and uncertainties in ES assessments, only a minority of studies (15%) did so. Our review shows that, in 29% of the studies, an already-existing matrix was used as an initial matrix for the assessment (mainly the same matrix from one of the Burkhard et al. papers). In 16% of the reviewed studies, no other data were used for the matrix scores or no adaptation of the existing matrix used was made. However, the actual idea of the ES scores, included in the Burkhard et al.'s matrices published 10 years ago, was to provide some examples and give inspiration for one's own studies. Therefore, we recommend to use only scores assessed for a specific study or, if one wishes to use pre-existing scores from another study, to revise them in depth, taking into account the local context of the new assessment. We also recommend to systematically report and consider variabilities and uncertainties in each ES assessment. We emphasise the need for all scientific studies to describe clearly and extensively the whole methodology used to score or evaluate ES in order to be able to rate the quality of the scores obtained. In conclusion, the application of the ES matrix has to become more transparent and integrate more variability analyses. The increasing number of studies that use the ES matrix approach confirms its success, appropriability, flexibility and utility for decision-making, as well as its ability to increase awareness of ES.


Introduction
Since the Ecosystem Service (ES) concept was largely popularised by the Millennium Ecosystem Assessment (2005), the demand for robust and applicable methodologies for ES assessments has increased. A wide range of methods for assessing and mapping ES has now been listed (e.g. in Martinez-Harms and Balvanera 2012, Crossman et al. 2012, Egoh et al. 2012, Burkhard and Maes 2017, illustrating the need for diverse methods and degrees of related expertise from people implementing them and to harness data of varying quantity and quality (Harrison et al. 2017). The choice of the right method should communicate the goals of the respective ES assessment and mapping exercise ), but also the applicability and appropriation of the methods and the results expected by potential users of the assessment outcomes including, for instance, various stakeholders, policy-and decision-makers and land managers.
The ES matrix approach was originally published by Burkhard et al. (2009), with following updates published in 2012 and 2014 . Of course, other comparable matrix/look-up table-based approaches were also published around the same time (Dechazal et al. 2008, Koschke et al. 2012). However, one of the strengths of the ES matrix approach, presented by Burkhard et al. (2009),  and , is that it is a highly flexible way to assess and map ES, based on various data sources and methods and in all kinds of study area settings from local to regional and national scales. Numerous ES matrix applications have been developed since its initial publication in 2009 (Campagne and Roche 2018). A decade after the proposition of the matrix approach, we would now like to analyse the diversity of ES matrix applications and highlight the different trends of the various uses. Our purpose is to identify the strengths and weaknesses of this ES assessment method, based on actual studies. Additionally, despite the flexibility and apparent ease of the method, we hypothesise that, in many cases, the ES matrix approach was applied in an oversimplified way, leading to comparably weak ES assessments. Through a review of scientific studies and related publications, in which the ES matrix approach was used, we present the diversity of application contexts, highlight trends of uses and propose future recommendations for improving ES matrix applications. We looked at 10 years of publications after the first seminal paper was published.

The initial ES matrix approach
The ES matrix approach is based on the use of a lookup table consisting of geospatial units which, for instance, can be Ecosystem Types (ET), habitat types or other geospatial units, such as Land Use and Land Cover (LULC) types and sets of ES, which are to be assessed in a specific study area. Thus, the selection of the study area is the starting point of the ES matrix approach, followed by the selection of relevant ET or geospatial units and the selection of relevant ES to be in the lines and columns of the matrix (look-up) table. Then, suitable indicators for the ES quantification and appropriate ES quantification methods have to be defined. Based on that, a score for each of the ES considered is generated, referring to ES potential, ES supply, ES flow/use or demand for ES (see Grunewald 2017 or Burkhard et al. 2012 for detailed definitions). In their seminal publication, Burkhard et al. (2009) proposed to use semi-quantitative scores on a relative scale ranging from 0 to 5. These scores can be based on or integrate diverse sources of data from expert judgements, statistical data to quantitative data from process-based models or direct or indirect measurements. The resulting ES matrix table can easily be joined to geospatial units in order "to evaluate capacities to provide ecosystem services in a spatial manner" (Burkhard et al. 2009: P.4).

A systematic review of the matrix approach
We conducted a systematic review of published studies through Web of Science and Scopus (terms used for the review research in Suppl. material 1), considering all articles published between 2009 and 2019 that used the ES matrix approach (we also included two studies from 2008 that applied similar approaches: Dechazal et al. 2008 andHaines-Young andPotschin 2008). With regard to the term "ES matrix approach", we mean the use of a look-up table relating ES and geospatial units as described in Burkhard et al. (2009). The initial queries, done in September 2019, returned 880 different studies. Only studies published in the English language were considered. The selection of the final studies was made through a manual verification of the titles, keywords and abstracts and of the full article in case of further doubts, reducing the pool of relevant studies to 110 altogether (all references in Suppl. material 1). The reduction in the number of papers is mostly due to the fact that the word "matrix" is used in a wide variety of methods and so many papers did not use the ES matrix approach as we mean it. A total of 109 studies applying the ES matrix approach and one methodological study without concrete application were selected during the review.
In the following, the results of the review are presented, referring to analysed attributes including case study location, the matrix elements, the scoring system and the methods used.

Applications of the ES matrix approach
Over the last 10 years, the number of published studies increased progressively, especially during the last five years (Fig. 1).
The flow of ES from nature to society is not always as straightforward as one could perhaps expect. Instead, it includes several components, including the ecosystem-based supply of ES and the societal demand for ES. In literature, many different terms are used, depending on the different ES frameworks, the perception of the authors and the individual applications. In the reviewed studies, we also found a diversity of ES components that were assessed through the ES matrix approach: ES capacity (e.g. Vihervaara et al. 2010), ES potential (e.g. Depellegrin et al. 2016, potential ES supply (e.g. Kamlun and Arndt Number of analysed studies using the ES matrix approach to assess ES supply, demand or flows/use (at the end of 2019). 2019), potential and actual ES supply (e.g. Hainz-Renetzeder et al. 2015), ES supply (e.g. Nedkov et al. 2014;Sohel et al. 2015), ES flow (e.g. ), ES use (e.g. Karstens et al. 2019 , current ES use (e.g. Nahuelhual et al. 2013), demand for ES (e.g. . As the concepts behind the terms were not very clear in a majority of cases and, for simplification reasons, ES "capacity", "potential", "potential supply" and "supply" were grouped into "ES supply". With this regroupment, the ES matrix approach was mainly used for assessing and mapping the "supply" of ES (Fig. 1). Matrices of ES flow or use were regrouped and matrices of ES demand are shown here without regrouping. The number of assessments dealing with demand for ES has increased during the last years of the reviewed period (e.g. Tao et al. 2018;Nurokhmah et al. 2019, as well as the number of studies of ES flow or use (e.g. Li et al. 2016 (Fig. 1).
The ES matrix approach has been applied mainly in Europe in 73 analysed studies with a concentration of studies in Germany and neighbouring countries (Fig. 2). There is also a notable increasing number of studies outside Europe, especially with 18 applications in East Asia and the Pacific (e.g. Li et al. 2016;Cai et al. 2017;Sun et al. 2018;Tao et al. 2018;). The ES matrix approach was used at different spatial scales, for example, continental scale (Stoll et al. 2015), national scale (Depellegrin et al. 2016) and local scale . The ES matrix approach was, however, mainly applied at the local (54 studies) and the regional scale (33 studies). The continental scale was used in 12 studies (Fig. 2). The extent of the individual case study sites varies between less than 1 km to the area of the whole of Europe.
The ES matrix approach has been applied for a large diversity of purposes. While each study presents its own context and objectives, we can observe a broad pattern of application types (using illustrative examples): • ES assessments in data-scarce areas: for instance in Nepal by Tamang (2011), in Burkina Faso by Sinare et al. (2016 or in Kenya by Wangai et al. (2018); • Assessment of a specific ES: flood regulating services by ; nutrient regulating services by    (2018). Mangi (2016) did an impact assessment on ecological integrity and ES supply before and after a shallow water area creation in the river Elbe's tide management.

ES matrix elements
The ES matrix approach has been used to assess an average of 15.6 different ES per matrix, whereof 7.0 (on average) were regulating ES, 6.7 provisioning ES and 3.8 cultural ES ( The geospatial units, that were mainly used in the different ES matrices, were related to LULC types and many studies used the European CORINE Land Cover typology or a related typology (EEA 1995). Besides CORINE Land Cover, the EUNIS habitat classification (European Nature Information System -EEA 2017) was used, notably for marine and benthic habitats (e.g. Salomidi et al. 2012;Galparsoro et al. 2014;). The ES matrix was generally used across different ecosystem types, but in some cases, it had also been used for specific ecosystem types, such as agroecosystems in Augstburger et al. (2018), wetland ecosystems in Ricaurte et al. (2017)   al. (2018) combine the ES matrix assessment with the Delphi approach, confidence ratings, standardised confidence levels and scenario assessment.
Several methodological steps are common in all applications of the ES matrix and we propose to look closer at the data and approaches used in and with the matrix, the scoring systems and the scoring process used, as well as the confidence and realiability analyses done in the analysed papers.
The ES matrix approach involves a scoring process to assess ES (supply, flow/use, demand) in ecosystem types or other explicit geospatial units. These scores can be based on or can integrate data from diverse sources of varying quantity and quality . Following the "tiered approach" (Grêt-Regamey et al. 2015), the data used in the ES matrix can be seen as a gradient of increasing complexity (Fig. 4). Spatial GIS data, such as LULC data, are the main data used in the reviewed studies to define the geospatial units in the matrix. At the same time, LULC types can be used as proxies for the supply and use of many ES (e.g. rather obvious LULC -ES relations such as timber provisioning ES in forest ET, water supply ES in water bodies). Thus, LULC can be a suitable base for ES mapping, which has been used intensively.
For the ES scoring process, expert scoring was the dominant data source, as it was used in 82 of the reviewed studies. When the scoring was expert-based, the number of experts involved in the scoring exercises varied between 2 and 170 with a mean of 31 experts. Nevertheless, the number of involved experts was not specified in 32 studies out of 82. Expert consultation was undertaken through workshops in 34 studies, interviews were conducted in 15 studies and specific surveys were carried out in 9 studies.
The second dominant data source was literature data transfer, which is when ES scores are based on an already-published ES matrix or deduced by information found in related scientific publications. This was used in 57 studies, more than half of the studies. Other data or approaches, such as statistical data (13 studies, e.g. national statistics of yield production), models (12 studies), remote sensing data (6 studies) and field data (6 studies) were used less in the analyses studies (Fig. 4).
Several types of data or approaches were used in 57 studies, of which 29 only combined two types of data or approaches: literature data transfer and expert scoring.

Scoring systems
One main characteristic of the ES matrix approach is to express ES provision with an ordinal scale and so allows the comparison of different ES. Several ranking scales were used to fill in the matrix. However, a numeric score ranging from 0 (no In two studies, the scoring system used non-numeric values (+) to (+++) (Maltby et al. 2017) or colour codes with 4 levels (Geange et al. 2019).

ES scoring process
A first step for implementing a matrix-based ES evaluation is to define the initial matrix that is to be used (Campagne and Roche 2018). It can be either based on an existing matrix from an already-published study or an empty ES matrix that is to be filled. Seventy of the analysed studies used an empty ES matrix, whereas in 32 studies, it was specified that the initial ES matrix (with or without modification afterwards) came from an already-existing ES matrix or several existing matrices that were found in literature ( Fig. 5 and Table 1). Amongst those studies, 23 studies specified that the initial ES scores came from one of the Burkhard et al. publications (Burkhard et al. 2009). For 7 studies, unclear information on the initial ES matrix was provided. For example, the number of experts involved are presented in Bhandari et al. (2016), but no information on the initial matrix, the fill-in process or whether there has been a compilation of scores can be found in the paper. After the definition of the initial matrix, the scoring process can be carried out with diverse sources of data. In the 32 studies that used an existing ES matrix as an initial matrix, 18 studies used no additional data to define the matrix scores and, therefore, made no adaptation of the values provided in the existing matrix (Fig. 5), whereas 14 studies modified the existing initial matrix, based on literature or expert opinions or models for one study. For the studies with an empty initial matrix, the scoring was mainly done through expert scoring or by a combination of expert scoring and data extracted from literature (Fig. 5). In addition, Table 1.
Number of times the published matrix was used as an initial matrix.

Figure 5.
Number of analysed studies with explicit information on the initial matrix (in the centre) and on the scoring data and approaches used in the initial matrix (arrows).

Maebe et al. (2019) and Ma et al. (2019) used a literature review and a mix of data from
published matrices and published quantitative data. They are noted here with empty initial matrix and scoring data with literature.
Finally, the methodology to define the final scores, using all kind of data and approaches presented in Fig. 5, should be detailed in the studies (e.g. for experts' scoring, how the experts' scores are collected and merged; when several data types and sources are used, how are they combined in a final score). From the 109 studies that were reviewed, only 61 studies explained clearly how the scores were obtained, for 19 studies the scores came from an existing ES matrix without modification or review of literature. Altogether, 29 studies did not provide any information about the method used to determine the scores, i.e. how exactly the scores were determined with the different data used in the paper.  2018) used a confidence level that was "developed for the means and standard deviations of the assessments of the ES provision potential and assessments for the confidence rating according to IPCC". In case the ES matrix scores were based on methods from different tiers (see above), the values can be cross-checked in order to find the most suitable, reliable and useful (for the specific purpose) ES quantification method (Burkhard 2017, Roche andCampagne 2019). Data from the different tiers should, of course, be valid for the same area, time and spatial scale and in comparable resolution. In the reviewed studies, only 13 studies did a confidence analysis, including eight newer studies from 2018 and 2019.

Discussion
We considered a total of 109 studies over a period of 10 years that have applied the ES matrix approach. Those studies were mainly carried out in Europe, but an increasing number of applications outside Europe can be noted, particularly in Asia. Applications mostly focused on ES supply assessments, whereas ES demand and ES flow/use assessments remain a minority.
Our review shows a mean of 15.6 ± 1.9 different ES were assessed through the matrix approach in 109 studies, whereby a mean of 7.9 ± 4.7 was found in the review by Hölting et al. (2019) in 101 studies using quantitative methods to assess landscape or ecosystem multifunctionality. The ES matrix approach allows the assessment of more ES than other approaches, notably by overcoming the limitations of data availability or the lack of proper proxies to quantitatively evaluate ES. We observed a lower number of cultural and provisioning ES than regulating ES assessed in the reviewed studies. The tendency to assess less cultural and provisioning ES than regulating ES was also observed in Egoh et al. (2012) The flexibility of the ES matrix approach was illustrated through the diversity of the applied scoring systems, data sources, matrix elements and the different purposes of applications. Nevertheless, the scoring system that was mostly used was the original "0 to 5" range, based on expert opinions, harnessing existing matrices or scores defined by authors, based on published results.
Our review highlights several major limitations or even mistakes in existing ES matrix approach applications.

The critical use of a pre-existing matrix
The review shows that 29% of the studies used an existing matrix as an initial matrix, 16% of the studies used no other data in the matrix scores and made no adaptation of the existing matrix values. As for other value-transfer methods, the lack of adaptation bears the risk that incorrect or, for the specific case study, unsuitable values are used. A critical evaluation of the validity of the scores in the matrix should therefore be mandatory. A total of 21 studies specified that the scores came from one of the Burkhard et al. published matrices (matrices in Burkhard et al. 2009). The matrices, developed in these studies, were initially created for "normal landscapes" in northern Germany, based on an integration of expert knowledge, statistical data, model outcomes and literature data applications derived from several long-term case studies in this specific region (e.g. Fränzle et al. 2008). Hence, the matrix values are basically only valid for comparable landscapes and human-environmental system settings. Otherwise, they need to be adapted to each case study's specific conditions. For example, the process used to create the different matrices in the study of Stoll et al. (2015), based on the matrix in , could be recommended as several experts across Europe were involved in the matrix adjustment process. The adaptation of the scores from an existing ES matrix is also possible through a participatory approach with adjustments in consensus, as in Cai et al. (2017) and Tao et al. (2018).

The need for rigorous presentation of methods
In 27% of the reviewed studies, it is not clear how the data has been used and where the final scores came from. This leads to a deficit in the scientific robustness and replicability of the studies, as well as a lack of proper consideration of the importance of the data acquisition protocol by the reviewers. It is important to be precise and explicitly transparent about the methods that were used in order to allow the end-user to be aware of uncertainties inherent in the assessment. A categorisation of the used data and approaches used according to the "tiered approach" (see "Methodologies used in the studies" Section above) can help to understand the type and complexity of the applied approaches.

Variability and uncertainty analysis of scores should be the norm
The limits and uncertainties of the ES matrix approach have been listed, for instance in Hou et al. (2013), Jacobs et al. (2014) and Campagne et al. (2017), as well as some of the issues regarding the integration of assessments of experts' uncertainties. Campagne and Roche (2018) elaborated more in detail on how to collect and integrate expert knowledge to address some of the biases and limits of the expert elicitation method. In the 84 studies using expert scoring (no matter the initial matrix, Fig. 3 (2018) were, unfortunately, not always properly addressed by many users of the ES matrix approach. When a score is expert-based, variability and confidence should be considered in the analysis with the final scores. Some studies developed or used confidence analysis and provided good examples (e.g. Elliott et al. 2019, Geange et al. 2019, Gorn et al. 2018, La Bianca et al. 2018. The results show that the number of studies that consider confidence analysis has increased in the last two years, but concerns still a too low number of papers.

The use of expert scoring in the ES matrix
A regular critique of the ES matrix approach is that it is too subjective, particularly when based on expert scoring alone. One way to tackle such remarks is to benchmark the ES expert scores against "more quantitative" estimates. However, up to today, only a few studies have dealt with the topic of comparing ES matrix experts-based scores with quantitative estimates. Ma et al. (2019) compared five indicators: one qualitative (expertbased) and four quantitative (data-based) indicators of the global climate regulation ES in Germany and found significant correlations. Roche and Campagne (2019) observed high levels of correlation amongst seven ES scores provided by an expert panel with eight spatial quantitative biophysical indicators at the landscape scale for the French Hauts-de-France Region. It has been shown that more complex ES assessment approaches do not necessarily deliver more robust results than those harnessing expert knowledge . Nevertheless, more comparison analysis studies need to be conducted in different contexts to strengthen the applicability of the ES matrix approach. This is most important, since quantitative ES assessment methods have not necessarily to be more reliable than expert scoring. Moreover, it could be more important to pay attention to the selection and the number of experts in the panel and the elicitation methods used to produce the estimates than to compare them to quantitative data. This, however, requires a careful consideration of the transparency of the methods used to fill in the ES matrix and to describe in detail what exactly has been done in order to achieve the scores extensively, as stated in the second point of the discussion.

Improve the characteristics of geospatial units to assess ES
Basically, the ES matrix approach is based on spatial units as LULC categories, although LULC categories can be considered as important proxies for many ES. Nevertheless, LULC alone lacks information regarding important components of ecosystem conditions that support ES capacities, such as soil type and quality, water availability, geomorphology or overall ecosystem integrity. These components also vary in space and time within and between LULC categories. One approach is to consider that the generality of LULC, especially when using broad categories, can be associated with high confidence of ES scores and is, in itself, a strength and the main interest of the ES approach -applicability and genericity. Another approach is to complement the LULC categories based ES scoring with other sources of informations that can be used to tune the matrix ES scores, based on local ecosystem condition and thus improve local validity of scores. As a consequence, this reduces the manageability of the comparably simple look-up-table approach. Jacobs et al. (2014) recommended to use broad land-use classes associated with high confidence scores that can be more easily transferable than locally specific LULC classes. In an alternative approach, the experts could also be asked to provide ranges of ES values depending on different ecosystem states, that would allow a more detailed consideration of relationships between ES scores adaptation, based on local ecosystem states.
Despite these limitations, the ES matrix approach has proven its usefulness. The advantages of the approach were listed, amongst others, in Jacobs et al. (2014) and Campagne and Roche (2018). The ES matrix approach offers a good compromise to deal with the 'urgency-uncertainty dilemma'   salience through its relevance and adaptability to local issues, as well as its easy comprehension; • legitimacy in expert workshop processes that promote acceptance and understanding by stakeholders, policy-and decision-makers and land managers; • feasibility through its flexibility, time and resource efficiency and the advantage of overcoming the limitations of available data by creating data, according to experts.

Conclusions
The ES matrix approach is widely applied in a high diversity of contexts and with various data and quantification approaches. Based on our analysis on ES matrix applications and methodologies within a ten year period, our key recommendations for future improvements include: • Proper communication and transparency of the quantification methods used, including uncertainty assessments; • Avoiding value-transfer from existing matrices to non-comparable case studies or adapting the values with a local participatory approach and local data. This means that an existing ES matrix cannot be used directly to estimate LULC types' ES capacities in a different context/region without being re-evaluated or adjusted as with a dedicated expert panel session; • Improving quantification of ES scores in the matrix, harnessing and integrating methods from different tier levels (besides expert-based quantifications, also use of other data originating, for instance, from statistics, monitoring, citizen science, social media, remote sensing or/and model outcomes).
We also take the opportunity to provide the recommendations for improved applications of the ES matrix beyond the results achieved from the review (based on Campagne and Roche 2018 andJacobs et al. 2014): • Appropriate stakeholder/end-user involvement during the assessment process; • Specification of the geospatial units, including information on ecosystem condition and spatial heterogeneities; • Improving the selection of ES that are relevant to be assessed in the specific case study; • Integration of different valuation methods, including biophysical, social-cultural and economic methods where appropriate; • Consideration of each individual study's purpose and flexible choice of respective methods and data to be used.
The simplicity of the method has been acknowledged, on the one hand, as the main strength of the method. On the other hand, this is also considered its key weakness. The success of the method is also linked to its feasibility and its easy comprehensibility that can promote the use and ability to increase awareness of ES for decision-making (Science for Environment Policy 2015). With the integration of data resulting from ES quantification methods that are combined with participatory approaches and the co-production of results, the final outputs are readily appropriable by stakeholders.
Nevertheless, the application of the ES matrix approach has to become more transparent and integrate more confidence analysis. It remains an important task to elaborate which are the most appropriate ES assessment methods for each individual ES or group of ES in different human-environmental system settings and for the different assessment purposes.