The analysis of time series of eddy-covariance (EC) data, as in-situ measurements of the net atmospheric CO2 exchange (NEE) reveals that their temporal variability is a ubiquitous phenomenon which may help investigations of the sensitivity of biogeochemical processes to the environment at the ecosystem scale. Over the past decade, numerous studies have focused on the interannual variability (IAV) of NEE and its environmental drivers at global scale (Jung et al. 2017; Liu et al. 2018; Fu et al. 2019). Studies were conducted for different ecosystems (Delpierre et al. 2012; Shao et al. 2016; Baldocchi et al. 2018; Fu et al., 2018) and individual sites (Wohlfahrt et al. 2008; Marcolla et al. 2011; Aubinet et al. 2018). Synthesis studies have reported significant spatial variability in NEE among ecosystem types (Baldocchi 2008; Niu et al. 2017; Baldocchi et al. 2018), as well as large temporal variability within sites.
The IAV of NEE at the ecosystem scale has been primarily linked to meteorological drivers (including extreme events) such as temperature, precipitation, and radiation, etc. Plant phenology is an additional control acting through, e.g., the length of the growing season, and the timing of leaf onset and shedding. Evidence exists that these drivers are significant contributors to the temporal variability of NEE. Subtler changes in the ecosystem structure due to natural and anthropogenic disturbances (e.g., stand/cropland/grassland management) or delayed physiological response to climate, and edaphic conditions, are also likely causes. Indeed, the variations in temperature, precipitation and incident solar radiation have been reported as the most important climate factors controlling variability in the NEE of different ecosystems (Jung et al. 2017), while less attention has been paid to climate interactions with phenology and physiology (i.e., period and amplitude of the growing season) (Delpierre et al. 2009). The direct and indirect controlling factors of the NEE IAV differ among the ecosystem types, with distinctive differences between forests and non-forests, evergreen needleleaf forests (ENF) and deciduous broadleaf forests (DBF), and between grasslands (GRA) and croplands (CRO) (Shao et al. 2016).
The response of the gross primary productivity (GPP) and ecosystem respiration (Reco) to physical and biological drivers controls the variability of NEE at different time scales, from seconds to years (Stoy et al. 2009). Long-term observations are necessary to capture such effects (Chu et al. 2017). Most recent studies indicate that IAV of NEE is most often associated with variations in GPP (Ciais et al. 2005; Stoy et al. 2009; Ahlström et al. 2015; Novick et al. 2016). In a synthesis study, Shao et al. (2016) found that the maximum photosynthetic rate was the most important factor for the IAV in GPP, which was mainly induced by the variation in vapour pressure deficit (D). For Reco, the most important drivers were GPP and the reference respiratory rate. The relative sensitivity of photosynthesis and respiration to environmental factors shows that GPP is more sensitive to drought stress than Reco in most ecosystems. This phenomenon was especially the case in Mediterranean climates, which experience large variability in the amount of rainfall during the growing season (Pereira et al. 2007; Allard et al. 2008).
Several methodologies have been used in the literature to explore carbon-flux/environment interactions. Common approaches are based on the statistical analysis between carbon cycle components and environmental variables applied on measured data. Among them, look-up tables are used for gap-filling (Falge et al., 2001; Reichstein et al. 2005). A number of numerical, analytical or predictive approaches, either detailed or simplified, have been proposed, which are grouped into the so-called Machine Learning approach (ML). Detailed or simplified ML can be applied to predict carbon fluxes based on environmental parameters. The approach has proved good at demonstrating the nonlinear relationship between ecosystem-based carbon fluxes and environmental variables based on EC measurements (Papale and Valentini 2003). These authors have demonstrated the potential of artificial neural networks in the study of carbon flux dynamics in European forest ecosystems, for gap-filling and spatializing carbon fluxes (Dou et al. 2015; Papale et al. 2015). In predictive studies of soil CO2 fluxes, or carbon and energy fluxes, ML based on the Random Forest approach (RF) (Dou et al. 2018) or regression splines (Tramontana et al. 2016) have been successfully used. The RF approach is highly flexible and can perform two types of data analysis: regression and classification. In our study specifically, we used the regression mode as our variables are continuous. RF is of particular interest because it determines the most important environmental parameters controlling the variations in carbon fluxes at different time scales. Spectral analysis is also an appropriate technique to investigate the control of fluxes in the time-frequency domain, which is therefore complementary to the RF approach. Previous studies used Fourier transform or cross-correlation approaches, (Baldocchi et al. 2001; Stoy et al. 2005, 2009; Vargas et al. 2010a). More recently the alternative technique of wavelet analysis has emerged (Vargas et al. 2010b, 2011; Fares et al. 2013). The latter is particularly relevant for investigating the spectral properties of biometeorological variables that exhibit non-stationary features (Katul et al. 2001) and for partitioning the drivers of their temporal variability along the frequency domain. In that sense, the scalogram — plot of a signal spectrum or a two-signal cross-spectrum as a function of time and frequency — can be more useful than the spectrogram for analyzing real-world signals that are typically non-stationary. Trends in means and/or variability are typical features of such signals, but changing modes of variability could also be present. Scalograms are the dedicated tools for detecting the latter. They also have the advantage of better time localization for short-duration/high-frequency events, and better frequency localization for low-frequency, longer-duration events.
Comparing the interaction between carbon fluxes and environmental parameters among different ecosystems requires concurrent time series of turbulent data obtained with a common methodology and standardized post-processing treatments. Thus far, this requirement has not yet been satisfactorily met for the various methodologies used either when measuring turbulent variables or for processing the data obtained. Moreover, concurrent time series of flux and meteorological data extending beyond 10 years were not available until recently. The present study is built upon time series of data from different types of ecosystems representative of the French network of stations contributing to the European ICOS Research Infrastructure. Among them, three historical stations were launched during the Euroflux project in 1997 and two during the subsequent CarboEuroflux project in 2000 and 2002. The data analyzed represent the carbon flux components, the Net Ecosystem Exchange, Gross Primary Production and total ecosystem Respiration. They cover the five main temperate plant functional types along time series extending from 9 to 13 years. The values of the three flux components were calculated from raw data using a common methodology thus eliminating artificial discrepancies between sites.
Our analysis investigates the respective influence of the drivers of NEE, GPP and Reco. It uses a Random Forest analysis performed along a range of discrete frequencies and a spectral analysis for the determination of cross-coherence between CO2 fluxes and their environmental drivers across the frequency domain.
Material and methods
Five sites from the pre-ICOS network were selected for the analysis, as representative of plant functional types in France:
- A temperate Deciduous Broadleaf Forest (DBF): hereafter referred to as Barbeau
- A temperate Evergreen Needleleaf Forest (ENF): Le Bray
- A Mediterranean Evergreen Broadleaf Forest (EBF): Puechabon
- A temperate/continental extensive Grassland (GRA): Laqueuille
- A temperate Cropland (CRO): Auradé
Meteorological data and flux measurements
The time series of in-situ measurements analyzed by the RF or wavelet approaches are summarized in Table 2. Among them, we used the incident solar radiation (Sd), the air temperature (Tair), the precipitations (P), the wind speed (u), the vapour pressure deficit (D) computed from the air temperature and the relative humidity, and the soil water content (θ). The relative extractable water (θEW) was computed from the volumetric soil water content (θ) normalized by the water holding capacity estimated from the difference between the water content at field capacity and at wilting point (Granier et al. 1999), previously quantified for these specific sites. When θ was not available or when the data series contained too many gaps, a site-specific model for the corresponding site was used to gap-fill θEW (Granier et al. 1999; Mouillot et al. 2001; Loustau et al. 2005; Vuichard et al. 2007; Cabon et al. 2018, supplementary material § S2).
The net ecosystem exchange (NEE) was determined using the eddy-covariance methodology which combines measurement at 20 Hz of the 3-D wind speed and surrogate temperature using a sonic anemometer, and with the molar fractions of CO2/H2O using an infrared gas analyzer (see Table 2 for sensor references at each site).
Eddy-covariance data processing
Half-hourly eddy-covariance fluxes from the five selected sites were calculated using EddyPro® (www.licor.com/eddypro). Standardized processing was applied for all the sites. i) Axis rotation of the wind vector was performed using the planar fit methodology (Wilczak et al. 2001) for Barbeau, Puechabon, Le Bray, Laqueuille and the double rotation for Auradé because of the fast growth rate of the vegetation. The block-average method was used to extract turbulent fluctuations from time series (Gash and Culf 1996). Time lags between vertical wind speed and the variable of interest were determined for each averaging period by an automatic time lag optimization. ii) Low frequency spectral corrections were applied according to the analytic method described by Moncrieff et al. (2005). High frequency spectral corrections were applied depending on the setup: the fully analytic method of Moncrieff et al. (1997) was adopted for the open-path systems (LI-7500, Auradé, Laqueuille, Barbeau), which includes a correction for sensor separation effects; an in-situ spectral correction method (Fratini et al. 2012) was used for the closed-path analyzers (LI-6262, Puechabon and Le Bray) as a suitable method to describe attenuation along the intake tube. For the closed-path analyzers, a correction was also applied to account for sonic anemometer and analyzer separation according to Horst and Lenschow (2009). For the open-path LI-7500 analyzers, the WPL correction (Webb et al. 1980) and the self-heating correction were also applied (Burba et al. 2008). iii) As a QA/QC test of the final fluxes, we used the standard flags (0-1-2) defined by Mauder and Foken (2004) and data with a flag = 2 were discarded. Remaining flux data were also filtered using statistical tests on the raw data (Vickers and Mahrt 1997); these included: despiking, absolute limit determination, amplitude resolution, drop out, skewness and kurtosis tests, and discontinuities. Finally, we tested for insufficient atmospheric turbulence with a friction velocity (u* threshold being determined for each site (Reichstein et al. 2005).
Partitioning and gap-filling
Incident solar radiation (Sd), air temperature, Tair and vapour pressure deficit (D) were gap-filled using the marginal distribution method (Reichstein et al. 2005). To compare CO2 fluxes computed over different integrated time scales from the different sites, we filled gaps in the half-hourly flux time series using the gap-filling tool and the nighttime partitioning method from Reichstein et al. (2005), developed in an R package (Wutzler et al. 2018). An exception was made for Barbeau for which the 2013 NEE long gap of the LI-7500 was gap-filled with the flux computed from the LI-7200 deployed in parallel with the LI-7500 since 2012 (Delpierre N., pers. comm).
Random forest analyses
A random forest (RF) machine-learning algorithm (Breiman 2001) was used to identify how the environmental controls affect the variability of NEE, GPP and Reco in the long term. The RF is a non-parametric statistical estimation technique based on the use of decision trees and multiple regressions. A detailed explanation of the algorithm is given by Breiman (2001). Each of the decision trees that form an RF are built from bootstrap samples of the original dataset (i.e., training data). For each tree, the data that are not included in the bootstrap samples are named Out-Of-Bag (OOB) samples (i.e., test data). The training data is randomly chosen for the splitting rule at each node. Each tree is then fully grown or until each node is pure. The trees are not pruned.
Using the OOB samples, the RF algorithm produces a robust and informative statistic able to provide the importance of an explanatory variable X for predicting the variable of interest Y (McInerney and Nieuwenhuis 2009). This importance of explanatory variables, X, over the response variables, Y, is calculated in terms of percentage increase of the mean squared error (%IncMSE) if the variable X in the model is replaced with a random one (Eq. 1).
More precisely, an explanatory variable, X, is considered important if by “breaking” the link between X and Y, the prediction error increases. To break this link, Breiman (2001) suggested randomly permuting the predictor variable, X, in the OOB samples. The importance corresponds to the mean increase in the prediction error on the tree ensemble. Therefore, if we randomly permute an explanatory variable that does not gain us anything in prediction, then predictions will not change much and we will only see small changes in MSE. On the other hand, the important variables will significantly change the predictions if randomly permuted, so we will see more significant changes in MSE. The higher the difference is, the more important is the explanatory variable.
The package ‘randomForest’ (Version 4.6-14, Liaw and Wiener 2018) available in R was used for these multiple regression analyses and is presented in the supplementary material (S3). The analysis was conducted first to compare five plant functional types (PFTs) at the half-hourly time-step. Secondly, we performed the analyses at weekly, monthly and seasonal scales for only two sites, those presenting contrasting results at the half-hourly time-step. Among the data presented in section 2.2, the data used for the RF analyses were Sd, Tair, P, u, D, θEW, and carbon dioxide concentration (CO2). To make the graphs easier to read at the aggregated time scales, we aggregated radiative parameters into an ‘Energy’ variable (Sd + Tair) and an Index of Stress relative to the air or soil water deficit constraint defined by ‘IStress’ (θEW+ D). In order to cross-compare the %IncMSE computed at each site and for each flux, we present this %IncMSE as a relative value of the total percentage (%Imp).
Missing environmental values are omitted in the analysis. Indeed, to make sure that the final results were not biased by the partitioning and gap-filling method that is based on a relationship between hourly Reco and Tair, and GPP and Sd, we performed the analyses using only measured NEE (qc = 0), nighttime Reco (Sd < 20 W m−2) and daytime GPP (Sd > 20 W m−2) when it was directly estimated from NEE minus Reco.
Wavelet coherence analyses
Local and global wavelet power spectra were computed using the Morlet wavelet transform (Morlet et al. 1982). Morlet proposed the use of a window with a size that depends on the frequency analyzed, with a fixed number of oscillations. The analyzed functions are precisely the Morlet wavelets that allow us to:
- Observe the high frequencies with a high temporal resolution and therefore to provide accurate information on the locations of brief phenomena.
- Observe the low frequencies over a sufficient period of analysis to account for slow phenomena.
The Morlet wavelet transform of a time series (xt) is defined as the convolution of the series with a set of “wavelet daughters” generated by the mother wavelet by translation in time by τ and scaling by s (Eq. 2) (Rösch and Schmidbauer 2018). The position of the particular daughter wavelet in the time domain is determined by the localizing time parameter τ being shifted by a time increment δt. δt is therefore the time resolution, i.e. the sampling resolution in the time domain, and 1/δt represents the number of observations per time unit. The choice of the set of scales s determines the wavelet coverage of the series in the frequency domain. This scale value is a fractional power of 2, a ‘voice’ in an ‘octave’ with 1/δj determining the number of voices per octave. δj corresponds to the frequency resolution, i.e. the sampling resolution in the frequency domain, and therefore, 1/δj refers to the number of suboctaves, i.e., voices per octave.
They also defined the wavelet power spectrum or time-frequency (or time-period) wavelet energy density:
Following the approach of Torrence and Webster (1999), the wavelet coherence between two time series () and () was computed as:
The notion of wavelet coherence requires smoothing of both the cross-wavelet spectrum and the normalizing individual wavelet power spectra, which is indicated by ‘S’ smoothing operator and is defined as S(W(τ, s)) = Sscale (Stime(W(τ, s))). Stime represents smoothing in time and Sscale is smoothing along the wavelet scale axis. is the cross-wavelet transform (Eq. 5), defined as the product of the wavelet transform of and the complex conjugate of the wavelet transform of (Torrence and Compo 1998):
Squared spectral coherence estimates, as described by Eq. 4, express the degree of linear association between the phases and amplitudes of the two data records within each normalized non-overlapping frequency band. These estimates show the proportion of common variance within each frequency band shared by two variables. Eq. (4) is the analogue with the notion of Fourier coherence and the coefficient of determination in statistics. It may actually be regarded as such, which means that a coherence value of zero signifies that the two time series are unrelated, whereas a coherence value of unity indicates the two time series are linearly related at the given frequency and time. Finally, through this cross-wavelet analysis, we can study the two series synchronization in terms of the instantaneous or local phase advance of any periodic component of () with respect to the correspondent component of (), the so-called phase difference of over at each localizing time origin and scale.
Eq. 6 equals the difference of individual phases, Δφ = φx−φy, when converted to an angle in the interval] −π, π].
The package ‘WaveletComp’ (Version 1.1, Rösch and Schmidbauer (2018)) available in R was used for these coherence analyses. The parameters used in the analysis and the scalogram reading guidelines are provided in the supplementary documentation (Supplementary material: § S3.2, S3.3 and Table S2, Fig. S2). All the results are presented as image plots of both the individual wavelet power spectrum (and average) and cross-wavelet coherence, in the time-period domain for the period 2006–2011. This period was selected as a common period for the three selected sites and with the lowest gap-filled data percentage. Since, among the five sites considered, our Random Forest analysis allowed us to identify three main patterns and our power and wavelet results are shown for three sites selected out of the five original sites: a forest with permanent foliage, Puechabon (similar to Le Bray), a forest with seasonal foliage, Barbeau, and the crop site, Auradé (similar to Laqueuille). The cross-coherence analyses focused on the most important meteorological parameters extracted from the Random Forest analysis, i.e. θEW and D. They comprise both the main potential constraints on carbon fluxes (Novick et al. 2016) and exhibit a stronger and less predictive temporal variability than Sd and Tair across sites. In addition, spectral analysis of the cross-coherence of these two variables with carbon flux components from 30 minutes to 6 years has rarely been performed.
Firstly, we are presenting the analyses of the role and importance of environmental drivers in determining the variability of carbon fluxes for the five sites selected over the whole period of measurement provided in Table 2 at each site. This phase does not need any specific description of the environmental conditions as it is based on the RF approach mixing all environmental conditions. Thereafter we are showing the investigations on the temporal correlation between time series for three of the five sites selected by identifying periodicities and quantifying phase differences between series for a range of temporal scales. The analyses were performed on a 6-year period (2006–2011) for which environmental and fluxes patterns are described.
Random forest analysis
At half-hourly resolution, the percentage of variance explained by the RF model showed values between 65% and 90% (Table 3) and was higher by between 4% and 30% for NEE and Reco as compared with GPP. For all five sites, Sd was the most important factor in explaining NEE and GPP, accounting for more than 45% of the half-hourly variability; whereas Tair was responsible for most of the variability of Reco (>70%), followed by θEW (Fig. 2). The atmospheric concentration of CO2 appeared third in the list of factors explaining the variance of GPP, Reco and NEE for poor mixing conditions (Reco) and canopies with lowest roughness (grassland at Laqueuille and crops at Auradé). The precipitation contribution was negligible.
Three main results were revealed by this analysis at half-hourly time-step:
- Globally, the four most important variables extracted were Sd, Tair, D, and θEW (Fig. 2).
- The pattern of RF for NEE was roughly similar to the GPP’s with Sd being the largest contributor and either θEW, D or Tair second.
- Reco was instead systematically dominated by the Tair contribution with θEW ranked second contributor.
The main difference between PFTs came from the variance profiles of GPP and NEE. For those PFTs with strong LAI dynamics during the year, i.e., Barbeau, Laqueuille, Auradé, Tair appears as the second most important variable after the incident radiation. The θEW or D only appeared as third or fourth-most important contributors, wind speed and precipitation being last. Conversely, for evergreen PFTs Puechabon (holm oak) and Le Bray (maritime pine), the θEW and D came second in the variance profiles of GPP and NEE, θEW being the most important parameter at Puechabon and D at Le Bray.
Analysis across integration times
Across time scales, the GPP-NEE similarity was conserved from the 30-minute time scale to the seasonal at the cropland site, but the precipitation overtook the stress contribution at the season scale (Fig. 3B). At Puechabon, the GPP profile was changed and the stress index became the main contributor to GPP at monthly and seasonal scales (Fig. 3A).
According to the integration time considered, the variables selected did capture a variable fraction of the carbon flux variability in the evergreen broadleaf at the Puechabon site and the cropland at Auradé, from 21% (the seasonal variance of GPP at Puechabon) to 89% (the half-hourly Reco at Puechabon) (Table 3). In the EBF, the variance contribution profile was maintained at the different scales, with the dominance of Sd (more than 50% for the different scales) for NEE. However, the constraint index, ‘IStress’, contributed gradually to the variability reaching almost 50% of the seasonal variance (Fig. 3). Reco patterns did not significantly change except for the season for which D and θEW equally contributed to the variability (35%) and the precipitation contribution reached 4% against 0.3% at the hourly scale. Finally, the GPP variability was mainly affected by the θEW from daily to seasonal scale, that became the most important variable across increasing time scales. Including D, these two potential constraints contributed to 64% and 56% at monthly and seasonal scales, respectively.
In the cropland, as regards NEE and GPP, Sd and Istress were still dominant along the different scales (sum >50%). However, precipitation became as important as Istress at seasonal scale. For Reco, we also observed an increasing importance of wind speed especially at seasonal scale (>50%). The difference in performance between the two PFTs, EBF and crops (CRO), decreased at longer time scales.
We focused the next analyses on three sites that presented contrasting results in the RF analyses (Barbeau, Puechabon and Auradé), and for a common period 2006–2011 with the lowest amount of gap-filled data.
Environmental conditions and carbon flux time series over the period 2006–2011
The 2006–2011 period was characterized by contrasting environmental conditions that are depicted for three sites, Barbeau, Auradé and Puechabon, in Fig. 4 and Supplementary material: section S4 and Table S3. The average annual temperatures (See Table S3) were slightly higher than the long-term average (Table 1). The highest monthly average temperature was recorded either in July or August, with values of 19.7 °C, 21.7 °C and 23.3 °C for Barbeau, Auradé and Puechabon, respectively. The warmest year was 2011, followed by 2006 and 2009. The year 2010 was coldest, despite a warm July, in part due to two months in winter and autumn being colder than the long-term average. Solar radiation during the spring varied strongly between years, with periods such as May 2011 presenting a significant increase in Sd at the three sites, compared to the other years. This increase is consistent with the low rainfalls recorded at the same time. Similarly, November 2011 had the lowest solar radiation at Puechabon; consistent with the highest amount of rainfall. Further comments are provided in Supplementary material S4.
Carbon fluxes: 2006–2011
Using the sign convention that a flux from the vegetation to the atmosphere is positive, the annual balance of carbon (NEE) in the evergreen broadleaf forest (EBF) ranged from −381 g C m−2 yr−1 (2007) to −136 g C m−2 yr−1 (2006) and from −579 g C m−2 yr−1 (2007) to −425 g C m−2 yr−1 (2006) in the deciduous broadleaf forest (DBF) with a smaller year-to-year variability than at the other sites during the studied period (Fig. 5, Table S3). At Auradé, the winter wheat rotations were a carbon sink with carbon sequestration between −166 g C m−2 yr−1 and −11 g C m−2 yr−1, whereas rapeseed and sunflower rotations were sources of carbon (+12 and +176 g C m−2 yr−1 respectively).
The temperate DBF site presented the highest gross primary production (GPP) between 1829 g C m−2 yr−1 (2006) and 2162 g C m−2 yr−1 (2011) with no significant differences between years, considering the uncertainty on the measurements (about ±10%, Baldocchi et al. 2018). The Mediterranean EBF GPP ranged from 986 g C m−2 yr−1 (2006) to 1475 g C m−2 yr−1 (2007) and the crop had the lowest annual values of GPP ranging from 642 g C m−2 yr−1 (sunflower rotation in 2007) to 1239 g C m−2 yr−1 (winter wheat, 2008). The temperate DBF site also presented the highest Reco between 1404 g C m−2 yr−1 (2006) and 1616 g C m−2 yr−1 (2011). In the Mediterranean EBF, Reco ranged from 847 g C m−2 yr−1 (2006) to 1091 g C m−2 yr−1 (2007) with no significant difference between years. The annual Reco ranged from 742 g C m−2 yr−1 (for the sunflower, 2007) and 1125 g C m−2 yr−1 (for the winter wheat, 2010). The annual peak in both GPP and Reco was observed in May/June in the cropland (except for the sunflower, in July) in June for the EBF and July for the DBF.
Power and cross-coherence spectra of CO2 fluxes and climate variables at multi-temporal scales
The power and wavelet analyses were performed at three sites representing different plant functional types (PFTs) and for which the RF analysis showed contrasting patterns (Puechabon, Barbeau and Auradé). In what follows, the cross-correlation of ecosystem fluxes with θEW and D was more specifically investigated as these two variables represent a potential stress on carbon fluxes with a stronger and less predictive temporal variability than Sd and Tair across sites. In addition, spectral analysis of the cross-coherence of these two variables with carbon fluxes from 30 minutes to 6 years has rarely been performed. For all scalograms, red colours indicates high power whereas blue colours indicates low power.
Individual power spectra of climate variables (Fig. 6)
Despite the range of climates covered, the power spectra of meteorological variables and their temporal variations from 2006 to 2011 were similar among the three sites. For each variable, the average wavelet power showed two major peaks at 1-year and 1-day bands. The water vapour deficit and soil moisture variables showed, however, the distinct features documented below.
- The water vapour deficit (D) spectrum reflected both the low frequency pattern shown by temperature and the high frequency behaviour of Sd. It exhibited high power at intermediate scales, from 8-day to 3-month periods, only during the summer season. Plausible explanations of the discrepancy between the temperature and water vapour saturation deficit at high frequency may be the frequent zeroing of D values during nighttime and the non-linear relationship between temperature and air water vapour saturation which both amplify the daily D variation for a given step change in temperature.
- At all sites, the relative extractable soil moisture content, θEW, is characterized by narrow bands of high power extending from 1-year down to 2-week periods. It showed little variation at periods shorter than a month. Different patterns can be seen, which appear to depend on the soil thickness. For soil deeper than 1 m, a 12-month periodicity is observed continuously for θEW irrespective of plant functional type (EBF or DBF). Power peaks (red colour) are observed at higher frequency only after heavy precipitation at the end of the summer/autumn in 09–10/2006, 11/2008, and autumn 2011 for Puechabon (EBF) and 10/2008, 08/2010 and 08/2011 for Barbeau (DBF). The rainstorms in late autumn have even induced locally high power peaks of θEW at intermediate frequency bands (periods < 1 month). No such patterns are observed during summer. We assume this is because high frequency precipitation events in summer only marginally affect the soil water volumetric content because of a dilution effect and the interception of precipitation by the vegetation canopy that is proportionally higher for lighter rainfall episodes and during summer. For the shallowest soil (Auradé), the power at the year period is intermittent whereas clear pulse dynamics are observed at high frequency (1 per day). Indeed, due to the shallower soil in the cropland (Auradé), the response of θ and θEW to precipitation events at higher frequencies is strong, as seen by the spikes at 3-hour to 6-month periodicities attributed to wetting-drying cycles. This was observed for all years, except 2008 when the cropland soil was close to saturation throughout the year and the θEW remained above 0.7 from June to September.
Individual power spectra of CO2fluxes (Fig. 7)
Over the 6-year period considered, the effect of the annual cycle of leaf formation and leaf fall is evident in the measured CO2 fluxes. This is particularly clear at Barbeau deciduous broadleaf forest, meanwhile the fluxes from the cropland were controlled by the farming operations between the sowing and the harvest date. The Puechabon evergreen Mediterranean broadleaf forest had no distinctive periodic pattern.
NEE and GPP. At high frequencies and in accordance with the RF analysis, the wavelet decomposition of NEE and GPP were strikingly similar, suggesting that processes driving temporal variations of GPP were also controlling NEE, more than temporal variations of Reco. Power spectra of NEE and GPP had significant high power at the 1-day period and a secondary small peak at 0.5 day during the growing season. Hence, the high frequency peaks were limited to spring-summer for the DBF and cropland, but covered the entire year in the EBF type. There was one exception however during winter 2008–2009 in the cropland where a secondary peak of high variance is observed due to the winter wheat sown in September.
At intermediate and low frequencies, the power spectra of NEE and GPP showed significant peaks at 1-year and 6-month periods especially in the deciduous broadleaf and cropland ecosystems. The NEE annual peak was systematically attenuated as compared with GPP presumably due to a compensating effect of Reco. GPP from the EBF site was characterised by a continuous high power at 1-year and by several hotspots at the 6-month periodicity as in 2006, 2007, 2009 and 2010.
Reco. The power spectrum of Reco was different from those of the GPP and NEE and was enriched at intermediate scales, from a week to 6 months. The Reco power exhibited within the week to 3-month bands has, however, no strong influence on the NEE spectrum. A distinctive feature of the Reco spectra is the absence of spectral gaps at intermediate scales (one to three months). A statistically significant strong and dominant wavelet band was observed at a 1-year periodicity at all sites and a secondary power at 6-month periodicity for the EBF and cropland sites. Specifically, these peaks were detected mostly during the growing season. However, two local events were also observed outside the growing season for the cropland site at the end of 2009 and during the autumn of 2011. These events impacted the power spectrum of NEE and were caused by the rotations between summer (rapeseed) and winter (wheat) crops and related soil disturbances (ploughing).
Cross-coherence spectra (Figs. 8 and 9)
The cross-wavelet coherence (CWC), i.e., the squared spectral coherence, between two signals, identifies the frequencies at which the two variables are most strongly correlated. CWC can be seen as a frequency-domain analogue of the squared correlation coefficient. This analysis allowed us to investigate further the temporal coherence between CO2 flux components (GPP and Reco), and D and θEW. This coherence is represented in the scalogram (Figs. 8 and 9) with a scale between 0 (low coherence, blue colour) and 1 (high coherence, red colour). High coherence is expected to occur when both GPP (Reco) and D (θEW) co-vary. Conversely, low coherence occurs for the periods when variables were constant or uncorrelated.
For the three sites selected, Figs. 8 and 9 show the time series of GPP, Reco, D and θEW together with cross-coherence spectra between GPP or Reco and D (Fig. 8) or θEW (Fig. 9), from 2006 to 2011. The annual harmonic mode — or the annual cycle — showed the dominant effect of D and θEW on GPP and Reco for the three sites as seen by the continuous high coherence (red colour) from 2006 to 2011. At higher frequencies, the daily cycle was marked by a strong coherence with D of both GPP and Reco at all sites. At the day period, the CWC between θEW and Reco was substantial in the cropland only. For intermediate periods (week to 6 months), the relationship between CO2 flux and driving variables was highly non-linear in the modes 1 week to 2 months, as evidenced by the spectral gap between the day and week periods. This gap was however discontinuous for D, for which coherence peaks were observed sporadically during growing seasons. Hotspots of high coherence were detected for periods from a month to 6 months and will be described further below, first for the D and second for θEW.
The D cross-coherences with GPP and Reco were roughly similar. The strong common power (red colour) with GPP at the 1-day period was observed during the growing season only. For the EBF site, the 1-day correlation was attenuated during winter, consistent with the smaller variations in D. At this 1-day period, GPP and D were in phase (arrow pointing up right) and GPP attained its maximum between 2 and 6 hours ahead of D (+π/12 and +π/4) (see Fig. S2). This was expected because D lagged 2 to 3 hours behind incident Sd. At 1-week to 2-month periods, correlations between GPP and D were weaker and phase relationships were more chaotic for the three sites. However, some localized high coherences spots were observed at intermediate scales in the EBF when θEW was not limiting (spring-summer 2007, spring 2011). In most of these situations, GPP and D were in phase with zero lag observed. At longer periods, (1 to 6 months) correlation hotspots between GPP and D were exhibited for the three sites. These seasonal hotspots were well individualized during the DBF growing seasons with GPP lagging behind D by an angle of 7π/4, that is about one week. The springs of 2006 and 2009 correlation hotspots were instead at the seasonal scale (4-month period), during which GPP and D were in anti-phase, D leading GPP by an angle of 3π/4 (∼1 month). In the EBF and cropland type, GPP and D are also correlated at periods of from 3 to 6 months. However, the GPP-D correlation was intermittently lost in the EBF, e.g., in spring 2008, both winter/spring 2009 and 2011, and in the crop site during winters of 2007, 2008, 2011. At one-year period, the three sites presented a strong correlation between GPP and D. The DBF signals were synchronized, while GPP led D by 3 months (+π/4 angle) in EBF and 4 months in the cropland (+π/3 angle).
The coherences of θEW with GPP and Reco were also similar between sites showing two typical frequencies at day−1 and year−1 frequencies. Indeed, the coherence was highest at frequency bands and dates where θEW co-varied with GPP or Reco. Similar co-occurring bands were observed at higher frequencies in the crop site, where the soil is shallowest. As for D, the GPP (Reco) – θEW cross-power showed seasonal discontinuities at 1-day period related to the phenology of the canopies. A spectral gap appeared between the week and month periods where rare, discontinuous, coherence hotspots occurred. At periods from 1 to 6 months, some scattered, discontinuous coherence hotspots emerged during 2006–2011; these were consistent with the θEW power spectrum that showed its maximal power at these frequencies.
We have based our data analyses upon homogenized time series of net CO2 fluxes and their components, GPP and Reco, and their relationship with independent drivers. The primary goal was not to maximize the fraction of variance explained but rather to examine the variability of fluxes as explained by climate drivers and the correlation patterns among PFTs. We limit our analysis to French sites under temperate and Mediterranean climates during the period, 2006–2011, for which a common 6-year long time series of raw data could be homogenized and processed. Our main findings concern the contribution of climate variables to the CO2 fluxes among five PFTs, the relationship between the temporal variations of GPP, Reco and the resulting NEE with climate, soil characteristics, PFT and management. Based on the RF and spectral analyses, they confirm the previous conclusions and intuitive results reached by the pioneering spectral analyses of ecosystem fluxes (Baldocchi et al. (2001), Katul et al. (2001) and Stoy et al. (2005, 2009)).
Contribution of environmental variables
The similarity between the PFTs in the distribution of the half-hourly CO2 flux variances among contributors was assigned to the ubiquitous biological processes involved in the C3 photosynthesis metabolism: CO2 diffusion through stomata and mesophyll, light absorption by chlorophyll antenna, the Calvin enzymatic cycle and RuBisCO kinetics. The high frequency dependency of these processes on incident light, temperature and CO2 concentration is well characterized. Furthermore, such processes have been shown to integrate straightforwardly from the cell to the leaf and canopy levels, with relatively few upscaling effects (Farquhar et al. 1989; Rayment et al. 2002). Such processes respond to environmental drivers with time constants of minutes to an hour (Vialet-Chabrand et al. 2017) so that cross-variations of CO2 fluxes with climate variables are expected to occur from minutes to decades. Since the RF analysis reveals only the direct effect of environmental variables at a 30-minute time-step, we should expect that a convergence among PFTs emerges from the ranking of driving variables by the RF algorithm.
The RF results are consistent with the conclusions reached by previous studies regarding the control of GPP, Reco and NEE at high frequency (Baldocchi et al. 2001; Stoy et al. 2009; Ouyang et al. 2014). Here, the differential role among PFTs of hydrological drivers such as D and θEW on GPP and NEE is also apparent. Besides increasing atmospheric demand for evapotranspiration, increasing D has been assigned to stomata closure in response to increased atmospheric dryness (Farquhar 1978; Medlyn et al. 2011; Novick et al. 2016); although not necessarily through a direct effect of D (Damour et al. 2010). High values of D have been widely observed to coincide with stomatal closure and GPP decrease (Oren et al. 1999; Fu et al., 2018). D is therefore likely to control the CO2 fluxes more tightly for those PFTs which expose the canopy foliage to high values of D for longer periods, such as the evergreen ecosystems at Puechabon and Le Bray in a warmer environment (Rambal et al. 2003). The D control was much less tight under wetter climates and canopies, e.g., the DBF Barbeau site and the alpine grassland at Laqueuille.
Soil drought also affects photosynthesis by degrading the plant and leaf water status, leading to stomatal closure, impaired growth, leaf shedding and ultimately xylem damage and plant mortality (Bréda et al. 2006; Granier et al. 2007). Soil water stress at θEW lower than 0.4 is reported to affect photosynthesis (Granier et al. 1999, 2000). This condition occurred on average on 115, 61 and 54 days for Puechabon (Mediterranean), Barbeau (temperate), and Auradé (crop), respectively, between 2006 and 2011. In the holm oak stand of Puechabon, Pita et al. (2013) showed a low GPP with low diurnal variability in accordance with a strong stomatal control caused by the severe water stress. This difference among sites and climates explains our RF analysis conclusions regarding the between-site differences. The soil water, θEW, is controlled by precipitation, runoff and water uptake by plants, but its variations are dampened by the soil water holding capacity. This combination of controls shifts its power scale pattern towards lower frequencies than those of D, a difference also observed in arid ecosystems by Jia et al. (2018). The θEW power gap in the high frequency range is thus more pronounced where the soil water holding capacity is large, e.g., in the old growth sessile oak stand at Barbeau (∼170 mm) as compared to the cropland Auradé (∼60 mm).
Compared to GPP, the situation is different for Reco, which is controlled by diffusion and other transport processes intervening between the source metabolic activity and putative decarbonation processes in the soil and aboveground biomass. Indeed, not only the Reco-Tair but also the Reco-θEW correlation is explained at least partly by a causal relationship, with an effect of temperature and moisture at periods from an hour to months and a year (e.g., Janssens et al. 2001; Vargas et al. 2010a; Sierra et al. 2015). However, the correlation between half-hourly values of Reco and θEW as observed at all sites might be partly coincidental, because respiration and soil moisture vary seasonally in parallel; they show lesser variations in winter when soil is cold and water-saturated, and larger variations during the growing season (Vargas et al. 2010b).
For the functional types with a discontinuous vegetation cover (Barbeau, Auradé and Laqueuille), the RF ranking of the air temperature, just after the radiation, indirectly suggests the role phenology played in the temporal pattern of GPP and therefore NEE (White et al. 1999). Interestingly, we observed that wind speed and CO2 concentration above the canopy co-varied more closely with GPP in smoother canopies (Auradé and Laqueuille). This observation could be explained by the contrasting atmospheric coupling of forest, grasslands and croplands: in low vegetation, turbulence is less developed and the aerodynamic transfer less efficient, i.e., the vegetation is decoupled from the atmosphere. This process leads to the smoother canopies having a larger diel variation of CO2 concentration that is synchronized and opposite to GPP variations. Therefore, we cannot conclude a direct influence of CO2 on GPP through enhanced photosynthesis as has been suggested by Fernández-Martínez et al. (2017).
The percentage of variance explained by climate variables in the RF analysis decreased with the integration timespan, confirming that other variables or biotic parameters or delayed effects contribute to the monthly to yearly variability of CO2 fluxes. Using regression algorithms with the machine learning approach to predict NEE, Reco and GPP in all types of ecosystems (ENF, DBF, EBF, grasslands, croplands, etc…), Tramontana et al. (2016) were led to similar conclusions. This finding suggests that additional explanatory variables must be introduced to capture CO2 flux variability at low frequency, especially at sites with no clear seasonal cycle as in evergreen broadleaf forest and crops. In different forest ecosystems (ENF, DBF and EBF), Urbanski et al. (2007); Delpierre et al. (2012) and Ouyang et al. (2014) showed that the direct influence of hydroclimatic drivers on canopy fluxes, especially NEE, was masked at longer time scales by the effects of canopy dynamics, photosynthetic thermal acclimation and ecosystem disturbances. As shown here by the RF analysis across integration times, several studies confirmed that climatic variables explain a large fraction of the variance of CO2 fluxes, up to 80%, at high frequency (30-minute flux averages) (Loescher et al. 2003; Hollinger et al. 2004; Moffat et al. 2010), but can be reduced to an explained variance of 50% when scaled to longer time intervals (Law et al. 2002; Luyssaert et al. 2007b). Indeed, the RF analyses cannot account for the putative time lag and memory effects on NEE, GPP and Reco (Papale et al. 2015) that might arise from the transfer of carbon from photosynthesis (GPP) to biomass and soil respiration (Reco) and its variation between PFTs, seasons and species (Dannoura et al. 2011; Vargas et al. 2011).
The main explanation for the uneven RF performances among PFTs is their specific disturbance regimes. The lowest score obtained in the cropland site might be due to management practice (sowing dates) and disturbance (soil preparation, harvest) which are indeed crucial but not accounted for by either the RF or spectral analyses. The long-term effects of disturbances such as management and extreme events (heatwaves, storms and prolonged drought) were also not revealed by our RF analysis (Amiro et al. 2010; Moreaux et al. 2011; Buysse et al. 2017). As no biotic parameters were selected in the present analysis, our results are not overestimated, since as we discussed above, these biotic parameters are known to be relatively important in explaining flux variability, depending on the time scales considered. At the Auradé site, a supplementary test showed an increase up to 92.8% in the GPP variance explained when the vegetation structure and dynamics, such as LAI, was included, as compared to the initial 64.9% otherwise (data not shown) (Ouyang et al. 2014). Missing soil properties and carbon pools in the RF analysis may also explain its lower performance with increasing integration time in the two ecosystems investigated over a range of time scales.
Temporal behaviour of CO2 flux variance partitioning
Our spectral analysis confirms previous conclusions extracted from the Fluxnet dataset (Stoy et al. 2009). Both our cross-coherence analysis across temporal scales and our RF analysis confirm the changes in importance of climate drivers on ecosystem CO2 fluxes, GPP and Reco, with decreasing time frequency. An important result of the RF analysis is to confirm the changes in importance of climate drivers of ecosystem CO2 fluxes, GPP and Reco, with decreasing time frequencies commonly observed in literature. This conclusion is also supported by the cross-coherence analysis across temporal scales. The variance of half-hourly ecosystem fluxes exceeded the variance of driving variables, as observed previously by Katul et al. (2001) at the Harvard Forest site and by Ouyang et al. (2014) in another temperate broadleaf forest in Ohio. It is generally dampened at longer periods, from weeks to a year, though Ouyang et al. (2014) observed a secondary power peak of NEE at 6-month period when compared with air temperature and incident light. The attenuation of the power of CO2 fluxes at low frequency has been interpreted as the control of CO2 fluxes by biological factors and the effects of synoptic weather patterns. Richardson et al. (2007) showed that 40% of the variance in modelled annual NEE can be attributed to variation in environmental drivers, and 55% to variation in the biotic drivers. Going through different time scales, Stoy et al. (2005) and Ouyang et al. (2014) concluded similarly that the control of NEE, GPP and Reco changes as the integration timespan increases from day to season and to year. They found that the incident downward flux density of photosynthetically active radiation, followed by air vapour pressure saturation deficit dominated at a daily scale whereas soil water, temperature or stand characteristics were most influential at seasonal and annual scales. Furthermore, our spectral analysis shows that the time series of CO2 flux power spectra over years are related to the canopy phenology and are markedly different for the studied PFTs. Our results closely confirm those obtained by a similar statistical analysis in a drought-stressed mixed ecosystem where soil moisture appeared as the most relevant predictor of GPP (Fares et al. 2013). Their ENF, which was planted with ponderosa pine, exhibited results with the same ranking order as we found for our ENF, the maritime pine ecosystem Le Bray.
The spectral analysis also showed the similarity of the power spectra of GPP and NEE from half-hourly to 1-month scales, and to a lesser extent at lower frequencies (Hong and Kim 2011; Jia et al. 2018). The GPP and Reco power scale-wise patterns are clearly distinguishable from each other and confirm therefore that NEE diel to monthly variability is dominated by GPP independent of the PFT and climate type (Stoy et al. 2005, 2009; Luyssaert et al. 2007a; Ouyang et al. 2014; Jia et al. 2018). We cannot exclude, however, a putative influence of Reco on NEE at longer time scales, 6-month to a year, although Luyssaert et al. (2007b) showed also that annual NEE anomalies in forests were created by anomalies in GPP and Fu et al. (2019) recently extended this conclusion to 66 sites. The contrasted scale-wise power of GPP and Reco shows that their functional link has actually only a minor influence, if any. We think this is due to Reco, the bulk CO2 emission from the ecosystem, being the end process of the biological part of the carbon cycle and results from a combination of metabolic and transport processes occurring all along the soil-plant continuum (Sierra et al. 2015; Moya et al. 2019). Each respiratory source is using photosynthetic organic carbon fixed by the carboxylation of Ribulose, but the transfer time may vary widely between leaf respiration which can follow assimilation within minutes, to coarse woody debris mineralization which is lagged by months to years. These delays may obscure the functional link between GPP and Reco and their respective temporal patterns. In addition, and contrary to the photosynthesis, which occurs only in the canopy foliage, the processes involved in the ecosystem respiration are located in above and belowground parts of the vegetation and through soil microbes which are subjected to a wide range of temperature and moisture. This positioning makes the relationship between Reco and bulk climate variables more complex than the GPP’s. Last, the ecosystem respiration continues at night and during the dormant season; this may contribute to its power enrichment at intermediates scales and attenuation at high frequency as compared with GPP.
The 6-year period covered by our spectral analyses allowed us to examine how the power spectrum of fluxes, micrometeorological variables and their correlation have been changing from year to year. The soil water was intermittently correlated with both GPP and Reco at intermediate (from twice a week to seasonal) frequencies especially at the end of summer periods and at the 1-yr period for the three ecosystems studied. On the contrary, GPP and Reco were highly correlated to D at the 1-day and 1-yr periods in the three ecosystems with PFT-specific dynamics. For the evergreen broadleaf forest, Puechabon, GPP and Reco were almost continuously in coherence with D at 3-month and 6-month periods; which is not the case in the deciduous forest and the cropland, which are marked by a canopy seasonality and for which GPP and Reco were intermittently uncorrelated at intermediate frequencies. Here, the comparison of spectral power and coherences among PFT showed that the canopy phenology is a key characteristic for assessing the dynamic effects of climate drivers across time scales. As expected, the power of CO2 fluxes and D co-vary with a larger magnitude in spring and summer and least in winter. Less intuitively the scalograms of the cropland revealed some periods where D variations are high but vegetation absent (late summer) and conversely, where D variations are lowest but vegetation is present (winter wheat). Here, the dynamics of the GPP-D cross-coherence clearly depends on the sowing date and leaf area development, and thus ultimately on management practice. This is of course not the case in the deciduous broadleaf forest since GPP-D cross-coherence coincides with the growing season and no management operations happened for the period considered. Accordingly, accounting for the LAI temporal variations would substantially improve the representation of CO2 flux variability (Hong and Kim 2011; Delpierre et al. 2012; Stoy et al. 2013; Jia et al. 2018).
The empirical approaches that we used here, such as the Random Forest approach and spectral analysis, are useful tools for investigating temporal variations in ecological variables but have limited use for disentangling their drivers and causal chain, even when accounting for the phase effects. The canopy photosynthesis GPP appears as the primary process controlling the temporal changes of CO2 exchange from the ecosystems considered, in accordance with the fact that it precedes the other carbon emission processes. The Reco scale-wise variability is conversely clearly disconnected from the GPP: that may be explained by the nature of its complex spatial distribution and multiple sources. The differing phenology of the PFTs is a key characteristic governing their temporal variability. The management itself introduces a supplementary complexity in the time-frequency control of carbon fluxes as observed in the cropland, where management modifies the vegetation dynamics at intra-annual periods. This effect might also be present in forest but at lower frequencies since there, main operations (thinning, harvest, etc) occur at longer periods (5–10 years).
Understanding the impacts of biotic and abiotic variables on ecosystem fluxes and their temporal variability is still challenging due to the nature of the impacts. As pointed out by Stoy et al. (2009), improving the assessment of biotic and abiotic variables involved in the temporal control of CO2 fluxes might benefit from using the upcoming generation of standardized long-term eddy covariance observations, e.g., the ICOS Research Infrastructure in Europe. Multiple site comparative analysis on standardized data series with well-documented ancillary data on management and canopy conditions will allow a better separation of drivers on ecosystem CO2 fluxes.