1.

## Introduction

Atmospheric aerosols are a key but poorly understood component of the climate system (Boucher et al., 2013), and as air pollutants also play a critical role for human health (Shiraiwa et al. 2012; Lelieveld et al. 2019). Severe air pollution episodes are a well-known problem in China, a global hotspot of aerosol abundance, and pose a major health risk to the population (e.g., Lelieveld et al. 2015; Yue et al. 2020). Also, climate-system effects of aerosols are particularly pronounced in China and over adjacent seas (Andersen et al. 2017; Hasekamp et al. 2019; Toll et al. 2019). Therefore, reaching a quantitative understanding of the potential of anthropogenic aerosol source changes on the atmospheric aerosol loading in this region is relevant for both policy makers and the climate science community.

The recent outbreak of the novel coronavirus disease 2019 (COVID-19) has affected human activities worldwide. As the initial epicentre of the outbreak, China was the first country to implement a number of unprecedented countermeasures, which caused an almost complete shutdown of public life during early 2020 (Tian et al. 2020). While Diamond and Wood (2020) and Huang et al. (2021) have reported that, dependent on the sector, only minor reductions of industrial production have taken place, the overall abrupt change in human activity (especially road traffic) has resulted in unprecedented decreases of NO2 and to a lesser degree of particulate matter concentrations in China (Diamond and Wood 2020; Shi and Brasseur 2020; Venter et al. 2020; Tang et al. 2021; Li et al. 2021). Despite these clear reductions in specific aerosol species, no clear signal of lockdown measures on satellite-based aerosol optical depth (AOD) has been found (Diamond and Wood 2020). This is not necessarily surprising, as NO2 (and NO as well) is strongly driven by traffic emissions (Carslaw 2005), while AOD is influenced by many factors at once, e.g., secondary aerosol formation, aerosol natural source variability or transport, which all need to be carefully considered (detailed discussion below). Especially secondary aerosol formation has been found to have offset the reductions from reduced primary particle emission during early 2020 in China, driven by changes in atmospheric chemistry (increased oxidizing capacity related to the reduced ${\text{NO}}_{\mathrm{x}}$ emissions has been reported to facilitate secondary aerosol formation (Chang et al. 2020; Huang et al. 2021)) and also facilitated by the specific meteorological conditions (higher humidity, lower temperatures) during this time (Huang et al. 2021; Li et al. 2021; Tang et al. 2021). In existing studies analysing the AOD response to the lockdown, none or not all of these factors have been considered explicitly. In this context, the initial lockdown period in China provides a unique test bed to study the effects of massive but short-term reductions of anthropogenic aerosol emissions on satellite-retrieved AOD.

The attribution of aerosol changes to changes in human activity is challenging, as the variability of aerosols is very high in time and space, resulting in a low signal to noise ratio. This is because aerosol loading is not only determined by local emissions, but also by regional and long-distance transport, secondary aerosol formation, meteorological variability and terrain features (e.g., Wang et al. 2018). Indeed, Wang et al. (2020a) have simulated air pollution in China in a number of reduced-emission scenarios for the first weeks of 2020 and found that the magnitude of meteorological influences exceeded the air-pollution signal of emission reductions in China during this time. This underscores the necessity of specifically accounting for meteorological variability in order to quantify aerosol changes due to a modified source situation. This is difficult, as day-to-day aerosol variability is dependent on a number of meteorological factors at local (Tang et al. 2016; Liu et al. 2017; Zhan et al. 2017; Grange et al. 2018; Stirnberg et al. 2020) to synoptic scales (Jia et al. 2015; Ma and Guan 2018; Grange et al. 2018; Leung et al. 2018), which determine natural aerosol emissions (Scott et al. 2018), regional transport and mixing of aerosols (Cermak and Knutti 2009; Tang et al. 2016; Stirnberg et al. 2021), and secondary particle formation (Huang et al., 2014; Liu et al. 2018b; Nieminen et al., 2018; Li et al. 2019; Wang et al. 2020b; Chang et al. 2020; Ding et al. 2021; Li et al. 2021). Synoptic changes not only critically influence day-to-day variations in aerosol loading, but also explain a substantial fraction of longer-term aerosol variability, so that the meteorologically-forced variability does not necessarily cancel out over the time period of a month (Che et al. 2019). Also, while satellite-retrieved AOD is a well-validated and commonly used parameter to approximate aerosol patterns, it is affected by aerosol swelling at high humidity and in the vicinity of clouds. Both lead to an increase in AOD at a constant aerosol concentration (Quaas et al. 2010; Schwarz et al. 2017). Adding to the complexity, AOD in China features region-specific seasonal cycles (Guo et al. 2011; Luo et al. 2014; Yu et al. 2017; Ma and Guan 2018; Sogacheva et al. 2018), as well as nonlinear trends, with aerosol increases generally observed starting in the mid-1990s, but decreasing rapidly since 2013 (e.g., Zheng et al. 2018). Additionally, pollutants in China have been found to significantly vary dependent on the day of the week (Li et al. 2019). Amidst these considerations, a robust attribution of any observed AOD anomalies to changes in anthropogenic emissions is only feasible when all of these factors are controlled for at once (cf. Cermak and Knutti 2009).

The aim of the research presented here is to separate the effects of the assumed reduction in columnar aerosol loading due to the societal and economic lockdown in response to the COVID-19 situation from the natural variability, and thus to reach an attribution of columnar aerosol loading changes due to reduced anthropogenic activities during the COVID-19 period. The overarching scientific question to be answered is: Did the changes in human activity due to the lockdown measures significantly reduce AOD in China? The corresponding hypothesis to be tested is that in densely populated regions in eastern China, the observed AOD is lower than would be expected under the given meteorological conditions of the year 2020. To test this hypothesis, a state-of-the-art machine-learning technique is employed to predict AOD on the basis of a) information on local and synoptic-scale meteorological conditions, b) geographic aspects pertaining to location and topography, and c) a climatologically expected AOD that considers seasonal and weekly cycles, as well as trend information. The predicted climatologically-informed and weather-adjusted AOD expectation is compared to satellite-observed AOD, where differences are interpreted as aerosol source changes.

2.

## Data and methods

2.1.

### Data

This study focuses on a spatial domain centred on the eastern parts of China (20${}^{°}$ N–45${}^{°}$ N and 100${}^{°}$ E–130${}^{°}$ E). All data used in this study span the months of January–March in the period 2001–2020. To reduce the amount of data and computational resources needed, as well as the complexity of seasonally-dependent relationships between meteorology and aerosol loading, only the months of January, February and March (JFM) are considered.

Data on columnar aerosol loading are taken from the official product suite for the Moderate-resolution Imaging Spectroradiometer (MODIS) sensor aboard the Terra satellite. Specifically, the AOD product based on daily level 3 collection 6.1 (MOD08_D3, ‘Aerosol_Optical_Depth_Land_Ocean_Mean’) data are used, which features a 1${}^{°}×$1${}^{°}$ resolution (Hubanks et al. 2008; Levy et al. 2013). The data set is chosen at the spatial scale of 1${}^{°}×$1${}^{°}$ as the study is focused on regional-scale aerosol changes, because studies on climate effects of aerosols are commonly conducted at such scales (e.g., Benas et al. 2020). The authors acknowledge that by using data at this spatial resolution, local-scale characteristics, relevant especially for the assessment of local air-pollution, are not captured.

To represent meteorological factors that have been shown to influence aerosol loading, ERA5 reanalysis data from the European Centre for Medium-Range Weather Forecasts (ECMWF) are used. ERA5 is the current generation of reanalysis from the ECMWF and follow-up of ERA-Interim (Dee et al., 2011). ERA5 features hourly meteorological fields at a spatial resolution of 0.25${}^{°}$ (Hersbach 2016). Here, meteorological fields are used at 3 UTC (close to the satellite overpass at ∼11 am local time over the study domain) and resampled to fit the MODIS grid. Meteorological variables that have been found to be relevant drivers of aerosol variability by influencing natural aerosol sources, transport, chemical processes or secondary aerosol formation are chosen from the single-level and pressure-level ERA5 products, and summarized with the respective reference in Table 1. To approximate longer-term advection patterns, wind data are also used as temporal averages 24 and 72 hours before the satellite overpass. To approximate past wet scavenging of aerosol particles, total precipitation is averaged over the past 24 hours in the same manner.

As additional information on the vertical layering of the atmosphere, data on pressure levels pertaining to temperature, specific humidity, and the u, v and w components of wind are used and averaged in three vertical layers to approximate the boundary layer (BL, all ERA5 levels between 1000 and 850 hPa), and two free tropospheric layers (FT1, all ERA5 levels between 825 and 700 hPa, and FT2, between 650 and 500 hPa).

As synoptic conditions can be difficult to capture with local meteorological information, the Siberian High position index (SHPI, Jia et al. 2015), which has been found to correlate strongly with wintertime AOD in China, is calculated. The SHPI is defined as the pressure-weighted mean of all longitudes of ERA5 grid cells exceeding 1023 hPa in a large spatial domain in the typical region of the Siberian High (60–145${}^{°}$ E and 30–65${}^{°}$ N). It therefore approximates the strength and location of the Siberian High, which is closely related to regionally different wintertime aerosol transport patterns to northern and southern China (Jia et al. 2015).

Along with the meteorological data, additional information on geographical factors are also taken from the reanalysis and used as predictors in the machine learning model, as terrain, coastal influences and geographic location have been shown to be important modulators of aerosol characteristics (e.g., Jia et al. 2015; Tang et al. 2016; Leung et al. 2018; Wang et al. 2018). Factors included are surface elevation, land sea mask, longitude and latitude.

2.2.

### Methods

A machine learning technique is used to predict an expected aerosol loading, which is then used as a comparison to satellite observed AOD. In essence, this approach uses machine learning for meteorological normalization as suggested by (Grange et al. 2018; Grange and Carslaw 2019; Petetin et al. 2020), but extends this approach by including information on geographic aspects and temporal characteristics of the time series. In this study, AOD is log-transformed for the training, testing and validation of the machine learning models. This is done as AOD is typically log-normally distributed so that its log-transformation approximates a normal distribution (e.g., Cermak and Knutti 2009). In tests conducted on the back-transformed data within this study, the training on log-transformed data has led to a slight (<5%) increase in predictive performance (higher R2, lower RMSE) of the machine learning models used.

Gradient Boosted Regression Trees (GBRTs, (Friedman 2001; Pedregosa et al. 2011)), a state-of-the-art tree-based machine-learning model, are applied to predict log(AOD). GBRTs use an ensemble of weak learners (trees) that are sequentially added to the ensemble and fitted to its predecessor’s previous residual error on the basis of gradient descent (Friedman 2001). GBRTs have been successfully applied before to study aerosol and cloud patterns and processes (Fuchs et al. 2018; Pauli et al. 2020; Stirnberg et al. 2020, 2021) and also the impact of COVID-19 on air pollution in Europe (Petetin et al. 2020). The exact architecture of the GBRTs is determined by hyperparameters (e.g., learning rate, number of trees).

A two-step modeling approach is chosen, which is illustrated in Fig. 1:

Fig. 1.

Schematic overview of the two-step machine learning approach used in this study. Significant differences in the comparison of the simulated AODE (blue) and the observed AOD (red) during the lockdown are interpreted as aerosol source changes.

1. In the first stage, grid-cell specific GBRTs are trained to approximate the region-specific climatological signals of log(AOD) on the basis of only four input features: the year, the day of the week, the day of the year, and the number of days before/after Chinese New Year for each specific observation. This is intended to account for regional differences in a) AOD, b) its temporal cycles (Ma and Guan 2018; Sogacheva et al. 2018), and c) nonlinear trend characteristics (Zheng et al. 2018). The predicted log(AOD) can be viewed as a generalized climatological expectation at a specific point in time and space and is termed log(AODEclim). This is similar to the climatologically expected AOD used in Diamond and Wood (2020), but uses daily data instead of monthly means. To avoid overfitting of the model to the day-to-day variability, a very shallow model architecture is chosen, with only 100 trees and a maximum depth (number of decision layers in each tree) of 2. The robust Huber loss function is chosen as in Fuchs et al. (2018). With these architectural constraints, the models are only capable of a very broad estimation of patterns underlying the time series, and overfitting is not an issue. The models are not technically validated, as the goal is not to accurately predict an existing target (i.e. monthly means, which are influenced by meteorology), but rather to generalize the discussed elements of the AOD time series at each location. As such, optimizing model performance (e.g., on monthly means) would not necessarily suggest that the models generalize (e.g., trends and seasonality) well. The skill of the models to generalize the underlying temporal patterns (mainly trend and seasonality) is controlled by visualizations as shown in Fig. 2b) (and Fig. S1) for each grid box. One should note that there is not enough data to train a local model in every region separately. If the number of valid observations for the whole time period is less than 200 per grid cell (e.g., some mountainous regions in the very north west of the domain), log(AODEclim) is defined as the average log(AOD).
Fig. 2.

Domain-average climatologically expected log(AODEclim) grouped by day of year (a)), displaying seasonal characteristics of the AOD data set. The black line represents the mean, the grey area the mean ± 1 standard deviation of the regional models. Panel b) shows an example of a log(AOD${}_{\mathit{Eclim}}\right)$ time series for Chengdu (‘+’ symbol), together with long-term variability (yearly average, black), and seasonal variability (monthly average, circles) of observed log(AOD).

2. In the second stage, a single domain-wide model is used to predict daily log(AOD) on the basis of the described log(AODEclim), meteorological, and geographical information. As this is a much more complex task, likewise, the model architecture is more complex. The data are split up into training (67%, n = 219,033) and test (33%, n = 107,883) data sets. To evaluate potential biases or uncertainties introduced when applying the model to an unseen time period (as e.g., 2020), only data from 2001–2018 are used during training, withholding data from 2019 as an additional separate test data set. Within the training data, a recursive feature elimination is done to select only those features that are shown to be useful to predict log(AOD). This is achieved by a step-wise elimination of the least important input feature, with model skill evaluated in a 3-fold cross validation. A generic model architecture is chosen for this approach (number of trees = 5000, maximum depth = 4, all other hyperparameters at default of the scikit-learn implementation (Pedregosa et al. 2011)). The final set of features includes 36 of the original 40 features, which maximizes model skill (see Table 1; the 3 features pertaining to vertical velocity, as well as evaporation are eliminated during the feature elimination process). Based on this set of features, model hyperparameters are optimized on the training data set in a grid-search approach to determine the best combination of hyperparameters (shown in Table 2), again using 3-fold cross validation (to limit computational costs). To limit overfitting of the model, two regularization strategies are applied: 1) for each decision made by the trees during training, only a fraction of the available features are considered ( $\sqrt{{N}_{\mathit{features}}}$), and 2) model training is stopped when the cross-validation score does not improve for 100 iterations (early stopping). The final set of hyperparameters is the one that performs best in the cross validation (bold numbers in Table 2). The final model performance is then tested on the completely independent test data set and on the year 2019. The predicted log(AOD) can be understood as the expected log(AOD) when meteorology, climatological context and geography are considered and is termed here as log(AODE). For all analyses (excluding the validation, Fig. 3a)), the results are transformed back to the original scale of AOD for clarity.
Fig. 3.

Validation of the machine-learning model predicting AOD/log(AOD) with independent test data. The top row shows average observed and predicted (backtransformed) AOD of the test data in a) and b), and their difference in c) in the study domain (20${}^{°}$ N–45${}^{°}$ N and 100${}^{°}$ E–130${}^{°}$ E). Panel d) shows a scatter plot summarizing the model skill to predict log(AOD) over the entire domain and therefore represents a combination of spatial and temporal skill. e) Spatial patterns of model skill in predicting the temporal variability of log(AOD) in each grid cell (R2). Panel f) shows the difference between observed AOD and predicted (back-transformed) AODE, with grey ‘+’ symbols showing significance at 0.01 level of an independent two-sided t test.

The two stage modeling architecture is chosen for two reasons: 1) To create robust region-specific climatological AOD expectations that account for nonlinear temporal characteristics of the AOD time series, and 2) to be able to have a quantitative estimate to potential weather contributions to the expected AOD (Difference between AODE and AODEclim).

3.

## Results and discussion

Figure 2 shows results of the first modeling step to predict climatologically expected AODEclim. In Fig. 2a), log(AODEclim) is shown as a domain average grouped by the parameter day of year. An increase in predicted log(AODEclim) with day of year is apparent, also showing substantial intramonthly changes of AOD. In thorough grid-cell specific analyses, such as shown in Fig. 2b) and Fig. S1, log(AODEclim) is qualitatively observed to capture well the underlying seasonal and trend patterns of AOD. One should note that AOD trends and seasonal variability are observed to differ substantially between grid cells, so that the estimated log(AODEclim) patterns vary similarly. In the example shown in Fig. 2b), the seasonal pattern of AOD in the pixel closest to Chengdu is well captured by the model and combined with the long-term trend, where AOD is observed (and predicted) to rapidly decrease starting ∼2012, after stricter air pollution countermeasures were introduced (Jin et al. 2016). In this example, some extremes of the monthly mean AOD are not captured (around 2012), which makes sense, as these typically are not climate signals but largely driven by meteorology, regional aerosol transport and secondary aerosol production (Huang et al., 2014; Zheng et al. 2015).

The grid-cell specific information of log(AODEclim) is then forwarded to the second machine learning model where it is combined with information on meteorology and geography (see Sec. 2). On this basis, the model is applied to predict daily log(AOD) at each point in time and space. Fig. 3 provides a summary of the skill of the model to predict AOD in the study domain. The top row shows averaged spatial patterns of observed AOD (a)), predicted back-transformed AODE (b)), and their difference (c)) in the independent test data (2001–2018). The results show that during the training period, the machine-learning model is able to reproduce the average spatial patterns well, without clear systematic regional biases. The same comparison is done for the year 2019, which was completely withheld from training, and shown in panel f). A larger absolute bias is apparent in this case, but only a very limited number of scattered pixels feature a significant difference between observed AOD and predicted AODE. This case study for 2019 acts as a guide to interpret the meaningfulness of the differences during 2020, as it shows the limitations of the method to predict in unseen years. Fig. 3d) shows the results of the validation of the day-to-day variability of log(AOD) with the independent test data set. The model is able to capture the largest part of daily log(AOD) variability (69%, 57% for back-transformed data, see Table S1). It is evident that some very low aerosol loadings are overestimated by the model (at log(AOD) < −5). This validation, however, represents a mix of spatial and temporal components of variability, as it is performed for the entire spatial domain at once. In terms of explaining day-to-day variability in each grid cell, the model performs slightly worse, with a domain average R2 of 0.50 (0.46 for back-transformed data) but only minor spatial variation of the model’s skill (Fig. 3e), standard deviation of temporal R2: 0.15). It should be noted that skill and bias tend to vary more where less data exists for training (see Fig. 4a)). On this basis, model predictions are used as comparison to the observed AOD in early 2020 to attribute anomalies to COVID-19 lockdown-induced emission changes.

Fig. 4.

Number of valid AOD observations during a) training, b) testing, c) JFM 2020, d) January 2020, e) February 2020, and f) March 2020 in the study domain (20${}^{°}$ N–45${}^{°}$ N and 100${}^{°}$ E–130${}^{°}$ E).

In Fig. 5a), differences between observed AOD and predicted AODE for all of JFM 2020 are shown. Since the influences of seasonality, trends, and meteorological conditions are inherent in AODE, differences to observed AOD can be interpreted as differences in aerosol emissions under consideration of the limitations of the method (see Fig. 3). No clear, systematic differences between observed AOD and AODE as hypothesized are evident. In the North China Plains, predicted AODE tends to be higher than observed AOD, but this difference is only significant in some pixels. In light of the limitations of the method illustrated by the test results for the year of 2019, it is thus hard to attribute these differences to decreased anthropogenic emissions during the lockdown. These results agree well with findings from Diamond and Wood (2020) who also analysed level 3 MODIS AOD at a 1${}^{°}$ x1${}^{°}$ resolution, but used monthly averages without explicitly controlling for meteorological variability. In some regions, the observed AOD is actually significantly higher than AODE, contrary to the hypothesis of this study. The clearest of these patterns is in the regions of northern Hebei north of Beijing and Liaoning. Here, dust transport from the Kumutage and Taklamakan deserts is a frequent phenomenon (Tang et al. 2016; Yu et al. 2017), which increases the variability of AOD in this region during winter. While some of the AOD variability related to dust transport should be explained by the model, certainly some limitations to modeling dust transport exist in the current modeling framework. For example, the model has no information on meteorology that controls emissions at the source region (e.g., previous precipitation, soil moisture and winds; Che et al. 2019). Including this information is difficult, as source regions vary, and the transport time and trajectory from source region to the receptor region depend on the synoptic situation controlling the transport. Also, wet scavenging along the trajectory is likely to be a factor limiting the model’s capability to predict such situations. Observed AOD also significantly exceeds AODE in south-western parts of the domain, in vicinity to the borders to Vietnam and Laos. These positive AOD anomalies are likely caused by the extraordinary number of wildfires occurring in the southeastern Asian Peninsula during February and March 2020 (Jenner 2020). Such AOD anomalies due to wildfire emissions could only be explained by a machine learning model if they also would occur in a similar way in the training data and were captured by the input features, which is not the case here.

Fig. 5.

Comparison of observed AOD 2020 minus AODE for a) JFM 2020, b) January 2020, c) February 2020, and d) March 2020 in the study domain (20${}^{°}$ N–45${}^{°}$ N and 100${}^{°}$ E–130${}^{°}$ E). Grey ‘+’ symbols show significance at 0.01 level of an independent two-sided t test. The black box in panel d) shows the region that is further investigated in Fig. 6.

Differences between observed AOD and AODE are analysed for the months of January, February and March (the lockdown in Wuhan started at the end of January (Zhu et al. 2020)) to further investigate the temporal evolution of differences between AOD and AODE during this period. In January (Fig. 5b)), observed AOD is higher than predicted AODE throughout the North China Plains, however, this difference is based on only very few data points (Fig. 4d)) and hence is not significant in most regions. While this pattern should not be over-interpreted, the very low number of samples indicates a prevalence of clouds in this region during this time, and the positive differences could be related to aerosol swelling or cloud-related processes that are not captured by the model. There is a region where observed AOD is significantly higher than predicted AODE in the province of Guizhou in the southwestern parts of the domain. However, as this is one of the more rural and less densely population regions in China, and first lockdown measures were put in place at the end of January in Wuhan, these cannot be interpreted as related to the lockdown-related changes in anthropogenic emissions. These differences rather underscore the difficulty in the interpretation of the results. During February (Fig. 5c)), differences between AOD and AODE are similar to those found for the JFM period. It should be noted that the spatial patterns of the differences in February closely match the findings from Diamond and Wood (2020), underscoring that the patterns apparent in Fig. 5 are unlikely to be artefacts created by the machine-learning model. In March 2020 (Fig. 5d)), observed AOD is again lower than predicted AODE in the North China Plains, however, in most regions this difference is not significant (p values > 0.01). This pattern extends over the adjacent Yellow Sea and to the Korean peninsula, where differences are observed to be significant. One should note though that the significance measure of the t test is affected by the higher number of samples over the Yellow Sea rather than the difference between AOD and AODE being larger. With sample sizes varying this much over the considered time and region using daily observations and therefore including this information instead of monthly average data are thus advantageous for such analyses.

To more closely evaluate this AOD anomaly from the expected AODE in the North China Plains and the adjacent Yellow Sea, Fig. 6 shows distributions of observed AOD, and predicted AODE and AODEclim for the test data (2001–2018, a)) and March 2020 (b)). It is apparent that during the test period, the median prediction of AODE and AODEclim is close to the observations, but that only AODE captures the variability of observed AOD. This is expected, as AODEclim is purely a climatological expectation without considering potential weather influences. During March 2020, AODEclim is lower than during the test period, reflecting the decreasing trend in aerosol loadings in this region. While the observed AOD is lower than this climatological expectation, the weather adjusted AODE is actually higher than AODEclim. It is thus noteworthy that during February and March 2020, AOD was lower than expected in the North China Plains, and may possibly be related to the lockdown measures, and potentially pointing trans-boundary effects of the lockdown measures. In light of the uncertainties associated with the method, though, these AOD reductions cannot be unequivocally be attributed to reduced anthropogenic emissions. The results do show that meteorological factors need to be explicitly considered in such analyses, as expected meteorological influences may otherwise partially mask the AOD changes. It should be noted that passive sensor AOD is an integrated columnar observation, which means that changes in the vertical distribution of aerosols cannot be detected. In a recent study using ground-based polarization lidar measurements over Wuhan, Yin et al. (2021) separated the AOD contributions of boundary layer and free-tropospheric aerosols to total AOD and found that during the lockdown, boundary layer AOD was reduced while free-tropospheric AOD was increased and actually dominated the total AOD signal, which is a clear anomaly in the time series. This indicates that some of the lockdown-related aerosol loading signals may not be detectable using column-integrated AOD observations.

Fig. 6.

Comparison of observed AOD and predicted AODE and AODEclim distributions of the region shown as a box in Fig. 5d) for test data in the time period of 2001–2018 (a)), and March 2020 (b)). Predicted log(AODE) and log(AODEclim) are back-transformed for the comparison. Thin vertical lines shows the median of the distributions.

Spatial patterns of the differences between climatologically expected AODEclim and AODE during JFM 2020 are shown in Fig. 7. It is apparent that AODE is significantly higher than AODEclim for the entire region of the North China Plains and a large region centered around Chengdu. This suggests that meteorological influences have led to a higher model expectation of AODE, underscoring the necessity of controlling for meteorological influences rather than using monthly average data in such analyses. This result is in agreement with recent studies that show that during the initial lockdown in China, increased secondary aerosol production was facilitated by high humidity and low air temperatures and changes in atmospheric chemistry (Huang et al. 2021; Li et al. 2021; Tang et al. 2021), and thus acted to increase the column aerosol loading during this time. One should note that while secondary aerosol production can be modified by meteorological conditions (e.g., Liu et al. 2018b), it also critically depends on the abundance of gaseous precursors (e.g., Liu et al. 2018a). The planetary boundary layer height was observed to have been abnormally low in northern China during the lockdown, which has been linked to strong haze conditions (Su et al. 2020). The results also support Wang et al. (2020a) whose numerical modelling results showed that the magnitude of meteorological influences exceeded the air-pollution signal of emission reductions during the lockdown period in China. While the differences between AODEclim and AODE are not as large as the differences between observed AOD and AODE, they are not as affected by limited sampling (prediction every day, n = 90) and show a clear and spatially coherent structure.

Fig. 7.

Comparison of climatologically expected AODEclim minus AODE for JFM 2020 in the study domain (20${}^{°}$ N–45${}^{°}$ N and 100${}^{°}$ E–130${}^{°}$ E), grey ‘+’ symbols show significance at 0.01 level of an independent two-sided t test.

4.

## Conclusions

In this study, a machine-learning model is applied to predict satellite-observed AOD on the basis of climatologically expected AOD, meteorological factors and geographical information to attribute AOD anomalies to reduced anthropogenic emissions during the COVID-19 period in China. The model is shown to be capable of well representing daily AOD variability in an independent test data set (spatiotemporal R2: 0.69, domain-average temporal R2: 0.50), with no systematic bias. In a separate test conducted on a year that is completely withheld from training (2019), the bias is higher, showing some limitations of the model to predict for an unseen time period. The main findings of comparisons between observed AOD and predicted AODE are that:

• No clear, systematic reduction of AOD with respect to the model expectation is found that can unequivocally be attributed to reduced anthropogenic emissions during the COVID-19 period.
• Observed AOD is lower than expected over the North China Plain during February and March, but in most regions these differences are not significant. In March 2020, this pattern extends over the Yellow Sea and to the Korean Peninsula, where differences are frequently observed to be significant at the 0.01 level. While this finding is noteworthy, in light of the uncertainties related to the method and the limited number of samples, these are not interpreted as a clearly evidential signature of reduced anthropogenic emissions during the COVID-19 period.
• Climatologically expected AODEclim is compared to the weather-adjusted expectation (AODE) for the months of January, February and March. It is found that AODE is significantly higher than AODEclim for the region of the North China Plains and a region centered around Chengdu. This suggests that meteorological influences have acted to increase AOD during this time, in agreement with recent literature.
• AODE significantly underestimates AOD in northern and southern parts of China, likely due to processes that are not captured by the machine-learning model as e.g., likely wildfire emissions (southern China), aerosol swelling, meteorological influences or secondary aerosol production.

The findings suggest that extensive anthropogenic aerosol source reductions (e.g., air-pollution measures) does not necessarily lead to a detectable reduction in satellite retrieved regional columnar aerosol loadings. The findings also highlight the complexity of columnar aerosol variability in a large spatial domain and underscore the challenge of detecting changes in column aerosol loading due to changes in anthropogenic activity and the necessity to account for meteorological influences.