A diagram for evaluating multiple aspects of model performance in simulating vector fields

Vector quantities, e.g., vector winds, play an extremely important role in climate systems. The energy and water exchanges between different regions are strongly dominated by wind, which in turn shapes the regional climate. Thus, how 10 well climate models can simulate vector fields directly affects model performance in reproducing the nature of a regional climate. This paper devises a new diagram, termed the vector field evaluation (VFE) diagram, which is very similar to the Taylor diagram but provides a concise evaluation of model performance in simulating vector fields. The diagram can measure how well two vector fields match each other in terms of three statistical variables, i.e., the vector similarity coefficient, root-mean-square (RMS) length (RMSL), and RMS vector difference (RMSVD). Similar to the Taylor diagram, 15 the VFE diagram is especially useful for evaluating climate models. The pattern similarity of two vector fields is measured by a vector similarity coefficient (VSC) that is defined by the arithmetic mean of the inner product of normalized vector pairs. Examples are provided, showing that VSC can identify how close one vector field resembles another. Note that VSC can only describe the pattern similarity, and it does not reflect the systematic difference in the mean vector length between two vector fields. To measure the vector length, RMSL is included in the diagram. The third variable, RMSVD, is used to 20 identify the magnitude of the overall difference between two vector fields. Examples show that the new diagram can clearly illustrate the extent to which the overall RMSVD is attributed to the systematic difference in RMSL and how much is due to the poor pattern similarity.

To construct the VFE diagram, one crucial issue is quantifying the pattern similarity of two vector fields. Over the past several decades, many vector correlation coefficients have been developed by different approaches. For example, some vector correlation coefficients are constructed by combining Pearson's correlation coefficient of the x-and y-component of the vector (Charles, 1959;Lamberth, 1966). Some vector correlation coefficients are devised based on orthogonal decomposition (Stephens, 1979;Jupp and Mardia, 1980;Crosby et al., 1993) or the regression relationship of two vector 5 fields (Ellison, 1954;Kundu, 1976;Hanson et al., 1992). These vector correlation coefficients usually do not change when one vector field is uniformly rotated or reflected to a certain angle. This is a reasonable and necessary property for the vector correlation coefficient when one detects the relationship of two vector fields. However, in terms of model evaluation, we expect the simulated vectors to resemble the observed ones in both direction and length with no rotation permitted. Thus, previous vector correlation coefficients are not well suited for the purpose of climate model inter-comparisons and 10 evaluation.
To measure how well the patterns of two vector fields resemble each other, a vector similarity coefficient (VSC) is introduced in section 2 and interpreted in section 3. Section 4 constructs the VFE diagram with three statistical variables to evaluate multiple aspects of simulated vector fields. Section 5 illustrates the use of the diagram in evaluating climate model 15 performance. A discussion and conclusion are provided in section 6.
(3) respectively, where are the quadratic mean of the length or RMS length (RMSL) of a vector field which measures the mean length of the vectors in a vector field. Based on (2) and (3), we have Clearly, the normalization of a vector field only scales the vector lengths without changing their directions (Fig. 1b).

MSDNV =
where α i is the included angle between paired vectors. MDA takes values in intervals [0, π] and measures how close the corresponding vector directions of two vector fields are to each other. A mean square difference (MSD) of normalized vector lengths is defined as follows: 5 Given equation (6) and the Cauchy-Schwarz inequality: we find that MSD takes on values in intervals [0,2].
MSD measures how close the corresponding vector lengths of two normalized vector fields are to each other. 10

Relationship of VSC with the MSD
VSC can be written as follows: If we assume each corresponding angle between two vector fields α i = α = const (i = [1, N]), with the support of (6) and (10) we obtain Thus, R v varies between 0 and cosα due to the difference in vector length when α is a constant angle. R v equals 0 when α equals 90° regardless of the value of MSD. MSD plays an increasingly important role in determining R v when α approaches 0 or 180°.

Relationship of VSC with MDA
To examine the relationship of R v with the included angles between two vector fields in a more general case, we produce a 5 number of random vector sequences. Firstly, we construct a reference vector sequence, � �⃑ , comprising 30 vectors, i.e., i = [1,30]. The lengths of 30 vectors follow a normal distribution, and the arguments of 30 vectors follow uniform distribution between 0 and 360°. Secondly, we produced a new vector sequence � �⃑ by rotating each individual vector of � �⃑ for a certain angle randomly between 0° and 180° without changes in vector lengths. Such a random production of � �⃑ was repeated 1×10 6 times to produce sufficient random samples of vector sequences. The vector similarity coefficients R v are computed between 10 � �⃑ and the 1×10 6 sets of randomly produced vector sequences, respectively. As shown in Figure 3, R v generally shows a negative relationship with MDA, i.e., a smaller MDA generally corresponds to a larger R v , and vice versa. However, it should be noted that R v varies within a large range for the same MDA. For example, when MDA equals 90°, R v can vary from approximately -0.5 to 0.5 depending on the relationship between the vector lengths and the corresponding included angles. A positive (negative) R v is observed when the 30 vector lengths and included angles are negatively (positively) 15 correlated. This means that the patterns of two vector fields are closer (opposite) to each other when the included angles between the long vectors are small (large). In other words, the longer vectors generally play a more important role than the shorter vectors in determining R v .

Application of VSC to 850-hPa vector winds
In this section, we compute the R v of the climatological mean 850-hPa vector winds in January with that in each month in the 20 Asian-Australian monsoon region (10°S-40°N, 40°-140°E). The purpose of this analysis is to illustrate the performance of R v in describing the similarity of two vector fields. The wind data used is NCEP-DOE reanalysis 2 data (Kanamitsu, et al., 2002). The climatological mean 850-hPa vector winds show a clear winter monsoon circulation characterized by northerly winds over the tropical and subtropical Asian regions in January and February (Figs. 4a,4b). The spatial pattern of vector winds in January is very close to that in February, which corresponds to a very high R v (0.97). The spatial pattern of vector 25 winds in January is less similar to that in April and October, which corresponds to a weak R v of 0.48 and -0.11, respectively.
In August, the spatial pattern of 850-hPa winds is generally opposite to that in January, which corresponds to a negative R v (-Geosci. Model Dev. Discuss., doi:10.5194/gmd-2016-172, 2016 Manuscript under review for journal Geosci. Model Dev.  Figure 4 illustrates that VSC can reasonably measure the pattern similarity of two vector fields. We also computed the VSCs of the January climatological mean vector winds with that in each individual month during the period from 1979 and 2005, respectively. The VSCs show a smaller spread in winter (January, February, 5 and December) and summer (June, July, and August) months than during the transitional months such as April, May, and October ( Fig. 4f). This indicates that the spatial patterns of vector winds have smaller inter-annual variation in summer and winter monsoon seasons than during the transitional seasons.

Construction of the VFE diagram
To measure the differences in two vector fields, a root-mean-square vector difference (RMSVD) is defined following 10 Shukla and Saha (1974) with a minor modification: where A � �⃑ i and B � �⃑ i are the original vectors. The RMSVD approaches zero when two vector fields become more alike in both vector length and direction. The square of RMSVD can be written as With the support of equation (4), (5), (7), we obtain The geometric relationship between RMSVD, L A , L B , and R v is shown in Figure 5, which is analogous to Figure 1 in Taylor (2001) but constructed by different quantities. It should be noted that RMSVD is computed from the two original sets of vectors. However, the MSDNV in section 2 is computed using normalized vectors.
With the above definitions and relationships, we can construct a diagram that statistically quantifies how close two vector fields are to each other in terms of the R v , L A , L B , and RMSVD. L A and L B , measure the mean length of the vector fields � �⃑ and � �⃑ , respectively. In contrast, RMSVD describes the magnitude of the overall difference between vector fields � �⃑ and � �⃑ .
Vector field � �⃑ can be called the "reference" field, usually representing some observed state. Vector field � �⃑ can be regarded 5 as a "test" field, typically a model-simulated field. The quantities in equation (12) are shown in Figure 6. The half circle represents the reference field, and the asterisk represents the test field. The radial distances from the origin to the points represents RMSL (L A and L B ), which is shown as a dotted contour (Fig. 6). The azimuthal positions provide the vector similarity coefficient (R v ). The dashed line measures the distance from the reference point, which represents the RMSVD.
Both the Taylor diagram and the VFE diagram are constructed based on the law of cosine. The differences between the two 10 diagrams are summarized in Table 1. Indeed, the Taylor diagram can be regarded as a specific case of the VFE diagram, which is further interpreted in Appendix A.

Evaluating vector winds simulated by multiple models
A common application of the diagram is to compare multi-model simulations against observations in terms of the patterns of 15 vector winds. As an example, we assess the pattern statistics of climatological mean 850-hPa vector winds derived from the historical experiments by 19 CMIP5 models (Taylor et al., 2012)  VSCs vary from 0.8 to 0.96 among 19 models, clearly indicating which model-simulated patterns of vector winds well resemble observations and which do not. The diagram also clearly shows which models overestimate or underestimate the mean wind speed (RMSL) (Fig. 7). For example, in comparison with the reanalysis data, some models (e.g., 12, 19, 13, and 15) underestimate wind speed over the Asian-Australian monsoon region in summer. In contrast, some models (e.g., 6 and 10) overestimate wind speed (Fig. 7a). In winter, most models overestimate the 850-hPa wind speed (Figure 7b). 25 To illustrate the performance of the VFE diagram in model evaluation, Figure 8 shows the spatial patterns of the climatological mean 850-hPa vector winds over the Asian-Australian monsoon region derived from the NCEP2 reanalysis and three climate models. Models 1 and 4 show a spatial pattern of vector winds very similar to the reanalysis data in summer, and R v reaches 0.96 and 0.95, respectively (Figs. 8a, 8c, 8e). In contrast, the spatial pattern of the vector winds 30 simulated by model 12 is less similar to the reanalysis data (Figs. 8a, 8g). For example, the reanalysis-based vector winds show stronger southwesterly winds over the southwestern Arabian Sea than the Bay of Bengal (Fig. 8a). However, an opposite spatial pattern is found in the same areas in model 12. More precisely, the southwesterly winds are weaker over the southwestern Arabian Sea than over the Bay of Bengal (Fig. 8g). R v reasonably gives expression to the lower similarity of the spatial pattern in the vector winds characterized by a smaller R v (0.86) in model 12 that is clearly lower than that (0.96) in model 1. Figure 7 suggests that model 12 underestimates wind speed (normalized RMS wind speed is 0.78) in summer. In 5 contrast, model 4 overestimates wind speed (normalized RMS wind speed is 1.35) in winter. These biases in wind speed can be identified in Figure 8. For example, model 12 generally underestimates the 850-hPa wind speed, especially over the Somali region in summer, compared with the reanalysis data (Figs. 8a, 8g). Model 4 overestimates the strength of easterly winds between 5°N and 20°N and westerly winds between the equator and 10°S in winter (Figs. 8b, 8f).

Other potential applications 10
Similar to the Taylor diagram (Taylor, 2001), the VFE diagram can be applied to the following aspects.

Tracking changes in model performance
To summarize the changes in the performance of a model, the points on the VFE diagram can be linked with arrows.

Indicating the statistical significance of differences between two groups of simulations 20
One way to assess whether there are apparent differences between two groups of data is by showing them on the diagram. Two groups of data can have a significant difference if the statistics from two groups of data are clearly separated from each other, and vice versa. As an illustration of this point, Figure 9 shows the normalized pattern statistics of the climatological mean 850-hPa vector winds derived from multiple members of model 12, 13, and 14. The symbols representing the same model show a close clustering, signifying that the sampling variability has less impact on the statistics of climatological 25 mean vector winds. On the other hand, the symbols representing different models are clearly separated from each other. This suggests that the differences between models are much larger than the sampling variability of individual models. Thus, the differences between models 12, 13, and 14 are likely to be significant. Models 12 and 13 are different versions of the same model. Compared with model 12, model 13 shows a similar RMSL but higher VSCs and smaller RMSVDs, which suggests that the improvement of model 13 beyond 12 is primarily due to the improvement of the spatial pattern of vector winds (Fig.  30   9). It should be noted that a formal test of statistical significance usually requires more than 30 samples. The ensemble member involved here is less than 10, which may not be sufficient to conclude a significant difference between three models, especially for models 12 and 13.

Evaluating model skill
Similar to equation (4) and (5) in Taylor (2001), one can also construct skill scores using VSC and RMSL to evaluate model 5 skills to simulate vector fields. For example: where R 0 is the maximum VSC attainable. S v1 or S v2 take values between zero (least skillful) and one (most skillful). Both skill scores can be shown as isolines in the VFE diagram, similar to Figure 10 and 11 in Taylor (2001). Both skill scores, S v1 10 and S v2 , take the VSC and the RMSL into account. However, S v1 places more emphasis on the correct simulation of the vector length, whereas S v2 pays more attention to the pattern similarity of the vector fields.

Discussion and Conclusions
In this study, we devised a vector field evaluation (VFE) diagram based on the geometric relationship between three scalar variables, i.e., the vector similarity coefficient (VSC), RMSL, and RMS vector difference (RMSVD). Three statistical 15 variables in the VFE diagram are meaningful and easy to compute. VSC is defined by the arithmetic mean of the inner product of normalized vector pairs to measure the pattern similarity between two vector fields. Our results suggest that VSC can well describe the pattern similarity of two vector fields. RMSL measures the mean length of a vector field. RMSVD measures the overall difference between two vector fields. The VFE diagram can clearly illustrate how much the overall RMSVD is attributed to the systematic difference in vector length versus how much is due to poor pattern similarity. 20 As discussed in Appendix A, three statistical variables can be computed with full vector fields (including both the mean and anomaly) or anomalous vector fields. One can compute three statistical variables using full vector fields if the statistics in both the mean state and anomaly need to be taken into account (Figs. 7,9). Alternatively, one can compute three statistical variables using anomalous vector fields if the statistics in the anomaly are the primary concern. The VFE diagram is devised 25 to compare the statistics between two vector fields, e.g., vector winds usually comprise 2-or 3-dimensional vectors. Onedimensional vector fields can be regarded as scalar fields. In terms of the one-dimensional case, the VSC, RMSL, and RMSVD computed by anomalous fields become the correlation coefficient, standard deviation, and centered RMSE, respectively, and they are the statistical variables in the Taylor diagram. Thus, the Taylor diagram is a specific case of the VFE diagram. The Taylor diagram compares the statistics of anomalous scalar fields. The VFE diagram is a generalized 30 Taylor diagram that can compare the statistics of full or anomalous vector fields.
The VFE diagram can also be easily applied to the evaluation of 3-dimensional vectors; however, we only considered 2dimensional vectors in this paper. If the vertical scale of one 3-dimensional vector variable is much smaller than its horizontal scale, e.g., vector winds, one may consider multiplying the vertical component by 50 or 100 to accentuate its importance. In addition, as with the Taylor diagram, the VFE diagram can also be applied to track changes in model 5 performance, indicate the significance of the differences between two groups of simulations, and evaluate model skills. More applications of the VFE diagram could be developed based on different research aims in the future.

Code availability
The code used in the production of Figure 3 and 7a are available in the supplement to the article.

Consider two full vector fields A � �⃑ and B
� �⃑ : A � �⃑ i = (x ai , y ai ); i = 1, 2, …, N B � �⃑ i = (x bi , y bi ); i = 1, 2, …, N A � �⃑ i and B � �⃑ i are 2-dimensional vectors. Each full vector field includes N vectors and can be broken into the mean and anomaly: 5 The standard deviation of the x-and y-component of vector A � �⃑ i and B � �⃑ i can be written as follows: The RMSL of vector field � �⃑ is written as follows: Similarly, we have The VSC between vector fields � �⃑ and � �⃑ : Geosci. Model Dev. Discuss., doi: 10.5194/gmd-2016-172, 2016 Manuscript under review for journal Geosci. Model Dev.
The RMSVD 2 between vector fields � �⃑ and � �⃑ : Based on equation (A1), (A2), and (A4), we can conclude that the L A , L B , and RMSVD 2 derived from the full vector fields is equal to those derived from the mean vector fields plus those derived from the anomalous vector fields. The R v computed by two full vector fields is also determined by that derived from the mean state and anomaly (A3). This indicates that the VFE 5 diagram derived from the full vector fields takes the statistics in both the mean state and anomaly of the vector fields into account. The VFE diagram derived from the full vector fields is recommended for use if both the statistics in the mean state and anomaly are of great concern. On the other hand, the VFE diagram derived from anomalous vectors fields can be used if the statistics in the anomaly are the primary concern. In this case, anomalous L A , L B , and R v and RMSVD 2 can be written, respectively, as follows: The vector fields � �⃑ and � �⃑ can be regarded as two scalar fields if we further assume that the y-component of both vector fields is equal to 0. Under this circumstance, equation (A5 -A8) can be written as follows: L A ′ and L B ′ equal the standard deviation of the x-component of vector fields � �⃑ and � �⃑ , respectively. R vA ′ is the Pearson's correlation coefficient between the x-component of vector fields � �⃑ and � �⃑ , and RMSVD A ′ 2 is the centered RMS difference 5 between the x-component of vector fields � �⃑ and � �⃑ . The Taylor diagram is constructed using the standard deviation, Geosci. Model Dev. Discuss., doi:10.5194/gmd-2016-172, 2016 Manuscript under review for journal Geosci. Model Dev. Published: 1 August 2016 c Author(s) 2016. CC-BY 3.0 License. correlation coefficient, and centered RMS difference (Talor, 2001). Thus, the Taylor diagram can be regarded as a specific case of the VFE diagram (i.e., for 1-dimensional cases). The VFE diagram is a generalized Taylor diagram which can be applied to multi-dimensional variables.

Author contribution
Z. Xu and Z. Hou are the co-first authors. Z. Xu constructed the diagram and led the study. Z. Hou and Z. Xu performed the 5 analysis. Z. Xu and Y. Han wrote the paper. All of the authors discussed the results and commented on the manuscript.   respectively. There are 432 (12×36) "+" symbols. Monthly NCEP-NCAR reanalysis II data were used to produce this figure.
Published: 1 August 2016 c Author(s) 2016. CC-BY 3.0 License. Figure 9: Normalized pattern statistics for climatological mean 850-hPa vector winds over the Asian-Austrian monsoon region (10°S-40°N, 40°-140°E) derived from each independent ensemble member by models 12, 13 and 14. Models 12, 13, and 14 include 5, 6, and 9 ensemble simulations, respectively. The same type of symbols show a close clustering, and different types of symbols are clearly separate from each other, which suggests that the difference between different models are likely to be significant.