Analysis of errors introduced by geographic coordinate systems on weather numeric prediction modeling

Most atmospheric models, including the Weather Research and Forecasting (WRF) model, use a spherical geographic coordinate system to internally represent input data and perform computations. However, most geographic information system (GIS) input data used by the models are based on a spheroid datum because it better represents the actual geometry of the earth. WRF and other atmospheric models use these GIS input layers as if they were in a spherical coordinate system without accounting for the difference in datum. When GIS layers are not properly reprojected, latitudinal errors of up to 21 km in the midlatitudes are introduced. Recent studies have suggested that for very high-resolution applications, the difference in datum in the GIS input data (e.g., terrain land use, orography) should be taken into account. However, the magnitude of errors introduced by the difference in coordinate systems remains unclear. This research quantifies the effect of using a spherical vs. a spheroid datum for the input GIS layers used by WRF to study greenhouse gas transport and dispersion in northeast Pennsylvania.


Introduction
Geographic information science (GISc) datasets are usually projected on a spheroid geographic coordinate system (GCS) such as World Geodetic System 1984 (WGS84) or North American Datum 1983 (NAD83).The earth is an irregular oblate spheroid, and these datums are used to better approximate the actual shape of the planet, which is flattened at the poles and bulged at the equator.The datums are used in combination with different projections (e.g., Univer-sal Transverse Mercator (UTM), latitude-longitude, Albert equal area) to map a 3-D view of the earth onto a 2-D plane.
Atmospheric models are based on a spherical coordinate system because it usually leads to faster computations and easier representations of data (Monaghan et al., 2013).The GISc layers used as input data for the atmospheric models generally use a spheroid datum, but they are ingested by the models as if they used spherical datums.Using different GCSs can affect the model results because the input data are mapped to different locations.This difference can lead to latitudinal shifts up to 21 km in the midlatitudes (Monaghan et al., 2013).This paper performs a series of sensitivity studies where the GISs input layers are reprojected from a spheroid to a spherical datum in order to more correctly represent the input layers used by the atmospheric models.
In a GCS the earth is represented as either an oblate spheroid or a sphere, whereas in a spherical system the earth is always represented as a sphere (Bugayevskiy and Snyder, 1995).This means that when using a spherical coordinate system, the spatial relationships between points on the surface of the earth are altered.The shift in the spatial relationship results in a latitudinal error and is consistent across all data that are used as input layers in the atmospheric models.Consequently, numerical errors are introduced by computations that are a function of latitude, such as the Coriolis force and the incoming solar radiation.As already explained in Monaghan et al. (2013), a minor mismatch between the Weather Research and Forecasting (WRF) model global atmosphere input and static variables will affect the simulation result.Figure 1 shows the latitudinal errors introduced when representing a point on the surface of the earth with a spher-ical GCS.Point A represents data projected on a spheroid system (red line).When that same point A is represented on a sphere (green line) like in an atmospherical model, its location gets incorrectly shifted to point B. Point C is the true location of point A when correctly projected in the spherical coordinate system.Figure 2 shows that the errors between spheroid and sphere representation for the same point are a function of latitude.The maximum errors occur at midlatitude, precisely at 45 • N and S. Indentical errors occur in the Southern Hemisphere.
Differences in coordinate systems and the resulting spatial errors, such as the example provided in Fig. 1, have not been a primary focus in atmospheric modeling because of the relatively coarse spatial resolution of the simulation domains (David et al., 2009).More recently, due to the improvements in computational resources and technological advances, atmospheric models are routinely run at higher spatial resolution.Yet this trend in running simulations with high-resolution input datasets do not take into account the shift between the coordinate systems, which may cause spatial errors in the model's output.Monaghan et al. (2013) investigated errors caused by different coordinate systems using WRF run with higher resolution topography and land use datasets over Colorado.Multiple WRF simulations were performed to study differences in meteorological parameters such as air temperature, specific humidity and wind speed.They concluded that the GCS transformation from WGS84 GCS to a spherical earth model caused the input data to shift up to 20 km southward in central Colorado.The impact of this shift leads to significant localized effects on the simulation results.The root mean square difference (RMSD) for air temperature is 0.99 • C, for specific humidity it is 0.72 g kg −1 and for wind speed it is 1.20 m s −1 .It was concluded that for high-resolution atmospheric simulations, the issue resulting from datum and projection errors is increasingly important to solve.All datasets used as input should be in the same GCS (Monaghan et al., 2013).
No study has yet given attention to the impacts of incorrect coordinate systems on the transport of an atmospheric tracer.Sensitivity experiments were conducted to quantify the impact of geographic coordinate systems on the atmospheric mixing ratios of methane (CH 4 ) emitted from the Marcellus shale gas production activities in Pennsylvania.Using a chemistry module to transport passive tracers in the atmosphere, WRF simulates the CH 4 mixing ratios in the atmosphere.
Geographic information systems and other geospatial technologies have been increasingly used in atmospheric sciences.GIS provides a scientific framework for observation data, modeling, and scientific deduction to study atmospheric phenomena and processes (Barkley et al., 2017;Hart and Martinez, 2006;Dobesch et al., 2013).However, some barriers between GIS and atmospheric science, such as different data formats and different GCSs, impede the collaborations.Point A represents data projected on a spheroid system.When that same point A is represented on a sphere like in an atmospherical model, its location gets incorrectly shifted to point B. Point C is the true location of point A when correctly projected in the spherical coordinate system (Monaghan et al., 2013).
This research utilizes the open-source language R to automatically convert the weather numerical-model input, output, and GIS data layers.
The objectives of this study are the following: 1. to quantify the impact of projecting the model input data with different coordinate systems on meteorological variables and simulated atmospheric mixing ratios of a passive tracer 2. to generate a tool that can automatically convert WRF output to GIS layers and vice versa.

Study area
The atmospheric simulations were performed using three nested domains of decreasing area and increasing spatial resolutions.As suggested by Monaghan et al. (2013), we defined several criteria to select a region where errors introduced by GCS are more likely to affect our simulation results.First, the region should have larger elevation gradients.Second, it should contain diverse land use patterns such as forest, urban, and wetland.Third, the simulation period requires convective conditions such as those in summertime since both the topography and the land cover play a larger effect on the simulations.Finally, a comparatively small domain should provide a focused study region because a larger domain would ignore the small variations.
The 9 × 9 km grid (domain 1) contains the mid-Atlantic region, the entire northeastern United States east of Indiana, parts of Canada, and a large area of the northern Atlantic Ocean.The 3 × 3 km (domain 2) grid contains the entire state of Pennsylvania and southern New York.The 1 × 1 km (domain 3) grid contains northeastern Pennsylvania and southeastern New York.One-way nesting is used so that information from the coarse domain translates to the fine domain but no information from the fine domain translates to the coarse domain (Barkley et al., 2017).The elevation of the domain 3 ranges between 108 and 706 m a.s.l.(above sea level) (Fig. 4).
The analysis of the model results focuses on the innermost domain 3.This region was primarily chosen because there has been an increase of activity in natural-gas fracking since 2008, which is expected to result in significant releases of fugitive greenhouse gas emissions, in particular CH 4 (Barkley et al., 2017).

Data
Table 1 shows the input data sources for each of the three scenarios.The variables include topography, land use, Coriolis, leaf area index (LAI), albedo and CH 4 emissions.

Digital elevation data
Two types of elevation data are included in the experiments.The WRF DEFAULT elevation data are derived from the US Geological Survey (USGS) global 30 arcsec (roughly 900 m) elevation dataset topography and are used in the DEFAULT case (Gesch and Greenlee, 1996).The HR and HR_SHIFT cases use higher resolution data from the NASA Shuttle Radar Topographic Mission (SRTM; Farr et al., 2007).The data consist of a 90 m resolution digital elevation model (DEM) for over 80 % of the world.The data are projected in a geographic (latitude-longitude) projection with the WGS84 GCS.

Land cover data
The DEFAULT scenario uses the 24 types of land use categories that are derived from satellite data.The HR and HR_SHIFT cases use the latest land cover products available for North America.The 2011 USGS National Land Cover Database (NLCD) covers the continental United States, including the state of Alaska, and is derived from Landsat satellite imagery with a 30 m spatial resolution.Furthermore, the product is modified from the Anderson Land Cover Classification System and is divided into 20 different land cover types.It has a NAD 1983 GCS and is projected using an Albers conic equal area projection (Homer et al., 2007).Barkley et al. (2017) Due to the extent of the NLCD dataset, the 2010 North American Land Cover (NALC) 1 is used for the areas of the domain that include Canada.The NALC product is constructed from observations acquired by the Moderate Resolution Imaging Spectroradiometer (MODIS) at a 250 m spatial resolution.This product is produced by Canada, the United States, and Mexico and is represented based on three hierarchical levels using the Food and Agriculture Organization (FOA) land classification system.NALC is based on a sphere GCS with a radius of 6 370 977 m and has a Lambert azimuthal equal-area projection (Latifovic et al., 2012).

Leaf area index
The LAI variable estimates the tree canopy area relative to a unit of ground area (Watson, 1947).Two types of LAI data are used in this experiment.WRF DEFAULT LAI is based on a climatology derived from MODIS is used in the DEFAULT scenario.LAI in HR was obtained from 8-day-averaged data from MODIS.The level-4 MODIS global LAI product composites data every 8 days at 1 km resolution on a sinusoidal grid (NASA LP DAAC, 2015a).The product we used is MCD15A2 for May 2015, which combines the MODIS data from Terra and Aqua satellites.

Albedo
Surface albedo is one of the key radiation parameters required for modeling of the earth's energy budget.In the DE-FAULT scenario, albedos use the values from the MODIS modified by National Oceanic and Atmospheric Administration (NOAA) according to the green fraction (Chen and Dudhia, 2001).
The HR and HR_RESHIFT cases use the satellite observations that are retrieved from MODIS to produce highresolution and domain-specific albedo input.A 16-day L3 Global 500 m MCD43A3 product is used for May 2015.The product relies on multiday, clear-sky, atmospherically corrected surface reflectances to establish the surface anisotropy and provide albedo measurements at a 500 m resolution (NASA LP DAAC, 2015b).

CH 4 emissions
CH 4 emission sources include unconventional wells and conventional wells.Both the location and amount of production rates are provided from the Pennsylvania Department of Environmental Protection (PADEP) Oil and Gas Reporting website, New York Department of Environmental Conservation, and the West Virginia Department of Environmental Protection (WVDEP).The emission was calculated by multiplying the production with the emission factors.Omara et al. (2016) indicates that the emission rate for conventional wells is 11 % and unconventional well is 0.13 % of the well production.The CH 4 emission files were converted as input files for the WRF model (Barkley et al., 2017).

Weather stations
The weather observations are the standard measurements of wind, temperature and moisture fields from World Meteorological Organization (WMO) surface stations at hourly intervals and radio sondes at 12-hourly intervals.The objective analysis program OBSGRID is used for quality control to remove erroneous data (Deng et al., 2009;Rogers et al., 2013).There are eight stations located in the inner domain.Temperature data during the experiment time from each tower are collected to validate the model simulation results.

Methodology
The WRF model (Skamarock and Klemp, 2008) version 3.6.1 is used to generate the numerical weather simulations in this research.It is one of the most widely distributed and used mesoscale numerical weather prediction (NWP) models in existence.It has well-tested algorithms for meteorological data assimilation and meteorological research and forecast purposes.The WRF model carries a complete suite of atmospheric physical processes that interact with the model's dynamics and thermodynamics core (Barkley et al., 2017).
The model physics of the WRF configuration in this research includes the use of the following settings (Barkley et al., 2017).First, the double-moment scheme is used for cloud microphysical processes (Thompson et al., 2004).Second, the Kain-Fritsch scheme is used for cumulus parameterization on the 9 km grid (Kain and Fritsch, 1990;Kain, 2004).Third, the rapid radiative transfer method is applied to general circulation models (GCMs; Mlawer et al., 1997;Iacono et al., 2008).Next, the level-2.5 TKE-predicting MYNN planetary boundary layer (PBL) scheme (Nakanishi and Niino, 2006) and the Noah four-layer land-surface model (LSM), that predicts soil temperature and moisture in addition to sensible and latent heat fluxes between the land surface and atmosphere, are included (Chen and Dudhia, 2001;Tewari et al., 2004;Barkley et al., 2017).
The WRF model enables the chemical transport option within the model, allowing for the projection of CH 4 concentrations throughout the domain.Surface CH 4 emissions used as input for the model come from the CH 4 emissions inventory.WRF is able to simulate the CH 4 transport in the atmosphere.
WRF simulations are performed for a 25 h time period from 07:00 on 14 May 2015 until 07:00 15 May 2015 Eastern Standard Time (EST) over the three nested domains described in Sect. 2. Figure 5 shows the experiment workflow.A series of numerical weather simulations were performed using the following input datasets: which are first reprojected onto a spherical coordinate system using the transformation function (Hedgley Jr., 1976).
This is a summary of the comparisons that are performed to assess the hypothesis.
1. DEFAULT is compared to HR to investigate the impacts on the high-resolution input data on model results.
2. HR is compared to HR_SHIFT to investigate the impacts of geographic coordinate system change on model results.
3. HR_RESHIFT is originally the model output from HR_SHIFT simulation.Then, the output is shifted back to WGS84.HR_RESHIFT is compared to HR.These two outputs are in the same geographic coordinate system.The model output comparison, such as temperature, wind speed, wind direction and CH 4 concentration, leads to sensitive understanding of how latitudedependent variables affect the model simulation.
The input data include elevation, land use, Coriolis E and F components, LAI, albedo, and maps of CH 4 sources.The CH 4 sources include conventional wells and unconventional wells.According to Refslund et al. (2013), using highresolution green-fraction data does not significantly impact the performance of the weather model simulation.Thus, we did not replace green fractions in this experiment.
The first simulation (DEFAULT scenario) uses the WRF DEFAULT setting: US Geological survey (USGS) Global 30 arcsec elevation dataset topography (GTOPO30; Gesch and Greenlee 1996), 24 types of land use data, Coriolis parameters E and F , original WRF leaf area index, and albedo.In addition to the above variables, the experiment takes CH 4 emissions from unconventional and conventional wells as inputs to the WRF simulation.
The second simulation, HR, uses higher resolution datasets for terrain, land cover, LAI and albedo.The terrain elevation data are derived from the NASA SRTM DEM product at a 90 m resolution.The NALC and NLCD are used for the land cover data.LAI and albedo are retrieved from MODIS in May 2015.All these data are replaced for all of the three WRF domains.A common approach to resampling land cover categories to a cell is based on the highest number of pixels that represent a class.Then the highest class occurrence is used to assign the land cover type of the cell.For example if cell A is made up of three different land cover types, (1) "Open Water" 38 %, (2) "Deciduous Forest" 32 %, and (3) "Evergreen Forest" 30 % then the final class for cell A would be Open Water.However, in this work, a hierarchical classification scheme is used to define the land cover type.First, we determine the most common class of land cover types presents inside the cell and create a count order based on the values inside that class.A class corresponds to multiple land cover types.For example, the class "Forest" includes  the types Deciduous Forest and Evergreen Forest.We assign the prevalent class, such as Forest, to the given pixel.Second, the grid cell is attributed a land cover type by selecting the type with largest values that are present within a class.For example, if the same cell A is made up of the three different land cover types, (1) Open Water 38 %, (2) Deciduous Forest 32 %, and (3) Evergreen Forest 30 %, then the final class for cell A would be Deciduous Forest because the class Forest is most common class (62 %) within this cell, and Deciduous Forest has the highest percentage within the Forest class.
The third simulation, HR_SHIFT, uses the same data as the HR scenario; however, the input data are converted from WGS84 to the DEFAULT WRF sphere GCS.
Coriolis is a function of latitude and thus particularly affected by errors in GCS.The Coriolis force has two components: E and F are calculated using E = 2 sin(ϕ) and F = 2 cos(ϕ), where is rotation rate of the earth and ϕ represents latitude.Coriolis E and F variables are recalculated in the HR_SHIFT scenario by using the reprojected latitude.
Table 2 shows the input and output GCS for the topographic, land use, and CH 4 data used for the WRF simu- lations.Specifically, results discuss the output for the DE-FAULT and HR, and HR and HR_RESHIFT configurations.A prototype tool is developed to automatically transfer WRF output to GIS layers.

WRF model input and output processing
A series of scripts in R are provided to perform the tasks identified in the current paper.Figure 6 shows the process used to generate new input data based on additional input data and an optional coordinate transformation.This process is performed in the WRF_preprocess.R and WRF_updateNC.R scripts.WRF_process.R takes WRF original input files as input and shift the selected WRF layers to sphere raster format.In addition, users generate an ESRI Shapefile as an output.The WRF_UpdateNC.R file takes the generated Rdata files and updates them into the original WRF input file.The detailed descriptions are attached in Appendix B.
Additional scripts are provided to perform basic transformation of the input data from their original format to the latitude-longitude WGS84 format that is used by WRF_preprocess.R to generate new model input data.For example MODIS_LAI.R is used to automatically download and reproject MODIS satellite data in a format that can be input into the WRF input file.These functions are provided to automate the process of downloading and reprojecting MODIS data; the same results can be achieved through several already alternatively methodologies.Essentially, the MODIS functions are wrappers around the MODIS Reprojection Tool, which is provided by NASA (NASA, 2017).The current code assumes standard WRF input data in NetCDF format; however, the script can be easily modified to accept a different input format from a model other than WRF.
ing the WRF simulation: air temperature, mean horizontal wind speed and direction, and CH 4 atmospheric mixing ratios.Temperature was selected because it is one of the main drivers of local and large-scale weather.Additionally, historical temperature data are available for comparison purposes.Near-surface temperature also corresponds to areas of higher energy, which relates to turbulent motions near the surface as well as surface-water exchange (evaporation).
Wind speed and wind direction were selected to represent the atmospheric dynamics impacting the weather conditions on small and large scales.Finally, we selected the CH 4 mixing ratios to quantify the impact on greenhouse gas transport in the atmosphere.

DEFAULT and HR sensitivity study
Previous studies have investigated the weather simulation performance differences by using higher resolution data.While the comparison between DEFAULT and HR is not the central focus of this work, experiments were performed to confirm previous findings and to quantify changes due to using higher resolution vs. changes due to the different GCSs.Figures 7, 8 and 9 compare the WRF simulations for domain 3 for temperature, wind direction and wind speed, respectively.The figures show that using higher resolution data does not significantly alter the results obtained using the DE-FAULT WRF input.

HR and HR_RESHIFT sensitivity study
This section analyzes the main research question of the article, namely what the effect of using a different geographic coordinate system is on the simulations of temperature, wind speed, wind direction, and CH 4 mixing ratio.

Results for temperature
The effect of using a different coordinate system on the simulations of temperature is performed by comparing observations between the un-shifted (HR) and shifted (HR_SHIFT) scenarios.Figure 10 shows the difference obtained for 14 May 2015 at 15:00 EST.This particular time and day were chosen because it is one of the hottest times of the day, when temperatures are expected to vary the most.The letters A-H represent the eight weather observation stations located inside the selected domain and are used for validation purposes.
The temperature difference ranges from −5.6 • C, represented by light blue colors, to 6 • C, shown with orange-red colors.When comparing both HR and HR_RESHIFT, the most striking spatial pattern is the systematic cooling around the finger lakes (roughly bound by points A, B and H).There are several additional areas of increased positive and negative temperature around the perimeter of the image, where most extremes are observed.However, these are likely to be artifacts introduced by the WRF computations where the nested grids meet.The largest differences are observed at the edges of the domain and are likely artifacts being introduced by WRF where the nested grids change resolutions.
Statistical tests were performed using the observed weather data (stations A-H), and both scenarios (HR and HR_RESHIT) have a 0.91 root mean square error.While this suggests that there are only small temperature variations when using a different GCS, it should be noted that this test was only performed at eight stations throughout the domain where ground data were available.Unfortunately, several of these stations lie close to the edge of the domain, where WRF simulation results are most unreliable.Therefore, the spatial cooling observed around the lakes is the most important result obtained entirely due to the change in GCS.
Both domain 2 and domain 3 show a systematic temperature increase in the HR_RESHIFT scenario when compared to HR (Figs. 11 and 12).The height is represented on the vertical axis while the temperature difference is on the horizontal axis.The variability and mean temperature differences are larger near the surface and below 1 km altitude.This height corresponds approximately to the average boundary layer height, where the impact of the surface on the atmospheric dynamics is maximum.The variability in the midtroposphere decreases significantly, revealing a lower impact of the GCS on the higher altitude model results.

Results for wind speed
Figure 13 shows the wind speed difference for 14 May 2015 at 11:00 EST, which ranges from −5.11 to 3.5 m s −1 between HR and HR_RESHIFT.A wave pattern is found during the 25 h simulation, and it can be explained by the shifted data allowing for a more accurate characterization of the complex terrain along the Appalachian Mountains.The wind speed differences between HR and HR_RESHIFT indicate that the change in GCS affects the results.

Results for wind direction
Figures 18 and 19 show results for wind directions and highlight that, as for the previous cases, the most differences are found closer to the surface.As explained earlier, changes in GCS affect the interaction in the lower layers of the troposphere the most.
In the northeastern corner of the inner domain, there is a strip-like pattern, with large local wind changes between positive and negative northeast and northwest, and between positive and negative southeast and southwest.In this region the Appalachian Mountains create a complex terrain with series of valley and ridges.The GCS changes the spatial distribution of the terrain elevation, leading to these very large changes in wind direction The strong vertical gradients observed in the figure suggest there is also a combination of influences from both the surface parameters (primarily elevation and land cover), and the Coriolis components.Despite observed changes throughout the vertical column, the nearsurface variability is significantly larger than the midtropospheric variances, as was observed for temperature and wind speed.

Results for CH 4 atmospheric mixing ratios
WRF was used to simulate CH 4 atmospheric mixing ratios that originated from leaks from unconventional and conventional natural-gas production activities during the 25 h simulation.The CH 4 mixing ratio is a unique tracer to study at-   climate change is 28 to 36 times greater than CO 2 over a 100-year period (US EPA, 2015).
CH 4 mixing ratios are computed differently than temperature, wind speed and wind direction.Temperature, wind speed and wind directions are computed using global atmospheric input data, which is an internal variable of the WRF model physics.On the other hand, CH 4 mixing ratios are  computed solely on the CH 4 emissions created using multiple datasets.Thus, CH 4 mixing ratios were selected to investigate the impact of differences in GCS on the simulation accuracy aggregated over time, as CH 4 accumulates differences along its trajectories in the atmosphere.Overall, we expect a strong sensitivity to transport differences revealed by  and panel (b) for evening time (p.m.).When the shaded area is larger than 0, CH 4 mixing ratios in HR are larger than those in HR_RESHIFT, and vice versa.
For conventional wells (Fig. 20), the differences are often close to 0, with nighttime increases (21:00 to 04:00 EST).For the unconventional wells (Fig. 21), the CH 4 mixing ratio in HR is also smaller during nighttime (21:00 to 08:00 EST), but much more so (as much as 1 ppb smaller).The reason for this change is that, during nighttime, the mixing within the boundary layer is smaller (more stable atmosphere) and therefore the magnitude of the concentration of CH 4 is higher.Because of the higher concentrations, the impact of the change in GCS is bigger.Furthermore, the explanation for why conventional wells have a smaller variation than unconventional wells is that most of them are located farther away from the tower network, and thus their emission contribution on the simulation is smaller because it is distributed over a wider area.These results show a significant change in the CH 4 mixing ratio when using the different GCS.

Conclusions
This paper discusses the impact of different GCSs on weather numerical-model simulations.The main hypothesis is that the error introduced by not taking into account the GCS of the input data, which results in latitudinal errors of up to 21 km in the midlatitudes, can cause significant changes in the output of the model.
A sensitivity study was performed using the WRF numerical model, with input data at different resolutions and different GCSs.Four different output parameters were investi- Results show that changes are introduced by using different GCSs for the input data.The observed differences were caused by (1) topography shift, including elevation, land use, albedo, and LAI differences, and (2) latitude-dependent physics, such as the Coriolis force and the incoming solar radiation.
A systematic temperature increase was observed in all of the three nested domains used in this study.A spatial pattern showing significant cooling was observed near two lakes included in the inner domain.
Similarly, wind speed and direction show spatial changes that can be traced back to the use of different land cover and elevation.Wind speed, wind direction, and temperature indicate more variations within the planetary boundary layer, where the interaction between the surface and the atmosphere is greatest.It is expected that changes at the surface will introduce most significant changes closer to the surface.
It is shown that, without exception, the GCS of the input data affects model results.Sometimes these changes are large and have a clear spatial patterns, whereas other times they are small and negligible.It is concluded that while some of these errors might be small, they nevertheless introduce an additional bias in the model output.For very high-resolution simulation in particular, these errors are compounded and can lead to significant errors.
While it is best to properly project all data in the correct representation used by the model, which in the case of WRF is a spherical GCS, it is most important to keep the GCSs and projections among the input layers consistent.In fact, if all layers are in the same GCS, errors in mapping onto the surface of the earth are consistent across the datasets and the effects of using the wrong GCS are minimized.However, mixing GCSs in the input data leads to larger errors.

Figure 1 .
Figure1.Equivalent-point comparisons when using a sphere and spheroid.Blue represents the true earth shape.Green represents the sphere that WRF assumes.Red shows the spheroid WGS84 GCS.Point A represents data projected on a spheroid system.When that same point A is represented on a sphere like in an atmospherical model, its location gets incorrectly shifted to point B. Point C is the true location of point A when correctly projected in the spherical coordinate system(Monaghan et al., 2013).

Figure 2 .
Figure 2. Errors introduced by the different geographic coordinate systems are a function of latitude.The maximum error of about 21 km is found at 45 • latitude.The three shaded areas indicate the latitudinal extents of the three nested WRF domains used in this study.

Figure 3 .
Figure 3. Map of study area shows three nested domains of WRF.The inner domain is located in the northeastern Pennsylvania and extends into southeastern New York.

Figure 4 .
Figure 4.In domain 3, the latitude ranges from 40 to 42.67 • N. The longitude ranges from −78 to −75.17 • W. The figure shows the satellite view of the domain with major roads, cities and landmarks.

1
2010 North American Land Cover at 250 m spatial resolution.Produced by Natural Resources Canada/The Canada Centre for Mapping and Earth Observation (NRCan/CCMEO), United States Geological Survey (USGS); Insituto Nacional de Estadística y Geografía (INEGI), Comisión Nacional para el Conocimiento y Uso de la Biodiversidad (CONABIO) and Comisión Nacional Forestal (CONAFOR)

Figure 5 .
Figure 5. Workflow of the study showing the three scenarios: DE-FAULT, HR and HR_SHIFT.

Figure 6 .
Figure 6.Flowchart for transforming and generating new model input data.

Figure 7 .
Figure 7. Temperature differences between HR and DEFAULT in domain 3.

Figure 8 .
Figure 8. Wind direction differences between HR and DEFAULT in domain 3.

Figure 9 .
Figure 9. Wind speed differences between HR and DEFAULT in domain 3.

Figure 11 .
Figure 11.Temperature differences between HR and HR_RESHIFT in domain 2.

Figure 12 .
Figure 12.Temperature differences between HR and HR_RESHIFT in domain 3.

Figure 14 .
Figure 14.Wind speed differences between HR and HR_RESHIFT in domain 2.

Figure 15 .
Figure 15.Wind speed differences between HR and HR_RESHIFT in domain 3.

Figure 16 .
Figure 16.Wind direction difference between HR and HR_RESHIFT on 14 May 15:00 EST, 2015, showing a strip pattern in the right top corner where it is a valley region.The pattern indicates that the WRF model reacts differently on a small-area weather simulation when the GCS changes.

Figure 17 .
Figure 17.Domain 3 topography map.The elevation ranges from 108 to 761 m above sea level.
the long-range transport of CH 4 emitted at the surface.Figures 20 and 21 show the mean of CH 4 mixing ratios differences between HR and HR_RESHIFT for conventional and unconventional wells as a function of time.The figures show two radar plots, where the times have been arranged as on a clock.Panel (a) indicates the results for morning time (a.m.)

Figure 18 .
Figure 18.Wind direction differences between HR and HR_RESHIFT in domain 2.

Figure 19 .
Figure 19.Wind direction differences between HR and HR_RESHIFT in domain 3.

Figure 20 .
Figure 20.CH 4 mixing ratios difference between HR and HR_RESHIFT in domain 3 for conventional wells.Panel (a) shows the differences between 00:00 and 12:00 EST on 14 and 15 May.Panel (b) shows the differences between 12:00 and 24:00 EST on 14 May.

Figure 21 .
Figure 21.CH 4 mixing ratios difference between HR and HR_RESHIFT in domain 3 for unconventional wells.Panel (a) shows the differences between 00:00 and 12:00 EST on 14 and 15 May.Panel (b) shows the differences between 12:00 and 24:00 EST on 14 May.

Table 1 .
The table showing the input data sources for each of the three scenarios (DEFAULT, HR and HR_RESHIFT).

Table 2 .
Shown below is the input and output GCS for the data used in each of the four analyses that will be performed.