Dynamically Downscaled Projections of Phenological Changes across the Contiguous United States
Phenological indicators (PI) are used to study changes to animal and plant behavior in response to seasonal cycles, and they can be useful to quantify the potential impacts of climate change on ecosystems. Here, multiple global climate models and emission scenarios are used to drive dynamically downscaled simulations using the WRF model over the CONUS. The wintertime dormancy of plants (chilling units or "CU"), timing of spring onset (Extended Spring Indices or "SI"), and frequency of proceeding false springs are calculated from regional climate simulations covering historical (1995-2005) and future periods (2025-2100). Southern parts of the CONUS show projected CU decreases (inhibiting some plants from flowering or fruiting), while the northern CONUS experiences an increase (possibly causing plants to break dormancy too early, becoming vulnerable to disease or freezing). Spring advancement (earlier SI dates) is projected, with decadal trends ranging from approximately 1 to 4 days per decade over the CONUS, comparable to or exceeding those found in observational studies. Projected changes in risk of false spring (hard freezes following spring onset) vary across members of the ensemble and regions of the CONUS, but generally western parts of the CONUS are projected to experience increased risk of false springs. These projected changes to PI connote significant effects on cycles of plants, animals, and ecosystems, highlighting the importance of examining temperature changes during transitional seasons.
Spatially Explicit Correction of Simulated Urban Air Temperatures Using Crowdsourced Data
Urban climate model evaluation often remains limited by a lack of trusted urban weather observations. The increasing density of personal weather sensors (PWSs) make them a potential rich source of data for urban climate studies that address the lack of representative urban weather observations. In our study, we demonstrate that carefully quality-checked PWS data not only improve urban climate models' evaluation but can also serve for bias correcting their output prior to any urban climate impact studies. After simulating near-surface air temperatures over London and south-east England during the hot summer of 2018 with the Weather Research and Forecasting (WRF) Model and its building Effect parameterization with the building energy model (BEP-BEM) activated, we evaluated the modeled temperatures against 402 urban PWSs and showcased a heterogeneous spatial distribution of the model's cool bias that was not captured using official weather stations only. This finding indicated a need for spatially explicit urban bias corrections of air temperatures, which we performed using an innovative method using machine learning to predict the models' biases in each urban grid cell. This bias-correction technique is the first to consider that modeled urban temperatures follow a nonlinear spatially heterogeneous bias that is decorrelated from urban fraction. Our results showed that the bias correction was beneficial to bias correct daily minimum, daily mean, and daily maximum temperatures in the cities. We recommend that urban climate modelers further investigate the use of quality-checked PWSs for model evaluation and derive a framework for bias correction of urban climate simulations that can serve urban climate impact studies.
Comparison of Local, Regional, and Scaling Models for Rainfall Intensity-Duration-Frequency Analysis
Intensity-duration-frequency (IDF) analyses of rainfall extremes provide critical information to mitigate, manage, and adapt to urban flooding. The accuracy and uncertainty of IDF analyses depend on the availability of historical rainfall records, which are more accessible at daily resolution and, quite often, are very sparse in developing countries. In this work, we quantify performances of different IDF models as a function of the number of available high-resolution ( ) and daily ( ) rain gauges. For this aim, we apply a cross-validation framework that is based on Monte Carlo bootstrapping experiments on records of 223 high-resolution gauges in central Arizona. We test five IDF models based on (two) local, (one) regional, and (two) scaling frequency analyses of annual rainfall maxima from 30-min to 24-h durations with the generalized extreme value (GEV) distribution. All models exhibit similar performances in simulating observed quantiles associated with return periods up to 30 years. When >10, local and regional models have the best accuracy; bias correcting the GEV shape parameter for record length is recommended to estimate quantiles for large return periods. The uncertainty of all models, evaluated via Monte Carlo experiments, is very large when ≤ 5; however, if ≥ 10 additional daily gauges are available, the uncertainty is greatly reduced and accuracy is increased by applying simple scaling models, which infer estimates on subdaily rainfall statistics from information at daily scale. For all models, performances depend on the ability to capture the elevation control on their parameters. Although our work is site specific, its results provide insights to conduct future IDF analyses, especially in regions with sparse data.
Evaluating the Ability of Remote Sensing Observations to Identify Significantly Severe and Potentially Tornadic Storms
Remote sensing observations, especially those from ground-based radars, have been used extensively to discriminate between severe and nonsevere storms. Recent upgrades to operational remote sensing networks in the United States have provided unprecedented spatial and temporal sampling to study such storms. These networks help forecasters subjectively identify storms capable of producing severe weather at the ground; however, uncertainties remain in how to objectively identify severe thunderstorms using the same data. Here, three large-area datasets (geostationary satellite, ground-based radar, and ground-based lightning detection) are used over 28 recent events in an attempt to objectively discriminate between severe and nonsevere storms, with an additional focus on severe storms that produce tornadoes. Among these datasets, radar observations, specifically those at mid- and upper levels (altitudes at and above 4 km), are shown to provide the greatest objective discrimination. Physical and kinematic storm characteristics from all analyzed datasets imply that significantly severe [≥2-in. (5.08 cm) hail and/or ≥65-kt (33.4 ms) straight-line winds] and tornadic storms have stronger upward motion and rotation than nonsevere and less severe storms. In addition, these metrics are greatest in tornadic storms during the time in which tornadoes occur.
Satellite Estimation of Falling Snow: A Global Precipitation Measurement (GPM) Perspective
Retrievals of falling snow from space-based observations represent key inputs for understanding and linking Earth's atmospheric, hydrological, and energy cycles. This work quantifies and investigates causes of differences among the first stable falling snow retrieval products from the Global precipitation Measurement (GPM) satellite and Cloud Profiling Radar (CPR) falling snow product. An important part of this analysis details the challenges associated with comparing the various GPM and snow estimates arising from different snow-rain classification methods, orbits, resolutions, sampling, instrument specifications, and algorithm assumptions. After equalizing snow-rain classification methodologies and limiting latitudinal extent, CPR observes nearly 10 (3) times the occurrence (accumulation) of falling snow as GPM's Dual-Frequency Precipitation Radar (DPR). The occurrence disparity is substantially reduced if pixels are averaged to simulate DPR radar pixels and CPR observations are truncated below the 8-dBZ reflectivity threshold. However, even though the truncated CPR- and DPR-based data have similar falling snow occurrences, average snowfall rate from the truncated CPR record remains significantly higher (43%) than the DPR, indicating that retrieval assumptions (microphysics and snow scattering properties) are quite different. Diagnostic reflectivity ()-snow rate () relationships were therefore developed at Ku and W band using the same snow scattering properties and particle size distributions in a final effort to minimize algorithm differences. CPR-DPR snowfall amount differences were reduced to ~16% after adopting this diagnostic - approach.
Severe Hail Fall and Hailstorm Detection Using Remote Sensing Observations
Severe hail days account for the vast majority of severe weather-induced property losses in the United States each year. In the United States, real-time detection of severe storms is largely conducted using ground-based radar observations, mostly using the operational Next Generation Weather Radar network (NEXRAD), which provides three-dimensional information on the physics and dynamics of storms at ~5-min time intervals. Recent NEXRAD upgrades to higher resolution and to dual-polarization capabilities have provided improved hydrometeor discrimination in real time. New geostationary satellite platforms have also led to significant changes in observing capabilities over the United States beginning in 2016, with spatiotemporal resolution that is comparable to that of NEXRAD. Given these recent improvements, a thorough assessment of their ability to identify hailstorms and hail occurrence and to discriminate between hail sizes is needed. This study provides a comprehensive comparative analysis of existing observational radar and satellite products from more than 10 000 storms objectively identified via radar echo-top tracking and nearly 6000 hail reports during 30 recent severe weather days (2013-present). It is found that radar observations provide the most skillful discrimination between severe and nonsevere hailstorms and identification of individual hail occurrence. Single-polarization and dual-polarization radar observations perform similarly at these tasks, with the greatest skill found from combining both single- and dual-polarization metrics. In addition, revisions to the ''maximum expected size of hail'' (MESH) metric are proposed and are shown to improve spatiotemporal comparisons between reported hail sizes and radar-based estimates for the cases studied.
Expanding the Goddard CSH Algorithm for GPM: New Extratropical Retrievals
The Goddard Convective-Stratiform Heating (CSH) algorithm has been used to retrieve latent heating (LH) associated with clouds and cloud systems in support of the Tropical Rainfall Measuring Mission (TRMM) and Global Precipitation Measurement (GPM) mission. The CSH algorithm required the use of a cloud-resolving model (CRM) to simulate LH profiles to build look-up tables (LUTs). However, the current LUTs in the CSH algorithm are not suitable for retrieving LH profiles at high latitudes or winter conditions that are needed for GPM. The NASA Unified-Weather Research and Forecasting (NU-WRF) model is used to simulate three eastern continental US (CONUS) synoptic winter and three western coastal/offshore events. The relationship between LH structures (or profiles) and other precipitation properties (radar reflectivity, freezing level height, echo-top height, maximum radar reflectivity height and surface precipitation rate) is examined, and a new classification system is adopted with varying ranges for each of these precipitation properties to create LUTs representing high latitude/winter conditions. The performance of the new LUTs is examined using a self-consistency check for one CONUS and one West Coast offshore event by comparing LH profiles retrieved from the LUTs using model-simulated precipitation properties with those originally simulated by the model. The results of the self-consistency check validate the new classification and LUTs. High latitude retrievals from the new LUTs are merged with those from the CSH algorithm to retrieve LH profiles over the GPM domain using precipitation properties retrieved from the GPM combined algorithm.
Examining WRF's Sensitivity to Contemporary Land-Use Datasets across the Contiguous United States Using Dynamical Downscaling
Land-use (LU) representation plays a critical role in simulating air-surface interactions that affect meteorological conditions and regional climate. In the Noah LSM within the WRF Model, LU categories are used to set the radiative properties of the surface and to influence exchanges of heat, moisture, and momentum between the air and land surface. Previous literature examined the sensitivity of WRF simulations to LU using short-term meteorological modeling approaches. Here, the sensitivity to LU representation is studied using continental-scale dynamical downscaling, which typically uses longer temporal and larger spatial scales. Two LU datasets, the U.S. Geological Survey (USGS) dataset and the 2006 National Land Cover Dataset (NLCD), are utilized in 3-yr dynamically downscaled WRF simulations over a historical period. Precipitation and 2-m air temperature are evaluated against observation-based datasets for simulations covering the contiguous United States. The WRF-NLCD simulation tends to produce lower precipitation than the WRF-USGS run, with slightly warmer mean monthly temperatures. However, WRF-NLCD results in more notable increases in the frequency of hot days [i.e., days with temperature >90°F (32.2°C)]. These changes are attributable to reductions in forest and agricultural area in the NLCD relative to USGS. There is also subtle but important sensitivity to the method of interpolating LU data to the WRF grid in the model preprocessing. In all cases, the sensitivity resulting from changes in the LU is smaller than model error. Although this sensitivity is small, it persists across spatial and temporal scales.
Intercomparison of Surface Temperatures from AIRS, MERRA, and MERRA-2, with NOAA and GC-Net Weather Stations at Summit, Greenland
The surface skin and air temperatures reported by the Atmospheric Infrared Sounder/Advanced Microwave Sounding Unit-A (AIRS/AMSU-A), the Modern-Era Retrospective analysis for Research and Applications (MERRA), and MERRA-2 at Summit, Greenland are compared with near surface air temperatures measured at National Oceanic and Atmospheric Administration (NOAA) and Greenland Climate Network (GC-Net) weather stations. The AIRS/AMSU-A Surface Skin Temperature (TS) is best correlated with the NOAA 2 m air temperature (T2M) but tends to be colder than the station measurements. The difference may be the result of the frequent near surface temperature inversions in the region. The AIRS/AMSU-A Surface Air Temperature (SAT) is also correlated with the NOAA T2M but has a warm bias during the cold season and a larger standard error than the surface temperature. The extrapolation of the temperature profile to calculate the AIRS SAT may not be valid for the strongest inversions. The GC-Net temperature sensors are not held at fixed heights throughout the year; however, they are typically closer to the surface than the NOAA station sensors. Comparing the lapse rates at the 2 stations shows that it is larger closer to the surface. The difference between the AIRS/AMSU-A SAT and TS is sensitive to near surface inversions and tends to measure stronger inversions than both stations. The AIRS/AMSU-A may be sampling a thicker layer than either station. The MERRA-2 surface and near surface temperatures show improvements over MERRA but little sensitivity to near surface temperature inversions.
Statistical Modeling of Extreme Precipitation with TRMM Data
This paper improves upon an existing extreme precipitation monitoring system based on the Tropical Rainfall Measuring Mission (TRMM) daily product (3B42) using new statistical models. The proposed system utilizes a regional modeling approach, where data from similar locations are pooled to increase the quality of the resulting model parameter estimates to compensate for the short data record. The regional analysis is divided into two stages. First, the region defined by the TRMM measurements is partitioned into approximately 28,000 non-overlapping clusters using a recursive k-means clustering scheme. Next, a statistical model is used to characterize the extreme precipitation events occurring in each cluster. Instead of applying the block-maxima approach used in the existing system, where the Generalized Extreme Value probability distribution is fit to the annual precipitation maxima at each site separately, the present work adopts the peak-over-threshold method of classifying points as extreme if they exceed a pre-specified threshold. Theoretical considerations motivate using the Point Process framework for modeling extremes. The fitted parameters are used to estimate trends and to construct simple and intuitive average recurrence interval (ARI) maps which reveal how rare a particular precipitation event is. This information could be used by policy makers for disaster monitoring and prevention. The new methodology eliminates much of the noise that was produced by the existing models due to a short data record, producing more reasonable ARI maps when compared with NOAA's long-term Climate Prediction Center ground-based observations. Furthermore, the proposed methodology can be applied to other extreme climate records.
Use of cloud radar Doppler spectra to evaluate stratocumulus drizzle size distributions in large-eddy simulations with size-resolved microphysics
A case study of persistent stratocumulus over the Azores is simulated using two independent large-eddy simulation (LES) models with bin microphysics, and forward-simulated cloud radar Doppler moments and spectra are compared with observations. Neither model is able to reproduce the monotonic increase of downward mean Doppler velocity with increasing reflectivity that is observed under a variety of conditions, but for differing reasons. To a varying degree, both models also exhibit a tendency to produce too many of the largest droplets, leading to excessive skewness in Doppler velocity distributions, especially below cloud base. Excessive skewness appears to be associated with an insufficiently sharp reduction in droplet number concentration at diameters larger than ~200 μm, where a pronounced shoulder is found for in situ observations and a sharp reduction in reflectivity size distribution is associated with relatively narrow observed Doppler spectra. Effectively using LES with bin microphysics to study drizzle formation and evolution in cloud Doppler radar data evidently requires reducing numerical diffusivity in the treatment of the stochastic collection equation; if that is accomplished sufficiently to reproduce typical spectra, progress toward understanding drizzle processes is likely.
Pacific Hurricane Landfalls on Mexico and SST
A statistical model of Northeast Pacific tropical cyclones (TCs) is developed and used to estimate hurricane landfall rates along the coast of Mexico. Mean annual landfall rates for 1971-2014 are compared to mean rates for the extremely high Northeast Pacific sea-surface temperature (SST) of 2015. Over the full coast, the mean rate and 5%-95% uncertainty for Saffir-Simpson category one and higher TCs (category-1+ TCs) is for 1971-2014 and for 2015, a difference that is not significant. However, the increase for the most intense landfalls, category-5 TCs, is significant: for 1971-2014 and for 2015. The SST impact on the category-5 TC landfall rate is largest on the northern Mexican coast. The increased landfall rates for category-5 TCs is consistent with independent analysis showing that SST has its greatest impact on the formation rates of the most intense Northeast Pacific tropical cyclones. Landfall rates on Hawaii ( for category-1+ TCs and for category-3+ TCs for 1971-2014) show increases in the best estimates for 2015 conditions, but the changes are insignificant according to our tests.
Retrieval of Snow Properties for Ku- and Ka-band Dual-Frequency Radar
The focus of this study is on the estimation of snow microphysical properties and the associated bulk parameters such as snow water content and water equivalent snowfall rate for Ku- and Ka-band dual-frequency radar. This is done by exploring a suitable scattering model and the proper particle size distribution (PSD) assumption that accurately represent, in the electromagnetic domain, the micro/macro-physical properties of snow. The scattering databases computed from simulated aggregates for small to moderate particle sizes are combined with a simple scattering model for large particle sizes to characterize snow scattering properties over the full range of particle sizes. With use of the single scattering results, the snow retrieval look-up tables can be formed in a way that directly links the Ku- and Ka-band radar reflectivities to snow water content and equivalent snowfall rate without use of the derived PSD parameters. A sensitivity study of the retrieval results to the PSD and scattering models is performed to better understand the dual-wavelength retrieval uncertainties. To aid in the development of the Ku- and Ka-band dual-wavelength radar technique and to further evaluate its performance, self-consistency tests are conducted using measurements of the snow PSD and fall velocity acquired from the Snow Video Imager/Particle Image Probe (SVI/PIP) during the winter of 2014 in the NASA Wallops flight facility site in Wallops Island, Virginia.
Daytime Cirrus Cloud Top-of-Atmosphere Radiative Forcing Properties at a Midlatitude Site and their Global Consequence
One-year of continuous ground-based lidar observations (2012) are analyzed for single-layer cirrus clouds at the NASA Micro Pulse Lidar Network site at the Goddard Space Flight Center to investigate top-of-atmosphere (TOA) annual net daytime radiative forcing properties. A slight positive net daytime forcing is estimated (i.e., warming) : 0.07 - 0.67 W/m in relative terms, which reduces to 0.03 - 0.27 W/m in absolute terms after normalizing to unity based on approximated 40% midlatitude occurrence frequency rate estimated from satellite. Results are based on bookend solutions for lidar extinction-to-backscatter (20 and 30 sr) and corresponding retrievals for 532 nm cloud extinction coefficient. Uncertainties due to cloud undersampling, attenuation effects, sample selection and lidar multiple scattering are described. A net daytime cooling effect is found from the very thinnest clouds (cloud optical depth ≤ 0.01) that is attributed to relatively high solar zenith angles. A relationship between positive/negative daytime cloud forcing is demonstrated as a function of solar zenith angle and cloud top temperature. These properties, combined with the influence of varying surface albedos, are used to conceptualize how daytime cloud forcing likely varies with latitude and season, with cirrus clouds exerting less positive forcing and potentially net TOA cooling approaching the summer poles (non-ice and snow covered) versus greater warming at the equator. The existence of such a gradient would lead cirrus to induce varying daytime TOA forcing annually and seasonally, making it a far greater challenge than presently believe to constrain daytime and diurnal cirrus contributions to global radiation budgets.
Evaluating the sensitivity of agricultural model performance to different climate inputs
Projections of future food production necessarily rely on models, which must themselves be validated through historical assessments comparing modeled to observed yields. Reliable historical validation requires both accurate agricultural models and accurate climate inputs. Problems with either may compromise the validation exercise. Previous studies have compared the effects of different climate inputs on agricultural projections, but either incompletely or without a ground truth of observed yields that would allow distinguishing errors due to climate inputs from those intrinsic to the crop model. This study is a systematic evaluation of the reliability of a widely-used crop model for simulating U.S. maize yields when driven by multiple observational data products. The parallelized Decision Support System for Agrotechnology Transfer (pDSSAT) is driven with climate inputs from multiple sources - reanalysis, reanalysis bias-corrected with observed climate, and a control dataset - and compared to observed historical yields. The simulations show that model output is more accurate when driven by any observation-based precipitation product than when driven by un-bias-corrected reanalysis. The simulations also suggest, in contrast to previous studies, that biased precipitation distribution is significant for yields only in arid regions. However, some issues persist for all choices of climate inputs: crop yields appear oversensitive to precipitation fluctuations but undersensitive to floods and heat waves. These results suggest that the most important issue for agricultural projections may be not climate inputs but structural limitations in the crop models themselves.
A Maieutic Exploration of Nudging Strategies for Regional Climate Applications Using the WRF Model
The use of nudging in the Weather Research and Forecasting (WRF) Model to constrain regional climate downscaling simulations is gaining in popularity because it can reduce error and improve consistency with the driving data. While some attention has been paid to whether nudging is beneficial for downscaling, very little research has been performed to determine best practices. In fact, many published papers use the default nudging configuration (which was designed for numerical weather prediction), follow practices used by colleagues, or adapt methods developed for other regional climate models. Here, a suite of 45 three-year simulations is conducted with WRF over the continental United States to systematically and comprehensively examine a variety of nudging strategies. The simulations here use a longer test period than did previously published works to better evaluate the robustness of each strategy through all four seasons, through multiple years, and across nine regions of the United States. The analysis focuses on the evaluation of 2-m temperature and precipitation, which are two of the most commonly required downscaled output fields for air quality, health, and ecosystems applications. Several specific recommendations are provided to effectively use nudging in WRF for regional climate applications. In particular, spectral nudging is preferred over analysis nudging. Spectral nudging performs best in WRF when it is used toward wind above the planetary boundary layer (through the stratosphere) and temperature and moisture only within the free troposphere. Furthermore, the nudging toward moisture is very sensitive to the nudging coefficient, and the default nudging coefficient in WRF is too high to be used effectively for moisture.
Assessment of Planetary Boundary Layer parametrizations and urban heat island comparison: Impacts and implications for tracer transport
Accurate simulation of planetary boundary layer height (PBLH) is key to greenhouse gas emission estimation, air quality prediction and weather forecasting. This manuscript describes an extensive performance assessment of several Weather Research and Forecasting (WRF) model configurations where novel observations from ceilometers, surface stations and a flux tower were used to study their ability to reproduce planetary boundary layer heights (PBLH) and the impact that the urban heat island (UHI) has on the modeled PBLHs in the greater Washington, D.C. area. In addition, CO measurements at two urban towers were compared to tracer transport simulations. The ensemble of models used 4 PBL parameterizations, 2 sources of initial and boundary conditions and 1 configuration including the building energy parameterization (BEP) urban canopy model. Results have shown low biases over the whole domain and period for wind speed, wind direction and temperature with no drastic differences between meteorological drivers. We find that PBLH errors are mostly positively correlated with sensible heat flux errors, and that modeled positive UHI intensities are associated with deeper modeled PBLs over the urban areas. In addition, we find that modeled PBLHs are typically biased low during nighttime for most of the configurations with the exception of those using the MYNN parametrization and that these biases directly translate to tracer biases. Overall, the configurations using MYNN scheme performed the best, reproducing the PBLH and CO molar fractions reasonably well during all hours, thus opening the door to future nighttime inverse modeling.
