Cycle Simulation
A cycle simulation is an execution of WRF-Hydro that takes place at different times. At these different times there may be different restart and forcing files, but most other properties of the run stay the same. Examples are a “forecast” or an “analysis cycle”.
From help(wrfhydropy.CycleSimulation):
Class for a WRF-Hydro CycleSimulation object. The Cycle Simulation object is used to
orchestrate a set of 'N' WRF-Hydro simulations, referred to as 'casts', which only differ
in their 1) restart times and 2) their forcings.
I keep the verbiage to a minimum in this example. Please refer to the first example for gory details.
Preliminary
[1]:
import datetime
import os
import pathlib
import pickle
import subprocess
import sys
import wrfhydropy
/glade/work/jamesmcc/python_envs/368_/lib/python3.6/site-packages/dask/config.py:131: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
data = yaml.load(f.read()) or {}
User Configuration
This section should be all you need to tailor to your own machine.
[2]:
model_repo = pathlib.Path('/glade/u/home/jamesmcc/WRF_Hydro/wrf_hydro_nwm_public')
experiment_dir = pathlib.Path('/glade/scratch/jamesmcc/wrfhydropy_cycle_example')
if not experiment_dir.exists():
os.mkdir(str(experiment_dir))
os.chdir(str(experiment_dir))
Simulation Object
First, we build the simulation that serves as the basis for the casts in the cycle.
The domain is pulled from the cloud, just as in the first (end-to-end) example. (Note that this section will not run on a cheyenne compute node as they are not connected to the internet.)
[3]:
domain_dir = experiment_dir / 'domain'
if not domain_dir.exists():
sys.path.append(str(model_repo / 'tests/local/utils'))
from gdrive_download import download_file_from_google_drive
file_id = '1xFYB--zm9f8bFHESzgP5X5i7sZryQzJe'
file_target = 'gdrive_testcase.tar.gz'
download_file_from_google_drive(
file_id,
str(experiment_dir.joinpath(file_target)))
untar_cmd = 'tar -xf ' + file_target + '; mv example_case domain'
subprocess.run(
untar_cmd,
shell=True,
cwd=str(experiment_dir))
[4]:
hrldas_model_side_file = model_repo / 'src/hrldas_namelists.json'
hrldas_domain_side_file = domain_dir / 'hrldas_namelist_patches.json'
hydro_model_side_file = model_repo / 'src/hydro_namelists.json'
hydro_domain_side_file = domain_dir / 'hydro_namelist_patches.json'
compile_options_file = model_repo / 'src/compile_options.json'
config = 'nwm_ana'
[5]:
domain = wrfhydropy.Domain(
domain_top_dir=domain_dir,
domain_config=config)
[6]:
model = wrfhydropy.Model(
model_repo / 'src',
compiler='ifort',
model_config=config)
[7]:
compile_dir = experiment_dir / 'compile'
if not compile_dir.exists():
model.compile(compile_dir)
else:
model = pickle.load(compile_dir.joinpath('WrfHydroModel.pkl').open('rb'))
/glade/work/jamesmcc/python_envs/368_/lib/python3.6/site-packages/wrfhydropy-0.0.18-py3.6.egg/wrfhydropy/core/model.py:193: UserWarning: /glade/scratch/jamesmcc/wrfhydropy_cycle_example/compile directory does not exist, creating
warnings.warn(str(self.compile_dir.absolute()) + ' directory does not exist, creating')
Model successfully compiled into /glade/scratch/jamesmcc/wrfhydropy_cycle_example/compile
[8]:
simulation = wrfhydropy.Simulation()
simulation.add(model)
simulation.add(domain)
Note that we dont need to compose or run the Simulation.
Cycle Object
The documentation is always a work in progress! Please provide feedback on how to continue to improve it. You’ll note that the CycleSimulation documentation actually covers what would be called CycleEnsemble, but it’s not described at the top, just mentioned in the arguments.
[9]:
help(wrfhydropy.CycleSimulation)
Help on class CycleSimulation in module wrfhydropy.core.cycle:
class CycleSimulation(builtins.object)
| Class for a WRF-Hydro CycleSimulation object. The Cycle Simulation object is used to
| orchestrate a set of 'N' WRF-Hydro simulations, referred to as 'casts', which only differ
| in their 1) restart times and 2) their forcings.
|
| Methods defined here:
|
| __init__(self, init_times:list, restart_dirs:list, forcing_dirs:list=[], ncores:int=1)
| Instantiate a Cycle object.
| Args:
| init_times: A required list of datetime.datetime objects which specify the
| restart time of each cast in the cycle. (Same for deterministic
| and ensemble cycle simultions).
| restart_dirs:
| Deterministic: a required list of either strings or pathlib.Path objects.
| Ensemble: a required list of lists. The outer list is for the cycles
| "casts" requested in init_times. The inner list is for each ensemble member
| in the cast.
| The following rules are applied to the individual entires:
| 1) A dot or a null string (are identical pathlib.Path objects and) mean
| "do nothing" with respect to the default path in the domain.
| 2) An existing path/file is used/kept (a non-existent path is not, gives
| an error).
| 3) A negative integer in units hours, pointing to a previous cast in the
| cycle.
| 4) Other wise, value error raised.
| forcing_dirs: optional
| Deterministic: list of either strings or pathlib.Path objects
| Ensemble: A list of lists, as for restart_dirs.
| See restart_dirs for usage rules.
| ncores: integer number of cores for running parallelizable methods (not the
| casts themselves). For an ensemble cycle, setting this value > 1 will
| force the ensemble.ncores = 1.
|
| __len__(self)
|
| add(self, obj:Union[wrfhydropy.core.simulation.Simulation, wrfhydropy.core.ensemble.EnsembleSimulation, wrfhydropy.core.schedulers.Scheduler, wrfhydropy.core.job.Job])
| Add an approparite object to an CycleSimulation, such as a Simulation, Job, or
| Scheduler.
| Args:
| obj: the object to add.
|
| compose(self, symlink_domain:bool=True, force:bool=False, check_nlst_warn:bool=False, rm_casts_from_memory:bool=True, rm_members_from_memory:bool=True)
| Cycle compose (directories and files to disk)
| Args:
| symlink_domain: Symlink the domain files rather than copy
| force: Compose into directory even if not empty. This is considered bad practice but
| is necessary in certain circumstances.
| rm_casts_from_memory: Most applications will remove the casts from the
| ensemble object upon compose. Testing and other reasons may keep them around.
| check_nlst_warn: Allow the namelist checking/validation to only result in warnings.
| This is also not great practice, but necessary in certain circumstances.
|
| pickle(self, path:str)
| Pickle ensemble sim object to specified file path
| Args:
| path: The file path for pickle
|
| rm_casts(self)
| Remove members from memory, replace with their paths.
|
| run(self, n_concurrent:int=1, teams:bool=False, teams_exe_cmd:str=None, teams_exe_cmd_nproc:int=None, teams_node_file:dict=None, env:dict=None, teams_dict:dict=None)
| Run the cycle of simulations.
| Inputs:
| n_concurrent: int = 1, Only used for non-team runs.
| teams: bool = False, Use teams?
| teams_exe_cmd: str, The mpi-specific syntax needed. For
| example: 'mpirun --host {hostname} -np {nproc} {cmd}'
| teams_exe_cmd_nproc: int, The number of cores per model/wrf_hydro
| simulation to be run.
| teams_node_file: dict = None, Optional file that acts like a node
| file. It is not currently implemented but the key specifies the
| scheduler format that the file follows. An example pbs node
| file is in tests/data and this argument is used here to test
| without a sched.
| env: dict = None, optional envionment to pass to the run.
| teams_dict: dict, Skip the arguments if you already have a
| teams_dict to use (backwards compatibility)
| Outputs: 0 for success.
|
| ----------------------------------------------------------------------
| Data descriptors defined here:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)
To initialize (__init__) a CycleSimulation, there are two required arguments. The init_times argument is a list of times (datetimes) at which the model will restart. We will make this every 6 hours (below we’ll set the job run duration to 12 hours). The second is restart_dirs, a list of directories where each cast is to find it’s restart files. In this case we will use the previous cast (at 6 hours prior).
[10]:
init_times = [datetime.datetime(2011, 8, 26, 0) +
datetime.timedelta(hours=hh) for hh in range(0, 24, 6)]
init_times
[10]:
[datetime.datetime(2011, 8, 26, 0, 0),
datetime.datetime(2011, 8, 26, 6, 0),
datetime.datetime(2011, 8, 26, 12, 0),
datetime.datetime(2011, 8, 26, 18, 0)]
Since the first cast has no previous cast, we just use the existing restarts at that time.
[11]:
restart_dirs = ['.'] + ([-6] * (len(init_times)-1))
restart_dirs
[11]:
['.', -6, -6, -6]
Instantiate the CycleSimulation object with the required arguments.
[12]:
cycle = wrfhydropy.CycleSimulation(
init_times=init_times, restart_dirs=restart_dirs)
Now we have a CycleSimulation to which we need to add our simulation (above) and a job.
[13]:
exe_cmd = 'mpirun -np 1 ./wrf_hydro.exe'
job_cycle = wrfhydropy.Job(
job_id='cycle',
exe_cmd=exe_cmd,
model_start_time=init_times[0],
model_end_time=init_times[0] + datetime.timedelta(hours=12),
restart=True,
restart_freq_hr=6,
output_freq_hr=1)
cycle.add(simulation)
cycle.add(job_cycle)
If we consider the job “ready to go”, we can compose it to disk. This compose step follows the same pattern as a Simulation.
[14]:
cycle_dir = experiment_dir / 'cycle'
os.mkdir(cycle_dir)
os.chdir(cycle_dir)
cycle.compose()
Composing simulation into directory:'/glade/scratch/jamesmcc/wrfhydropy_cycle_example/cycle/cast_2011082600'
Getting domain files...
Making job directories...
Validating job input files
cycle
Model already compiled, copying files...
Simulation successfully composed
Composing simulation into directory:'/glade/scratch/jamesmcc/wrfhydropy_cycle_example/cycle/cast_2011082606'
Getting domain files...
Making job directories...
Validating job input files
cycle
Model already compiled, copying files...
Simulation successfully composed
Composing simulation into directory:'/glade/scratch/jamesmcc/wrfhydropy_cycle_example/cycle/cast_2011082612'
Getting domain files...
Making job directories...
Validating job input files
cycle
Model already compiled, copying files...
Simulation successfully composed
Composing simulation into directory:'/glade/scratch/jamesmcc/wrfhydropy_cycle_example/cycle/cast_2011082618'
Getting domain files...
Making job directories...
Validating job input files
cycle
Model already compiled, copying files...
Simulation successfully composed
As with Simulation objects, this time between compose (to disk) and run is a critical point for the user to look at the runs and verify that everything is configured correctly.
Since these cycles are dependent on each other, we can not run them in parallel. We perform a serial, interactive run here. (Other run modes will be shown in subsequent examples).
[15]:
cycle.run()
Running job cycle:
Wall start time: 2020-04-08 23:18:08
Model start time: 2011-08-26 00:00
Model end time: 2011-08-26 12:00
Running job cycle:
Wall start time: 2020-04-08 23:18:15
Model start time: 2011-08-26 06:00
Model end time: 2011-08-26 18:00
Running job cycle:
Wall start time: 2020-04-08 23:18:20
Model start time: 2011-08-26 12:00
Model end time: 2011-08-27 00:00
Running job cycle:
Wall start time: 2020-04-08 23:18:27
Model start time: 2011-08-26 18:00
Model end time: 2011-08-27 06:00
[15]:
0
Cycle Collection
I have not yet implemented a syntactic sugar version for more complicated objects than Simulations…. so we do this the “unchained” or “old fashioned” way.
[16]:
chanobs_files = sorted(cycle_dir.glob('*/*CHANOBS*'))
cycle_chanobs_ds = wrfhydropy.open_whp_dataset(chanobs_files)
n_files 48
Note that the open_whp_dataset function handles the dimensions of the cycle and that the time dimensions are lead_time and reference_time and valid_time is a variable with those dimensions.
[17]:
cycle_chanobs_ds
[17]:
<xarray.Dataset>
Dimensions: (feature_id: 4, lead_time: 12, reference_time: 4)
Coordinates:
latitude (feature_id) float32 41.470795 41.473614 41.449814 41.40192
longitude (feature_id) float32 -73.76059 -73.69085 -73.73565 -73.68741
* reference_time (reference_time) datetime64[ns] 2011-08-26 ... 2011-08-26T18:00:00
* feature_id (feature_id) int32 6226948 6226964 6227008 6227150
* lead_time (lead_time) timedelta64[ns] 01:00:00 02:00:00 ... 12:00:00
Data variables:
crs (lead_time, reference_time) |S1 b'' b'' b'' ... b'' b'' b''
order (lead_time, reference_time, feature_id) int32 3 2 4 ... 4 4
elevation (lead_time, reference_time, feature_id) float32 180.48 ... 147.61
streamflow (lead_time, reference_time, feature_id) float32 0.16395454 ... 12.969116
valid_time (lead_time, reference_time) datetime64[ns] 2011-08-26T01:00:00 ... 2011-08-27T06:00:00
Attributes:
featureType: timeSeries
proj4: +proj=lcc +units=m +a=6370000.0 +b=6370000.0 +lat_1=3...
station_dimension: feature_id
Conventions: CF-1.6Plot
Put valid_time into the coordinates so that it can be plotted against.
[18]:
cycle_chanobs_ds = cycle_chanobs_ds.set_coords('valid_time')
cycle_chanobs_ds
[18]:
<xarray.Dataset>
Dimensions: (feature_id: 4, lead_time: 12, reference_time: 4)
Coordinates:
latitude (feature_id) float32 41.470795 41.473614 41.449814 41.40192
longitude (feature_id) float32 -73.76059 -73.69085 -73.73565 -73.68741
* reference_time (reference_time) datetime64[ns] 2011-08-26 ... 2011-08-26T18:00:00
* feature_id (feature_id) int32 6226948 6226964 6227008 6227150
* lead_time (lead_time) timedelta64[ns] 01:00:00 02:00:00 ... 12:00:00
valid_time (lead_time, reference_time) datetime64[ns] 2011-08-26T01:00:00 ... 2011-08-27T06:00:00
Data variables:
crs (lead_time, reference_time) |S1 b'' b'' b'' ... b'' b'' b''
order (lead_time, reference_time, feature_id) int32 3 2 4 ... 4 4
elevation (lead_time, reference_time, feature_id) float32 180.48 ... 147.61
streamflow (lead_time, reference_time, feature_id) float32 0.16395454 ... 12.969116
Attributes:
featureType: timeSeries
proj4: +proj=lcc +units=m +a=6370000.0 +b=6370000.0 +lat_1=3...
station_dimension: feature_id
Conventions: CF-1.6Do the easiest plotting thing possible.
[20]:
(cycle_chanobs_ds.streamflow.plot
.line(x='valid_time', col='reference_time', hue='feature_id',
col_wrap=2,
figsize=(12, 4)))
[20]:
<xarray.plot.facetgrid.FacetGrid at 0x2b6f4b0aea20>