Mitigating Worker Failures

collect_NeahBay_ssh Worker Failure

The sea surface height anomaly at the western Juan de Fuca boundary is taken from a NOAA forecast of storm surge at Neah Bay. If this page is not accessible then the nowcast.workers.collect_NeahBay_ssh worker may fail. As the observations for several days are saved in the files on this pages, one can try at a later time and if necessary use yesterday’s files to continue the forecast. If it goes several days, we can recover observed sea surface heights from the NOAA tides and water levels .

make_ssh_files can take a date so that older files can be run. Files from the observation site can also be run with the appropriate flag.

download_weather Worker Failure

The Environment and Climate Change Canada (ECCC) 2.5 km resolution GEM forecast model products from the High Resolution Deterministic Prediction System (HRDPS) are critical inputs for the nowcast system.

The HRDPS products files that we use are downloaded every 6 hours via the nowcast.workers.collect_weather worker. If things are going really badly the collect_weather worker may not finish hours after its expected completion time. That can be determined by reading the info log file or debug log file.

In the rare event that the nowcast automation system fails to download the HRDPS products, it is critical that someone runs the nowcast.workers.download_weather worker to download the files. Confirm that the files exist on the ECCC hpfx server by browsing the the date and forecast of interest; e.g. https://hpfx.collab.science.gc.ca/yyyymmdd/WXO-DD/model_hrdps/continental/2.5km/hh/, where:

  • yyyymmdd is the date of interest

  • hh is the forecast name (00, 06, 12, or 18) of interest

then run the worker as follows:

  1. ssh on to skookum, activate the production nowcast conda environment, and navigate to the nowcast configuration and logging directory:

    $ ssh skookum
    skookum$ conda activate /SalishSeaCast/nowcast-env
    
  2. Run the nowcast.workers.download_weather worker for the appropriate forecast with debug logging, for example:

    (/SalishSeaCast/nowcast-env)skookum$ python3 -m nowcast.workers.download_weather $NOWCAST_YAML 12 2.5km --debug
    

    The command above downloads the 12 forecast. The --debug flag causes the logging output of the worker to be displayed on the screen (so that you can see what is going on) instead of being written to a file. It also disconnects the worker from the nowcast messaging system so that there is no interaction with the manager and the ongoing automation. The (abridged) output should look like:

    2023-02-24 10:18:07,831 INFO [download_weather] running in process 3006
    2023-02-24 10:18:07,831 INFO [download_weather] read config from /SalishSeaCast/SalishSeaNowcast/config/nowcast.yaml
    2023-02-24 10:18:07,831 DEBUG [download_weather] **debug mode** no connection to manager
    2023-02-24 10:18:07,831 INFO [download_weather] downloading 12 2.5 km forecast GRIB2 files for 20230115
    2023-02-24 10:18:07,841 DEBUG [urllib3.connectionpool] Starting new HTTPS connection (1): hpfx.collab.science.gc.ca:443
    2023-02-24 10:18:08,112 DEBUG [urllib3.connectionpool] https://hpfx.collab.science.gc.ca:443 "GET /20230115/WXO-DD/model_hrdps/continental/2.5km/12/001/20230115T12Z_MSC_HRDPS_UGRD_AGL-10m_RLatLon0.0225_PT001H.grib2 HTTP/1.1" 200 1856503
    2023-02-24 10:18:10,748 DEBUG [download_weather] downloaded 1856503 bytes from https://hpfx.collab.science.gc.ca/20230115/WXO-DD/model_hrdps/continental/2.5km/12/001/20230115T12Z_MSC_HRDPS_UGRD_AGL-10m_RLatLon0.0225_PT001H.grib2
    2023-02-24 10:18:10,882 DEBUG [urllib3.connectionpool] https://hpfx.collab.science.gc.ca:443 "GET /20230115/WXO-DD/model_hrdps/continental/2.5km/12/001/20230115T12Z_MSC_HRDPS_VGRD_AGL-10m_RLatLon0.0225_PT001H.grib2 HTTP/1.1" 200 1823132
    2023-02-24 10:18:11,840 DEBUG [download_weather] downloaded 1823132 bytes from https://hpfx.collab.science.gc.ca/20230115/WXO-DD/model_hrdps/continental/2.5km/12/001/20230115T12Z_MSC_HRDPS_VGRD_AGL-10m_RLatLon0.0225_PT001H.grib2
    2023-02-24 10:18:12,135 DEBUG [urllib3.connectionpool] https://hpfx.collab.science.gc.ca:443 "GET /20230115/WXO-DD/model_hrdps/continental/2.5km/12/001/20230115T12Z_MSC_HRDPS_DSWRF_Sfc_RLatLon0.0225_PT001H.grib2 HTTP/1.1" 200 723914
    2023-02-24 10:18:13,021 DEBUG [download_weather] downloaded 723914 bytes from https://hpfx.collab.science.gc.ca/20230115/WXO-DD/model_hrdps/continental/2.5km/12/001/20230115T12Z_MSC_HRDPS_DSWRF_Sfc_RLatLon0.0225_PT001H.grib2
    2023-02-24 10:18:13,116 DEBUG [urllib3.connectionpool] https://hpfx.collab.science.gc.ca:443 "GET /20230115/WXO-DD/model_hrdps/continental/2.5km/12/001/20230115T12Z_MSC_HRDPS_DLWRF_Sfc_RLatLon0.0225_PT001H.grib2 HTTP/1.1" 200 2290020
    2023-02-24 10:18:14,277 DEBUG [download_weather] downloaded 2290020 bytes from https://hpfx.collab.science.gc.ca/20230115/WXO-DD/model_hrdps/continental/2.5km/12/001/20230115T12Z_MSC_HRDPS_DLWRF_Sfc_RLatLon0.0225_PT001H.grib2
    2023-02-24 10:18:14,371 DEBUG [urllib3.connectionpool] https://hpfx.collab.science.gc.ca:443 "GET /20230115/WXO-DD/model_hrdps/continental/2.5km/12/001/20230115T12Z_MSC_HRDPS_LHTFL_Sfc_RLatLon0.0225_PT001H.grib2 HTTP/1.1" 200 1314185
    2023-02-24 10:18:15,966 DEBUG [download_weather] downloaded 1314185 bytes from https://hpfx.collab.science.gc.ca/20230115/WXO-DD/model_hrdps/continental/2.5km/12/001/20230115T12Z_MSC_HRDPS_LHTFL_Sfc_RLatLon0.0225_PT001H.grib2
    

You can use the -h or --help flags to get a usage message that explains the worker’s required arguments, and its option flags:

(nowcast)$ python -m nowcast.workers.download_weather --help
usage: python -m nowcast.workers.download_weather [-h] [--debug] [--run-date RUN_DATE]
       [--no-verify-certs] config_file {12,06,00,18} {1km,2.5km}

SalishSeaCast worker that downloads the GRIB2 files from the 00, 06, 12, or 18
Environment and Climate Change Canada GEM 2.5km HRDPS operational model forecast.

positional arguments:
  config_file          Path/name of YAML configuration file for NEMO nowcast.
  {12,06,00,18}        Name of forecast to download files from.
  {1km,2.5km}          Horizontal resolution of forecast to download files from.

options:
  -h, --help           show this help message and exit
  --debug              Send logging output to the console instead of the log file,
                       and suppress messages to the nowcast manager process.
                       Nowcast system messages that would normally be sent to the
                       manager are logged to the console, suppressing interactions
                       with the manager such as launching other workers. Intended only
                       for use when the worker is run in foreground from the command-line.
  --run-date RUN_DATE  Forecast date to download. Use YYYY-MM-DD format.
                       Defaults to 2023-02-24.
  --no-verify-certs    Don't verify TLS/SSL certificates for downloads.
                       NOTE: This is intended for use *only* when downloading from
                       dd.alpha.meteo.gc.ca which has still uses deprecated TLS-1.0.

The --run-date flag allows you to download a prior date’s forecast files. To determine if the --run-date flag can be used check that the files exist on the ECCC hpfx server by browsing the the date and forecast of interest; e.g. https://hpfx.collab.science.gc.ca/yyyymmdd/WXO-DD/model_hrdps/continental/2.5km/hh/, where:

  • yyyymmdd is the date of interest

  • hh is the forecast name (00, 06, 12, or 18) of interest

Even if the worker cannot be re-run in the nowcast system deployment environment on skookum due to permission issues the forecast products can be downloaded using a Development Environment. That can be accomplished as follows:

  1. Activate your nowcast conda environment, and navigate to your nowcast development and testing environment:

    $ source activate salishsea-nowcast
    (nowcast)$ cd MEOPAR/nowcast/
    
  2. Edit the SalishSeaNowcast/config/nowcast.yaml file to set a destination in your filespace for the GRIB2 files that the worker downloads:

    weather:
      download:
        # Destination directory for downloaded GEM 2.5km operational model GRIB2 files
        # GRIB dir: /results/forcing/atmospheric/GEM2.5/GRIB/
        GRIB dir: /ocean/<your_userid>/MEOPAR/GRIB/
    

    Note

    The directory /ocean/<your_userid>/MEOPAR/GRIB/ must exist. Create it if necessary with:

    $ mkdir -p /ocean/<your_userid>/MEOPAR/GRIB/
    
  3. Set the value of the NOWCAST_YAML environment variable to the absolute path of the SalishSeaNowcast/config/nowcast.yaml file that you edited.

  4. Continue from step 2 above.