Skip to main content

ERCOT NPF - Extended Documentation

ERCOT Nodal Price Forecast

Jose Luis Silva avatar
Written by Jose Luis Silva
Updated over 2 months ago

The following section provides a deeper look into the toolkit scripts. These details are intended for users with coding experience, and adjustments should be made only if you’re comfortable working with code.

1. Operation Datetime vs Publication Datetime

By default, the API call in vanilla_prediction_pull uses the operation date (operationDate) to fetch forecasts. If instead you want to retrieve forecasts based on the publication time (i.e., the most recent forecast runs available), you can modify the base URL accordingly. This is not the default behavior of either toolkit, but it can be useful when you want the most recent forecast.

  1. In utils.py navigate to vanilla_prediction_pull function.

  2. Update the base_url argument to reference publication_datetime instead of operationDate

def vanilla_prediction_pull(date_range: pd.DatetimeIndex,

nodes: list,

base_url: str = "https://api2.woodma.com/nodal-price forecast/v1/prediction?

node=%s&publication_datetime=%s",

timeout: int = 60, max_retries: int = 3) -> pd.DataFrame:

This small change calls the API to return forecasts based on publication time rather than operation date, giving you the latest available runs for your nodes of interest.

2. Pulling Data from the API

Our toolkits provide a set of convenience functions that simplify pulling nodal forecast and actuals data from the Wood Mackenzie API. This section explains the workflow, function responsibilities, and how to use them together.

2.1 vanilla_prediction_pull_byPubDate

What it does: Pulls predictions for one or more nodes at a specific publication datetime.

Use case: “I want the forecast run that was published at [insert datetime here]

Key args: - pubDate → string like "YYYY-MM-DD HH:MM". - nodes → list of node names.

Returns: DataFrame of forecasts (predicted values only) for those nodes at that publication run.

2.2 vanilla_prediction_pull

What it does: Pulls predictions for nodes across a range of operation dates.

Use case: “Give me all the latest forecasts for Aug 20–25 for AEEC.”

Key args:

  • date_range → a pd.DatetimeIndex (e.g., pd.date_range("2025-08-20", "2025-08-25")).

  • nodes → list of node names.

Returns: DataFrame of predictions keyed by operation date.

2.3 pull_forecasts_and_actuals

What it does: Pulls both forecasts and the freshest available actuals for a single node at one publication datetime.

Use case: “At the time of the 2025-08-22 12:00 forecast, what were the predictions and the matching actuals?”

Returns: A normalized DataFrame with:

  • Forecast metadata (node_name, forecast_datetime, publication_datetime).

  • Predictions (predicted_lmp, predicted_mcc, predicted_mec).

  • Actuals (actual_lmp, actual_mcc, actual_mec).

  • Confidence and prediction ranges.

This is the recommended entry point if you want side-by-side predictions and actuals.

2.4 add_actuals

What it does: Updates an existing predictions DataFrame by filling in or refreshing actual values.

How it works:

  • Looks at the original publication time (Central time).

  • Decides which additional forecast runs to fetch to capture actuals:

    – 00:00 run → fetch (next day 00:00).

    – 01:00–23:00 run → fetch (next day 00:00) and (day+2 00:00).

  • Ignores any runs at or after the original publication time.

  • Ensures the newest actuals (latest publication timestamp) are kept per (node_name, forecast_hour).

Use case: “I already have a DataFrame of forecasts — just update it with the latest actual LMP/MCC/MEC.”

2.5 Recommended Usage

  • Use pull_forecasts_and_actuals if you want a ready-to-analyze dataset of predictions and actuals side by side.

  • Use add_actuals if you’re already working with a forecasts DataFrame and need to refresh actuals.

  • Use vanilla_prediction_pull_byPubDate or vanilla_prediction_pull if you only want the raw forecasts without actuals.

Did this answer your question?