Create a DA-RT Sign Classification Forecast Model

This tutorial will walk you through how to predict DA-RT sign using classification

📘

Tutorial Overview

In this tutorial, we explain how to create and deploy a Network of Time Series (NoTS) that generates a 48-hour forecast of DA-RT sign using XGBoost with a binary logistic objective. The resulting forecast will be generated every hour on an ongoing basis.

Note: You could also use the capabilities described below to classify real time price spikes, for example, by using the Numerical Expression connector to predict when real time prices are above a certain value (e.g., RT_LMP > 100).

Create your NoTS

For this tutorial, you will leverage the client library to create your NoTS. The NoTS that you'll create will use an XGBoost model with time-based and weather-based features to forecast DA-RT sign at the ERCOT North Hub. We will be using Yes Energy DataSignals Cloud to access historical price data.

🚧

Model Performance

Note that the model we are using in this tutorial is not optimized. The goal of this tutorial is to show users how to create and deploy a simple DA-RT sign forecasting model. We do not recommend using this model in production.

Client Library

To create the NoTS in the client library, you can use the following code.

import myst
from myst.connectors.model_connectors import xgboost
from myst.connectors.operation_connectors import numerical_expression
from myst.connectors.source_connectors import time_trends
from myst.connectors.source_connectors import yes_energy
from myst.recipes.time_series_recipes import the_weather_company

myst.authenticate()

# Create a new project.
project = myst.Project.create(title="DA-RT sign ERCOT North Hub")

# Create an hour of day, day of week, and month of year time series from a time trends source.
time_trends_source = project.create_source(
    title="Time Trends",
    connector=time_trends.TimeTrends(
        sample_period=myst.TimeDelta("PT1H"),
        time_zone="UTC",
        fields=[
            time_trends.Field.HOUR_OF_DAY,
            time_trends.Field.DAY_OF_WEEK,
            time_trends.Field.DAY_OF_YEAR,
        ],
    ),
)
hour_of_day_time_series = time_trends_source.create_time_series(
    title="Hour of Day",
    sample_period=myst.TimeDelta("PT1H"),
    label_indexer=time_trends.Field.HOUR_OF_DAY,
)
day_of_week_time_series = time_trends_source.create_time_series(
    title="Day of Week",
    sample_period=myst.TimeDelta("PT1H"),
    label_indexer=time_trends.Field.DAY_OF_WEEK,
)
day_of_year_time_series = time_trends_source.create_time_series(
    title="Day of Year",
    sample_period=myst.TimeDelta("PT1H"),
    label_indexer=time_trends.Field.DAY_OF_YEAR,
)

# Create a temperature time series using a The Weather Company recipe.
temperature_time_series = project.create_time_series_from_recipe(
    recipe=the_weather_company.TheWeatherCompany(
        metar_station=the_weather_company.MetarStation.KDFW,
        field=the_weather_company.Field.TEMPERATURE,
    )
)

# Create a DA LMP time series using a Yes Energy source.
yes_energy_source = project.create_source(
    title="Yes Energy",
    connector=yes_energy.YesEnergy(
        items=[
          yes_energy.YesEnergyItem(datatype="DALMP", object_id=10000697078),
          yes_energy.YesEnergyItem(datatype="RTLMP", object_id=10000697078)
        ],
        stat=yes_energy.YesEnergyAggregation.AVG,
    ),
)
da_lmp_ts = yes_energy_source.create_time_series(
    title="Historical DA LMP",
    sample_period=myst.TimeDelta("PT1H"),
    label_indexer="DALMP_10000697078",
)
rt_lmp_ts = yes_energy_source.create_time_series(
    title="Historical RT LMP",
    sample_period=myst.TimeDelta("PT1H"),
    label_indexer="RTLMP_10000697078",
)

# Create a binary target.
rt_over_da_operation = project.create_operation(
    title=f"RT > DA Target",
    connector=numerical_expression.NumericalExpression(
      variable_names=["da_lmp", "rt_lmp"], math_expression="rt_lmp > da_lmp"),
)
rt_over_da_operation.create_input(da_lmp_ts, group_name="da_lmp")
rt_over_da_operation.create_input(rt_lmp_ts, group_name="rt_lmp")
rt_over_da_ts = rt_over_da_operation.create_time_series(
    title=rt_over_da_operation.title,
    sample_period=myst.TimeDelta("PT1H")
)

# Create an XGBoost classification model.
model = project.create_model(
    title="RT > DA Classification Model",
    connector=xgboost.XGBoost(
      objective=xgboost.XGBoostObjective.BINARY_LOGISTIC,
      num_boost_round=500,
      max_depth=8,
      min_child_weight=40,
      learning_rate=0.038,
      fit_on_null_values=True,
      predict_on_null_values=True,
    ),
)

# Add the time series as inputs to the model.
for time_series in [
    hour_of_day_time_series,
    day_of_week_time_series,
    day_of_year_time_series,
    temperature_time_series,
]:
    model.create_input(time_series, group_name=xgboost.GroupName.FEATURES)
model.create_input(rt_over_da_ts, group_name=xgboost.GroupName.TARGETS)

# Add a fit policy to the model.
model.create_fit_policy(
    start_timing=myst.TimeDelta("-P3M"),
    end_timing=myst.TimeDelta("PT1H"),
    schedule_timing=myst.TimeDelta("PT24H"),
)

# Create a time series with the model predictions.
forecast_time_series = model.create_time_series(
    title="RT > DA Forecast", sample_period=myst.TimeDelta("PT1H")
)

# Add a run policy to the time series.
forecast_time_series.create_run_policy(
    start_timing=myst.TimeDelta("PT1H"),
    end_timing=myst.TimeDelta("PT49H"),
    schedule_timing=myst.TimeDelta("PT1H"),
)

Deploy your NoTS

Once you've finished creating your NoTS, you can deploy your Project.

Web Application

To create a new Deployment, click the Deploy button in the top right corner of the Project Create page. Specify a title for your Deployment and then click the Deploy button.

Once you’ve deployed a project, your Model Fit Policy and Time Series Run Policy will begin to run according to their schedules. This means that your Model will be fitted once a day and your Time Series will be run every hour. Note that these Policies will run indefinitely, until you deactivate your Deployment.

To track and verify the results of your Deployment, navigate to the Project Monitor space by clicking on the Monitor tab at the top of the Project page. The results table shows a list of ongoing results that are being generated by your policy. You can refresh the table by clicking on the refresh icon.

Client Library

The code below will deploy your project, creating a first model fit and time series run immediately.

# Deploy the project.
project.deploy("My Deployment")

# Create ad hoc time_series run job.
time_series_run_job = forecast_time_series.run(
    start_timing=myst.TimeDelta("PT1H"), 
    end_timing=myst.TimeDelta("PT49H"),
)

👍

Tutorial Complete

You are now generating the 48-hour DA-RT sign forecasts for ERCOT North Hub on an ongoing basis. See the section on Query Time Series Data to learn more about how to query your stored forecasts.