Create a DA-RT Sign Classification Forecast Model
This tutorial will walk you through how to predict DA-RT sign using classification
Tutorial Overview
In this tutorial, we explain how to create and deploy a Network of Time Series (NoTS) that generates a 48-hour forecast of DA-RT sign using XGBoost with a binary logistic objective. The resulting forecast will be generated every hour on an ongoing basis.
Note: You could also use the capabilities described below to classify real time price spikes, for example, by using the Numerical Expression connector to predict when real time prices are above a certain value (e.g., RT_LMP > 100).
Create your NoTS
For this tutorial, you will leverage the client library to create your NoTS. The NoTS that you'll create will use an XGBoost model with time-based and weather-based features to forecast DA-RT sign at the ERCOT North Hub. We will be using Yes Energy DataSignals Cloud to access historical price data.
Model Performance
Note that the model we are using in this tutorial is not optimized. The goal of this tutorial is to show users how to create and deploy a simple DA-RT sign forecasting model. We do not recommend using this model in production.
Client Library
To create the NoTS in the client library, you can use the following code.
import myst
from myst.connectors.model_connectors import xgboost
from myst.connectors.operation_connectors import numerical_expression
from myst.connectors.source_connectors import time_trends
from myst.connectors.source_connectors import yes_energy
from myst.recipes.time_series_recipes import the_weather_company
myst.authenticate()
# Create a new project.
project = myst.Project.create(title="DA-RT sign ERCOT North Hub")
# Create an hour of day, day of week, and month of year time series from a time trends source.
time_trends_source = project.create_source(
title="Time Trends",
connector=time_trends.TimeTrends(
sample_period=myst.TimeDelta("PT1H"),
time_zone="UTC",
fields=[
time_trends.Field.HOUR_OF_DAY,
time_trends.Field.DAY_OF_WEEK,
time_trends.Field.DAY_OF_YEAR,
],
),
)
hour_of_day_time_series = time_trends_source.create_time_series(
title="Hour of Day",
sample_period=myst.TimeDelta("PT1H"),
label_indexer=time_trends.Field.HOUR_OF_DAY,
)
day_of_week_time_series = time_trends_source.create_time_series(
title="Day of Week",
sample_period=myst.TimeDelta("PT1H"),
label_indexer=time_trends.Field.DAY_OF_WEEK,
)
day_of_year_time_series = time_trends_source.create_time_series(
title="Day of Year",
sample_period=myst.TimeDelta("PT1H"),
label_indexer=time_trends.Field.DAY_OF_YEAR,
)
# Create a temperature time series using a The Weather Company recipe.
temperature_time_series = project.create_time_series_from_recipe(
recipe=the_weather_company.TheWeatherCompany(
metar_station=the_weather_company.MetarStation.KDFW,
field=the_weather_company.Field.TEMPERATURE,
)
)
# Create a DA LMP time series using a Yes Energy source.
yes_energy_source = project.create_source(
title="Yes Energy",
connector=yes_energy.YesEnergy(
items=[
yes_energy.YesEnergyItem(datatype="DALMP", object_id=10000697078),
yes_energy.YesEnergyItem(datatype="RTLMP", object_id=10000697078)
],
stat=yes_energy.YesEnergyAggregation.AVG,
),
)
da_lmp_ts = yes_energy_source.create_time_series(
title="Historical DA LMP",
sample_period=myst.TimeDelta("PT1H"),
label_indexer="DALMP_10000697078",
)
rt_lmp_ts = yes_energy_source.create_time_series(
title="Historical RT LMP",
sample_period=myst.TimeDelta("PT1H"),
label_indexer="RTLMP_10000697078",
)
# Create a binary target.
rt_over_da_operation = project.create_operation(
title=f"RT > DA Target",
connector=numerical_expression.NumericalExpression(
variable_names=["da_lmp", "rt_lmp"], math_expression="rt_lmp > da_lmp"),
)
rt_over_da_operation.create_input(da_lmp_ts, group_name="da_lmp")
rt_over_da_operation.create_input(rt_lmp_ts, group_name="rt_lmp")
rt_over_da_ts = rt_over_da_operation.create_time_series(
title=rt_over_da_operation.title,
sample_period=myst.TimeDelta("PT1H")
)
# Create an XGBoost classification model.
model = project.create_model(
title="RT > DA Classification Model",
connector=xgboost.XGBoost(
objective=xgboost.XGBoostObjective.BINARY_LOGISTIC,
num_boost_round=500,
max_depth=8,
min_child_weight=40,
learning_rate=0.038,
fit_on_null_values=True,
predict_on_null_values=True,
),
)
# Add the time series as inputs to the model.
for time_series in [
hour_of_day_time_series,
day_of_week_time_series,
day_of_year_time_series,
temperature_time_series,
]:
model.create_input(time_series, group_name=xgboost.GroupName.FEATURES)
model.create_input(rt_over_da_ts, group_name=xgboost.GroupName.TARGETS)
# Add a fit policy to the model.
model.create_fit_policy(
start_timing=myst.TimeDelta("-P3M"),
end_timing=myst.TimeDelta("PT1H"),
schedule_timing=myst.TimeDelta("PT24H"),
)
# Create a time series with the model predictions.
forecast_time_series = model.create_time_series(
title="RT > DA Forecast", sample_period=myst.TimeDelta("PT1H")
)
# Add a run policy to the time series.
forecast_time_series.create_run_policy(
start_timing=myst.TimeDelta("PT1H"),
end_timing=myst.TimeDelta("PT49H"),
schedule_timing=myst.TimeDelta("PT1H"),
)
Deploy your NoTS
Once you've finished creating your NoTS, you can deploy your Project.
Web Application
To create a new Deployment, click the Deploy button in the top right corner of the Project Create page. Specify a title for your Deployment and then click the Deploy button.
Once you’ve deployed a project, your Model Fit Policy and Time Series Run Policy will begin to run according to their schedules. This means that your Model will be fitted once a day and your Time Series will be run every hour. Note that these Policies will run indefinitely, until you deactivate your Deployment.
To track and verify the results of your Deployment, navigate to the Project Monitor space by clicking on the Monitor tab at the top of the Project page. The results table shows a list of ongoing results that are being generated by your policy. You can refresh the table by clicking on the refresh icon.
Client Library
The code below will deploy your project, creating a first model fit and time series run immediately.
# Deploy the project.
project.deploy("My Deployment")
# Create ad hoc time_series run job.
time_series_run_job = forecast_time_series.run(
start_timing=myst.TimeDelta("PT1H"),
end_timing=myst.TimeDelta("PT49H"),
)
Tutorial Complete
You are now generating the 48-hour DA-RT sign forecasts for ERCOT North Hub on an ongoing basis. See the section on Query Time Series Data to learn more about how to query your stored forecasts.
Updated almost 2 years ago