Convert into Dummy Variables

This tutorial walks you through creating a price forecast model with dummy variables

📘

Tutorial Overview

In this tutorial, you will create and deploy a Linear Regression price model that uses dummy variables for each hour of the day as features. The resulting forecast will be generated every hour on an ongoing basis.

Create your NoTS

For this tutorial, you will leverage the client library to create your NoTS. This NoTS will use a Linear Regression model with hour of day features encoded as dummy variables. Your graph will forecast the day-ahead locational marginal price (DA LMP) for the DLAP PGAE-APND price node. We will be using the Yes Energy DataSignals API to access historical price data.

🚧

Model Performance

Note that the model we are using in this tutorial is not optimized. The goal of this tutorial is to show you how to create and deploy a price forecasting model with dummy variables. We do not recommend using this model in production.

Client Library

To create the NoTS in the client library, you can use the following code.

import myst
import numpy as np
from myst.connectors.model_connectors import linear_regression
from myst.connectors.operation_connectors import get_dummies
from myst.connectors.source_connectors import time_trends
from myst.connectors.source_connectors import yes_energy

myst.authenticate()

# Create a new project.
project = myst.Project.create(title="My Project")

# Create an hour of day time series from a time trends source.
time_trends_source = project.create_source(
    title="Time Trends",
    connector=time_trends.TimeTrends(
        sample_period=myst.TimeDelta("PT1H"),
        time_zone="US/Pacific",
        fields=[time_trends.Field.HOUR_OF_DAY],
    ),
)
hour_of_day_time_series = time_trends_source.create_time_series(
    title="Hour of Day",
    sample_period=myst.TimeDelta("PT1H"),
    label_indexer=time_trends.Field.HOUR_OF_DAY,
)

# Create the get dummies operation.
get_dummies_operation = project.create_operation(
    title="Dummy Hour Of Day",
    connector=get_dummies.GetDummies(categories=list(range(24))),
)
get_dummies_operation.create_input(hour_of_day_time_series, group_name=get_dummies.GroupName.OPERANDS)

# Create a time series for each of the hour of day dummies.
dummy_hour_of_day_time_series = [
    get_dummies_operation.create_time_series(
        title=f"Hour of Day ({index})", sample_period=myst.TimeDelta("PT1H"), label_indexer=index
    )
    for index in range(24)
]

# Create a DLAP time series using a Yes Energy source.
yes_energy_source = project.create_source(
    title="Yes Energy",
    connector=yes_energy.YesEnergy(
        items=[
            yes_energy.YesEnergyItem(datatype="DALMP", object_id=20000004194)  # CAISO DLAP PGAE - APND
        ],
        stat=yes_energy.YesEnergyAggregation.AVG,
    ),
)
dalmp_time_series = yes_energy_source.create_time_series(
    title="Historical DALMP",
    sample_period=myst.TimeDelta("PT1H"),
    label_indexer="DALMP_20000004194",
)

# Create a linear regression model.
model = project.create_model(title="DALMP Model", connector=linear_regression.LinearRegression())

# Add the time series as inputs to the model.
for time_series in dummy_hour_of_day_time_series:
    model.create_input(time_series, group_name=linear_regression.GroupName.FEATURES)
model.create_input(dalmp_time_series, group_name=linear_regression.GroupName.TARGETS)

# Add a fit policy to the model.
model.create_fit_policy(
    start_timing=myst.TimeDelta("-P3M"), 
    end_timing=myst.TimeDelta("PT1H"),
    schedule_timing=myst.TimeDelta("PT24H"),
)

# Create a time series with the model predictions.
forecast_time_series = model.create_time_series(
    title="DALMP Forecast", sample_period=myst.TimeDelta("PT1H")
)

# Add a run policy to the time series.
forecast_time_series.create_run_policy(
    start_timing=myst.TimeDelta("PT1H"),
    end_timing=myst.TimeDelta("PT49H"),
    schedule_timing=myst.TimeDelta("PT1H"),
)

Deploy your NoTS

Once you've finished creating your NoTS, you can deploy your Project.

Web Application

To create a new Deployment, click the Deploy button in the top right corner of the Project Create page. Specify a title for your Deployment and then click the Deploy button.

Once you’ve deployed a project, your Model Fit Policy and Time Series Run Policy will begin to run according to their schedules. This means that your Model will be fitted once a day and your Time Series will be run every hour. Note that these Policies will run indefinitely, until you deactivate your Deployment.

To track and verify the results of your Deployment, navigate to the Project Monitor space by clicking on the Monitor tab at the top of the Project page. The results table shows a list of ongoing results that are being generated by your policy. You can refresh the table by clicking on the refresh icon.

Client Library

The code below will deploy your project, creating a first model fit and time series run immediately.

# Deploy the project.
project.deploy("My Deployment")

# Create ad hoc model fit job.
model_fit_job = model.fit(
    start_timing=myst.TimeDelta("-P3M"), 
    end_timing=myst.TimeDelta("PT1H")
)

# Create ad hoc time series node run job.
time_series_run_job = forecast_time_series.run(
    start_timing=myst.TimeDelta("PT1H"), 
    end_timing=myst.TimeDelta("PT49H"),
)

👍

Tutorial Complete

You are now generating 48-hour forecasts for the DA LMP for DLAP PGAE-APND using a linear regression model with hour-of-day dummy variables. See the section on Query Time Series Data to learn more about how to query your stored forecasts.