Jump to content

Recommended Posts

Posted

Use Case:

It is common in industry to seek to use the behavior of upstream process variables to predict what the behavior of a downstream variable might be minutes, hours or days from the present time. 

 

Solution:

A traditional predictive modeling workflow can be applied to solve this problem. 

  • Identify an appropriate training data set
  • Perform any necessary data cleansing
  • Create a predictive model
  • Evaluate the model fit
  • Improve the model
  • Operationalize the model

What differentiates this use case from any other predictive modeling use case is a specific data cleansing step for adjusting signals to remove process lag. 

1. Load Data

Load your target signal and the relevant upstream signals into the display pane. In this example, the target signal is the product viscosity, measured in an analytical lab based off a sample from a downstream sample point. Three upstream signals: the reactor temperature, reactant conversion, and viscosity modifier flow to the reactor significantly influence the product viscosity measurement and will be used as inputs into the model signal.

image.png

2. Identify Training Data Set

Identify an appropriate training data set for your regression model. This may involve a longer time window to include variability in product type or seasonality. In this example, we will pan out to 3 months to capture multiple cycles of different product types. 

image.png

With an appropriate training window identified, you can also limit your training data set to a subset of samples present during a particular condition. If this interests you, consult the "advanced options" section of the Prediction Tool KB article for more information. This method is particularly useful if you're wishing to create different models for different modes of operation. 

3. Cleanse Signals -- Adjust for Process Lag

We can time-shift our upstream signals using either a known constant delay, a known variable delay (like a calculated residence time signal), or an unknown delay of maximum correlation to the target signal. The first two of these options will utilize the .move() function in Formula (or .delay() in earlier versions of Seeq). The latter will utilize the .correlationOffset() function. 

In this example, we have a known lag of 1.5 hours between the reactor and the product sampling point. We will use the move function with an input scalar of 1.5h, as shown below. 

image.png

The time shift calculation should be applied across all relevant input signals. 

image.png

More information on the different options for time shifting signals using fixed, variable, or calculated offsets is available in this forum post: 

4. Cleanse Signals -- Remove signal noise, outliers, abnormal operating data

In this example, we apply an agileFilter to each of our time-shifted model input signals. 

image.png

Apply the same technique to each of the model inputs. 

image.png

Note that steps 3 & 4 could have been combined into a single formula. An example of this would be: $reactor_temp.move(1.5h).agileFilter(1min)

For guidance on additional cleansing techniques, consult the Interactive Training.

5. Build the Predictive Model

Use the Prediction tool panel to create a model of your target signal based on your cleansed, time-adjusted input signals. Ensure your model training window matches the date range that you identified in step 2. 

image.png

You can view the model parameters like coefficients, rSquared, and p-values using the "+ Prediction Model" option. 

image.png

6. Evaluate the Model Fit

Use Scatter Plot view and the model parameters to evaluate the goodness of fit of the model. 

image.png

Switch to a time range outside of your training data set to ensure your model is a good fit for data throughout time. 

7. Improve Model (as needed)

If the scatter plot indicates a non-linear relationship, test out additional model scales in the prediction tool panel. Consider eliminating variables with p-values higher than your significance level cutoff (frequently 0.05). Add additional variables if relevant. 

If distinct modes of operation introduce significant signal variability, consider creating a model for each operating mode and stitch the models together into a single model using the splice() function in Formula. 

8. Deploy the Model

The model should project out into the future by the amount of the process lag between the upstream and target signals. 

image.png

  • Like 2

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...