Showing results for tags 'time shift'.

tip Calculating the Correlation Offset Separately for Each Capsule in a Condition

John Cox posted a topic in Tips & Tricks

The correlationOffset() Formula function can be a useful tool for identifying the time shift which maximizes the correlation between two signals: see this post for additional background. In some situations, a user will have a condition defined for time periods of interest (startups, process runs, specific product grades, specific modes of operation, etc.). The user then wants to analyze how the correlation offset varies for each time period of interest (each capsule in the condition). The key to this calculation is applying the transform() function to the condition in Seeq Formula, in combination with the correlationOffset() function. Let's say we have a temperature sensor in a reactor. At some point, well downstream of this temperature measurement, we have a relative humidity sensor that is sensing the same volume of air, but due to the locations of the two sensors, we know that the inverse correlation between the two signals is offset by a significant amount of time delay (at least 2 hours, as visually estimated with the dashed regions in the trend below). As a reminder, for this use case, the objective is to calculate the correlation offset separately for the data contained within each capsule in a condition of interest (shown as Time Periods to Calculate Correlation in the trend below): The formula approach for this is shown below, with comments to describe the details. The transform() function enables the correlationOffset() function to be applied separately to each capsule in the Time Periods to Calculate Correlation condition, and the correlation offset time is stored as a capsule property of the condition generated by the formula: The resulting calculated offset (in units of seconds) is shown in the capsules pane at the lower right and also at the top of the screen as labels. Optionally, the "Offset" capsule property can be converted to a signal (see Max Correlation Offset in lane 2) for trending purposes, and here the units were converted to hours. Looking at the final results, the time shift which maximizes the correlation between the 2 signals varies between 2.1 and 2.5 hours over the 3 time periods of interest shown in chain view, and this variation may offer valuable insights to the user. The time shift is a negative value which means that the relative humidity (downstream signal) would need to be shifted to the left by that time amount to maximize its correlation with the temperature signal. This is the formula to create a signal for the max correlation offset, based on the "Offset" capsule property. In this example the time shift is more meaningful in units of hours, so we convert from seconds to hours: Note that in this use case we wanted to calculate the correlation offset separately for each capsule in a condition. If the goal is to calculate the correlation offset over rolling window time periods, there are other functions in Formula expressly for this purpose, such as CrossCorrelations_timeShifts().

February 22
- correlation
- time delay
- (and 3 more)
  Tagged with:

Aggregated Time Weighted Standard Deviation Across Multiple Signals

Emilio Conde posted a topic in General Seeq Discussions

There are times when you may need to calculate a standard deviation across a time-range using the data within a number of signals. Consider the below example. When a calculation like this is meaningful/important, the straightforward options in Seeq may not be mathematically representative to calculate a comprehensive standard deviation. These straightforward options include: Take a daily standard deviation for each signal, then average these standard deviations Take a daily standard deviation for each signal, then take the standard deviation of the standard deviations Create a real-time standard deviation signal (using stddev($signal1, $signal2, ... , $signalN)), then take the daily average or standard deviation of this signal While straightforward options may be OK for many statistics (max of maxes, average of averages, sum of totalizes, etc), a time-weighted standard deviation across multiple signals presents an interesting challenge. This post will detail methods to achieve this type of calculation by time-warping the data from each signal then combining each individually warped signal into a single signal. Similar methods are also discussed in the following two seeq.org posts: Two different methods to arrive at the same outcome will be explored. Both of these methods share the same Step 1 & 2. Step 1: Gather Signals of Interest This example will consider 4 signals. The same methods can be used for more signals, but note that implementing this solution programmatically via Data Lab may be more efficient when considering a high number of signals (>20-30). Step 2: Create Important Scalar Constants and Condition Number of Signals: The number of signals to be considered. 4 in this case. Un-Warped Interval: The interval you are interested in calculating a standard deviation (I am interested in a Daily standard deviation, so I entered 1d) Warped Interval: A ratio calculation of Un-Warped Interval / Number of Signals. This metric is detailing what the new time-range will be for the time-warped signals. I.e. given I have 4 signals considering a days worth of data of, each signal's day worth of data will be warped into 6 hour intervals Un-Warped Periods: This creates a condition with capsules spanning the original periods of interest. periods($unwarped_interval) Method 1: Create ONE Time-Shift Signal, and move output Warped Signals The Time Shift Signal will be used as a counter to condense the data in the period of interest (1 day for this example) down to the warped interval (6 hours for this example). 0-timeSince($unwarped_period, 1s)*(1-1/$num_of_signals) The next step is to use this Time Shift Signal to move the data within each signal. Note there is an integer in this Formula that steps with each signal applied to. Details can be viewed in the screenshots. $area_a.move($time_shift_signal, $unwarped_interval).setMaxInterpolation($warped_interval).move(0*$warped_interval) The last step is to combine each of these warped signals together. We now have a Combined Output that can be used as an input into a Daily Standard Deviation that will represent the time-weighted standard deviation across all 4 signals within that day. Method 2: Create a Time-Shift Signal per each Signal - No Need to move output Warped Signals This method takes advantage of 4 time-shift signals, one per signal. Note there is also an integer in this Formula that steps with each signal applied to. Details can be viewed in the screenshot. These signals take care of the data placement, where-as the data placement was taken care of using .move(N*$warped_interval) above. 0*$warped_interval-timeSince($unwarped_period, 1s)*(1-1/$num_of_signals) We can then follow Method 1 to use the time shift signals to arrange our signals. We just need to be careful to use each time shift signal, as opposed to the single time shift signal that was created in Method 1. As mentioned above, there is no longer a .move(N*$warped_interval) needed at the end of this formula. The last step is to combine each of these warped signals together, similar to Method 1. $area_a.move($time_shift_1, $unwarped_interval).setMaxInterpolation($warped_interval) Comparing Method 1 and Method 2 & Calculation Outputs The below screenshot shows how Methods 1 & 2 arrive at the same output Note the difference in calculated values. The Methods reviewed in this post most closely capture the true time-weighted standard deviation per day across the 4 signals. Caveats and Final Thoughts While this method is still the most mathematically correct, there is a slight loss in data at the edges. When combining the data in the final step, the beginning of $signal_2 falls at the end of $signal_1, and so on. There are some methods that could possibly address this, but this loss in samples should be negligible to the overall standard deviation calculation. This method is also heavy on processing, especially depending on the input signals' data resolution and as the overall number of signals being considered increases. It is most ideal to use this method if real-time results are not of high importance, and better fitting if the calculation outputs are input in an Organizer that displays the previous day's/week's/etc results.

July 31, 2023
- 2
- aggregate
- delay
- (and 8 more)
  Tagged with:
  - aggregate
  - delay
  - standard deviation
  - stddev
  - warp
  - time warp
  - time delay
  - time shift
  - time running
  - time weighted

Aligning Data Values: Common Use Cases Involving Time Delays, Lab Data, and Process Events

John Cox posted a topic in General Seeq Discussions

Accurate data alignment can be the most challenging part of creating process calculations, finding correlations, and developing prediction models. It is an often overlooked data cleansing step that may be even more critical than smoothing noisy signals and removing data outliers. The need for data alignment stems from time delays present in the industrial process (related to physical transport and equipment/piping volumes, as well as lab measured data reported well after process operation completes). Another common data alignment need centers on comparing process metrics before/after process events of interest, which again involves time delays (or time shifts). There are at least four categories of data alignment use cases prevalent across the process manufacturing industries: Known time delay - The time delay resulting from the transportation of material at a given speed or velocity (across some distance in the process operation) can mask strong correlations between upstream/downstream signals or other signals separated in time, resulting in poor modeling results if not accounted for by a data alignment step. Variable time delay – Here the time delay is variable and a function of production speed, storage volumes, etc. but can be calculated based on measured/known process parameters. Before/after comparisons - Related to process experimentation and optimization, there is a recurring need to calculate and compare process metrics before and after some identified process event, such as a process feed being turned on, a unit restart, equipment maintenance, a process parameter or setpoint being adjusted, etc. Process and analytical (LIMS) data - When trying to correlate process signals with lab-measured analytical results, work is often needed to align the process signals and subsequent analytical results. In some cases, the alignment can be based on sequential, consistently reported lab data values. In other cases, more sophisticated logic, such as connecting process operation and lab results by matching id properties, is needed. Example methods for addressing each of these cases, using Seeq tools and Formula, are included below. Use Case Category #1: Known Time Delay In this specific example, the goal is to correlate a process signal (temperature) with a later reported lab result, but this use case also occurs frequently with continuous upstream (earlier in the process) and downstream signals (later in process), and the same time shifting methodology applies. It is known that the lab result is consistently reported 2 hours after the process operation which would most correlate. Therefore, the “Process Temperature” signal needs to be shifted 2 hours to the right in time and then sampled for each unique “Lab Measured Analytical Result” value. Analysis Steps 1. We shift the raw "Process Temperature" 2 hours to the right using Seeq Formula's move() function to create "Temperature Shifted 2 Hours to Right": $Temperature.move(2h) 2. We use Seeq Formula's resample() function to pick off values of the shifted temperature at the exact timestamp that new "Lab Measured Analytical Result" values are reported. The resample() function gives the ability to sample one signal based on the timestamps of the data values from another signal. The result is 1 temperature sample (green, Lane 1) per lab sample (Lane 2). $TemperatureShifted.resample($LabDataValues) 3. The value of the data alignment is obvious by comparing two XY plots of the lab result and the original, raw temperature/aligned temperature. The correlation between the process temperature and the lab result, hidden in the XY plot on the left, is obvious in the XY plot on the right, which shows the aligned temperature: Use Case Category #2: Variable Time Delay This use case is similar to the previous: correlating two measurements separated by a time delay, but with an additional complication. Here, the time delay is variable and is based on the physics of a material transport distance: changes in an upstream pressure physically take many hours to work their way through equipment/piping and influence a downstream analyzer signal. The physical time delay varies based on Production Line Speed. Analysis Steps 1. The transport distance is known (500 feet) and set up as a value using Seeq Formula. The user calculates the Time Delay between the 2 signals based on the Transport Distance / Production Line Speed. The resulting time delay fluctuates between 7 and 13 hours. Transport Distance Formula: 500.toSignal().setUnits('ft') Calculated Time Delay Formula: $TransportDistance/$LineSpeed 2. Using Seeq Formula's move() function, the Upstream Pressure can then be shifted (by the variable time delay calculated in Step 1) so that the effect of pressures changes upstream aligns in time with the resulting impact on the Downstream Analyzer. // Limit the maximum time delay to 20 hours $UpstreamPressure.move($CalculatedTimeDelay,20h) Changes in the upstream pressure (yellow signal) are now correctly aligned in time with their effect on the downstream analyzer (blue signal): Note that in this use case, we did not need to use the resample() function as we did with the discretely measured lab data in Use Case Category #1, as here we are working with continuously measured signals. Use Case Category #3: Before/After Comparisons An extremely common use case is to compare process metrics before/after some process event. The process event could be anything of interest: a process unit restart, equipment or control strategy modifications, periodic maintenance work, process experimentation, etc. The analysis begins by identifying the process events and then calculating the metrics over appropriate time ranges before and after each event. The alignment step typically involves creating a common time basis spanning before/after operation and moving the before/after calculated metrics to the same point in time for comparison. Analysis Steps 1. For this example we begin with a Pressure signal that ramps up over time as equipment run life increases and the equipment fouls. The equipment is shut down for Maintenance periodically and the process then restarts at a much lower pressure value. We use the Pressure signal and Seeq Formula to identify Equipment Maintenance periods based on the running delta of the pressure signal < 0: $Pressure.runningDelta().isLessThan(0) 2. Use Formula's grow() function to expand Equipment Maintenance to create a Before and After Maintenance time period which includes the desired time period (for example, 4 hrs) for calculating before/after pressures. This also gives a common time basis for a later alignment step. $EquipmentMaintenance.grow(4hr) Results are shown here in Chain View for the Before and After Maintenance capsules, which gives a nice visual of the pressure 4 hours before and after maintenance. 3. Average the Pressure signal over the first 2 hours of the Before and After Maintenance capsules. // Do a 2 hour avg pressure over the FIRST 2 hours // of the Before and After Maintenance capsules $Pressure.aggregate(average(),$BeforeAfterMaintenance.afterStart(2hr),middleKey()) 4. Use Signal from Condition to align the average pressure before maintenance (Step 3) at the middle of the Before and After Maintenance capsules. Signal from Condition is commonly used to find a value within a condition and move it to a specific location in time. In this case, we can use the Maximum statistic to find the correct "Pressure Avg (Before)" value, as there is only 1 value for each maintenance capsule. Using Signal from Condition in this way is a key technique for aligning before/after process values/metrics. We can see that the "Pressure Avg (Before)" has now been moved to the middle of the "Before and After Maintenance" capsules: 5. Repeat steps 3 and 4 to compute/align the average pressure after maintenance, and then move/align to the middle of the Before and After Maintenance capsules. 6. With the before/after pressure values aligned in time, we now use Formula to calculate the percent pressure reduction resulting from maintenance. // Calculate the % change relative to the // the Pressure Avg (Before) (($BeforeAvg-$AfterAvg)/$BeforeAvg*100).setUnits('%') Use Case Category #4: Process and Analytical (LIMS) Data In this example, we will align calculated process metric values from each batch (e.g., max temperature, total Chemical A flow added) with later reported lab results (often referred to as LIMS - Laboratory Information Management System data).This is a very common analytics need. Here, the “Product Impurity” lab results are reported at varying time intervals following batch completion, so a constant time delay alignment approach isn’t feasible. Zooming in, we can confirm that the lab results are reported (step to a new value) some time after the Process Batches complete. The reporting time is variable. Analysis Steps 1. The maximum temperature and totalized chemical added over each Process Batch are thought to correlate with (influence) the amount of final Product Impurity. So, we need to calculate the max T and total chemical A added per Process Batches capsule. Use Signal for Condition to do the calculations and place results at the end of each Process Batch. 2. We now need to work on alignment. To provide a basis for joining the process results to the corresponding lab results, we need to create "Product Impurity Results Capsules" for every value change in the reported Product Impurity. We use the Formula toCondition() function for this and store the numeric lab result in a capsule property named PctImpurity. $ProductImpurityData.toCondition('PctImpurity') Note: in the screenshot below, the highlighted Product Impurity Results capsule contains the lab result for the Process Batches capsule at the top left of the screen. 3. Now, use Composite Condition to "join" the Process Batches condition to the Product Impurity Results Capsules condition, so we have a time period that contains the process and lab results we want to align and correlate. We check the Inclusive options to create capsules from the start of Process Batches to the end of Product Impurity Results Capsules: The resulting yellow "joined" capsules in the screenshot below now span the time period of process operation and the eventual lab measured impurity result, for each individual batch. Investigating a zoomed time period, we can see the Process Batches start joined to the end of the resulting Product Impurity Results Capsules. 4. With a common capsule established, we align Max T, Total Chemical A, and Product Impurity at the middle of the "Process Batches to Impurity Result (joined)" capsules. For Max T and Total Chemical A, we use the 'Value at Start', and for Product Impurity, we use the 'Value at End'. Using Signal from Condition in this way is a key technique used in aligning process and lab data values. The results for a short time period look like this, and the "aligned" values can be used for further calculations or for creating a prediction model to predict Product Impurity based on Max T and Total Chemical A: Joining Process and Lab Values Based on a Matching Capsule Property With that example finished, let’s look at another common (and more complicated) scenario in this use case category: when analytical results can be reported inconsistently or out of sequence with process batches, a more advanced condition join using Seeq Formula, based on a matching id or other capsule property value, may be needed. In this example, a numeric batch id is a capsule property shared by the Process Batches and Lab Analytical Batches conditions: see the 437 and 438 capsule property values shown as labels on the blue and green capsules in the screenshot below. Using this batch id linkage, matching id Process/Lab capsules can be joined with a single line formula, and capsule properties from the separate process and lab conditions (e.g., see the 0.99 and 1.32 lab product impurity results shown on the blue capsules) can be preserved, all courtesy of enhanced “capsule matching by property” functionality introduced in Seeq R56 (link). Starting with the raw data and Process Batches and Lab Analytical Batches conditions and their capsule properties, we have already calculated the "Average Process Ratio for Batch" using Signal from Condition and located it at the start of the Process Batches capsules. We illustrate only the critical steps in this use case under Analysis Steps below: Analysis Steps - Joining Process and Lab Data on a Matching Capsule Property 1. We join the Process Batches (capsules) to Lab Analytical Batches, based on their BatchID/LabID property matching. We use the join() Formula function. The batch id numeric is stored as the "LabID" capsule property in the Lab Batches condition, so, prior to doing the join, we must rename it to have the same property name as the "BatchID" capsule property on the Process Batches condition. Note: the capsule property matching and keepProperties() are enhanced functionality options introduced in Seeq R56. // Join process to lab batches based on matching ID. // We need to rename the LabID capsule property to BatchID for the // lab capsules. // Use keepProperties() so that the resulting condition has // the Result capsule property (the lab measured product impurity value). $ProcessBatches.join($LabBatches.renameProperty('LabID','BatchID'), 4d, true, 'BatchId', keepProperties()) Inspecting the capsules and the batch id and product impurity capsule property values at the top of the trend, we see the "Process Batches Joined" capsules are linked based on matching ID, and the product impurity "Result" capsule property (the 0.99 and 1.32 values) is retained and now part of a capsule that starts at the beginning of each Process Batches capsule: 2. We now translate the "Process Batches Joined to Lab Batches on ID Match" Result capsule property into a "Lab Measured Product Impurity Aligned to Batch Start" signal, with values moved to the start of the Process Batches, at the exact location we have the Avg Process Ratio for Batch. $JoinedBatches.toSignal('Result',startKey()).toDiscrete() For this short time range, we can confirm the 0.99 value for the resulting impurity signal aligns correctly to BatchID 437, and the 1.32 value aligns correctly to Batch 438. As a result, we have now connected the lab measured product impurity result to each individual process batch, regardless of lab result timing, and with additional steps have aligned the Average Process Ratio and Lab Measured Product Impurity.

September 12, 2022
- align signals
- time delay
- (and 5 more)
  Tagged with:

Create a Soft Sensor based on time-shifted signals

Allison Buenemann posted a topic in General Seeq Discussions

Use Case: It is common in industry to seek to use the behavior of upstream process variables to predict what the behavior of a downstream variable might be minutes, hours or days from the present time. Solution: A traditional predictive modeling workflow can be applied to solve this problem. Identify an appropriate training data set Perform any necessary data cleansing Create a predictive model Evaluate the model fit Improve the model Operationalize the model What differentiates this use case from any other predictive modeling use case is a specific data cleansing step for adjusting signals to remove process lag. 1. Load Data Load your target signal and the relevant upstream signals into the display pane. In this example, the target signal is the product viscosity, measured in an analytical lab based off a sample from a downstream sample point. Three upstream signals: the reactor temperature, reactant conversion, and viscosity modifier flow to the reactor significantly influence the product viscosity measurement and will be used as inputs into the model signal. 2. Identify Training Data Set Identify an appropriate training data set for your regression model. This may involve a longer time window to include variability in product type or seasonality. In this example, we will pan out to 3 months to capture multiple cycles of different product types. With an appropriate training window identified, you can also limit your training data set to a subset of samples present during a particular condition. If this interests you, consult the "advanced options" section of the Prediction Tool KB article for more information. This method is particularly useful if you're wishing to create different models for different modes of operation. 3. Cleanse Signals -- Adjust for Process Lag We can time-shift our upstream signals using either a known constant delay, a known variable delay (like a calculated residence time signal), or an unknown delay of maximum correlation to the target signal. The first two of these options will utilize the .move() function in Formula (or .delay() in earlier versions of Seeq). The latter will utilize the .correlationOffset() function. In this example, we have a known lag of 1.5 hours between the reactor and the product sampling point. We will use the move function with an input scalar of 1.5h, as shown below. The time shift calculation should be applied across all relevant input signals. More information on the different options for time shifting signals using fixed, variable, or calculated offsets is available in this forum post: 4. Cleanse Signals -- Remove signal noise, outliers, abnormal operating data In this example, we apply an agileFilter to each of our time-shifted model input signals. Apply the same technique to each of the model inputs. Note that steps 3 & 4 could have been combined into a single formula. An example of this would be: $reactor_temp.move(1.5h).agileFilter(1min) For guidance on additional cleansing techniques, consult the Interactive Training. 5. Build the Predictive Model Use the Prediction tool panel to create a model of your target signal based on your cleansed, time-adjusted input signals. Ensure your model training window matches the date range that you identified in step 2. You can view the model parameters like coefficients, rSquared, and p-values using the "+ Prediction Model" option. 6. Evaluate the Model Fit Use Scatter Plot view and the model parameters to evaluate the goodness of fit of the model. Switch to a time range outside of your training data set to ensure your model is a good fit for data throughout time. 7. Improve Model (as needed) If the scatter plot indicates a non-linear relationship, test out additional model scales in the prediction tool panel. Consider eliminating variables with p-values higher than your significance level cutoff (frequently 0.05). Add additional variables if relevant. If distinct modes of operation introduce significant signal variability, consider creating a model for each operating mode and stitch the models together into a single model using the splice() function in Formula. 8. Deploy the Model The model should project out into the future by the amount of the process lag between the upstream and target signals.

February 3, 2021
- 2
- prediction
- soft sensor
- (and 6 more)
  Tagged with:

Sign In

Search the Community

Search By Tags

Search By Author

Content Type

Forums

Calendars

Categories

Categories

Categories

Categories

Find results in...

Find results that contain...

Date Created

Start

End

Last Updated

Start

End

Filter by number of...

Minimum number of comments

Minimum number of replies

Minimum number of reviews

Minimum number of views

Joined

Start

End

Group

Company

Title

Level of Seeq User

tip Calculating the Correlation Offset Separately for Each Capsule in a Condition

Aggregated Time Weighted Standard Deviation Across Multiple Signals

Aligning Data Values: Common Use Cases Involving Time Delays, Lab Data, and Process Events

Create a Soft Sensor based on time-shifted signals