Jump to content
  • To Search the Seeq Knowledgebase:

    button_seeq-knowledgebase.png.ec0acc75c6f5b14c9e2e09a6e4fc8d12.png.4643472239090d47c54cbcd358bd485f.png

Search the Community

Showing results for tags 'aggregate'.

  • Search By Tags

    Type tags separated by commas.
  • Search By Author

Content Type


Forums

  • Community Technical Forums
    • Tips & Tricks
    • General Seeq Discussions
    • Seeq Data Lab
    • Seeq Developer Club
    • Seeq Admin Forum
    • Feature Requests

Categories

  • Seeq FAQs
  • Online Manual
    • General Information

Find results in...

Find results that contain...


Date Created

  • Start

    End


Last Updated

  • Start

    End


Filter by number of...

Joined

  • Start

    End


Group


Company


Title


Level of Seeq User

Found 8 results

  1. There are times when you may need to calculate a standard deviation across a time-range using the data within a number of signals. Consider the below example. When a calculation like this is meaningful/important, the straightforward options in Seeq may not be mathematically representative to calculate a comprehensive standard deviation. These straightforward options include: Take a daily standard deviation for each signal, then average these standard deviations Take a daily standard deviation for each signal, then take the standard deviation of the standard deviations Create a real-time standard deviation signal (using stddev($signal1, $signal2, ... , $signalN)), then take the daily average or standard deviation of this signal While straightforward options may be OK for many statistics (max of maxes, average of averages, sum of totalizes, etc), a time-weighted standard deviation across multiple signals presents an interesting challenge. This post will detail methods to achieve this type of calculation by time-warping the data from each signal then combining each individually warped signal into a single signal. Similar methods are also discussed in the following two seeq.org posts: Two different methods to arrive at the same outcome will be explored. Both of these methods share the same Step 1 & 2. Step 1: Gather Signals of Interest This example will consider 4 signals. The same methods can be used for more signals, but note that implementing this solution programmatically via Data Lab may be more efficient when considering a high number of signals (>20-30). Step 2: Create Important Scalar Constants and Condition Number of Signals: The number of signals to be considered. 4 in this case. Un-Warped Interval: The interval you are interested in calculating a standard deviation (I am interested in a Daily standard deviation, so I entered 1d) Warped Interval: A ratio calculation of Un-Warped Interval / Number of Signals. This metric is detailing what the new time-range will be for the time-warped signals. I.e. given I have 4 signals considering a days worth of data of, each signal's day worth of data will be warped into 6 hour intervals Un-Warped Periods: This creates a condition with capsules spanning the original periods of interest. periods($unwarped_interval) Method 1: Create ONE Time-Shift Signal, and move output Warped Signals The Time Shift Signal will be used as a counter to condense the data in the period of interest (1 day for this example) down to the warped interval (6 hours for this example). 0-timeSince($unwarped_period, 1s)*(1-1/$num_of_signals) The next step is to use this Time Shift Signal to move the data within each signal. Note there is an integer in this Formula that steps with each signal applied to. Details can be viewed in the screenshots. $area_a.move($time_shift_signal, $unwarped_interval).setMaxInterpolation($warped_interval).move(0*$warped_interval) The last step is to combine each of these warped signals together. We now have a Combined Output that can be used as an input into a Daily Standard Deviation that will represent the time-weighted standard deviation across all 4 signals within that day. Method 2: Create a Time-Shift Signal per each Signal - No Need to move output Warped Signals This method takes advantage of 4 time-shift signals, one per signal. Note there is also an integer in this Formula that steps with each signal applied to. Details can be viewed in the screenshot. These signals take care of the data placement, where-as the data placement was taken care of using .move(N*$warped_interval) above. 0*$warped_interval-timeSince($unwarped_period, 1s)*(1-1/$num_of_signals) We can then follow Method 1 to use the time shift signals to arrange our signals. We just need to be careful to use each time shift signal, as opposed to the single time shift signal that was created in Method 1. As mentioned above, there is no longer a .move(N*$warped_interval) needed at the end of this formula. The last step is to combine each of these warped signals together, similar to Method 1. $area_a.move($time_shift_1, $unwarped_interval).setMaxInterpolation($warped_interval) Comparing Method 1 and Method 2 & Calculation Outputs The below screenshot shows how Methods 1 & 2 arrive at the same output Note the difference in calculated values. The Methods reviewed in this post most closely capture the true time-weighted standard deviation per day across the 4 signals. Caveats and Final Thoughts While this method is still the most mathematically correct, there is a slight loss in data at the edges. When combining the data in the final step, the beginning of $signal_2 falls at the end of $signal_1, and so on. There are some methods that could possibly address this, but this loss in samples should be negligible to the overall standard deviation calculation. This method is also heavy on processing, especially depending on the input signals' data resolution and as the overall number of signals being considered increases. It is most ideal to use this method if real-time results are not of high importance, and better fitting if the calculation outputs are input in an Organizer that displays the previous day's/week's/etc results.
  2. Seasonal variation can influence specific process parameters whose values are influenced by ambient conditions, or perhaps raw material make up changes over the year's seasons based on scheduled orders from different vendors. For these reasons and more, it may not suffice to compare your previous month's process parameters against current. For these situations, it may be best to compare current product runs against previous product runs occurring the same month, but a year ago in order to assess consistency or deviations. In Seeq, this can be achieved through utilizing Condition Properties. 1. Bring in raw data. For this example, I will utilize a single parameter (Viscosity) and a grade code signal. 2. Convert Product step-signal into a condition. Add properties of Product ID, Variable statistic(s), and month start/end times. // Create a month condition. Be sure to specify your time zone so that start/end times are at 12:00 AM $m = months('US/Eastern') // Create a signal for start and end times to add into "Product ID" condition $start_signal = $m.toSignal('Start', startKey()).toStep() $end_signal = $m.toSignal('End', startKey()).toStep() $p.toCondition('Product ID') // Convert string signal into a condition, with a capsule at each unique string // Specifying 'Product ID' ensures the respective values in Signal populate // a property named 'Product ID' .removeLongerThan(100d) // Bound condition. 100d as arbitrary limit .setProperty('Avg Visc', $vs, average()) // Set 'Avg Visc' property reflecting avg visc over each Product ID .setProperty('Month Start', $start_signal, startValue()) // Set 'Month Start' property to know what month Product ID ran .setProperty('Month End', $end_signal, startValue()) // Set 'Month End' property to know what month Product ID ran 3. Create another condition that has a capsule ranging the entire month for each product run within the month. Add similar properties, but note naming differences of 'Previous Month Start' and 'Previous Month Avg Visc'. This is because in the next step we will move this condition forward by one year. $pi.grow(60d) // Need to grow capsules in the condition to ensure they consume the entire month .transform($capsule -> // For each capsule (Product Run) in 'Product ID'.... capsule($capsule.property('Month Start'), $capsule.property('Month End')) // Create a capsule ranging the entire month .setProperty('Product ID', $capsule.property('Product ID')) // Add property of Product ID .setProperty('Previous Month Start', $capsule.property('Month Start')) // Add property of Month Start named 'Previous Month Start' .setProperty('Previous Month Avg Visc', $capsule.property('Avg Visc')) // Add property of Avg Visc named 'Previous Month Avg Visc' ) Notice we now have many overlapping capsules in our new condition ranging an entire month -- one for each Product Run that occurred within the month. 4. Move the previous 'Month's Product Runs' condition forward a year and merge with existing 'Product ID' condition. Aggregate properties of 'Previous Month Avg Visc'. This ensures that if a product was ran multiple times and had different avg visc values in each run, then what is displayed will be the average of all the avg visc values for each product. $previousYearMonthProductRun = $mspi.move(1y) // Move condition forward a year $pi.mergeProperties($previousYearMonthProductRun, 'Product ID', // Merge the properties of both conditions only if their // capsules share a common value of 'Product ID' keepProperties(), // keepProperties() will preserve all existing properties aggregateProperty('Previous Month Avg Visc', average())) // aggregateProperty() will take the average of all 'Previous // Month Avg Visc' properties if multiple exist... I.e. if // there were multiple Product Runs, each with a different value // for 'Previous Month Avg Visc', then take the average of all of // them. The resulting condition will match our original condition, except now with two new properties: 'Previous Month Start' & 'Previous Month Avg Visc' We can then add these properties in a condition table to create a cleaner view. We could also consider creating any other statistics of interest such as % difference of current Avg Visc vs Previous Month Avg Visc. To do this, we could use a method similar to gathering $start_signal and $end_signal in Step 2, create the calculation using the signals, then add back to the condition as a property.
  3. Hello, I'm trying to build a trend of yearly rolling averages based on an existing trend. I know if use signal from condition and then I input the condition and the bounding condition (yearly), it doesn't give me the rolling average for the past year. I need to look at it in a way where if I look at trend and select today's date, it gives me the average for the past year. So basically a trend of yearly rolling averages. Please help, thanks
  4. Chromatography Transition Analysis in Seeq Many biopharmaceutical companies use Transition Analysis to monitor column integrity. Transition Analysis works by using a step change in the input conductivity signal and tracking the conductivity at the outlet of the column. The output conductivity transition can be analyzed via moment analysis to calculate the height of a theoretical plate (HETP) and the Asymmetry factor as described below. Step 1: Load Data and Find Transition Periods In order to perform this analysis in Seeq, we start by loading outlet conductivity and flow rate data for the column: Note: Depending on the density of the conductivity data, many companies find that some filtering of the data needs to be performed to get consistent results when performing the above differential equations. The agilefilter operator in Formula can be helpful to perform this filtering if needed: $cond.agileFilter(0.5min) Once the data is loaded, the next step is to find the transition periods. The transition periods can be found using changes in the signal such as a delta or derivative with Value Searches have also been applied. Combine the periods where the derivative is positive and the flow is low to identify the transitions. Alternative methods using the Profile Search tool can also be applied. Step 2: Calculate HETP As the derivatives are a function of volume instead of time, the next step is to calculate the volume using the following formula: Volume = $flow.integral($Trans) The dC/dV function used in the moment formulas can then be calculated: dCdV = runningDelta($Cond) / runningDelta($vol) Using that function, the moments (M0 through M2) can be calculated: M0 = ($dCdV*runningDelta($vol)).aggregate(sum(), $Trans, middleKey()) M1 = (($vol*$dCdV)*runningDelta($vol)).aggregate(sum(), $Trans, middleKey()) M2 = (($dCdV*($vol^2))*runningDelta($vol)).aggregate(sum(), $Trans, middleKey()) The moments are then used to calculate the variance: Variance = ($M2/$M0) - ($M1/$M0)^2 Finally, the HETP can be calculated: HETP = ((columnlength*$variance)/($M1/$M0)^2) In this case, the column length value needs to be inserted in the units desired for HETP (e.g. 52mm). The final result should look like the following screenshot: Alternatively, all of the calculations can be performed in a single Formula in Seeq as shown in the code below: $vol = $flow.integral($Trans) $dCdV = runningDelta($cond) / runningDelta($vol) $M0 = ($dCdV*runningDelta($vol)).aggregate(sum(), $Trans, middleKey()) $VdCdV = $vol*$dCdV $M1 = ($VdCdV*runningDelta($vol)).aggregate(sum(), $Trans, middleKey()) $V2dCdV = $dCdV*$vol^2 $M2 = ($V2dCdV*runningDelta($vol)).aggregate(sum(), $Trans, middleKey()) $variance = ($M2/$M0) - ($M1/$M0)^2 (52mm*$variance)/(($M1/$M0)^2) //where 52mm is the column length, L Step 3: Calculate Asymmetry Asymmetry is calculated by splitting the dC/dV peak by its max value into a right and left side and comparing the volume change over those sides. This section assumes you have done the calculations to get volume and dC/dV calculated already as performed for HETP in Step 2 above. The first step for Asymmetry is to determine a minimum threshold for dC/dV to begin and end the peaks. This is often done by calculating a percentage of the difference between the maximum and minimum part of the transition period (e.g. 10%): $min = $dCdV.aggregate(minValue(), $Trans, durationKey()) $max = $dCdV.aggregate(maxValue(), $Trans, durationKey()) $min + 0.1*($max - $min) The Deviation Search tool can then be used to identify the time when dC/dV is greater than the 10% value obtained above. Next, the maximum point of the dC/dV peaks can be determined by calculating the derivative of dC/dV in the Formula tool: $dCdV.derivative() The derivative can then be searched for positive values (greater than 0) with the Value Search tool to identify the increasing part of the dC/dV curve. Finally, a Composite Condition intersecting the positive dC/dV derivative and the transition values above 10% of the curve will result in the identification of the left side of the dC/dV curve: The right side of the dC/dV curve can then be determined using Composite Condition with the A minus B operator to subtract the positive dC/dV derivative from the transition above 10%: The change in volume can then be calculated by aggregating the delta in volume over each side of the peak using Formula: $Vol.aggregate(delta(), $LeftSide, middleKey()).aggregate(maxValue(), $Trans, middleKey()) Finally, the Asymmetry ratio can be calculated by dividing the volume change of the right side of the peak divided by the volume change of the left side of the peak. $VolRightSide/$VolLeftSide The final view should look similar to the following: Similar to HETP, all of the above formulas for Asymmetry may be calculated in a single formula with the code below: $vol = $flow.integral($Trans) $dCdV = (runningDelta($cond) / runningDelta($vol)).agileFilter(4sec) //calculate 10%ile of dCdV $min = $dCdV.aggregate(minValue(), $Trans, durationKey()) $max = $dCdV.aggregate(maxValue(), $Trans, durationKey()) $dCdV10prc = $min + 0.1*($max - $min) //Deviation search for when dCdV is above the 10%ile $deviation1 = $dCdV - $dCdV10prc $Above10 = valueSearch($deviation1, 1h, isGreaterThan(0), 0min, true, isLessThan(0), 0min, true) //Calculate filtered derivative of dCdV $dCdVderiv = $dCdV.derivative() //Value Search for Increasing dCdV (positive filtered derivative of dCdV) $dCdVup = $dCdVderiv.validValues().valueSearch(40h, isGreaterThan(0), 30s, isLessThanOrEqualTo(0), 0min) //Composite Conditions to find increasing left side above 10% and right side $LeftSide = $Above10.intersect($dCdVup) $RightSide = $Above10.minus($dCdVup) //Find change in volume over left side and right sides, then divide b/a $VolLeftSide = $Vol.aggregate(delta(), $LeftSide, middleKey()).aggregate(maxValue(), $Trans, middleKey()) $VolRightSide = $Vol.aggregate(delta(), $RightSide, middleKey()).aggregate(maxValue(), $Trans, middleKey()) $VolRightSide/$VolLeftSide Optional Alteration: Multiple Columns It should be noted that oftentimes the conductivity signals are associated to multiple columns in a chromatography system. The chromatography system may switch between two or three columns all reading on the same signal. In order to track changes in column integrity for each column individually, one must assign the transitions to each column prior to performing the Transition Analysis calculations. Multiple methods exist for assigning transitions to each column. Most customers generally have another signal(s) that identify which column is used. This may be valve positions or differential pressure across each column. These signals enable a Value Search (e.g. “Open” or “High Differential Pressure”) to then perform a Composite Condition to automatically assign the columns in use with their transitions. Alternatively, if no signals are present to identify the columns, the columns can be assigned manually through the Custom Condition tool or assigned via a counting system if the order of the columns is constant. An example of Asymmetry calculated for multiple columns is shown below: Content Verified DEC2023
  5. Summary There are many use cases where the user wants to do an aggregation over a periodic time frame, but only include certain values. Examples abound: for cement calculate the average daily clinker production only when the kiln is running, for biotech pharma the standard deviation of dissolved oxygen only when the batch is running, etc. Here our user is looking into equipment reliability for compressors. She wants to calculate the average daily compressor power to examine its performance over time. Steps 1. Add the signal into Seeq. 2. Use the 'Periodic Condition' or 'Formula' Tool to add a condition for days. This doesn't have to be days, it can be any arbitrary time, but it is usually periodic in nature. To use a custom periodic Condition, consider using the periods() function. For example to do this for days, the formula is: periods(1day) As an example, to change this to 5-minute time periods the formula is: periods(5min) Here are how the days and statistics in the Capsule Pane look for her: 3. Find only the desired values to be used in the aggregate (e.g. Average) statistic. From the values in the Capsule Pane she sees that the average compressor powers results are too low. This is because all the time when the compressor was OFF, near 0, are included in this calculation. She wants to only include times when the compressor is running because that will provide a true picture of how much energy is going into the equipment - and can indicate potential problems. To do this, she creates a 'Value Search' for times to include in this calculation, in this case when the Compressor is ON or > 0.5kW. 4. Create a new signal to only have values *within* the 'Compressor ON' condition. Use the aptly named 'within' function in the 'Formula' Tool. Notice how the resulting Signal in the bottom lane only has values within the 'Compressor ON' condition. Now she can use this because all those 0's that were causing the time-weighted Average to be low... are now gone. 5. Examine the resulting statistical values. She now sees that the calculations are correct and can use this resulting 'Compressor Power only when ON' Signal in other calculations using 'Signal from Condition', 'Scorecard Metric', and other Seeq Tools!
  6. There are many cases where a moving average needs to be monitored. To create any statistic that is associated with a sliding time frame, first use the periods() function in Formula to create the periodic capsules. A few examples are listed below. Intent Seeq Formula 30 day average every 1 day periods(30d,1d) 24 hour average every 1 hour periods(1d,1h) 30 minute average every 5 minutes periods(30min,5min) Once the periodic condition is created, use Signal from Condition to calculate a moving average, max, min, or other statistic. The sliding capsules will appear overlapped in the capsules lane. Refer to the capsules pane to verify the capsules meet your expectation of start time and duration, then apply the desired "moving" aggregation (average, standard deviation, etc.) to any signal with the Signal from Condition tool. Below shows a moving 30 minute average sampled every 5 minutes on the Example Temperature data.
  7. FAQ: I've got a signal for which the average and standard deviation are believed to be drifting over time. When I view the average and standard deviation in calendar time, it isn't helpful because they are highly dependent upon the production grade that I am running. Is there a better way that I could be viewing my data to get a sense of the drift of the average and standard deviation by production grade over time? Solution 1: Histogram 1. Add your signal of interest and your production grade code signal to the display. 2. Create a condition for all production grades using formula: $gradeCode.toCondition() 3. Use the Histogram tool to calculate the average reactor temperature during each grade campaign and display them aggregated over production grade, and time. The same methods from step 3 can be applied to get a second histogram of the distribution of the standard deviation of the signal of interest by grade over time. Solution 2: Chain View 1. Add your signal of interest and your operating state signal to the display. 2. Use Formula to create a condition for all operating states: $stateSignal.toCondition() 3. Use the Signal from Condition tool to calculate the average temperature over the all operating states condition. 4. Use the Signal from Condition tool to calculate the standard deviation of temperature over the all operating states condition. 5. Use Formula to calculate two new signals for “Avg + 2 SD” and “Avg – 2 SD”. 6. Filter your all operating states condition for only the state that you are interested in viewing. In this example we want to view only the capsules during which the compressor is in stage 2, for which the syntax is: $AllOperatingStates.removeLongerThan(7d).removeShorterThan(4h).filter($capsule -> $capsule.getProperty('Value').isEqualTo('STAGE 2')) This formula is taking our condition for all operating states, keeping only capsules that are between 4h and 7d in length, then filtering those capsules to include only those for which the value is equal to stage 2. 7. Swap to chain view and view a longer time range.
  8. The histogram tool when aggregated over time can cause some confusion as to what hours of the day the values of the bins reflect. The bin labels that are generated by Seeq reflect the hour of the day beginning at midnight. So for example, 0 = midnight, 2 = 2 AM, 14 = 2 PM, etc. When the display range interval is set to one day beginning at midnight, the histogram bins line up with the signal value quite nicely (see attached screenshot 2). When the display range is not set to begin at midnight, or is set to multiple days, the histogram bins may not appear to line up with the process data (see attached screenshot 1), but the distribution that they reflect is correct for the day(s) in the display range. histogram tool discussion - screenshot 1.pdf histogram tool discussion - screenshot 2.pdf
×
×
  • Create New...