Jump to content

John Brezovec

Seeq Team
  • Posts

    70
  • Joined

  • Last visited

  • Days Won

    11

Posts posted by John Brezovec

  1. Doing individual insert calls can get pretty slow when working with larger trees. In these cases I'd suggest doing a single insert of all of your limits with a DataFrame.

    For this we'll have to reshape the DataFrame Kin How suggested into something that looks like:

    image.png

    I did that with a little pandas manipulation:

    csv["Path"] = csv[["Level 1", "Level 2", "Level 3"]].apply(
        lambda x: ">>".join(x), axis=1
    )
    
    df = pd.concat(
        [
            csv[["Path", "Limits 1", "Limits 1 Name"]].rename(
                columns={"Limits 1": "Formula", "Limits 1 Name": "Name"}
            ),
            csv[["Path", "Limits 2", "Limits 2 Name"]].rename(
                columns={"Limits 2": "Formula", "Limits 2 Name": "Name"}
            ),
        ]
    )
    
    df["Formula"] = df["Formula"].astype(str)

    Note that the Formula column has to be a string so I called astype(str) on it in the last line.

     

    At this point I can now do a single call to insert to add all my limits at once:

    tree.insert(children=df)

    This should give the same results as Kin How's method, but should be more performant when working with larger trees.

    • Thanks 1
  2. Since you already have identified:

    • When the equipment is running (Purple Condition)
    • Time periods that you want to aggregate within (Yellow Condition)

    We should be able to accomplish this with a single Signal from Condition, using the 'Equipment Running' condition as the selected condition, total duration as the statistic, and the 'Time Between Replacement' condition as your bounding condition. The setup should look something like this:

    image.png

    If you prefer to do it in formula, the equivalent would be:

    $equipmentRunning.setMaximumDuration(500day).aggregate(totalDuration("h"), $timeBetweenReplacement.setMaximumDuration(500day), durationKey())

    If you still get an error from that let me know what it is.

  3. If I'm understanding you correctly, you want to find the last N batches that were ran that are of the same product as the most recent batch, correct?

    We can do that with some more creative condition logic and filtering 🙂

    Assuming that our batch capsules have a property called 'Product' that tells us what product that batch ran, we can add some filtering to our above formula to only pass in capsules to toCapsulesByCount() that match the last product produced:

    // return X most recent batches in the past Y amount of time
    $numBatches = 3 // X
    $lookback = 3mo // Y
    
    $currentLookback = capsule(now()-$lookback, now())
    
    // find the product of the most recent batch and assign that
    // as a capsule property to the lookback condition
    // this will let us filter the batches condition dynamically
    $lookbackCond = condition($lookback, $currentLookback)
                       .setProperty(
                           'Product',
                           $batchCondition.toSignal('Product', endKey()),
                           endValue(true)
                       )
    
    // filter the batch condition to only find batches of the active product
    $filteredBatches = $batchCondition.removeLongerThan($lookback)
                                      .touches($lookbackCond, 'Product')
    
    // create a rolling condition with capsules that contain X adjacent capsules
    $rollingBatches = $filteredBatches.toCapsulesByCount($numBatches, $lookback)
    
    // find the last capsule in the rolling condition that's within the lookback period
    $batchWindow = condition(
       $lookback,
       $rollingBatches.toGroup($currentLookback, CAPSULEBOUNDARY.ENDSIN).last()
    )
    
    // find all the batches within the capsule identified
    // ensure all the batches are within the lookback period
    $filteredBatches.inside($batchWindow)
                   .touches(condition($lookback, $currentLookback))

    The key here is to pass a capsule property name to touches() to allow filtering with this dynamic property value (the keep() function requires the input comparison scalar to be certain).

    The output will look something like this -- note I'm only getting the 3 more recent batches so you can actually read the property labels 🙂

    image.png

    • Thanks 1
  4. Here's an alternative method to getting the last X batches in the last 30 days:

    // return X most recent batches in the past Y amount of time
    $numBatches = 20 // X
    $lookback = 1mo // Y
    
    // create a rolling condition with capsules that contain X adjacent capsules
    $rollingBatches = $batchCondition.removeLongerThan($lookback)
                                     .toCapsulesByCount($numBatches, $lookback)
    
    // find the last capsule in the rolling condition that's within the lookback period
    $currentLookback = capsule(now()-$lookback, now())
    $batchWindow = condition(
       $lookback,
       $rollingBatches.toGroup($currentLookback, CAPSULEBOUNDARY.ENDSIN).last()
    )
    
    // find all the batches within the capsule identified
    // ensure all the batches are within the lookback period
    $batchCondition.inside($batchWindow)
                   .touches(condition($lookback, $currentLookback))

    This is similar to yours in that it uses toGroup, but the key is in the use of toCapsulesByCount as a way to get a grouping of X capsules in a condition.

    You can see an example output below. All capsules will show up as hollow because by the nature of the rolling 'Last X days' the result will always be uncertain.

    image.png

    • Like 1
    • Thanks 2
  5. OK great. In the meantime there are some workarounds / alternative methods to achieve what you're looking for. For instance, you could use capsule properties to store information about the recipe that is being run and display that in a column in the condition table instead of using the condition name itself. This video is an introduction to setting and using capsule properties that may be of interest.

  6. It sounds like you're looking for a 'Condition Table' rather than the 'Simple Table' that you're currently using - you can switch between them using the 'Simple' and 'Condition' buttons in the top left of the toolbar. A condition table allows you to see statistics per capsule, rather than for your whole display range. If you use the 'Add column' button you can add statistics for each capsule, including duration. Our knowledge base article has more information on this: https://support.seeq.com/space/KB/1617592515/Tables+Charts#Condition-Tables

    Let me know if that puts you down the right path!

    image.png

  7. The Seeq python module has recently been split into two packages: seeq and seeq-spy. The result of this is that when using SPy outside of datalab, you have to install both. The seeq module needs to be installed with the version that matches your Seeq server (e.g. pip install -U seeq~=60.1). The pypi page has additional information on this: https://pypi.org/project/seeq/.

    Let me know if you're still having issues after you install the seeq package.

  8. To accomplish this I would first create a condition that identifies when the machine is off (marking the data you want to remove), and then use the remove function in Formula to remove the data that's present in that condition. In one formula this could look like:

    $thrust.remove($speed < 0)

    You can then find the rate of change on this cleansed signal using something like the derivative() function.

    Note that there are two functions that are commonly used to remove downtime periods in Seeq: remove(), and within(). There are subtle differences in their behavior. You can refer to this post to understand those differentiators: 

  9. The option Joe suggested gave that error because the calculation changed the return type of the item, which spy.pull isn't expecting. You're passing in a signal to the spy.pull call, while the result of 

    $signal.totalized(capsule('2023-06-06T10:00Z', '2023-06-07T10:00Z'))

    Is a scalar, which causes the error here. You could get around that by adding a toSignal() at the end to ensure the formula returns the same type as the input. By default, toSignal() will return a sample every day, which is fine in your case since your search window is a day.

    $signal.totalized(capsule('2023-06-06T10:00Z', '2023-06-07T10:00Z')).toSignal()
  10. Are you just looking for an hourly average of your signal, or are you looking to do averages over more complex conditions? If you're looking for hourly averages, you can use the following as an input to the calculation parameter:

    spy.pull(
        items,
        start='01/01/2023',
        calculation='$signal.aggregate(average(), days(), startKey())',
        grid=None
    )

    Another way to accomplish this is to include both a signal and a condition in your call to spy.pull and specify shape='capsules'. This will result in a dataframe where rows are capsules, and columns are aggreaged signals. This can be good when you want to use more complex conditions to define your aggregations:

    image.png

    • Like 1
  11. The challenge with specifying a shared folder by Path is that to the owner of the content, the workbook will show up in their own folder, while to the shared user it will show up in their Shared Folder. This means you'd have to specify a different path based on who was running the code.

    The easiest method would be to specify the workbook by ID. If you really need to specify by name, I'd suggest doing a spy.search or spy.workbooks.search to get the ID, and then use that as the argument in your push:

    image.png

    Another alternative would be to have the workbook live in the Corporate folder. The docstring for spy.push shows how you can push to workbooks in the Corporate folder:

    f'{spy.workbooks.CORPORATE} >> MySubfolder >> MyWorkbook'
    • Thanks 2
  12. @TJDataYou should be able to do this using absolute or relative paths when specifying your formula parameters using >> and .. to traverse up and down your tree. For example, if we start with a tree that looks like the following:

    Example
    └── Cooling Tower 1
        ├── Shifts
        ├── Area A
        │   └── Temperature
        └── Area B
            └── Temperature

    Traversing Up the Tree: We can insert a calculation under each area that references the Shifts condition with the following (note the usage of .. in the formula parameters):

    tree.insert('Shift Max Temp', 
                formula='$temp.aggregate(maxValue(), $shifts, startKey())',
                formula_parameters={
                    'shifts': '.. >> Shifts',
                    'temp': 'Temperature'
                }, 
                parent='Cooling Tower 1 >> Area *')

    Traversing Down the Tree: We can insert a calculation under Cooling Tower 1 that references an attribute of a child asset with the following:

    tree.insert('Area A High Temp',
                formula='$temp > 100',
                formula_parameters={'temp': 'Area A >> Temperature'},
                parent='Cooling Tower 1'
               )
  13. One way of achieving this would be to compute the aggregate value (maxValue in your case) outside the call to setProperty() and then round the result before you use setProperty:

    // condition you want the aggregate of
    $condition = ($t > 100).removeLongerThan(1d)
    
    // output the maxValue as a signal that we can then round easily
    $maxPerCapsule = $t.aggregate(maxValue(), $condition, startKey())
                       .round(2) // round to two decimal places
    
    // use the rounded maxValue to set the property in the condition
    $condition.setProperty('Max Value', $maxPerCapsule, startValue())

    This results in the aggregates getting computed with the original data and then rounded:

    image.png

    • Thanks 1
  14. There are two main ways to construct asset trees using SPy:

    1. spy.assets using python class-based templates (Asset Trees 2 notebook in the documentation)
    2. spy.assets.Tree using simpler function-based construction (Asset Trees 1 notebook in the documentation)

    It sounds like your use-case would make better use of the second spy.assets.Trees option, which does not require you to predefine the possible 'slots' for signals/conditions. Take a read of the linked spy.assets.Trees documentation and see if that will work for you.

  15. Hi Pat,

    While you cannot move files between SDL projects in this manner, there are a couple options that I use to make this process easier:

    1. To simply copy numerous files/folders from one project to another, I like to make a zip archive of the files I want to move so I only have to download/upload one file between projects. You can do this by opening a terminal window in the source SDL project and using the 'zip' command to make a zip file. If you want to zip up the whole project except for the SPy Documentation folder, you can use this command:

    zip -r archive.zip . -x ".*" -x "SPy Documentation/*"

    You can then download the resulting archive.zip file from your source project and upload it into the destination project. In your destination project, open a new terminal window and use the following command to unzip the archive:

    unzip archive.zip

    2. Alternatively,  If you've created a common set of scripts that you would like to keep in sync with version control, consider using git to do your tracking / copying. Just be aware that there is a moderate learning curve going down this path. Starting in R58 with JupyterLab, there is a GUI interface for git that lowers the barrier to entry.

  16. A question came up recently that I thought would be of wider interest: how can I prevent interpolation across batch/capsule boundaries?

    In batch processes, lab samples are often taken periodically throughout the course of a batch. When viewing these samples in Seeq, you may encounter times when samples are interpolating between batches rather than just within a batch. In the image below, the periods highlighted in yellow correspond to this unwanted interpolation.

    image.png

    The following formula is one way of preventing this interpolation. $signal corresponds to your signal of interest and $condition corresponds to your batches you want to prevent interpolation across.

    combineWith(
        $signal,
        Scalar.Invalid.toSignal().aggregate(startValue(), 
                                            $condition.removeLongerThan(40h),
                                            startKey())
    )

    This results in the following signal that has our desired behavior:

    image.png

  17. Glad the grid=None worked for you! Looking back at the behavior you experienced, I do think it would be worth investigating further. Could you send a support ticket to support@seeq.com with a link to this forum post and a screenshot of what those two signals you're trying to pull look like in workbench?

  18. Hi Ruby,

    I don't know what your two signals look like, but this behavior can happen when your signals are using linear interpolation but spaced by more than the maximum interpolation. You have grid set to 1 day, which means Seeq will try to give you an interpolated value per day. If there is no valid interpolated value at that timestamp, a NaN will be returned as you see in your first example. In your second example I'm guessing you provided a start time where both signals have valid values, and so no interpolation needs to occur.

    Again, I don't know what your use case is or what your signals look like, but one option would be to set the grid=None to return the samples without any gridding applied.

×
×
  • Create New...