# Imputing Missing Values

## Recommended Posts

Hi All,

I'm trying to get a solution for the missing value imputation. Is there any ways to it like filling missing values with "Mean" or "Median".

Below is the screenshot for the same.

Regards,

Jitesh Vachheta

##### Share on other sites

Hello Jitesh,

Yes, this is possible, and you have a few options. From the screenshots above, it looks like you used the remove function to clean up your signal, which caused you to end up with a discontinuous signal. You can easily splice another signal into these gaps by using the splice (or spliceblend) functions in Formula. The first step would be to create the signal that you want to splice in (if not already coming from the historian), and then using that signal as the input of the splice function.

Hope this helps!

##### Share on other sites

Hey Danilo,

Would you mind sharing some code snippet or something to demonstrate.

Lets say you have a scalar values like "I/O Error".Now how can you remove this and replace the respective missing spots with mean or median.

Regards,

Jitesh Vachheta

Edited by Jitesh Vachheta
##### Share on other sites
• Seeq Team

Hi Jitesh,

Say that the condition you used to remove the data from "\$signal" is called "\$condition."

If I understand correctly, you want your signal to display the mean or median value during condition.

First you must calculate the mean or median value. There are multiple ways to do this, I will share two:

1. Zoom out in the display range to a period over which you want to find the mean/median. Use Simple Scorecard Metric to calculate mean or median during this period. Say the mean was 2.55
• Then use formula to create a signal called, "Mean" which is:
• `2.55.tosignal() `
2. Find the mean/median value around the capsules where data was removed. To do this, use the grow function to grow the capsules - \$condition.grow(1 hr). Then use Signal from Condition to calculate the mean/median value during these times (call this "MeanDuringGrownCapsules"). In the signal from condition, use step interpolation and place the sample over the duration of the capsules.

Next you must splice in the mean during the gaps. The formulas to do this is splice() or spliceblend().

`\$signal.splice(\$mean [or "MeanDuringGrownCapsules"] , \$condition)`

The difference between splice() and spliceblend() is that spliceblend() will smoothly join the samples from the original signal and the replacement signal, while splice() will create a discontinuity.

Let me know if this helps!

##### Share on other sites

Jitesh,

I am not 100% sure if this is what you are looking for, but you can just increase the max interpolation and Seeq will draw the line. See screen shots below for details.

I should mention that knowledge of planned future calculations should be considered along with why the data was removed.  Different methods for filling gaps in the data will result in slightly different outcomes in the end results.  This should be considered before determining how to fill in missing data e.g. (splicing like suggested above or adjusting interpolation).

Starting point (I remove the spikes around the edges as well for this.)

To remove the edges, you can do the following.

\$Time= 1day  //Time needed to grow your condition to exclude the spikes at the edges.

\$signal.remove(\$RemoveCondition.grow(\$Time))

Once you have your cleaned signal you can set the max interpolation in the formula tool. The following syntax will change the interpolation. Note you may need to adjust time for your own problem.

\$signal.setMaxInterpolation(3weeks)

The results

Overlaid

Hope this helps,

Teddy

##### Share on other sites

Hi Teddy,

While Applying your solution for removing an edges, i'm getting an error as below.

Formula Failed
token recognition error at: '#' at '', line=1, column=0 (GET /api/formulas/compile 400 Bad Request)

i have tried with the same  data set as you selected in above example (ID_FAN_MOTOR,ID_FAN_VIBRATION)

Regards,
Jitesh
##### Share on other sites
• 2 weeks later...

Jitesh,

What version of Seeq are you using?

Teddy

##### Share on other sites

Hi Teddy,

It's version (R21.0.42.08-v201906200037) that i'm currently using.

Regards,

Jitesh Vachheta

##### Share on other sites

Jitesh,

Can you send me a screen shot of your formula?

Teddy

## Create an account

Register a new account