To Search the Seeq Knowledgebase:

Search the Community

Showing results for tags 'pareto'.

Found 2 results

Sort By
- Date
- Relevancy

Create Pareto Charts with CDF in Seeq

Allison Buenemann posted a topic in Seeq Data Lab

Users are often interested in creating pareto charts using conditions they've created in Seeq sorted by a particular capsule property. The chart below was created using the Histogram tool in Seeq Workbench. For more information on how to create Histograms that look like this, check out this article on creating and using capsule properties. Often times users would like to see the histogram above, but with the bars sorted from largest to smallest in a traditional pareto chart. Users can easily create paretos from Seeq conditions using Seeq Data Lab. A preview of the chart that we can create is: The full Jupyter Notebook documentation of this workflow (including output) can be found in the attached pdf file. If you're unable to download the PDF, the code snippets below can be run in Seeq Data Lab to produce the chart above. #Import relevant libraries from seeq import spy import pandas as pd import numpy as np import matplotlib import matplotlib.pyplot as plt Log in to the SPY module if running locally using spy.login, or skip this step if running Seeq Data Lab. #Search for your condition that has capsule properties using spy.search #Use the 'scoped to' argument to search for items only in a particular workbook. If the item is global, no 'scoped to' argument is necessary condition = spy.search({ "Name": "Production Loss Events (with Capsule Properties)", "Scoped To": "9E50F449-A6A1-4BCB-830A-8D0878C8C925", }) condition #pull the data from the time frame of interest using spy.pull into a Pandas dataframe called 'my_data' my_data = spy.pull(condition, start='2019-01-15 12:00AM', end='2019-07-15 12:00AM', header='Name',grid=None) #remove columns from the my_data dataframe that will not be used in creation of the pareto/CDF my_data = my_data.drop(['Condition','Capsule Is Uncertain','Source Unique Id'], axis=1, inplace=False) #Calculate a new dataframe column named 'Duration' by subtracting the capsule start from the capsule end time my_data['Duration'] = my_data['Capsule End']-my_data['Capsule Start'] #Group the dataframe by reason code my_data_by_reason_code = my_data.groupby('Reason Code') #check out what the new data frame grouped by reason code looks like my_data_by_reason_code.head() #sum total time broken down by reason code and sort from greatest to least total_time_by_reason_code['Total_Time_by_Reason_Code'] = my_data_by_reason_code['Duration'].sum().sort_values(ascending=False) total_time_by_reason_code['Total_Time_by_Reason_Code'] = total_time_by_reason_code['Total_Time_by_Reason_Code'].rename('Total_Time_by_Reason_Code') total_time_by_reason_code['Total_Time_by_Reason_Code'] #plot pareto of total time by reason code total_time_by_reason_code['Total_Time_by_Reason_Code'].plot(kind='bar') #Calculate the total time from all reason codes total_time = total_time_by_reason_code['Total_Time_by_Reason_Code'].sum() total_time #calculate percentatge of total time from each individual reason code percent_time_by_reason_code['Percent_Time_by_Reason_Code'] = total_time_by_reason_code['Total_Time_by_Reason_Code'].divide(total_time) percent_time_by_reason_code['Percent_Time_by_Reason_Code'] #Calculate cumulative sum of percentage of time for each reason code cum_percent_time_by_reason_code['Cum_Percent_Time_by_Reason_Code'] = percent_time_by_reason_code['Percent_Time_by_Reason_Code'].cumsum() cum_percent_time_by_reason_code['Cum_Percent_Time_by_Reason_Code'] = cum_percent_time_by_reason_code['Cum_Percent_Time_by_Reason_Code'].rename('Cum_Percent_Time_by_Reason_Code') cum_percent_time_by_reason_code['Cum_Percent_Time_by_Reason_Code'] #plot cumulative distribution function of time spent by reason code cum_percent_time_by_reason_code['Cum_Percent_Time_by_Reason_Code'].plot(linestyle='-', linewidth=3,marker='o',markersize=15, color='b') #convert time units on total time by reason code column from default (nanoseconds) to hours total_time_by_reason_code['Total_Time_by_Reason_Code'] = total_time_by_reason_code['Total_Time_by_Reason_Code'].dt.total_seconds()/(60*60) #build dataframe for final overlaid chart df_for_chart = pd.concat([total_time_by_reason_code['Total_Time_by_Reason_Code'], cum_percent_time_by_reason_code['Cum_Percent_Time_by_Reason_Code']], axis=1) df_for_chart #create figure with overlaid Pareto + CDF plt.figure(figsize=(20,12)) ax = df_for_chart['Total_Time_by_Reason_Code'].plot(kind='bar',ylim=(0,800),style='ggplot',fontsize=12) ax.set_ylabel('Total Hours by Reason Code',fontsize=14) ax.set_title('Downtime Reason Code Pareto',fontsize=16) ax2 = df_for_chart['Cum_Percent_Time_by_Reason_Code'].plot(secondary_y=['Cum_Percent_Time_by_Reason_Code'],linestyle='-', linewidth=3,marker='o',markersize=15, color='b') ax2.set_ylabel('Cumulative Frequency',fontsize=14) plt.show()
- April 13, 2020
- 1 reply
- - 4
- - pareto
  - histogram
  - (and 2 more)
    Tagged with:
    
    pareto
    
    histogram
    
    cdf
    
    cumulative distribution function
Creating Histograms with Multiple Groupings

Allison Buenemann posted a topic in General Seeq Discussions

FAQ: We have various conditions that are calculated from signals on a variety of different equipment and assets. We would like to view them in a histogram that is broken out by month, and for each month each asset has a separate bar in the histogram. Example Solution: 1. For three signals, we want to create a histogram that is the total time per month spent above some threshold. In this example, each signal is associated with a different cooling tower Area. 2. We have a condition for when each signal is above it's threshold value. These conditions were created using the value search tool. 3. The three conditions can be combined into a single condition (here it is called "Combined In High Mode w Area as Property"). In the formula tool, before combining the conditions, we assign each condition a property called 'Area' and set the value as that particular asset area. Once the properties are set we use the combineWith() function to combine them into one final signal. The formula syntax below will achieve this: //Create a new condition for each original condition that has a property of 'Area'. $A=$AHigh.setProperty('Area','Area A') $G=$GHigh.setProperty('Area','Area G') $I=$IHigh.setProperty('Area','Area I') //Combine the new conditions created into a new condition with all of the high power modes where each capsule //has a property of 'Area' that describes the signal that was searched to identify that original condition. combineWith($A,$G,$I) ***Note: the combineWith() function in Seeq Formula is required here because it will retain capsule properties of individual conditions when combining them. Using union() or any other composite condition type logic will NOT retain the capsule properties of the individual condition.*** 4. Use the Histogram tool and the multiple grouping functionalities to aggregate over both time, and the capsule property of 'Area'. Final Result: (remove other items from the details pane to view just the histogram)
- November 7, 2019
- - 2
- - setproperty()
  - capsule properties
  - (and 5 more)
    Tagged with:
    
    setproperty()
    
    capsule properties
    
    condition
    
    grouping
    
    aggregation
    
    histogram
    
    pareto

Sign In

Search the Community

Search By Tags

Search By Author

Content Type

Forums

Calendars

Categories

Categories

Categories

Categories

Find results in...

Find results that contain...

Date Created

Start

End

Last Updated

Start

End

Filter by number of...

Minimum number of comments

Minimum number of replies

Minimum number of reviews

Minimum number of views

Joined

Start

End

Group

Company

Title

Level of Seeq User

Create Pareto Charts with CDF in Seeq

Creating Histograms with Multiple Groupings