Jump to content

Creating a Formula Parameters string?


patjdixon

Recommended Posts

I am trying to create formulas in which the formula parameters are dynamically created (see the attached image).  I print the resulting string for the formula parameters and it looks correct.  However, I get an error on the push stating that the format is incorrect.  Is there a way to fix this?

 

 

seeq_formparams_2023-05-16 at 1.52.28 PM.jpg

Link to comment
Share on other sites

  • Seeq Team

Formula Parameters needs to be a Python dictionary, where the keys are the variable names and the values are the IDs, or a Pandas DataFrame row containing the ID.

So instead of constructing a string on line 60, you'll need to create a dictionary and assign rows from a spy.search() call.

You can see how to do it near the end of the SPy Tutorial:

https://python-docs.seeq.com/user_guide/Tutorial.html#pushing-a-new-condition-to-seeq

 

 

  • Like 1
Link to comment
Share on other sites

for ModelConfig_Column in range(1,ModelConfig_NumColumns):
    TargetTag = ModelConfig_CSV[0][ModelConfig_Column]

    #print(TargetTag_Running)
    # Set the type of regression formula
    if ModelConfig_CSV[2][ModelConfig_Column] == "OLS":
        RegressFormula = "$target.regressionModelOLS("
    elif ModelConfig_CSV[2][ModelConfig_Column] ==  "Ridge":
        RegressFormula = "$target.regressionModelRidge("      
    elif ModelConfig_CSV[2][ModelConfig_Column] ==  "PCR":
        RegressFormula = "$target.regressionModelPCA("
    else:
        RegressFormula = ""
    #print(RegressFormula) 
    RegressFormula = RegressFormula+"$condition.toGroup(capsule("+DataRange+"), CapsuleBoundary.Intersect)"
    #print(RegressFormula) 
    if ModelConfig_CSV[6][ModelConfig_Column]:
        RegressFormula =RegressFormula + ", true,"
    else:
        RegressFormula =RegressFormula + ", false,"
    #print(RegressFormula)
    InputList = ""
    FormVarList = ""
    for ModelConfig_Row in range (7,27):
        # When a cell is empty, it is false
        InputVar = ModelConfig_CSV[ModelConfig_Row][ModelConfig_Column]
        if ModelConfig_CSV[ModelConfig_Row][ModelConfig_Column]:
            InputName = ModelConfig_CSV[ModelConfig_Row][0]
            Scale = ModelConfig_CSV[1][ModelConfig_Column]
            locals()[InputVar + '_ID'] = spy.search({
                'Name': InputVar + '_ID',
                'Scoped To':scopedtoid })
            #print(Scale)
            if "Linear" in Scale: 
                InputList = InputList + " $" + InputName + ","
            elif "Log" in Scale:
                InputList = InputList + " $" + InputName + ", ln($" + InputName + ").validValues(),"
            elif "Polynomial" in Scale:
                Poly_Exponent = re.findall(r'\d', Scale ) 
                #print(Poly_Exponent[0])
                for ExpNum in range(1,int(Poly_Exponent[0])+1):
                    InputList = InputList + " $" + InputName + "^"+ str(ExpNum) +","
            elif "Expanded" in Scale:
                # not fully implemented yet, its rather complex
                InputList = InputList + " $" + InputName + ","
            else:
                InputList =""
            # Set the list of inputs for formula variables
            FormVarList = FormVarList + "'$" + InputName + "':" + InputVar + "_ID['ID'].iloc[0], "
            #print(ModelConfig_CSV[ModelConfig_Row][ModelConfig_Column])
    # remove final ','
    #FormVarList = FormVarList[:-1]
    InputList = InputList[:-1]
    RegressFormula = RegressFormula + InputList + ")"
    VarSelectFlag = ModelConfig_CSV[5][ModelConfig_Column]
    print(VarSelectFlag)
    if VarSelectFlag == "TRUE":
        RegressFormula = RegressFormula + ".variableSelection(" + ModelConfig_CSV[6][ModelConfig_Column]+ ")"
    RegressFormula = RegressFormula + ".predict(" + InputList + ")"
    FormVarList = FormVarList + "'$target':" + TargetTag + "_ID['ID'].iloc[0], '$condition':Training_ID['ID'].iloc[0]"
    #print(RegressFormula)
    print(FormVarList)
    print()
    metadata = list()
    metadata.append({
        'Name': TargetTag +'_PRED',
        'Type': 'Signal',
        'Description': 'Prediction tag',
        'Formula': RegressFormula,
        'Formula Parameters': {str(FormVarList)}
    })
PRED_metadata = pandas_obj.DataFrame(metadata)
spy.push(metadata = PRED_metadata, workbook=scopedtoid)

Link to comment
Share on other sites

  • Seeq Team

You'll want to click Show Stack Trace. I believe that error is likely coming from Pandas, indicating that one of your DataFrames is empty.

The reason I say this is an unorthodox method is that it's very hard to debug. It would be better to deal with the variables names in string form rather than assigning them to Python variables and then constructing code via string concatenation. I.e., instead of creating a variable called PulpEye_BlendFreeness_ID, just keep a DataFrame with row a row whose Name is PulpEye_BlendFreeness and has an ID column.

ChatGPT may be able to help you understand your code and suggest a better method overall. You can paste it in and ask it to help you structure it in a more supportable way.

Link to comment
Share on other sites

I tried ChatGPT, but it didn't understand what Seeq does with Formula Paramaters so its recommendations didn't format it correctly.  The core issue is that I am trying to dynamically create the Formula Parameters string, but the issue I think is that the argument cannot be a string, it needs to be an ID.  In other words, if I have "$parameter":argument, the argument needs to be something like Training_ID["ID"].iloc[0], not "Training_ID["ID"].iloc[0]".  I am not sure how to do that.  Any ideas?

seeq_formparams_2023-05-18 at 9.45.10 PM.jpg

Link to comment
Share on other sites

  • Seeq Team

PM_SIM_SetModels_230416 - MarkD Edits.ipynb

Pat, here's an updated version. I'm not able to test it because I don't have your data sets. But the primary difference is that I replaced this line that dynamically assigns values to local variables:

# locals()[InputVar + '_ID'] = spy.search({

with code that just stores the IDs in a dictionary:

            SearchResults = spy.search({
                'Name': InputVar + '_ID',
                'Scoped To':scopedtoid })
            InputVar_ID = SearchResults.iloc[0]['ID']
            TagIDs[InputVar] = InputVar_ID

You can see that instead of constructing a string, we construct a `FormulaParameters` dictionary where the keys are variable names and the values are IDs.

If it doesn't work right away, it won't surprise me because I couldn't test it. Please take a little time debugging it by adding print statements and walking through it so you have a feel for what it's doing..

Link to comment
Share on other sites

  • Seeq Team

This error is coming from Pandas and indicates that your search result returned zero rows. You'll need to check to see that your spy.search() result includes at least one row.

I think it's important that you step through your code and really try to understand what each line is doing, it'll be too hard for us to have a back and forth for every error you encounter.

There are a bunch of online resources for understanding Python and Pandas basics:

https://www.google.com/search?q=best+resource+for+learning+python+and+pandas

Link to comment
Share on other sites

The issue is that a variable used for Formula Parameters requires some unknown format or syntax, or can't be done at all.

In the first attachment, I use a hard coded Formula Parameters argument, and it works as expected

In the next attachment, I use the variable FormulaList with the ID appended and I get a syntax error

In the last attachment, I format FormulaList as a string and the push fails

Has anyone ever done this successfully? 

 

seeq_formparams_2023-05-20 at 9.49.12 AM.jpg

seeq_formparams_2023-05-20 at 9.50.07 AM.jpg

seeq_formparams_2023-05-20 at 9.50.57 AM.jpg

Link to comment
Share on other sites

  • Seeq Team

In the code I gave you, you'll see I use the `FormulaParameters` variable like so:

'Formula Parameters': FormulaParameters

The only thing wrong with the code I gave you (so far) is that you're not handling the case where spy.search() is returning zero rows. You'll need to debug that and put in error handling code.

Has anyone ever done this successfully? 

Yes. However, I'd say that you have to understand some Python basics: What is a dictionary and how does it work, what is its syntax; what is a set and what is its syntax. I supplied some references above for learning.

  • Like 1
Link to comment
Share on other sites

Hurrah, done!  

Mark, thanks for hanging in there.  I know I was slow on the uptick, but when I finally figured out the dictionary list thing it worked.  The problem I had is that I did not know what format Formula Parameters was.  

The fixed code is attached.

My next issue is going to be getting model coefficients.  When you create a regression model with a formula instead of using the WorkBench 'Model & Predict' interface, you cannot see the statistics and model coefficients.  The Get() formula does not seem to work unless you re-specify the model formula.  I am going to create a separate topic on that.

seeq_formparams_2023-05-20 at 11.24.12 PM.jpg

seeq_formparams_2023-05-20 at 11.24.48 PM.jpg

seeq_formparams_2023-05-20 at 11.25.10 PM.jpg

PM_SIM_SetModels_230416.ipynb

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...