cancel
Showing results for 
Search instead for 
Did you mean: 

Problem with CHAID (PAL)

Former Member
0 Kudos

Hello everybody,

I am having problems with the PAL-CHAID Algorithm in SAP PA 2.4.

When I am using HANA online and let the CHAID Algorithm run on about 30 independent Variables

I don’t get any results back, it is loading forever.

And I don’t think I should need to wait longer. I only use VARCHAR, INTEGER and DOUBLE Data Types.

The HANA trace gives me the following:

[6154]{200947}[57/59567858] 2016-03-10 21:01:18.295235 i TraceContext TraceContext.cpp(00827) : UserName=HGUESSMANN, ApplicationUserName=hguessmann, ApplicationName=SAPVisualIntelligence

[6154]{200947}[57/59567858] 2016-03-10 21:01:18.295227 e CalcEngine ceRepositoryAccessor.cpp(00066) : RepositoryAccessor::getCalculationScenario(): for scenario 'HGUESSMANN:PAS79_3_PROC' failed

[6125]{219478}[65/59568461] 2016-03-10 21:10:05.844571 i TraceContext TraceContext.cpp(00827) : UserName=HGUESSMANN, ApplicationUserName=hguessmann, ApplicationName=SAPVisualIntelligence

[6125]{219478}[65/59568461] 2016-03-10 21:10:05.844564 e CalcEngine ceRepositoryAccessor.cpp(00066) : RepositoryAccessor::getCalculationScenario(): for scenario 'HGUESSMANN:PAS80_READER_0_PROC' failed

[24678]{219478}[65/59568590] 2016-03-10 21:10:17.362373 i TraceContext TraceContext.cpp(00827) : UserName=HGUESSMANN, ApplicationUserName=hguessmann, ApplicationName=SAPVisualIntelligence

[24678]{219478}[65/59568590] 2016-03-10 21:10:17.362365 e CalcEngine ceRepositoryAccessor.cpp(00066) : RepositoryAccessor::getCalculationScenario(): for scenario 'HGUESSMANN:PAS80_1_PROC' failed

[13937]{-1}[-1/-1] 2016-03-10 21:10:54.410049 e TrexNet Request.cpp(00741) : ERROR: new Request without host!

[13937]{-1}[-1/-1] 2016-03-10 21:10:54.410177 e Executor X2.cpp(04909) : failed to send listPlan request to  an invalid parameter was given

[13937]{-1}[-1/-1] 2016-03-10 21:11:24.411503 e TrexNet Request.cpp(00741) : ERROR: new Request without host!

[13937]{-1}[-1/-1] 2016-03-10 21:11:24.411622 e Executor X2.cpp(04909) : failed to send listPlan request to  an invalid parameter was given

[13937]{-1}[-1/-1] 2016-03-10 21:11:54.412741 e TrexNet Request.cpp(00741) : ERROR: new Request without host!

[13937]{-1}[-1/-1] 2016-03-10 21:11:54.412866 e Executor X2.cpp(04909) : failed to send listPlan request to  an

The last part goes on forever, even if I killed SAP PA in the task manager.

Here is the tricky part, when I only use a few variables I get results.

I would appreciate your help.

Using less variables is not really a solution for me.

Additional question: when do through SAP PA created Procedures "PAS##_PROC" get deleted on HANA?

Accepted Solutions (0)

Answers (1)

Answers (1)

achab
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Heiko,

Are you able to isolate which variable(s) are causing the problem to appear?

Thanks & regards

Antoine

Former Member
0 Kudos

Hi Antoine,

I did not come close to identify the trouble causing variables, because I have to restart SAP PA everytime it crashes. This is taking a long time.

The error, something like: "Execution plan aborted. Transaction rolled back" popped up one time.

I don't have the exact message anymore.

Do you have a hint, what could be wrong with the variables?

Thanks for the quick reply!

Best regards,

Heiko

achab
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Heiko,

I have no specific clues at this stage. Can you also share the PA logs (specifically after it crashes)?

Thanks & regards

Antoine

achab
Product and Topic Expert
Product and Topic Expert
0 Kudos

And maybe in parallel you should raise a SAP support ticket.

Thanks & regards

Antoine

Former Member
0 Kudos

Hi Antoine,

Sorry to ask, but where do I find the SAP PA logfile?

achab
Product and Topic Expert
Product and Topic Expert
0 Kudos

no problem Heiko

check

Former Member
0 Kudos

When I use SAP HANA Studio and use the CHAID PAL-Function directly, it is in progress forever, as well. I think there is a general problem with my data. Are there any known limitations with CHAID?


I attached the logfile.

Thank you.

Best regards,

Heiko

achab
Product and Topic Expert
Product and Topic Expert
0 Kudos

I see this

Prerequisites

● The target column of the training data must not have null values, and other columns should have at least one valid value (not null).

● The table used to store the tree model is a column table.

Note CHAID treats null values as special values.

http://help.sap.com/hana/SAP_HANA_Predictive_Analysis_Library_PAL_en.pdf

P148

Thanks & regards

Antoine

achab
Product and Topic Expert
Product and Topic Expert
0 Kudos

Also spotted this in our release restriction note - might be the reason

In Expert Analytics, HANA CHAID algorithm performance is inversely proportionate to the number of distinct values for categorical features in the training dataset. As a result, when using Expert Analytics, if you do not find the performance of CHAID optimal for your use case, it is recommended to use other decision tree algorithms such as HANA C4.5. Note: Additional configurable parameters will be exposed in a future release of Expert Analytics to allow a threshold parameter that will help stop merging of categories beyond a specified threshold. This will allow use of CHAID algorithm with all kinds of datasets.

Former Member
0 Kudos

Hi Antoine,

this is very valuable information. Thank you very much!

This should be the issue here. I tried the CHAID with SQLScript and it was also going on forever.

achab
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Heiko,

Can you please give it a try with HANA C4.5? Another very valuable option would be to use Automated Analytics Classification algorithms with delegation to APL.

Thanks & regards

Antoine

Former Member
0 Kudos

I would suggest you to applying binning using the option provided in the configuration panel in CHAID, for numerical variables...This should help...

Former Member
0 Kudos

Hi,

PAL C4.5, Auto Classification and R-Algorithms work without a problem. I recognized that there are still execution threads open in SAP HANA from calling CHAID a few days before. As I can see some from SAP PA and some from calling them from the AFM, so I need to shut them down manually.

I will move on using other algorithms or try binning with CHAID.

Thank you!

achab
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Heiko,

Good to know. Can you please kindly flag the thread as Answered if you feel like it?

Thanks & regards

Antoine