cancel
Showing results for 
Search instead for 
Did you mean: 

INSERT INTO TABLE (SELECT ...) Poor Performance

Former Member
0 Kudos

Hi folks,

I have another performance issue on HANA. I have a query Insert Into table (Select...) as follows:


insert into "SYSTEM"."SIMILARITYSCORES"

(select T."id", T."text_data", T.TOTAL_TERM_COUNT, T.SCORE

FROM TM_GET_RELATED_DOCUMENTS (

DOCUMENT 'some text here to find its related documents'

SEARCH "text_data" FROM "SYSTEM"."TABLE_TEST"

RETURN

TOP 100000

"id", "text_data"

) AS T);

This query takes 7-8 minutes to be completed. However, if I just execute select statement, it finishes instantly. Select part returns about 750 rows, so is it normal to take 7-8 minutes to insert them into another table? By the way, there is no primary key nor index on SIMILARITYSCORES table.

I tried to replaced insert into part with create table as follows, but nothing changed with performance


create table "SYSTEM"."SIMILARITYSCORES" AS

(select T."id" as "id", T."text_data" as "text", T.TOTAL_TERM_COUNT as "TOTAL_TERM_COUNT", T.SCORE as "SCORE"

FROM TM_GET_RELATED_DOCUMENTS (

DOCUMENT 'some text here to find its relevant documents '

SEARCH "text_data" FROM "SYSTEM"."TABLE_TEST"

RETURN

TOP 100000

"id", "text_data"

) AS T)

I want to find a way to improve its performance, maybe you can help me with this.

Note: TM_GET_RELATED_DOCUMENTS is a function which is coming from text mining.

Thanks in advance,

Inanc

Message was edited by: Tom Flanagan

Accepted Solutions (0)

Answers (2)

Answers (2)

Former Member
0 Kudos

Hi Inanc,

I see it no way a query optimization question. If we would talk about milliseconds then probably yes. But if insert of 750 records takes 7-8 minutes… then I’d rather think about HANA bug or something is completely wrong in your system. Which HANA revision you are using?

Try to narrow down the problem, e.g.:

  1. Replace
    create table "SYSTEM"."SIMILARITYSCORES" 
    with
    create column table "SYSTEM"."SIMILARITYSCORES
  2. Check if this is a general INSERT problem. E.g. check how much time the following statement will take:
    CREATE COLUMN TABLE TEST_TABLE as (SELECT TOP 1000 * FROM PUBLIC.M_CS_TABLES).
    In my HANA system it took 7.5s.
  3. Limit number of records to be inserted by your statement. Check how long will it take to insert different number of records e.g. 10, 100, 200
  4. Monitor thread(s) executing the statement in Performance\Thread in Studio. Check if there is any evidence of lock situation and anything else suspicious. Look at column Status if there anything like Semaphore Wait etc. Switch on Create Call Stack check box to see if the time is spent in particular piece of code. This may help to search SAP Note for already known problems or give and idea how to analyse the problem further.
  5. During or after execution of the statement check index server trace file and check if there are any messages that may be relevant and indicate potential root cause of the problem.

Regards,

Dmitry

Former Member
0 Kudos

Hi Dmitry and Bojan,

I was trying to apply your suggestions to get some idea about the core problem. Suddenly, it started to work fine. The whole process is now being executed in less than 1 second. I don't know how it fixed, but it's good now.

I am now wondering what was the problem and what changed. Is HANA not stable?  I am using version 1.00.110.00.1447753075 (fa/newdb100_rel).

Maybe (just came to my mind), is it possible that it is because of full text index operation? To be able to use text mining function (TM_GET_RELATED_DOCUMENTS), I need to create a full text index on table "TABLE_TEST" and then execute MERGE DELTA OF "TABLE_TEST". As far as I observe, indexing require some time to be completed. I wasn't able to use TM_GET_RELATED_DOCUMENTS function right after I created the full text index. I need to wait for some time. Maybe it needed more time to be able to execute all processes in normal time. Maybe, there is a bug there. I really don't know.

Anyway, I will test it couple of days. If the same problem occurs again, I will try to come back with some hints about problem as you suggested.

Note: Btw, I cannot get PlanViz, because it says "[PlanViz] unsupported statement type: (prepare) trace-only execution of update statements are not supported"

Thanks

Former Member
0 Kudos

Hi Inan,

But then I'd epxect that performance of SELECT statement would also be affected which, as you stated, was not the case.

BR,

Dmitry

Former Member
0 Kudos

I thought that maybe there was a situation on background which only affects "INSERT INTO" statement, but not "SELECT" statement. But now I think, my thought was ridiculous, you are right. It seems it wasn't the case.

As I said, I will keep testing it. If the query slows down again (hope it won't), I will be here

Inanc

Bojan-lv-85
Advisor
Advisor
0 Kudos

Hi Inanc,

without knowning th exact execution path of the query I'm afraid it will be quite hard to tell where exactly the most time is spent, except anyone sees anything obvious in there. The runtime differences can also be due to the fact that the first one needs to insert the subquery result which is a costly operation.

Anyway, a good starting point to find optimization potential is the following note

2000002 - FAQ: SAP HANA SQL Optimization

Give it a try and let us know how that goes. In any other case, maybe you could generate an executed PlanViz trace which would show where exactly the most time is spent.

Thanks! BR, Bojan