cancel
Showing results for 
Search instead for 
Did you mean: 

Webi variable: Count word frequency in a string

antonette_oberholster
Active Contributor
0 Kudos


Hallo guys

This is a tricky one. The requirement is that there is a free text field in the table, and we need to count the word occurences in those strings all together.

i.e. row 1 - My name is Antonette

      row 2 - Did I tell you my name?

result:

    my : 2

    name : 2

    is : 1

etc.  

The result should be a tag cloud to show which words stand out.

How do we achieve this in Webi?

We are on 4.1, the data is in a SQL db.

Regards

Antonette

Accepted Solutions (1)

Accepted Solutions (1)

antonette_oberholster
Active Contributor
0 Kudos

or any of the reporting tools . . ..

Former Member
0 Kudos

Hello Antonette,

I am not getting you.

you want to know the count("My name is Antonette") or you want to know the count of each word like count ("my") , count("name").

I am assuming the above all text is coming from one field.

kindly clarify your question.

Regards,

S Babu

antonette_oberholster
Active Contributor
0 Kudos

Hallo Kommuri

lets say it is a field where people have to describe their pets. Everyone has a free text field that contains the description.

We want to know, throughout the entire dataset, what was the general feeling round their pets, by counting the words that everyone mentions in their descriptions.

result: ranked top 10:

dog : 321

happy : 236

playfull : 148

laugh : 69

smile : 55

chew : 26

cat : 19

......

so we can then analyze free text field by identifying which words come to mind, in this case, you can make the assumption that most people have dogs and it makes them happy.

in other words, a grouped summation of all the words.

Hope this helps

Regards

Antonette

former_member201488
Contributor
0 Kudos

IMO, you'll be hard pushed to do this with WEBI. The easiest way I would do this is to export your text string to Excel, use the Text to Columns function to strip out each individual word into its own column (Excel 2010 should support 16K columns), paste-special /transpose this range to it's own column, copy this and perform a remove duplicates operation, then do some count-ifs on the individual words versus the original (duplicates) range. If your Excel-Fu is good, should only take 5 minutes at most.

Doing it for many responses might be laborious, but I don't think the SAP BI stack is your best tool for the job.

HTH

NMG

nikhil_joy2
Active Contributor
0 Kudos

Hello Antonette,

Your requirement is a typical case of unstructured data analysis and unfortunately webI is no the right tool for it. Unstructured data analysis is something you have to do either in database level or ETL level as shown in below examples. Once you have the key words with count available, you can use WebI or Lumira to represent it.  SAP HANA supports unstructured data analysis, and most of the ETL tools also support this logic.

check below links for more understanding.

Regards,

Nikhil Joy

antonette_oberholster
Active Contributor
0 Kudos

Thanks Nikhil!

antonette_oberholster
Active Contributor
0 Kudos

Hallo Neil

Yes it should take only 5 minutes, but in that case I would rather use some of the out-of-the-box word frequency counters.

Thanks anyway

Answers (0)