on 02-19-2014 7:21 PM
Hello all,
First I'm going to explain what I'm doing:
I'm using predictive analysis to create suggestions (produtcs) for the costumer. For example: who buy bread, usually buy milk.
But I'm having problem to run that algorithm (APRIORI) because I need to remove duplicates first.
Then a guy (Bimal) helped me by creating an algorithm that removes duplicates, but when I add more columns (for example from place) it gives an error. Than he suggested me remove duplicates on HANA! Does anyone here knows how can I do that?
I'm going to explain a "fake" table:
Product UserID Store Purchase Nº
A 1234 Aa 1
A 1234 Aa 1
B 1234 Aa 1
C 2345 Bb 2
A 1234 Bb 2
C 2345 Aa 3
At this example, you can see that the user 1234 bought product A 3 times, but in different Stores, does anyone know how can I do that to remove the duplicates at the same purchase number?
Regards!
Hi,
If you issue a SQL distinct * function on the table, you will be able to get all the non-duplicate records.
Not sure next how you are going to implement it. I supposed you can bring the non-duplicate records into a new table assuming you cannot remove record from the original master table.
Regards.
YS
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Jurgen,
Either you can use Distinct or use group by clause like below:
select "Product","UserId","Store","PurchaseNo"
from "BEST"."Test"
GROUP BY
"Product","UserId","Store","PurchaseNo"
Use this in a script based view on HANA side and use it further for your algorithms as input to ensure that no duplicate records comes while further using APRIORI alogrithms
Regards,
Krishna Tangudu
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
80 | |
9 | |
9 | |
7 | |
7 | |
6 | |
6 | |
6 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.