cancel
Showing results for 
Search instead for 
Did you mean: 

Cleaning Data - How to change content of a column based on values in another column

shishir_kumar3
Explorer
0 Kudos

I have a data set containing the following (sample):

CodeDateNameColorSizeQtyPrice
89021110000110.02.2014XYZ BLACK3911885
89021110000111.02.2014XYZBLACK3911885
89021212100115.02.2014XYZBLACK3911885
1200100XYZ0139180019.03.2014XYZBLACK3911885
1200100XYZ0139180027.04.2014XYZBLACK3911885
8902111000123.05.2014XYZBLACK3921885

The Code column contains incorrect entries. The Correct Code is 890211100001.

How do I replace all incorrect codes based on Column "Name", "Color", "Size". (i.e. if Name = "XYZ" and Color = "BLACK" and Size = 39 then replace all code with "890211100001" ?

I can do it in excel using formulas but how to do this in Lumira?

Regards

Shishir

Accepted Solutions (0)

Answers (1)

Answers (1)

TammyPowlas
Active Contributor
0 Kudos

I looked at Replace - but that won't work because you have multiple columns

The best I can come up with is creating a calculated dimension - in my case I called it "Clean"

Here is the formula:

if {Color} = "BLACK" and {Name} = "XYZ" and {Size} = 39 then "890211100001" else {Code}

Here are the results:

If I have time I'll try to document in a document the steps.

shishir_kumar3
Explorer
0 Kudos

Hi Tammy,

Thanks for the quick response.

My problem is not limited to only one SKU

In a data set containing approximately 1 million rows, around 10k rows have this error for different name/color/size combination. So hardcoding for each sku is not going to be a practical solution.

regards

Shishir

TammyPowlas
Active Contributor
0 Kudos

In that case, moving the cleansing part might be better served by a tool like Data Services, which has lookups, transformations, etc.