on 03-23-2016 4:47 AM
Hi expert,
I would appreciate your help about logic for calculated column,
My requirement is to calculate repeated auto-part failure for the same part within 90 days, I have PART_CD and have RPAR_DATE, I don't have problem calculating the total of repeated parts but in my requirement it will only be counted as repeated if it is within 90 days. My options are only for graphical calculation view.
I would appreciate any help in this regards please let me know if more clarifications needed.
Hi Iamyae,
Please use this Formula in Calculated column.
If ("RPAR_DATE" > (ADDDAYS (now(), -90)), 'REPEAT', 'NOT')
Example,
Source Data:
In Analytic/Calculation View, Create calculated column and Use the formula.
The Output - Product Repaired within 90 Days will be,
PART_CD: 1 repeated 3 Times within 90 Days
PART_CD: 2 repeated 1 Time within 90 Days
Regards,
Muthuram
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Muthuram,
Thank you for your answer, but in that calculation I don't see any relation between the date and PART_CD, this will give me any RPAR_DATE that fall within 90 days independent from the PART_CD. The reason I am saying that because multiple attribute are involved in the repair date not just PART_CD.
Ok, let's try us this way . I shall explain my plan first with SQL and then try to build a model with graphical calculation view .
Step1 : For every product , I shall determine what is the first, second, third and so on failure.
Step 2. I shall join First failure with Second failure to have both the dates in one row. In general n and n+1 failure date would be in the same line.
Step 3: Now, for every failure I have the previous failure date in the same row , so I can easily know what is the number of days between them .
Screenshot clearly shows you , I can have my previous failure ( if exist) in the same line . My base table data is as below.
Now, I shall try to achieve the same with Graphical Calculation view .
Rank_1 : I am generating failure number for every Part ,
Rank_2 : Sames as Rank_1.
Projection_2 : Converting Rank_Number to NVARCHAR.
Projection_3 : Converting Rank_Number to NVARCHAR and then one more calculated attribute with formula = Rank_Number + 1 .
Join_1 : Joining Projection_2 with Projection_3 with PART_CD= PART_CD and FAILURE = PREV_FAILURE . Join Type is Left outer join .
Aggregation: Calculating the difference between two dates .
Providing some screenshot to explain the model .
Rank_2 :
Projection_2 :
Join:
Aggregation:
Result : records highlighted are your intended result
Regards
Anindya
Thank you Anindya for your time and hard work.
It looks like I didn't put it clearly, in your calculation it shows that you are calculating the repeated days, however my requirement is about repeated parts within 90 days.
E.g: if PART_CD = X repaired on 2016-01-01 that's 0 repeat
the same PART_CD came back for repair on 2016-02-01 that's 1 repeat because we don't count first as 1 repeat it's was normal failure. So I know when I come up with the repeat failures for each part I will deduct 1 to get the right result of repeats.
sample of report is:
we ran a report with prompt of begin_date = 09-01-2015 end_date = 02-29-2015
PART_CD Repeat failure # Total Claim # % Claim with repeat failure
11111 3 15 20%
22222 28 247 11.3%
And these failures are within 90 days of consecutive RPAR_DATE
I hope I am clear this time.
Hi
You can easily count the number of records for each part. My calculation shows Repeat failure for every part . If you see carefully you would find five records for Part Code P1 .
Out of five , 4 are repeat failure for P1 because first one we need to omit . But only 2 happened within 90 days .
Regards
Anindya
Hi Anindya,
Thank you again that was very detailed answer, I can take it from here. I will have to just create another calculated column where it count between days =< than 90 that will be my repeat.
One more clarification, the example I gave was for just one calculation that I need, however in my requirement I need to count PART_CD, SER_NO, and another code. Can I achieve that by adding the columns for calculation within the same ranking and projection or I need to create separate ranking and projection for each attribute that I need to count?
In RANK node you can add Column in "Partition By" section , but whether that would answer your business question or not , that would depend on the scenario.
Let's say you have same parts in different country and you add "Country" column to Parition By, then your ranking would change accordingly.
Country Parts Failure_Rank Date
India P1 1 20150101 ( YYYYMMDD)
India P1 2 20150530
Morocco P1 1 20160204
Morocco P1 2 20160301
Netherlands P2 1 20150725
Regards
Anindya
Yes based on my requirement,
I have PART_CD that is the same in every country, so we are calculating the repeated parts for the same serial number within 90 days, from all dealers no matter where the dealer is located (city, state and country). So it looks like I will get the same repeat even if I add country or other attribute, because my concern is to calculate whenever the repair done for a specific serial number for the same part. And other question what was the reason of using ranking instead of aggregations.
We need to join every failure with its previous failure. So, first we are getting the rank in one node and in another node we are again generating rank but then adding 1 to it.
Let's say part P1 fail on 1st Jan, 2nd Feb, and 3rd march. Now, we want to join 1st with 2nd and 2nd with 3rd .
RANK NODE 1 | RANK NODE 2 | ||||
Date | Rank | Date | Rank | Rank 2 | |
1-Jan | 1 | 1-Jan | 1(+1) | 2 | |
2-Feb | 2 | 2-Feb | 2(+1) | 3 | |
3-Mar | 3 | 3-Mar | 3(+1) | 4 |
If you join these two data set with rank , then 2nd record on the left side would come together with first record on the right side and so on. Then we can determine the difference between 2nd Feb and 1st of Jan .
Hope this helps.
Regards
Anindya
Thank you again.
I did get your point, I am trying the same logic you suggested, I have created 2 rank nodes in one of them created rank_Column and in projection used it in calculated column as PREV_FAILURE = string("rank_Column" + 1)
on the second rank node created same rank_column and projection created calculated column as
FAILURE = "RANK_COLUMN" and both are NVARCHAR, and the rank properties: Ascending (Bottom N) Threshold = 5000, order by RPAR_DATE, partition by PART_CD.
and joined both projection from one projection I have renamed RPAR_DATE to PREV_FAIL_DATE and joined it to RPAR_DATE in the second projection and joined FAILURE to PREV_FAILURE but none of them is selected for output
I am getting DAYS_BETWEEN, and PREV_FAIL_DATE with PART_CD.
I am sure this is in the right direction.
At this point we are checking days in between RPAR_DATE and PREV_FAIL_DATE and if it's =< 90 days it will count it as a repeat, and this only counting consecutive days not all days in any given 90 days. Also we are calculating the repeat failure of this part which is universal, we need to calculate the repeat of a given part related to a specific serial number that's the only time it will count as repeat, when the same engine has same part replaced multiple times within 90 days.
So in this case do I add serial number as an attribute and join it between the 2 projection to achieve what I am looking for or you have a different approach?
Please advise and I thank you for your time.
Thank you Anindya,
This ranking procedure took care of the issue I am getting the correct repeats,
one more issue that I am facing is that in my requirement besides the repeats I am calculating the total of all the failures including the first visits, so now I have 2 different aggregations from one I am getting the repeats for each part and Serial number and from the other aggregation I am getting the total of all failures, once I join them then and include all the columns then the total get split, and when I try to get the percentage I get wrong percentage please advised how I can overcome this issue.
Thank you again.
This is My model, so in the rank1 I am concatenating Seag_cd with Ser_No to use distinct combination (Seag_cd = part_cd), and order by RPAR_DATE
same thing rank 2, projection 2 converting rank_column to nvarchar, projection 3 converting to rank_column to nvarchar + 1
from the total_Seag I am using just Seag_Cd, DLR_CD and the concatenated column to get the total, after I join them if I do data preview using all the columns I am getting good numbers, not the exact ones though, the reason because I am calculating all the failures, but I will need to deduct 1 from every cycle to get the exact number of repeats, I tried using failures - 1 but my results all change to 0.
another issue I am facing if I do analysis instead of data preview using only SEAG_CD, Failures and total_Failures, the Failures becomes more than the total_Failures which is wrong.
I do appreciate your time and dedication. the bellow is my result set.
Few suggestions here:
1. Projection_1 giving date to both Rank nodes . Use 2 separate projection , model would be easier to read and less complex. You can easily copy project 1 and name it something else.
2. Calculate total number of failure at desired level before joining them at Join_2 node.
If you are using dummy key figure ( =1 ) , you cannot make -1 for every row. Essentially that would make 1-1 = 0 and sum will be zero. You have to sum it first for all failures and then subtract 1 from the total .
You can also take data from Rank_1 node with a filter of Rank Not Equal to 1 or Greater than 1 . This way, you will always omit the first failure . So, you can derive total number of repeat failure.
Also explore "Counter" in Calculation View .
Regards
Anindya
Thank you Anindya,
It looks like I am about to get the final product, my only challenge now is that at the top layer as you can see from my model design, I have one aggregation where I am getting repeats and another one to get the total, if I test each one separate I am getting the right numbers, however when I joined them or union them I am getting wrong results, let me explain,
when I do left join to the total I am getting lot of null values, if I do left join to the repeats I am getting less total, if I do union I am getting lot of null values, I tried replacing nulls with 0 but I get wrong values when display results. Please advise.
Thank you much for your time.
Hi,
Try to create two aggregate column and take max and min from date.
then create union on top of these two aggregate column and merge
on top of union, put aggregated node and take all fields as ' Add aggregated columns'
then add following formula to calculate date
I hope, this will solve our problem
-Vikram Divekar
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi lamyae
Can you please elaborate more with examples?
Shirish.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Shirish,
My requirement is to calculate the repeat failure for the same PART_CD within 90 days, so if the same part is repaired 2 times within 90 days that's considered as one repeat because we don't consider the first fix as repeat only the second time, and if it's repeated 3 times then that's 2 repeats. The columns I have available are RPAR_DATE, PART_CD among other attributes.
User | Count |
---|---|
83 | |
10 | |
10 | |
9 | |
7 | |
6 | |
5 | |
5 | |
4 | |
3 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.