Application Development Discussions
Join the discussions or start your own on all things application development, including tools and APIs, programming models, and keeping your skills sharp.
cancel
Showing results for 
Search instead for 
Did you mean: 

MDMP unicode conversion - Vocabulary maintenace

Former Member
0 Kudos

Hello,

W have upgraded 4.5B MDMP system to ECC4 EHP4 and now we are in process of unicode conversion.

please advice on as specified below.

In SPUMG Scan process, we have completed the following scans:

a) Tables without language Info

b) Tables with Ambiguous language Info

c) Indx Analysis

For maintaining vocabulary we have implemented notes 756534 & 756535, 871541. and Tables with language are also scanned. While handling vocabulary the system has displays 539158 with duplicates and 189314 when duplicates are discarded, which doesnot assigened any languages. We have 9 acive languages.

As per Unicode conversion document i left out with only 2 more option a) HInt management b) Manual assignment

How to handle this? is this a normal situation or any correction can be done?

Thanks and Regards

Vinay

1 ACCEPTED SOLUTION

nils_buerckel
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Vinay,

This Thread belongs to the Unicode forum:

In my opinion one of the best and most efficient possibility to reduce the number of words

is to make use of hints. Please follow the Unicode Conversion Guide and SAP note 1034188.

Application colleagues should be contacted in order to find proper hints.

Please also note that report umg_vocabulary_statistic or um4_vocabulary_statistic is the best tool to evaluate tables for hint processing.

And the rest of the vocab, which can not be assigned by the hints method or by the SAP notes you mentioned, need to be assigned by native speakers (manual effort !).

As you can see, SPUMG is NOT just a tool to be executed by Basis staff.

It needs in deed collaboration between Basis, Application and native speakers.

And it should be clear that the duration of the scans can be quite high - therefore a trial & error approach will take time (especially for large systems).

Best regards,

Nils Buerckel

SAP AG

10 REPLIES 10

nils_buerckel
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Vinay,

This Thread belongs to the Unicode forum:

In my opinion one of the best and most efficient possibility to reduce the number of words

is to make use of hints. Please follow the Unicode Conversion Guide and SAP note 1034188.

Application colleagues should be contacted in order to find proper hints.

Please also note that report umg_vocabulary_statistic or um4_vocabulary_statistic is the best tool to evaluate tables for hint processing.

And the rest of the vocab, which can not be assigned by the hints method or by the SAP notes you mentioned, need to be assigned by native speakers (manual effort !).

As you can see, SPUMG is NOT just a tool to be executed by Basis staff.

It needs in deed collaboration between Basis, Application and native speakers.

And it should be clear that the duration of the scans can be quite high - therefore a trial & error approach will take time (especially for large systems).

Best regards,

Nils Buerckel

SAP AG

0 Kudos

Hi,

Thanks for the guidance.

Finally we reduced the count to lessthan 10000 from total count of around 5Laks. Hint management was very effective and with heavy manual efforts.

We are expecting tat count can be reduced around 5000.

We have some queries :-

a)If we leave these remaining words unnassigned and proceed, will it cause any issue?

b) In unicode conversion document it is said like Data corruption can occur if vocabulary not maintained, what is exactly data corruption mean?

c) Can i assign english as a language to all remaining words?

d) is it possible to assign/maintain words during or after post unicode conversion?

Kindly suggest.

Regards

Vinay

nils_buerckel
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi,

a) This will cause additional entries in the reprocessing scan. If you do not maintain these, then the according entries will come up in SUMG in manual repair. However I would strongly recommend to maintain the vocab as much as possible. If you want to maintain the data via manual repair in SUMG - this usually needs to be done during downtime in the PRD system !

b) data corruption means here, that a wrong code page might be used during the conversion for a certain db entry - hence the data might not be readable anymore.

c) You can assign EN. However then the data is converted based on the code page assigned to EN in the SPUMG/SPUM4 - and it will not appear in SUMG anymore. Hence if this is the wrong code page, data is converted wrongly, but you will not be able to find out - users have to find it later in the Unicode system and then manual repair is possible. I would not assign EN, if you do not know the code page.

d) Please have a look at the Guide - section on SUMG. However, this is a REPAIR - as data is already converted.

Best regards,

Nils Buerckel

SAP AG

0 Kudos

Hi Vijay,

Please share cookbook or document which you have followed for MDMP Unicode conversion. I have done the Single code page unicode conversion earlier but MDMP is very complex and have not done yet.

If you can share your document and steps it will be great help.

Big regards

nils_buerckel
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Nikhil,

please have a look at:

[original link is broken]

in addition, this link might give you some hints:

/people/chris.kernaghan/blog/2010/07/05/starting-an-mdmp-unicode-conversion

Best regards,

Nils Buerckel

SAP AG

Edited by: Nils Buerckel on Jul 14, 2010 10:10 AM

0 Kudos

Hi Nils,

We are currently doing MDMP conversion and trying to understand what happens if I assign "Unknown Language" to the words in Vocabulary. The customer also dont know the language for few hundred words. In this case we are thinking to assign the "Unknown Language" in Vocabulary.

But we would like to make sure that there dont be any data correption and at the least they will still appear in SUMG for the processing.

Incase if I leave them with know language assignment and also if I dont do the manual repair in SUMG after the conversion(as I dont know to which language these entries belongs to) then would it lead to any data loss or correption?

Please advise and share your thoughts.

Thanks & Regards,
N.Amarnath

nils_buerckel
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi  N.Amarnath,

the difference between language '?' <unkwown> and language <empty> in the vocabulary is rather small.

The unknown language was invented in order to achieve the following: The words with unknown language do not appear for selection: “language = <empty>”. This is a typical selection in the vocabulary view to identify those words which have not been checked at all. Otherwise (e.g. in reprocessing and SUMG) there is no difference between language <empty> and language <unknown>.

Hence the goal would be that all words in the vocab are assigned - either to a supported language or to language <unknown>.

Best regards,

Nils Buerckel

0 Kudos

Hi Nils,

Thank you very much for the response, though it is a old thread I found only this as relevant and gave a try.

From all your responses in this thread, I understood that if we assign some language to the words which we are not sure then there is a possibility of wrong conversion for them and those will not appear in SUMG(and the this data will not be in readable format).

It makes much sense if we leave the words with out any language assignment(rather with "unknown" language) so that they will still appear in SUMG for the repairs after the Unicode conversion.

Here I got another challenge, as you suggested earlier in this thread, all the repairs needs to be done before we go live in the production environment. What if we don't repair them and go live ( I am afraid if the transactions that are dependents on this data would fail?). Or what is the potential impact of this.

The reason why I am asking this is that the team is not able to identify the language to which these words belongs to.

Please provide your valuable suggestions that would help us to define the path forward.

Thanks & Regards,

N.Amaranth

nils_buerckel
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Amarnath,

I actually did not hear of any case where a program or a transaction did not work anymore due to a wrong conversion of a string - I think this risk is rather low ...

One example for a problematic situation would be that a user finds some master data - e.g. address data, which was converted incorrectly. Now this user corrects one word in this address - let's assume the street name. However the city name could be also incorrect, but this was not corrected by the user. Now it is not possible to repair this table entry via SUMG (because some words were already corrected manually). All words in this entry have to be repaired manually. This does not sound very dangerous, but if this happens with many entries it can be a lot of effort to get it fixed. And of course there is the risk that the erraneous address is used in forms which are sent to customers.

Hence if you are unable to find out the language of the according word I would first try to find out in which table this data is stored and then find out consequences due to erraneous data in the according application.

Please note that e.g. in table INDX in some cases there is binary data (which has no meaning).

Best regards,

Nils

0 Kudos

Hi Nils,

Thank you very much for your valuable information. I will take this to business/native-speakers to fix and/or to analyze the risk and define path forward.

Thanks & Regards,

N.Amarnath