cancel
Showing results for 
Search instead for 
Did you mean: 

Problem with HANA System Replication Configuration

Former Member
0 Kudos

Hi everyone,

I'am trying to setup a system replication configuration between two developer HANA instances. (This should be a dry run on system replication to upgrade our HANA One Rev 74 to Rev 80 via replication takeover.)

The main specs

SITE_A :

     # Developer HANA Rev 80 (1.00.80.00.391861) hosted on AWS (via SAP CAL)

SITE_B :    

     # Developer HANA Rev 80 (1.00.80.00.391861) hosted on Azure (via SAP CAL)

Main Problem

One registering SITE_B as secondary system (hdbnsutil -sr_register) I get the following error:


error: only system replication chains are allowed with an aditional async secondary site: [primary] <------- [secondary] <---async--- [additional secondary] failed

I think I am missing a very general piece because the message is about a replication chain though I am creating a basic instance to instance replication.

Any help or hint would be great.

The hole story

I followed the documentation:

* http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/9049e009-b717-3110-ccbd-e14c277d8...

*

*

Enable replication mode on SITE_A went without problem:


sid-hdb:/usr/sap/HDB/HDB00> hdbnsutil -sr_state

checking for active or inactive nameserver ...

System Replication State

~~~~~~~~~~~~~~~~~~~~~~~~

mode: primary

site id: 1

site name: AWSDEV80

Host Mappings:

~~~~~~~~~~~~~~

done.

2.Step is to register SITE_B as Secondary System. Therefor I stopped the system and afterwards tried this on SITE_B maschine:


vhcalhdbdb:/vap/usr/sap/HDB/HDB00> hdbnsutil -sr_register --remoteHost=54.77.xxx.yyy --remoteInstance=00 --mode=async --name=SITEB

adding site ...

checking for inactive nameserver ...

nameserver vhcalhdbdb:30001 not responding.

collecting information ...

error: only system replication chains are allowed with an aditional async secondary site: [primary] <------- [secondary] <---async--- [additional secondary]

failed. trace file nameserver_vhcalhdbdb.00000.000.trc may contain more error details.

The log contains:


[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764448 i Basis            TraceStream.cpp(00396) : ==== Starting hdbnsutil, version 1.00.80.00.391861 (NewDB100_REL), build linuxx86_64 not set 2014-05-23 12:00:36 ld7272.wdf.sap.corp gcc (SAP release 20130806, based on SUSE gcc47-4.7.2_20130108-0.15.45) 4.7.2 20130108 [gcc-4_7-branch revision 195014]

[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764562 i Basis            TraceStream.cpp(00401) : MaxOpenFiles: 1048576

[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764592 i Memory           MallocProxy.cpp(01181) : Installed malloc hooks

[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764596 w Basis            Timer.cpp(00660) : Fallback to system call for HR timer

[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764599 i Memory           AllocatorImpl.cpp(01219) : Allocators activated

[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764601 i Memory           AllocatorImpl.cpp(01235) : Using big block segment size 16777216

[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764604 e Configuration    ConfigStoreManager.cpp(00693) : Configuration directory does not exist.

[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764607 e Configuration    ConfigStoreManager.cpp(00693) : Configuration directory does not exist.

[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764609 i Basis            ProcessorInfo.cpp(00746) : Using GDT segment limit to determine current CPU ID

[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764611 w Environment      Environment.cpp(00286) : Changing environment set IMSLERRPATH=/usr/sap/HDB/HDB00/exe//

[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764613 w Environment      Environment.cpp(00286) : Changing environment set IMSLSERRPATH=/usr/sap/HDB/HDB00/exe//

[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764617 w Environment      Environment.cpp(00286) : Changing environment set SSL_WITH_OPENSSL=0

Again, any help or hint would be great.

Cheers,

Mathias

Accepted Solutions (0)

Answers (7)

Answers (7)

greg_niecka
Participant
0 Kudos

Hi all,

when setting up replication this logical names must be real names of Linuxs ? or then can be freely choose like SITEA and SITEB ?

it is enough that I can put static entries in /etc/hosts or it must be resolved via nslookup?

Regards

GN

Former Member
0 Kudos

Hi Greg,

hdbnsutil -sr_register --remoteHost=<!!! name on the DB !!! > --remoteInstance=00 --mode=async --

name=<just a logical name to indicate the location (SITE_B,SECSITE, SITESID..).>

The important name is the "remoteHost" it should match with the name on your DB.

Regards.

Mohamed

0 Kudos

Hi all,

we are getting below issue while registering secondary,

after opening all the ports starts with 300**

error: remoteHost does not match with any host of the source site. all hosts of source and target site must be able to resolve all hostnames of both sites correctly

failed. trace file nameserver_10.104.152.11.00000.000.trc may contain more error details.

greg_niecka
Participant
0 Kudos

same here

Primary system - jdpsap1

Secondary system - jdpsap2

there is communication between hosts (ping to name, IP) from Linux level

I have to shut down HANA on "secondary" system - jdpsap2 to run replication.

Now when I try to run replication command on "secondary" system jdpsap2 I get error message that nameserver jdpsap2:30001 is not responding - it is obvious when HANA engine is stopped on jdpsap2 and nameserver is shutdown there aswell.

How to get this thing to work anyway?

Former Member
0 Kudos

Hallo Mathias,

We did setup replication, one point but we did it via hostnames. Did you check your routing tables that needs to be updated since SITE A has to be able to talk to SITEB.

We have not done it on the developer instances, but on the standarad appliances.

0 Kudos

Hi all,

I am also getting below error while doing replication can you helping on this.

[4207]{-1}[-1/-1] 2015-02-21 22:42:05.271524 e Configuration    ConfigStoreManager.cpp(00693) : Configuration directory does not exist.

[4207]{-1}[-1/-1] 2015-02-21 22:42:05.271525 e Configuration    ConfigStoreManager.cpp(00693) : Configuration directory does not exist.

[4207]{-1}[-1/-1] 2015-02-21 22:42:05.271526 i Basis            ProcessorInfo.cpp(00746) : Using GDT segment limit to determine current CPU ID

[4207]{-1}[-1/-1] 2015-02-21 22:42:05.271527 w Environment      Environment.cpp(00286) : Changing environment set IMSLERRPATH=/usr/sap/xxx/HDB00/exe//

[4207]{-1}[-1/-1] 2015-02-21 22:42:05.271528 w Environment      Environment.cpp(00286) : Changing environment set IMSLSERRPATH=/usr/sap/xxx/HDB00/exe//

[4207]{-1}[-1/-1] 2015-02-21 22:42:05.271530 w Environment      Environment.cpp(00286) : Changing environment set SSL_WITH_OPENSSL=0

********> hdbnsutil -sr_register --remoteHost=vsa323634.ash.od.sap.biz  --remoteInstance=00 --mode=syncmem --name=vadb**

adding site ...

checking for inactive nameserver ...

nameserver *******:30001 not responding.

collecting information ...

unable to contact primary site host vadb****:30102. connection refused,location=vadb***:30102. Trying old-style port...

error: unable to contact primary site; to vadb***:30001; original error: connection refused,location=vadb***:30001

failed. trace file nameserver_vadb***.00000.000.trc may contain more error details.

Regards,

Srini

Former Member
0 Kudos

Hi Srinivas,

There seems to be communication problem between your primary and secondary sites. check if ping and telnet to name server port/index server port is working fine ?

Regards,

Pavan Gunda

saphanabcn
Newcomer
0 Kudos

Hi Srinivas,

Have you tried a netstat from the primary server "SITEA"?

nestat -na | grep <ip_SITEB>

Check if you can see any signal from the other site.

0 Kudos

Hi Sergio Gonzalez Hitachi

Have you tried a netstat from the primary server "SITEA"?

nestat -na | grep <ip_SITEB>

it is not showing anything

but telnet and ping is working fine.

0 Kudos

Hi Hitachi,

finally it is working,

Issue:

statisticsserver is embedded @ primary site.

hostname is having same name in the both hosts.

solution:

make the target side also embedded statsserver

change the hostname on target side

then issue is resolved.

thank you for your response

Regards,

Srini

Former Member
0 Kudos

Hi Mathias

Could you reach for a solution out? I have the same issue in a newer version, SP85

- I followed the same guideline.

- Both systems have exactly the same configuration (different OS parameters, of course).

[PRIMARY] known as PRIME >> $ HDB version

HDB version info:

  version:             1.00.85.00.397590

  branch:              NewDB100_REL

  git hash:            not set

  git merge time:      not set

  weekstone:           0000.00.0

  compile date:        2014-11-12 12:39:22

  compile host:        ld7272.wdf.sap.corp

  compile type:        opt

[SECONDARY] known as CONTINGENCIA >> $ HDB version

HDB version info:

  version:             1.00.85.00.397590

  branch:              NewDB100_REL

  git hash:            not set

  git merge time:      not set

  weekstone:           0000.00.0

  compile date:        2014-11-12 12:39:22

  compile host:        ld7272.wdf.sap.corp

  compile type:        opt

--- X --

STEPS TO ENABLE REPLICATION

--- X ---

From PRIME

hdbnsutil -sr_enable --name=PRIME

From CONTINGENCIA

hdbnsutil -sr_register --remoteHost=<PRIME_hostname> --remoteInstance=00 --mode=sync --name=CONTINGENCIA

adding site ...

checking for inactive nameserver ...

nameserver <CONTINGENCIA_hostname>:30001 not responding.

collecting information ...

// This command runs forever....

--- X --

Checkin from PRIME

--- X ---

$ hdbnsutil -sr_state

checking for active or inactive nameserver ...

System Replication State

~~~~~~~~~~~~~~~~~~~~~~~~

mode: primary

site id: 1

site name: PRIME

Host Mappings:

~~~~~~~~~~~~~~

<PRIME_hostname> -> [PRIME] <PRIME_hostname>

<PRIME_hostname> -> [CONTINGENCIA] <CONTINGENCIA_hostname>

$ hdbcons -e hdbindexserver "replication info"

SAP HANA DB Management Client Console (type '\?' to get help for client commands)

Try to open connection to server process 'hdbindexserver' on system '<SID>', instance '<instance>'

SAP HANA DB Management Server Console (type 'help' to get help for server commands)

Executable: hdbindexserver (PID: 37303)

[OK]

--

Dumping replication statistics ...

Replication Primary Information

===============================

System Replication Primary Configuration

[system_replication] logshipping_timeout                      = 30

[system_replication] enable_full_sync                         = false

[system_replication] preload_column_tables         = true

[system_replication] ensure_backup_history         = true

[system_replication] enable_ssl                    = off

[system_replication] datashipping_snapshot_max_retention_time = 7200000000

- lastLogPos               : 0x2bd106c0

- lastLogPosTimestamp      : 06.02.2015-12.10.55 (1423224655944076)

- lastSavepointVersion     : 14684

- lastSavepointLogPos      : 0x2bd0fb02

- lastSavepointTimestamp   : 06.02.2015-12.07.58 (1423224478129835)

0 session registered.

[OK]

--

[EXIT]

--

[BYE]

So no replication can be established.

I've checked at SAP HANA STUDIO >> LANDSCAPE >> SYSTEM REPLICATION from PRIME and both servers are shown, but nothing else.

- Do I have to check anything else on both systems?

- Which LOG file do I have to review?

Thanks!

Former Member
0 Kudos

Hi Sergio,

I'm sorry but i couldn't resolve my problem yet and I have no good advice for your problem.

Maybe the adminstration guide can help you. (But I'm pretty sure that you already read through it) http://help.sap.com/hana/SAP_HANA_Administration_Guide_en.pdf

Two points I missed for some time:

# Do you configured the hostname resolution as described in 4.1.3.16 of the guide.

# Are the hostnames of both systems really different? On my machines DEV HANA hostnames are every time vhcalhdbdb and HANA ONE instances are every time hanaserver. And i think


collecting information ...

// This command runs forever....

indicates that that hostNames could not be resolved correctly.

Cheers,

Mathias

Former Member
0 Kudos

Hi again Mathias.

I'm also sorry because of you

Well, as you said, I read about this in the Ad. Guide with no success.

And yes, hostnames are different in both systems.

However the command runs forever and it seems a hostname resolution problem, through netstat command I can see an "ESTABLISHED" messages in both systems, so there's kind of communication!

I'm still stuck on that....

THANKS!

Former Member
0 Kudos

Hello Sergio,

How are you ?

I have exactly the same issue.... after registering the secondary site.

checking for inactive nameserver ...

nameserver lr002:30001 not responding.

collecting information ...


"// This command runs forever...."


and as well I have established connection.

1st Site

hdbnamese 61026 pr0adm   31u  IPv4 3445012  0t0  TCP lp002pr0:30102->lr002:64334 (ESTABLISHED)

2nd site

hdbnsutil 28773 pr0adm   13u  IPv4 89708833  0t0  TCP lr002:64334->lp002pr0:30102 (ESTABLISHED)

Did you sovle the issue ?

Cheers

Mohamed

Former Member
0 Kudos

Hi Guys,

The sr_register command appearing to hang can also indicate an MTU size problem on the network connection between the nodes. If you are using Jumbo frames then drop down to an MTU of 1500 and see what happens.

Sander

Bojan-lv-85
Advisor
Advisor
0 Kudos

Hello Mathias,

did you try it via HANA Studio?

What is the hostname of your primary site?

Could you provide the content of global.ini -> [system_replication] from both sites?

Thx!

BR, Bojan

Former Member
0 Kudos

Hi Bojan,

I have to say I'm a bit confused by the term hostname in this context, but in my mind the hostnames are

  • SITE_A (primary) : 54.77.xx.yy (so my AWS elastic IP) - I also tried the public DNS (ec2-54-77-xx-yy.eu-west-1.compute.amazonaws.com)
  • SITE_B (secondary) : hanadevrev8-han-4c7xxxx-xxxx-xxxx-abad-1c2xxxxxx3d3.cloudapp.net

Content of the global.ini

  • SITE_A (primary) :

[auditing configuration]

.... < five entries >

[persistence]

.... < five entries >

[system_replication]

mode = primary

actual_mode = primary

site_id = 1

site_name = AWSDEV80

  • SITE_B (secondary)

[persistence]

.... < two entries >

     So, there is no system replication entry at all.

I tried it with the studio as well, but it reports the same error:

Thank you for your quick response.

Cheers

Bojan-lv-85
Advisor
Advisor
0 Kudos

Hello Mathias,

are the hosts reaching each other via IP? I also tried it on my two machines deployed in our cloud and faced connectivity issues when using hosts under cetain domains  not exactly the error you got but I resolved it in leaving an entry in the /etc/hosts file on each host mapping the ip to the hostname (without domain).

Maybe its worth a try, otherwise I am running out of ideas here as the error itself you are reporting is quite exotic and definetively misleading.

Let me know your results!
BR, Bojan

Former Member
0 Kudos

Hi Mathias,

I know this could be a little strange but can you try doing a System Replication between SITE_B to SITE_A instead of SITE_A to SITE_B

Would be useful to see if SITE_A would report the same errors we see on SITE_B when System Replication is being setup

Otherwise the error you see is not some thing we have come across so far and it may need SAP attention to dig deeper

Former Member
0 Kudos

Hi Mathias,


1. Can you check the result of hdbnsutil -sr_state on SITE_B

2. Try the sr_register on SITE_B with the MODE= SYNC or SYNCMEM

We need to see if there is a different errorfor other modes compared to ASYNC mode

Send me the results and we can check accordingly

Former Member
0 Kudos

Runnig on SITE_B:


vhcalhdbdb:/vap/usr/sap/HDB/HDB00> hdbnsutil -sr_state

checking for active or inactive nameserver ...

nameserver vhcalhdbdb:30001 not responding.

nameserver vhcalhdbdb:30001 not responding.

System Replication State

~~~~~~~~~~~~~~~~~~~~~~~~

mode: none

done.

Test sync and syncmem shows no difference.


vhcalhdbdb:/vap/usr/sap/HDB/HDB00> HDB info

USER       PID  PPID %CPU    VSZ   RSS COMMAND

hdbadm    5208  5207  0.0  12404  2612 -sh

hdbadm    3519  3515  0.0  12408  2636 -sh

hdbadm    7278  3519  0.0  11440  1592  \_ /bin/sh /usr/sap/HDB/HDB00/HDB info

hdbadm    7301  7278  0.0   4664   572      \_ ps fx -U hdbadm -o user,pid,ppid,pcpu,vsz,rss,args

hdbadm    2964     1  0.0 153560 89336 /usr/sap/HDB/HDB00/exe/sapstartsrv pf=/usr/sap/HDB/SYS/profile/HDB_HDB00_vhcalhdbdb -D -u hdbadm

vhcalhdbdb:/vap/usr/sap/HDB/HDB00> hdbnsutil -sr_register --remoteHost=54.77.xxx.yyy --remoteInstance=00 --mode=sync --name=SITEB

adding site ...

checking for inactive nameserver ...

nameserver vhcalhdbdb:30001 not responding.

collecting information ...

error: only system replication chains are allowed with an aditional async secondary site: [primary] <------- [secondary] <---async--- [additional secondary]

failed. trace file nameserver_vhcalhdbdb.00000.000.trc may contain more error details.

vhcalhdbdb:/vap/usr/sap/HDB/HDB00> hdbnsutil -sr_register --remoteHost=54.77.xxx.yyy --remoteInstance=00 --mode=syncmem --name=SITEB

adding site ...

checking for inactive nameserver ...

nameserver vhcalhdbdb:30001 not responding.

collecting information ...

error: only system replication chains are allowed with an aditional async secondary site: [primary] <------- [secondary] <---async--- [additional secondary]

failed. trace file nameserver_vhcalhdbdb.00000.000.trc may contain more error details.

The trace file aren't different as well.

So, I would say there is no difference between the modes.

Thanks a lot.

Cheers.

Former Member
0 Kudos

Hi Mathias,

I got exactly the same problem today.

Replicating

B1X/hosta 97.04 to B1X/hostb 112.02

When I ran hdbnsutil -sr_register on hostb, I got the same error as you.

I manged to resolve the problem.

My issue was B1X was initially a copy of our production system which has replication on.

So when I ran hdbnsutil -sr_state on my primary server, it showed replication config from my Production system!

To fix this. I shut down HANA on both systems.

Ran "hdbnsutil -sr_cleanup –force" on both system.

Next I started HANA on hosta. Enabled replication on hosta then ran hdbnsutil -sr_register on hostb.

This time I did not get the error.

Replication still had problems because I was replicating SPS9 to SPS11 and there is a bug, see

"2312539 - Near Zero Downtime Upgrade To SAP HANA Database SPS11 With ASYNC Replication Mode Doesn't Work"


On siteB I added


[system_replication]

enable_send_ack_in_async_mode = false


to  global.ini


then restarted HANA on siteB , now the replication started!


Regards


Tom