on 02-03-2015 1:18 PM
Hi everyone,
I'am trying to setup a system replication configuration between two developer HANA instances. (This should be a dry run on system replication to upgrade our HANA One Rev 74 to Rev 80 via replication takeover.)
SITE_A :
# Developer HANA Rev 80 (1.00.80.00.391861) hosted on AWS (via SAP CAL)
SITE_B :
# Developer HANA Rev 80 (1.00.80.00.391861) hosted on Azure (via SAP CAL)
One registering SITE_B as secondary system (hdbnsutil -sr_register) I get the following error:
error: only system replication chains are allowed with an aditional async secondary site: [primary] <------- [secondary] <---async--- [additional secondary] failed
I think I am missing a very general piece because the message is about a replication chain though I am creating a basic instance to instance replication.
Any help or hint would be great.
I followed the documentation:
Enable replication mode on SITE_A went without problem:
sid-hdb:/usr/sap/HDB/HDB00> hdbnsutil -sr_state
checking for active or inactive nameserver ...
System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~
mode: primary
site id: 1
site name: AWSDEV80
Host Mappings:
~~~~~~~~~~~~~~
done.
2.Step is to register SITE_B as Secondary System. Therefor I stopped the system and afterwards tried this on SITE_B maschine:
vhcalhdbdb:/vap/usr/sap/HDB/HDB00> hdbnsutil -sr_register --remoteHost=54.77.xxx.yyy --remoteInstance=00 --mode=async --name=SITEB
adding site ...
checking for inactive nameserver ...
nameserver vhcalhdbdb:30001 not responding.
collecting information ...
error: only system replication chains are allowed with an aditional async secondary site: [primary] <------- [secondary] <---async--- [additional secondary]
failed. trace file nameserver_vhcalhdbdb.00000.000.trc may contain more error details.
The log contains:
[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764448 i Basis TraceStream.cpp(00396) : ==== Starting hdbnsutil, version 1.00.80.00.391861 (NewDB100_REL), build linuxx86_64 not set 2014-05-23 12:00:36 ld7272.wdf.sap.corp gcc (SAP release 20130806, based on SUSE gcc47-4.7.2_20130108-0.15.45) 4.7.2 20130108 [gcc-4_7-branch revision 195014]
[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764562 i Basis TraceStream.cpp(00401) : MaxOpenFiles: 1048576
[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764592 i Memory MallocProxy.cpp(01181) : Installed malloc hooks
[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764596 w Basis Timer.cpp(00660) : Fallback to system call for HR timer
[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764599 i Memory AllocatorImpl.cpp(01219) : Allocators activated
[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764601 i Memory AllocatorImpl.cpp(01235) : Using big block segment size 16777216
[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764604 e Configuration ConfigStoreManager.cpp(00693) : Configuration directory does not exist.
[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764607 e Configuration ConfigStoreManager.cpp(00693) : Configuration directory does not exist.
[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764609 i Basis ProcessorInfo.cpp(00746) : Using GDT segment limit to determine current CPU ID
[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764611 w Environment Environment.cpp(00286) : Changing environment set IMSLERRPATH=/usr/sap/HDB/HDB00/exe//
[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764613 w Environment Environment.cpp(00286) : Changing environment set IMSLSERRPATH=/usr/sap/HDB/HDB00/exe//
[6748]{-1}[-1/-1] 2015-02-03 12:57:01.764617 w Environment Environment.cpp(00286) : Changing environment set SSL_WITH_OPENSSL=0
Again, any help or hint would be great.
Cheers,
Mathias
Hi all,
when setting up replication this logical names must be real names of Linuxs ? or then can be freely choose like SITEA and SITEB ?
it is enough that I can put static entries in /etc/hosts or it must be resolved via nslookup?
Regards
GN
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi all,
we are getting below issue while registering secondary,
after opening all the ports starts with 300**
error: remoteHost does not match with any host of the source site. all hosts of source and target site must be able to resolve all hostnames of both sites correctly
failed. trace file nameserver_10.104.152.11.00000.000.trc may contain more error details.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
same here
Primary system - jdpsap1
Secondary system - jdpsap2
there is communication between hosts (ping to name, IP) from Linux level
I have to shut down HANA on "secondary" system - jdpsap2 to run replication.
Now when I try to run replication command on "secondary" system jdpsap2 I get error message that nameserver jdpsap2:30001 is not responding - it is obvious when HANA engine is stopped on jdpsap2 and nameserver is shutdown there aswell.
How to get this thing to work anyway?
Hallo Mathias,
We did setup replication, one point but we did it via hostnames. Did you check your routing tables that needs to be updated since SITE A has to be able to talk to SITEB.
We have not done it on the developer instances, but on the standarad appliances.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi all,
I am also getting below error while doing replication can you helping on this.
[4207]{-1}[-1/-1] 2015-02-21 22:42:05.271524 e Configuration ConfigStoreManager.cpp(00693) : Configuration directory does not exist.
[4207]{-1}[-1/-1] 2015-02-21 22:42:05.271525 e Configuration ConfigStoreManager.cpp(00693) : Configuration directory does not exist.
[4207]{-1}[-1/-1] 2015-02-21 22:42:05.271526 i Basis ProcessorInfo.cpp(00746) : Using GDT segment limit to determine current CPU ID
[4207]{-1}[-1/-1] 2015-02-21 22:42:05.271527 w Environment Environment.cpp(00286) : Changing environment set IMSLERRPATH=/usr/sap/xxx/HDB00/exe//
[4207]{-1}[-1/-1] 2015-02-21 22:42:05.271528 w Environment Environment.cpp(00286) : Changing environment set IMSLSERRPATH=/usr/sap/xxx/HDB00/exe//
[4207]{-1}[-1/-1] 2015-02-21 22:42:05.271530 w Environment Environment.cpp(00286) : Changing environment set SSL_WITH_OPENSSL=0
********> hdbnsutil -sr_register --remoteHost=vsa323634.ash.od.sap.biz --remoteInstance=00 --mode=syncmem --name=vadb**
adding site ...
checking for inactive nameserver ...
nameserver *******:30001 not responding.
collecting information ...
unable to contact primary site host vadb****:30102. connection refused,location=vadb***:30102. Trying old-style port...
error: unable to contact primary site; to vadb***:30001; original error: connection refused,location=vadb***:30001
failed. trace file nameserver_vadb***.00000.000.trc may contain more error details.
Regards,
Srini
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Mathias
Could you reach for a solution out? I have the same issue in a newer version, SP85
- I followed the same guideline.
- Both systems have exactly the same configuration (different OS parameters, of course).
[PRIMARY] known as PRIME >> $ HDB version
HDB version info:
version: 1.00.85.00.397590
branch: NewDB100_REL
git hash: not set
git merge time: not set
weekstone: 0000.00.0
compile date: 2014-11-12 12:39:22
compile host: ld7272.wdf.sap.corp
compile type: opt
[SECONDARY] known as CONTINGENCIA >> $ HDB version
HDB version info:
version: 1.00.85.00.397590
branch: NewDB100_REL
git hash: not set
git merge time: not set
weekstone: 0000.00.0
compile date: 2014-11-12 12:39:22
compile host: ld7272.wdf.sap.corp
compile type: opt
--- X --
STEPS TO ENABLE REPLICATION
--- X ---
From PRIME
hdbnsutil -sr_enable --name=PRIME
From CONTINGENCIA
hdbnsutil -sr_register --remoteHost=<PRIME_hostname> --remoteInstance=00 --mode=sync --name=CONTINGENCIA
adding site ...
checking for inactive nameserver ...
nameserver <CONTINGENCIA_hostname>:30001 not responding.
collecting information ...
// This command runs forever....
--- X --
Checkin from PRIME
--- X ---
$ hdbnsutil -sr_state
checking for active or inactive nameserver ...
System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~
mode: primary
site id: 1
site name: PRIME
Host Mappings:
~~~~~~~~~~~~~~
<PRIME_hostname> -> [PRIME] <PRIME_hostname>
<PRIME_hostname> -> [CONTINGENCIA] <CONTINGENCIA_hostname>
$ hdbcons -e hdbindexserver "replication info"
SAP HANA DB Management Client Console (type '\?' to get help for client commands)
Try to open connection to server process 'hdbindexserver' on system '<SID>', instance '<instance>'
SAP HANA DB Management Server Console (type 'help' to get help for server commands)
Executable: hdbindexserver (PID: 37303)
[OK]
--
Dumping replication statistics ...
Replication Primary Information
===============================
System Replication Primary Configuration
[system_replication] logshipping_timeout = 30
[system_replication] enable_full_sync = false
[system_replication] preload_column_tables = true
[system_replication] ensure_backup_history = true
[system_replication] enable_ssl = off
[system_replication] datashipping_snapshot_max_retention_time = 7200000000
- lastLogPos : 0x2bd106c0
- lastLogPosTimestamp : 06.02.2015-12.10.55 (1423224655944076)
- lastSavepointVersion : 14684
- lastSavepointLogPos : 0x2bd0fb02
- lastSavepointTimestamp : 06.02.2015-12.07.58 (1423224478129835)
0 session registered.
[OK]
--
[EXIT]
--
[BYE]
So no replication can be established.
I've checked at SAP HANA STUDIO >> LANDSCAPE >> SYSTEM REPLICATION from PRIME and both servers are shown, but nothing else.
- Do I have to check anything else on both systems?
- Which LOG file do I have to review?
Thanks!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Sergio,
I'm sorry but i couldn't resolve my problem yet and I have no good advice for your problem.
Maybe the adminstration guide can help you. (But I'm pretty sure that you already read through it) http://help.sap.com/hana/SAP_HANA_Administration_Guide_en.pdf
Two points I missed for some time:
# Do you configured the hostname resolution as described in 4.1.3.16 of the guide.
# Are the hostnames of both systems really different? On my machines DEV HANA hostnames are every time vhcalhdbdb and HANA ONE instances are every time hanaserver. And i think
collecting information ...
// This command runs forever....
indicates that that hostNames could not be resolved correctly.
Cheers,
Mathias
Hi again Mathias.
I'm also sorry because of you
Well, as you said, I read about this in the Ad. Guide with no success.
And yes, hostnames are different in both systems.
However the command runs forever and it seems a hostname resolution problem, through netstat command I can see an "ESTABLISHED" messages in both systems, so there's kind of communication!
I'm still stuck on that....
THANKS!
Hello Sergio,
How are you ?
I have exactly the same issue.... after registering the secondary site.
checking for inactive nameserver ...
nameserver lr002:30001 not responding.
collecting information ...
"// This command runs forever...."
and as well I have established connection.
1st Site
hdbnamese 61026 pr0adm 31u IPv4 3445012 | 0t0 TCP lp002pr0:30102->lr002:64334 (ESTABLISHED) |
2nd site
hdbnsutil 28773 pr0adm 13u IPv4 89708833 | 0t0 TCP lr002:64334->lp002pr0:30102 (ESTABLISHED) |
Did you sovle the issue ?
Cheers
Mohamed
Hello Mathias,
did you try it via HANA Studio?
What is the hostname of your primary site?
Could you provide the content of global.ini -> [system_replication] from both sites?
Thx!
BR, Bojan
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Bojan,
I have to say I'm a bit confused by the term hostname in this context, but in my mind the hostnames are
Content of the global.ini
[auditing configuration]
.... < five entries >
[persistence]
.... < five entries >
[system_replication]
mode = primary
actual_mode = primary
site_id = 1
site_name = AWSDEV80
[persistence]
.... < two entries >
So, there is no system replication entry at all.
I tried it with the studio as well, but it reports the same error:
Thank you for your quick response.
Cheers
Hello Mathias,
are the hosts reaching each other via IP? I also tried it on my two machines deployed in our cloud and faced connectivity issues when using hosts under cetain domains not exactly the error you got but I resolved it in leaving an entry in the /etc/hosts file on each host mapping the ip to the hostname (without domain).
Maybe its worth a try, otherwise I am running out of ideas here as the error itself you are reporting is quite exotic and definetively misleading.
Let me know your results!
BR, Bojan
Hi Mathias,
I know this could be a little strange but can you try doing a System Replication between SITE_B to SITE_A instead of SITE_A to SITE_B
Would be useful to see if SITE_A would report the same errors we see on SITE_B when System Replication is being setup
Otherwise the error you see is not some thing we have come across so far and it may need SAP attention to dig deeper
Hi Mathias,
1. Can you check the result of hdbnsutil -sr_state on SITE_B
2. Try the sr_register on SITE_B with the MODE= SYNC or SYNCMEM
We need to see if there is a different errorfor other modes compared to ASYNC mode
Send me the results and we can check accordingly
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Runnig on SITE_B:
vhcalhdbdb:/vap/usr/sap/HDB/HDB00> hdbnsutil -sr_state
checking for active or inactive nameserver ...
nameserver vhcalhdbdb:30001 not responding.
nameserver vhcalhdbdb:30001 not responding.
System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~
mode: none
done.
Test sync and syncmem shows no difference.
vhcalhdbdb:/vap/usr/sap/HDB/HDB00> HDB info
USER PID PPID %CPU VSZ RSS COMMAND
hdbadm 5208 5207 0.0 12404 2612 -sh
hdbadm 3519 3515 0.0 12408 2636 -sh
hdbadm 7278 3519 0.0 11440 1592 \_ /bin/sh /usr/sap/HDB/HDB00/HDB info
hdbadm 7301 7278 0.0 4664 572 \_ ps fx -U hdbadm -o user,pid,ppid,pcpu,vsz,rss,args
hdbadm 2964 1 0.0 153560 89336 /usr/sap/HDB/HDB00/exe/sapstartsrv pf=/usr/sap/HDB/SYS/profile/HDB_HDB00_vhcalhdbdb -D -u hdbadm
vhcalhdbdb:/vap/usr/sap/HDB/HDB00> hdbnsutil -sr_register --remoteHost=54.77.xxx.yyy --remoteInstance=00 --mode=sync --name=SITEB
adding site ...
checking for inactive nameserver ...
nameserver vhcalhdbdb:30001 not responding.
collecting information ...
error: only system replication chains are allowed with an aditional async secondary site: [primary] <------- [secondary] <---async--- [additional secondary]
failed. trace file nameserver_vhcalhdbdb.00000.000.trc may contain more error details.
vhcalhdbdb:/vap/usr/sap/HDB/HDB00> hdbnsutil -sr_register --remoteHost=54.77.xxx.yyy --remoteInstance=00 --mode=syncmem --name=SITEB
adding site ...
checking for inactive nameserver ...
nameserver vhcalhdbdb:30001 not responding.
collecting information ...
error: only system replication chains are allowed with an aditional async secondary site: [primary] <------- [secondary] <---async--- [additional secondary]
failed. trace file nameserver_vhcalhdbdb.00000.000.trc may contain more error details.
The trace file aren't different as well.
So, I would say there is no difference between the modes.
Thanks a lot.
Cheers.
Hi Mathias,
I got exactly the same problem today.
Replicating
B1X/hosta 97.04 to B1X/hostb 112.02
When I ran hdbnsutil -sr_register on hostb, I got the same error as you.
I manged to resolve the problem.
My issue was B1X was initially a copy of our production system which has replication on.
So when I ran hdbnsutil -sr_state on my primary server, it showed replication config from my Production system!
To fix this. I shut down HANA on both systems.
Ran "hdbnsutil -sr_cleanup –force" on both system.
Next I started HANA on hosta. Enabled replication on hosta then ran hdbnsutil -sr_register on hostb.
This time I did not get the error.
Replication still had problems because I was replicating SPS9 to SPS11 and there is a bug, see
"2312539 - Near Zero Downtime Upgrade To SAP HANA Database SPS11 With ASYNC Replication Mode Doesn't Work"
On siteB I added
[system_replication]
enable_send_ack_in_async_mode = false
to global.ini
then restarted HANA on siteB , now the replication started!
Regards
Tom
User | Count |
---|---|
85 | |
10 | |
10 | |
9 | |
7 | |
6 | |
5 | |
5 | |
4 | |
3 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.