cancel
Showing results for 
Search instead for 
Did you mean: 

HANA Replication Error -

Former Member
0 Kudos

Hello Expert ,

After enabling replication - we found that after 2% replication is stuck and indexserver shows following errors :

TrexNet          EndPoint.cpp(00260) : ERROR: failed to open channel <2ndry_HostIP>:33102! reason: (connection refused)

rexNet          EndPoint.cpp(00260) : details:

TNS              TNSClient.cpp(00671) : sendRequest dr_secondaryactivestatus to <2ndry_Hostname>:33102 failed with NetException. data=(S)host=<2ndry_Hostname>|service=statisticsserver|(I)drsender=1|port=33005|

TNS              TNSClient.cpp(00671) : sendRequest dr_secondaryactivestatus to <2ndry_Hostname>:33001 failed with NetException. data=(S)host=<2ndry_Hostname>|service=statisticsserver|(I)drsender=1|port=33005|

sr_nameserver    TNSClient.cpp(06915) : error when sending request 'dr_secondaryactivestatus' to <2ndry_Hostname>:33102: connection refused,location=<2ndry_Hostname>:33001

Same text appears for nameserver , indexserver etc...

End of trace shows the below text :

Stream NetworkChannelCompletion.cpp(00524) : NetworkChannelCompletionThread #0 NetworkChannel FD 173 [0x00007ff18449e158]  {refCnt=6, idx=0} 172.28.90.185/33103_tcp-><2ndry_HostIP>/56197_tcp Connected,[-w--]

: Error in asynchronous stream event: exception  1: no.2110001  (Basis/IO/Stream/impl/NetworkChannelCompletion.cpp:450)

    Generic stream error: getsockopt, Event=EPOLLERR - , rc=110: Connection timed out

$NetworkChannelBase$=

NetworkChannel FD 173 [0x00007ff18449e158]  {refCnt=6, idx=0} <1ry_HostIP>/33103_tcp-><2ndry_HostIP>/56197_tcp Connected,[-w--]

exception throw location:

1: 0x00007ff6fed5d6ca in Stream::NetworkChannelCompletionThread::run(void*&)+0x686 at NetworkChannelCompletion.cpp:450 (libhdbbasis.so)

To me it looks like some network issue but i am not able to reach and conclusion , HANA rev used is Rev91.

Thanks

Dev

Accepted Solutions (0)

Answers (3)

Answers (3)

Former Member
0 Kudos

Hello All ,

We had a problem at the network switch - our N/W team has corrected it and replication seems to be fine now.

Thanks

Dev

Former Member
0 Kudos

Hi Dev,

Appreciated that you gave feedback to the community.

Please mark the thread as resolved.

KR,

Amerjit

0 Kudos

Hi Dev,

Regarding the solution your NW implemented, could you elaborate more on what was changed at the NW layer?

Regards,

Edgar

Former Member
0 Kudos

Hi,

Did you update HANA client on SLT & Source along with HANA server?

looks to me client compatibilty issue. Though HANA client is upward & downward compatible, it may cause issues in particular cases. Please check on these lines.

Best Regards

Sachin

Former Member
0 Kudos

Hi Devpriy,

As per the log it says that network issue. Can you please try reactivating the replication.

The below Note may not be suits you. But, the log in the note mentioned the same.

2145902 - SAP HANA DB: XSEngine Does Not Start After Configuring System Replication on Multitenant D...


Regards,

Pavan Gunda

Former Member
0 Kudos

Thanks Pavan - i tried reactivation too but no luck.

Former Member
0 Kudos

Hello Dev,

I will just assume that you have religiously followed the System Replication Setup guides and have also respected the network requirements for replication.

Please check your secondary.

As <sidadm> on secondary.

1. HDB status ( you should see running HDB processes)

2. hdbnsutil -sr_state (this will show you your replication setup and state)

3. Check on your secondary that port 33102 is being listened on. I believe it's the hdbpreprocessor that listens on this port.

netstat -an | grep 33102 | grep LISTEN

lsof -i :33102 | grep LISTEN

Kind Regards,

Amerjit

Former Member
0 Kudos

Hello Amerjit -

Thanks -

1. Steps and Network settings both were taken care as we had a working system which we had cleaned to work towards new system.

2. All the Processes are running fine at the secondary end.

3.hdbnsutil -sr_state had shown us to re-setup the replication but after 4 % again the same message appears as there is a timeout happening as per the below log :

sr_nameserver    TNSClient.cpp(06915) : error when sending request 'dr_secondaryactivestatus' to <2ndryhostname>:33102: timeout occured,location=<2ndryhostname>:33102

4.hdbnameserver is listening on it.

Thanks

Dev

Former Member
0 Kudos

Hi Dev,

You spotted my error (preprocessor as opposed to nameserver).

Check your nameserver (and other) trace files on the secondary

Just as a side comment:

1. Make sure that column preload is false and that you have set a global allocation limit.

2. Your /etc/hosts is correct on both PRI and SECO nodes.

Amerjit

Former Member
0 Kudos

Hello Amerjit -

Could you please help me with column preload and global allocation limit.?

I have only one system is running to so am not sure if i really need glbal allocation limit.

Thanks

Dev

Former Member
0 Kudos

Hello Dev,

The column preload and GAL were just side comments.

You need to check in your nameserver traces on the SECO to see what's happening during the failed replication.

It could be a bug but I can't really comment as I'm not on rev91.

KR,

Amerjit