on 04-11-2008 8:13 AM
Hi folks
I am trying to start sapinst, it does work, even the gui starts correctly. But after about 10 seconds sapinst terminates, stating that the gui did not login properly. The strange thing is, that i was able to start sapinst once or twice correctly.
host1:/Installation_Master_6.20_6.40_07_07/IM_LINUX_X86_64/SAPINST/UNIX/LINUXX86_64 # ./sapinst
[==============================] | extracting... done!
guiengine: no GUI connected; waiting for a connection on host migzm210, port 21212 to continue with the installation
guiengine: login in process.
..............................
guiengine: login timeout; the client was unable to establish a valid connection
CSynEvent::~CSynEvent: an error occured;: Success
Suse, sapinst and java versions:
host1:/Installation_Master_6.20_6.40_07_07/IM_LINUX_X86_64/SAPINST/UNIX/LINUXX86_64 # cat /etc/SuSE-release
SUSE Linux Enterprise Server 10 (x86_64)
VERSION = 10
PATCHLEVEL = 1
host1:/Installation_Master_6.20_6.40_07_07/IM_LINUX_X86_64/SAPINST/UNIX/LINUXX86_64 #./sapinst -v
[==============================] | extracting... done!
This is SAPinst, version 642, build 917371
compiled on Jul 30 2007, 02:42:28
host1:/Installation_Master_6.20_6.40_07_07/IM_LINUX_X86_64/SAPINST/UNIX/LINUXX86_64 # java -version
java version "1.4.2"
Java(TM) 2 Runtime Environment, Standard Edition (build 2.2)
IBM J9SE VM (build 2.2, J2RE 1.4.2 IBM J9 2.2 Linux amd64-64 j9xa64142ifx-20070808 (JIT enabled)
J9VM - 20070807_1500_LHdSMr
JIT - r7_level20070315_1745)
Has anyone had this before?
Regards
Michael
Not sure about it, but port 21212 on host migzm210 might be still in use by another process, e.g. by a process remaining from a previous, aborted run of sapinst.
So look for such a process and stop it. If everything else fails, a reboot could sort it out.
hope this helps
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Nope, when the port is in use, then this message will occur:
guiengine: call to bind() for socket 3 failed. Address already in use
ERROR 2008-04-11 12:22:40 [iaxxgenimp.cpp:532]
init()
FGE-00006 Attempt to open a communication port connection failed. Check whether the port 21212 is already in use.
ERROR 2008-04-11 12:22:40
FCO-00034 An error occurred during the installation. Problem: error in GUI server subsystem.
But i tried different ports as well, the they act the same. I can even see the gui connects, but still the server disconnects after 10 secs. I also started the server with --nogui, and connected from another server, same problem.
Regards
Michael
Hi again
I checked iptables, all are on accept policy, i can also telnet to port 21212 without problem, SELinux is not active as well.
I also got the latest sapinst from swdc (which was actually older, than the one of the master dvd), same problem. Even a NW 7.0 sapinst is not working. I am going to open a OSS message now. Thanks so far, but still waiting for input.
Best regards
Michael
This is REALLY strange.
I just installed yesterday a system where everything worked as expected...
Another "guess":
- check the ulimits for root (/etc/security/limits.conf) + ulimit -a
- check JAVA_HOME for the root user
- use a "remote" sapinst
With the last point I mean to start sapinst without a DISPLAY variable set and run sapinst on another machine and connect to the Linux system, that´s how I do the installation if I have no possibility to run X remotely...
Markus
Hi,
I'm getting exactly this problem on a RHEL cluster.
Can you give me more info on the fix from SUSE so I can try and match it to RHEL. ?
Many Thanks
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi,
the SAP LinuxLab is currently investigating what is going wrong here. We were able to reproduce the problem and work together with Novell to fix the issue. When we're done, i'm going to post the solution. Until then, I locked the topic.
Thanks,
Hannes
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi everybody
The SAP Linuxlab and SUSE were able to reproduce the error. The problem turned out to be a linux kernel bug. If you are having the problem, please check if you see strange signal queue values:
cat /proc/<pid>/status
Where pid is the process id of the sapinst process. You most probably have the bug, if you see something like this: SigQ: 18446744073709545431/71679
Normally the first number has to be smaller than the second, this is ok: SigQ: 0/71679
SUSE will include the fix within the next maintenance update of the kernel, so be sure to apply it, when you run into the issue.
Thanks to everybody for your valuable input! Best regards
Michael
Hi,
check /etc/hosts for localhost entry and try setting it to servers IP address (not 127.0.0.1). Had similar problem and solved it somehow by changing JAVA_HOME variable and manipulating with /etc/hosts
Regs,
FS
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Michael
the remote sapinst gui was started from a Win2k PC.
If I start the sapinst with SAPINST_START_GUI=false as either ROOT or my adm user, my remote sapinst connection fails with the SSL error.
I'm able to continue using sapinst running locally on the SLES10 server, so although a pain not a showstopper.
regards,
Stephen
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi
I'm also having the same problem when using nogui option.
When I start my local sapinst and connect to the remote server, it times out after 10secs with:
Network input/output exception has occurred: Remote host closed connection during handshake
javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake
I also have an issue when starting sapinst on my Linux SLES10 server as root without setting the nogui option.
However, that works now.
Still have the SSLhandshake issue.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Well, I can update you with my status:
After several contacts with the development guy, he provided a sapinst which skips deliberately the handshake check. They are still figuring out, what is going wrong. I will post here, when the issue is finally resolved.
@Stephen: I am not fully convinced, we have absolutely the same problem, obviously we both have an error during the gui - sapinst connection handshake. But in our case the problem occurs no matter if -nogui is specified or not. Even worse, if we start the sapinst as sidadm, it always works, if we start it as root the handshake seems to work very rarely. Did you start the remote gui on a SLES10 box as well, or from a windows client?
Regards
Michael
Hey Michael,
this might not be related to the error, but worth to be mentioned. Instead of calling sapinst this way
host1:/Installation_Master_6.20_6.40_07_07/IM_LINUX_X86_64/SAPINST/UNIX/LINUXX86_64 # ./sapinst
please use the following method instead:
host1:~ # mkdir /tmp/sapinst_install
host1:~ # cd /tmp/sapinst_install
host1:~ # /Installation_Master_6.20_6.40_07_07/IM_LINUX_X86_64/SAPINST/UNIX/LINUXX86_64/sapinst
Thanks,
Hannes
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
@Markus, Hannes: i tried your suggestions, no luck so far. I checked the limits, JAVA_HOME, and the mkdir /tmp/sapinst_install. I also tried with a remote gui (it is possible to start sapinst with --nogui). The problem seems to be the sapinst itself, not the gui. As soon as i connect with the gui, sapinst terminates after a short time.
I did not get any answer from SAP support so far, but a am waiting on some trace, debug options for sapinst.
Thanks so far, i will post any new findings, regards
Michael
Hi Markus
We had to add -f for the forked processes to work:
strace -fF <sapinst> 2> strace_sapinst.log &
The problem seems to be a child process / thread, which dies, here is the problem part:
[pid 9199] clone(Process 9200 attached
child_stack=0x42003260, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|C
LONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x420039d0, tls=0x42003940, child_
tidptr=0x420039d0) = 9200
[pid 9199] recvfrom(5, <unfinished ...>
[pid 9200] write(2, "\nguiengine: login in process.", 29
guiengine: login in process.) = 29
[pid 9200] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 😎 = 0
[pid 9200] rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 😎 = 0
[pid 9200] rt_sigprocmask(SIG_SETMASK, [], NULL, 😎 = 0
[pid 9200] nanosleep({1, 0}, <unfinished ...>
[pid 9199] <... recvfrom resumed> "\0s<?xml version=\"1.0\"?><sapinstg"..., 1000, 0, NULL, NULL) = 11
7
[pid 9199] getpeername(5, {sa_family=AF_INET, sin_port=htons(21730), sin_addr=inet_addr("146.67.64.23
2")}, [841813590032]) = 0
[pid 9199] futex(0x2adcd4965f50, FUTEX_WAKE, 2147483647) = 0
[pid 9199] sendto(5, "\0`<?xml version=\"1.0\" encoding=\""..., 98, 0, NULL, 0) = 98
[pid 9199] tgkill(9160, 9200, SIGRTMIN) = -1 EAGAIN (Resource temporarily unavailable)
We are suspecting a problem with NPTL, under SLES9 (which always works) we always set LD_ASSUME_KERNEL. Under SLES10 this is not possible anymore.
Regards, Michael
Dear Michael,
please do not set LD_ASSUME_KERNEL on SLES10. The meaning, or better said, what it is doing changed from SLES9 to SLES10 (actually, it changed in the kernel version). Please remove any LD_ASSUME_KERNEL settings when using SLES10. The SAP sapinst should be completely aware of this and will not set LD_ASSUME_KERNEL. To verify, do you set LD_ASSUME_KERNEL in any of the profile you are using (you mentioned, that sapinst works sidadm but not for root..)?
Thanks,
Hannes
Do not use the SuSE provided Java installation but the one on the site given in
note 861215 - Recommended Settings for the Linux on AMD64/EM64T JVM
It contains special fixes for SAP installations.
Are you installing on the console or remotely?
Markus
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.