Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
802 views
in Technique[技术] by (71.8m points)

centos7 - Gahp server (failure issues ) exited with status 1 unexpectedly

I am working on a web-based tool (named cloudcopasi) which take jobs from a user and submit it to bosco resources (compute nodes). I am using a bosco version (condor 8.8.12) on Linux CentOS 7. The web interface allows a user to add a bosco pool which user can use to submit jobs. However, when I try to submit a job, it fails. I tried to test the pool as well by using the following command:

bosco_cluster --test

It gives me the following GAHP error:

…..
Testing bosco submission...Passed!
Submission and log files for this job are in /home/cloudcopasi/bosco/local.bosco/bosco-test/boscotest.LTA07r
Waiting for jobmanager to accept job...Passed
Checking for submission to remote slurm cluster (could take ~30 seconds)...Failed
Showing last 5 lines of logs:
01/06/21 13:34:03 [3800] Gahp Server (pid=3815) exited with status 1 unexpectedly
01/06/21 13:34:08 [3800] gahp server not up yet, delaying ping
01/06/21 13:34:08 [3800] No jobs left, shutting down
01/06/21 13:34:08 [3800] Got SIGTERM. Performing graceful shutdown.
01/06/21 13:34:08 [3800] **** condor_gridmanager (condor_GRIDMANAGER) pid 3800 EXITING WITH STATUS 0

I am not sure what I am missing but I don’t understand how to solve this “Gahp server” issue.

Any help is highly appreciated.

Thank you.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This a probably an ssh failure (network, authentication, or authorization). Bosco runs the following command to access the remote cluster submit host:

<sbin>/remote_gahp <user>@<hostname> batch_gahp

You can run it on the command line to get more details about what's going wrong. remote_gahp is a bash script, so you can dig in further, if necessary.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...