Hello Michael, thank you for the fast answer...
Michael Creel wrote:
So, the setup process goes as it appears in the Tutorial or the screencast, and seems to complete normally. The setup process reports finding the compute nodes, as in this shot:
but when you finish the setup process and call "lamnodes" is says that lamd is not running? Is that an accurate summary of the problem?
That's correct. The screens are the same like in the turtorial and at the end the screen says: Found 8 nodes (we're running 8 at the moment)
Michael Creel wrote:
What version of PelicanHPC are you using?
From this mirror:
http://download.mi.hs-heilbronn.de/pelicanhpc/
pelicanhpc-v1.9.1-32bit.iso
Michael Creel wrote:
After you re-run pelican_restarthpc things seem to work correctly?
Interestingly it's "pelican_restart_hpc" (with underscore), but yes & no, I did the following:
1.) lamboot
2.) lamnodes shows only the host
3.) pelican_restart_hpc
4.) the host finds 8 nodes, but lamnodes has still only 1
5.) pelican_setup again
6.) the host finds 8 nodes, lamnodes shows all 8 IPs of the nodes
7.) startx and then I tried some examples
Michael Creel wrote:
kernel_example has a bug, thanks for the report. I'll fix that for the next release. (EDIT: actually, this is already fixed in the v1.99.0 release.). I think that pea_example is running fine. This is an indication that re-running pelican_restarthpc in fact worked. If 2 MPI ranks are used, it is normal that top only shows 1 running on the frontend, because the other is running on a compute node. To see it, ssh into the compute node and run top there (or set up a cluster monitor - see the homepage for a link how to do it).
Ok, here we go: v1.99 - We do have AMD Semprons 3000+ - and i'm not sure if they support 64bit - so maybe here settles the problem we do have? We tried running the v1.9.1-32bit.iso as I mentioned.
Michael Creel wrote:
sorry, I don't know anything about that.
No problem, but actually MPI Programs should be supported by Pelican? I couldn't find any "howto" for MPI Porgrams to run it on al cluster... it's quite new for me... maybe you have a hint.