Archive for Ožujak, 2006

PVM Trouble

Ožujak 4, 2006

Recently, a friend of mine and I decided to have some fun with distributed computing… The Computer Science contest is pretty near (at the time of writing, maybe even too near for my taste :)), we have a large enough CS classroom and we have enough will to go and study that darn thing. We were presented with one Linux server, running SuSE 9.1 and a whole network of Windows XP computers that were being used as normal, CS class computers. The problem – solving N linear equations with N variables using the Gaussian elimination method.

Our CS teacher told us about PVM – Parallel Virtual Machine, an application that can be used to connect several *nix/Windows computers in one large parallel problem-solving network. A sigh of relief was heard, because the time was running out and we were happy because we wouldn't have to write the client-server routines. I started studying the PVM API, and finally the time has come to test the thing.

So, we compiled and ran the PVM daemon on the Linux machine, and installed the RHSD and PVM on the Windows machines, and the fun was about to start… Not. First of all, we couldn't connect the Windows machines to the Linux machine. "add [hostname]" or "add [ipaddr]" wouldn't work, PVM would simply say "can't start pvmd", do some diagnose-tests and say something about a malformed %PVM_ROOT% environment variable on the remote machine. And, yes, we checked the value of the variable several times, and it was okay.

The other way around (adding the Lin machine to the Win machine) worked more-or-less. The Win machine saw the Linux machine, but the Linux machine didn't see the Windows machine. So, we turned off all possible Windows firewalls we could imagine. I turned off the SuSE firewall, flushed all iptables entries, but nothing helped. I asked Google for a solution, but all I got was a bunch of similar questions with no answers. Some of the people I asked told me that was a firewall problem or a configuration problem, but I couldn't find an error in either one of them.

The only strange thing I noticed was the lack of the PVM_ROOT variable when I tried the "set" command in the Windows command prompt through RSH. It was there when I listed all the environment variables locally, but it wasn't there when I tried that remotely. Still, the RSHD program has the "export environment variables" option checked. Strange :/.

The only thing we could do was give up on PVM and start writing some simple client / server communication protocol to be used with our application. I'm well aware it will lack some of the more advanced features PVM has, but then the work will be 100% ours and we'll know exactly what it does, and why it does that (hopefully :)). Wish us luck.