Hans Svensson and I were investigating the restarting behaviour of nodes
in Erlang;
wanting to know for instance whether "are pids created on a node with
name "n" comparable to a pid created on
a node with the same name after a restart"?
(all, of course, for the noble purpose of eventually really
understanding the detailed
semantics of distributed Erlang :-)
Anyway, after experimenting Hans came up with the following program that
works a
bit unexpectedly (attached as "strangeCommunication.erl").
Three nodes are used (n1,n2,n3); n2 is restarted automatically whenever
it halts (by the shell file "restartingErlangshell.sh").
When strangeCommunication:run() is started it performs three times on
node n1:
- starts a process on n2
- halts node n2
The result is a list of three (dead) process identifiers. We are sure
they are dead since we have
received exit messages regarding them.
We then spawn a new process on n2 which just echo received messages to
the sender.
The three pids (of dead processes) are communicated to a newly spawned
process on node n3,
which tries to communicate with any of the dead processes. And rather
surprisingly one of the communications succeeds!
(test@REDACTED)1> strangeCommunication:start().
Killing and restarting node n2@REDACTED
Killing and restarting node n2@REDACTED
Killing and restarting node n2@REDACTED
Got: {'EXIT',<4981.41.0>,normal}
Got: {'EXIT',<4981.41.0>,normal}
Got: {'EXIT',<4981.41.0>,normal}
Trying to communicate with: <4981.41.0>
(<<131,103,100,0,10,110,50,64,106,101,122,97,98,101,108,0,0,0,41,0,0,0,0,3>>)
Recieved 6 from <4981.41.0> (!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!)
(<<131,103,100,0,10,110,50,64,106,101,122,97,98,101,108,0,0,0,41,0,0,0,0,3>>)
Trying to communicate with: <4981.41.0>
(<<131,103,100,0,10,110,50,64,106,101,122,97,98,101,108,0,0,0,41,0,0,0,0,1>>)
No reply!
Trying to communicate with: <4981.41.0>
(<<131,103,100,0,10,110,50,64,106,101,122,97,98,101,108,0,0,0,41,0,0,0,0,2>>)
No reply!
We all know that pids eventually wrap around, but it seems that when
nodes are restarted, pids are going be reused much earlier than one
would think. Maybe it would be a good idea to permit more restarts than
three before reuse?
(it seems like three is the magic number being used in the runtime system)
Lars-Åke and Hans
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: strangeCommunication.erl
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20070309/09d70fb6/attachment.ksh>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: restartingErlangshell.sh
Type: application/x-shellscript
Size: 132 bytes
Desc: not available
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20070309/09d70fb6/attachment.bin>