While writing a GenServer as a singleton on a multi-node system, it’s easy to come up with something like this:
def start_link(_number, _opts \\ []) do
case GenServer.start_link(__MODULE__, :ok, name: {:global, __MODULE__}) do
{:ok, pid} ->
{:ok, pid}
{:error, {:already_started, pid}} ->
{:ok, pid}
end
end
It works perfectly until it doesn’t: when our singleton process goes down, the only supervision tree aware of this would be the one that started it. Every other node won’t get any message, so while the process is restarting, the caller will fail with Process is not alive
.…