Fault Tolerance in a Distributed Erlang -
how can have fault-tolerance in distributed application? far understand, supervision tree works supervising local process (if right?). how can supervise remote processes spawned on remote nodes. need supervise them , restart them in case of failure?
look @ otp design principles chapter 9 distributed applications , sub chapters 9.4 failover , 9.5 takeover.
if interested in topic should @ famous thesis making reliable distributed systems in presence of software errors , ton of published books topic. of materials on-line 3 free e-books , tutorial on erlang. example chaper distribution distribunomicon.
tl;tr? long story short, wrote, have monitor each other supervisor tree , restart in case of failure. can reinvent wheel because erlang provides great tools doing or use existing solution form bare otp riak_core.
Comments
Post a Comment