r - Passing information between threads (foreach with %dopar%) -
i'm using dosnow- package parallelizing tasks, differ in length. when 1 thread finished, want
- some information generated old threads passed next thread
- start next thread immediatly (loadbalancing in clusterapplylb)
it works in singlethreaded (see makeclust(spec = 1 ))
#register snow , dosnow require(dosnow) #change spec 4 or more, see problem registerdosnow(cl <- makecluster(spec=1,type="sock",outfile="")) numbersprocessed <- c() # init processed vector x <- foreach(i = 1:10,.export=numbersprocessed) %dopar% { #do working stuff cat(format(sys.time(), "%x"),": ","starting",i,"(numbers processed far:",numbersprocessed, ")\n") sys.sleep(time=i) #appends number general vector numbersprocessed <- append(numbersprocessed,i) cat(format(sys.time(), "%x"),": ","ending",i,"\n") cat("--------------------\n") } #end stopcluster(cl)
now change spec in "makecluster" 4. output this:
[..] type: exec 18:12:21 : starting 9 (numbers processed far: 1 5 ) 18:12:23 : ending 6 -------------------- type: exec 18:12:23 : starting 10 (numbers processed far: 2 6 ) 18:12:25 : ending 7
at 18:12:21 thread 9 knew, thread 1 , 5 have been processed. 2 seconds later thread 6 ends. next thread has know @ least 1, 5 , 6, right?. thread 10 knows 6 , 2.
i realized, has cores specified in makecluster. 9 knows 1, 5 , 9 (1 + 4 + 4), 10 knows 2,6 , 10 (2 + 4 + 4).
is there better way pass "processed" stuff further generations of threads?
bonuspoints: there way "print" master- node in parallel processing, without having these "type: exec" etc messages snow package? :)
thanks! marc
my bad. damn.
i thought, foreach %dopar% is load-balanced. this isn't case, , makes question absolete, because there can nothing executed on host-side while parallel processing. explains why global variables manipulated on client side , never reach host.
Comments
Post a Comment