apache spark - Avoid "Task not serialisable" with nested method in a class -


i understand usual "task not serializable" issue arises when accessing field or method out of scope of closure.

to fix it, define local copy of these fields/methods, avoids need serialize whole class:

class myclass(val myfield: any) {    def run() = {      val f = sc.textfile("hdfs://xxx.xxx.xxx.xxx/file.csv")       val myfield = this.myfield      println(f.map( _ + myfield ).count)    }  }  

now, if define nested function in run method, cannot serialized:

class myclass() {    def run() = {      val f = sc.textfile("hdfs://xxx.xxx.xxx.xxx/file.csv")       def mapfn(line: string) = line.split(";")       val myfield = this.myfield      println(f.map( mapfn( _ ) ).count)     }  }  

i don't understand since thought "mapfn" in scope... stranger, if define mapfn val instead of def, works:

class myclass() {    def run() = {      val f = sc.textfile("hdfs://xxx.xxx.xxx.xxx/file.csv")       val mapfn = (line: string) => line.split(";")       println(f.map( mapfn( _ ) ).count)        }  }  

is related way scala represents nested functions?

what's recommended way deal issue ? avoid nested functions?

isn't working in way in first case f.map(mapfn(_)) equivalent f.map(new function() { override def apply(...) = mapfn(...) }) , in second 1 f.map(mapfn)? when declare method def method in anonymous class implicit $outer reference enclosing class. map requires function compiler needs wrap it. in wrapper refer method of anonymous class, not instance itself. if use val, have direct reference function pass map. i'm not sure this, thinking out loud...


Comments

Popular posts from this blog

how to proxy from https to http with lighttpd -

android - Automated my builds -

python - Flask migration error -