python - Pig Script: STORE command not working -
this first time posting stackoverflow , i'm hoping can assist. i'm new @ pig scripts , have encountered problem can't solve.
below pig script fails when attempt write results file:
register 'myudf.py' using jython myfuncs; = load '$file_nm' using pigstorage('$delimiter') ($fields); b = filter ($field_nm) not null; c = foreach b generate ($field_nm) fld; d = group c all; e = foreach d generate myfuncs.theresult(c.fld); --dump e; store e 'myoutput/theresult'; exec;
i see results of e when dump screen. however, need store results temporarily in file. after store command, error receive is: output location validation failed.
i've tried numerous workarounds, removing theresult folder , removing earlier contents of theresult, none of commands use work. these have been along lines of:
hdfs dfs -rm myoutput/theresult
and
hadoop fs -rm myoutput/theresult
...using both shell (hs) , file system (fs) commands. i've tried call function (shell script, python function, etc.) clear earlier results stored in myoutput/theresult folder. i've read every website can find , nothing working. ideas??
the output location of mapreduce directory. so, must have tried way
hadoop fs -rmr myoutput/theresult
and run pig script. work. "rmr" - remove recursive, deletes both folder/file "rm" - remove removes file
everytime, need either change output path or delete , use same, since hdfs worm(write once read many) model storage.
Comments
Post a Comment