[ https://issues.apache.org/jira/browse/BEAM-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15886971#comment-15886971
]
Davor Bonaci commented on BEAM-1556:
------------------------------------
It certainly is an issue in other runners as well.
I wouldn't do in the context of a {{FileBasedSource}}. Users should be able to call the {{FileSystem}}
API from, say, {{@ProcessElement}} method of a {{DoFn}}. So, I think the registration should
be done before any "user code" is invoked.
Doing it in worker startup might not be ideal -- the constructor takes {{PipelineOptions}}
as an argument. Since jobs could have different options, it probably needs to happen on a
per-task basis, likely at the point the worker receives the task from the master and deserializes
{{PipelineOptions}}.
> Spark executors need to register IO factories
> ---------------------------------------------
>
> Key: BEAM-1556
> URL: https://issues.apache.org/jira/browse/BEAM-1556
> Project: Beam
> Issue Type: Bug
> Components: runner-spark
> Reporter: Frances Perry
> Assignee: Jean-Baptiste Onofré
>
> The Spark executors need to call IOChannelUtils.registerIOFactories(options) in order
to support GCS file and make the default WordCount example work.
> Context in this thread: https://lists.apache.org/thread.html/469a139c9eb07e64e514cdea42ab8000678ab743794a090c365205d7@%3Cuser.beam.apache.org%3E
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)