Re: Sharing SparkContext

fair scheduler merely reorders tasks .. I think he is looking to run multiple pieces of code on a single context on demand from customers...if the code & order is decided then fair scheduler will ensure that all tasks get equal cluster time :)

Re: Sharing SparkContext

Thank You Mayur

I will try Ooyala job server to begin with. Is there a way to load RDD created via sparkContext into shark? Only reason i ask is my RDD is being created from Cassandra (not Hadoop, we are trying to get shark work with Cassandra as well, having troubles with it when running in distributed mode).

fair scheduler merely reorders tasks .. I think he is looking to run multiple pieces of code on a single context on demand from customers...if the code & order is decided then fair scheduler will ensure that all tasks get equal cluster time :)

Re: Sharing SparkContext

"Inside a given Spark application (SparkContext instance), multiple
parallel jobs can run simultaneously if they were submitted from
separate threads. By “job”, in this section, we mean a Spark action
(e.g.save,collect) and any tasks that need to run
to evaluate that action. Spark’s scheduler is fully thread-safe and
supports this use case to enable applications that serve multiple
requests (e.g. queries for multiple users).

By default, Spark’s
scheduler runs jobs in FIFO fashion. Each job is divided into
“stages” (e.g. map and reduce phases), and the first job gets
priority on all available resources while its stages have tasks to
launch, then the second job gets priority, etc. If the jobs at the
head of the queue don’t need to use the whole cluster, later jobs
can start to run right away, but if the jobs at the head of the
queue are large, then later jobs may be delayed significantly.

Starting in Spark 0.8, it
is also possible to configure fair sharing between jobs. Under
fair sharing, Spark assigns tasks between jobs in a “round robin”
fashion, so that all jobs get a roughly equal share of cluster
resources. This means that short jobs submitted while a long job
is running can start receiving resources right away and still get
good response times, without waiting for the long job to finish.
This mode is best for multi-user settings.

To enable the fair
scheduler, simply set thespark.scheduler.modetoFAIRbefore
creating a SparkContext:"

On 2/25/14, 12:30 PM, Mayur Rustagi
wrote:

fair scheduler merely reorders tasks .. I think he
is looking to run multiple pieces of code on a single context on
demand from customers...if the code & order is decided then
fair scheduler will ensure that all tasks get equal cluster time
:)

"Inside a given Spark application (SparkContext instance), multiple
parallel jobs can run simultaneously if they were submitted from
separate threads. By “job”, in this section, we mean a Spark action
(e.g.save,collect) and any tasks that need to run
to evaluate that action. Spark’s scheduler is fully thread-safe and
supports this use case to enable applications that serve multiple
requests (e.g. queries for multiple users).

By default, Spark’s
scheduler runs jobs in FIFO fashion. Each job is divided into
“stages” (e.g. map and reduce phases), and the first job gets
priority on all available resources while its stages have tasks to
launch, then the second job gets priority, etc. If the jobs at the
head of the queue don’t need to use the whole cluster, later jobs
can start to run right away, but if the jobs at the head of the
queue are large, then later jobs may be delayed significantly.

Starting in Spark 0.8, it
is also possible to configure fair sharing between jobs. Under
fair sharing, Spark assigns tasks between jobs in a “round robin”
fashion, so that all jobs get a roughly equal share of cluster
resources. This means that short jobs submitted while a long job
is running can start receiving resources right away and still get
good response times, without waiting for the long job to finish.
This mode is best for multi-user settings.

To enable the fair
scheduler, simply set thespark.scheduler.modetoFAIRbefore
creating a SparkContext:"

On 2/25/14, 12:30 PM, Mayur Rustagi
wrote:

fair scheduler merely reorders tasks .. I think he
is looking to run multiple pieces of code on a single context on
demand from customers...if the code & order is decided then
fair scheduler will ensure that all tasks get equal cluster time
:)

"Inside a given Spark application (SparkContext instance), multiple
parallel jobs can run simultaneously if they were submitted from
separate threads. By “job”, in this section, we mean a Spark action
(e.g.save,collect) and any tasks that need to run
to evaluate that action. Spark’s scheduler is fully thread-safe and
supports this use case to enable applications that serve multiple
requests (e.g. queries for multiple users).

By default, Spark’s
scheduler runs jobs in FIFO fashion. Each job is divided into
“stages” (e.g. map and reduce phases), and the first job gets
priority on all available resources while its stages have tasks to
launch, then the second job gets priority, etc. If the jobs at the
head of the queue don’t need to use the whole cluster, later jobs
can start to run right away, but if the jobs at the head of the
queue are large, then later jobs may be delayed significantly.

Starting in Spark 0.8, it
is also possible to configure fair sharing between jobs. Under
fair sharing, Spark assigns tasks between jobs in a “round robin”
fashion, so that all jobs get a roughly equal share of cluster
resources. This means that short jobs submitted while a long job
is running can start receiving resources right away and still get
good response times, without waiting for the long job to finish.
This mode is best for multi-user settings.

To enable the fair
scheduler, simply set thespark.scheduler.modetoFAIRbefore
creating a SparkContext:"

On 2/25/14, 12:30 PM, Mayur Rustagi
wrote:

fair scheduler merely reorders tasks .. I think he
is looking to run multiple pieces of code on a single context on
demand from customers...if the code & order is decided then
fair scheduler will ensure that all tasks get equal cluster time
:)

"Inside a given Spark application (SparkContext instance), multiple
parallel jobs can run simultaneously if they were submitted from
separate threads. By “job”, in this section, we mean a Spark action
(e.g.save,collect) and any tasks that need to run
to evaluate that action. Spark’s scheduler is fully thread-safe and
supports this use case to enable applications that serve multiple
requests (e.g. queries for multiple users).

By default, Spark’s
scheduler runs jobs in FIFO fashion. Each job is divided into
“stages” (e.g. map and reduce phases), and the first job gets
priority on all available resources while its stages have tasks to
launch, then the second job gets priority, etc. If the jobs at the
head of the queue don’t need to use the whole cluster, later jobs
can start to run right away, but if the jobs at the head of the
queue are large, then later jobs may be delayed significantly.

Starting in Spark 0.8, it
is also possible to configure fair sharing between jobs. Under
fair sharing, Spark assigns tasks between jobs in a “round robin”
fashion, so that all jobs get a roughly equal share of cluster
resources. This means that short jobs submitted while a long job
is running can start receiving resources right away and still get
good response times, without waiting for the long job to finish.
This mode is best for multi-user settings.

To enable the fair
scheduler, simply set thespark.scheduler.modetoFAIRbefore
creating a SparkContext:"

On 2/25/14, 12:30 PM, Mayur Rustagi
wrote:

fair scheduler merely reorders tasks .. I think he
is looking to run multiple pieces of code on a single context on
demand from customers...if the code & order is decided then
fair scheduler will ensure that all tasks get equal cluster time
:)

"Inside a given Spark application (SparkContext instance),
multiple parallel jobs can run simultaneously if they were
submitted from separate threads. By “job”, in this section,
we mean a Spark action (e.g.save,collect) and
any tasks that need to run to evaluate that action. Spark’s
scheduler is fully thread-safe and supports this use case to
enable applications that serve multiple requests (e.g.
queries for multiple users).

By default, Spark’s scheduler runs jobs in FIFO fashion.
Each job is divided into “stages” (e.g. map and reduce
phases), and the first job gets priority on all available
resources while its stages have tasks to launch, then the
second job gets priority, etc. If the jobs at the head of
the queue don’t need to use the whole cluster, later jobs
can start to run right away, but if the jobs at the head
of the queue are large, then later jobs may be delayed
significantly.

Starting in Spark 0.8, it is also possible to configure
fair sharing between jobs. Under fair sharing, Spark
assigns tasks between jobs in a “round robin” fashion, so
that all jobs get a roughly equal share of cluster
resources. This means that short jobs submitted while a
long job is running can start receiving resources right
away and still get good response times, without waiting
for the long job to finish. This mode is best for
multi-user settings.

To enable the fair scheduler, simply set thespark.scheduler.modetoFAIRbefore
creating a SparkContext:"

On 2/25/14, 12:30 PM, Mayur Rustagi wrote:

fair scheduler merely reorders tasks .. I
think he is looking to run multiple pieces of code on a
single context on demand from customers...if the code
& order is decided then fair scheduler will ensure
that all tasks get equal cluster time :)

"Inside a given Spark application (SparkContext instance),
multiple parallel jobs can run simultaneously if they were
submitted from separate threads. By “job”, in this section,
we mean a Spark action (e.g.save,collect) and
any tasks that need to run to evaluate that action. Spark’s
scheduler is fully thread-safe and supports this use case to
enable applications that serve multiple requests (e.g.
queries for multiple users).

By default, Spark’s scheduler runs jobs in FIFO fashion.
Each job is divided into “stages” (e.g. map and reduce
phases), and the first job gets priority on all available
resources while its stages have tasks to launch, then the
second job gets priority, etc. If the jobs at the head of
the queue don’t need to use the whole cluster, later jobs
can start to run right away, but if the jobs at the head
of the queue are large, then later jobs may be delayed
significantly.

Starting in Spark 0.8, it is also possible to configure
fair sharing between jobs. Under fair sharing, Spark
assigns tasks between jobs in a “round robin” fashion, so
that all jobs get a roughly equal share of cluster
resources. This means that short jobs submitted while a
long job is running can start receiving resources right
away and still get good response times, without waiting
for the long job to finish. This mode is best for
multi-user settings.

To enable the fair scheduler, simply set thespark.scheduler.modetoFAIRbefore
creating a SparkContext:"

On 2/25/14, 12:30 PM, Mayur Rustagi wrote:

fair scheduler merely reorders tasks .. I
think he is looking to run multiple pieces of code on a
single context on demand from customers...if the code
& order is decided then fair scheduler will ensure
that all tasks get equal cluster time :)