> On March 21, 2016, 3:57 p.m., Avinash sridharan wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/cni.cpp, line 339
> > <https://reviews.apache.org/r/44706/diff/2/?file=1307515#file1307515line339>
> >
> > This is just a thought. Maybe its better to use `await` over here, and
> > use a lambda or `.onAny` on the await to parse the list of futures returned
> > on await and clean up any checkpointed data for networks that we were not
> > able to join due to plugin errors ?
>
> Qian Zhang wrote:
> I think any clean up works should be handled in the `cleanup` method,
> that's should be the design idea of isolator. Here we are doing all or
> nothing, so if there is a call to a CNI plugin fails for any reasons, we will
> return a `Failure` which should be eventually handled in
> `Slave::executorLaunched`, and it will call `MesosContainerizer::destroy`
> which will eventually call each isolator's `cleanup` method to do the cleanup.

Advertising

Sounds reasonable.
> On March 21, 2016, 3:57 p.m., Avinash sridharan wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/cni.cpp, line 391
> > <https://reviews.apache.org/r/44706/diff/2/?file=1307515#file1307515line391>
> >
> > Shouldn't we be cleaning up the check pointed data for this ifname if
> > there is an error while launching the plugin for this ifname ?
>
> Qian Zhang wrote:
> I think we should do the cleanup in the `cleanup` method.
Agreed.
> On March 21, 2016, 3:57 p.m., Avinash sridharan wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/cni.cpp, line 445
> > <https://reviews.apache.org/r/44706/diff/2/?file=1307515#file1307515line445>
> >
> > Why the `CHECK_READY` here ? If the future is not READY it could be in
> > error, which is fine we should just return an ERROR ?
>
> Qian Zhang wrote:
> Because when `NetworkCniIsolatorProcess::__connect` is called, we expect
> `output` must be ready otherwise `NetworkCniIsolatorProcess::__connect`
> should not be called. I think it is a common way to use `then`, please take a
> look at:
> https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/mesos/isolators/network/port_mapping.cpp#L3173:L3186
> as a reference.
The fact that a plugin might misbehave should not cause the agent to crash. I
think this defensive check doesn't serve much purpose apart from causing a
potential panic in the Agent.
> On March 21, 2016, 3:57 p.m., Avinash sridharan wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/cni.cpp, line 471
> > <https://reviews.apache.org/r/44706/diff/2/?file=1307515#file1307515line471>
> >
> > Again why the CHECK ? The result might not have an IPv4 or an IPv6
> > allocation due to an IPAM error in which case we should return a failure.
>
> Qian Zhang wrote:
> I think the output of IPAM plugin has no IPv4 address and no IPv6 address
> is an unexpected result, so that's why I use `CHECK` here, but I agree with
> you returning a `Failure` is more appropriate, will update the patch
> accordingly later.
Why is it an unexpected result? What happens if the IPAM is oversubscribed and
runs out of addresses? That should definitely not cause the Agent to crash. We
should not be able to launch containers, but the Agent should not be crashing.
IPAM running out of addresses is not a terminal behavior, it will recover when
containers are destroyed and IP addresses are freed.
> On March 21, 2016, 3:57 p.m., Avinash sridharan wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/cni.cpp, lines 328-336
> > <https://reviews.apache.org/r/44706/diff/2/?file=1307515#file1307515line328>
> >
> > why do we need this dispatch ? The dispatch is to itself, so seems a
> > bit odd. Can we invoke the `subprocess` for the plugin in _connect directly
> > ?
> >
> > Basically not sure why we need connect & _connect.
>
> Qian Zhang wrote:
> The idea is similar with how `MesosContainerizer` call each isolator:
> https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/mesos/containerizer.cpp#L1171:L1181
>
> I'd like to handle each call to a CNI plugin in a separate libprocess
> dispatch event, so that's why I call `connect` via `dispatch`.
In the example you gave the `isolator` does a dispatch on an isolator process.
So the `MesosContainerizer` effectively does a dispatch on a separate
libprocess thread, which is the intended behavior. Here it seems a bit odd that
dispatch is scheduling an event on itself. Functionally nothing wrong, but this
is not idiomatic and we should avoid this.
- Avinash
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44706/#review124554
-----------------------------------------------------------
On March 20, 2016, 4:27 p.m., Qian Zhang wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44706/
> -----------------------------------------------------------
>
> (Updated March 20, 2016, 4:27 p.m.)
>
>
> Review request for mesos, Avinash sridharan, Gilbert Song, and Jie Yu.
>
>
> Bugs: MESOS-4759
> https://issues.apache.org/jira/browse/MESOS-4759
>
>
> Repository: mesos
>
>
> Description
> -------
>
> Implemented isolate() method of "network/cni" isolator.
>
>
> Diffs
> -----
>
> src/slave/containerizer/mesos/isolators/network/cni/cni.hpp PRE-CREATION
> src/slave/containerizer/mesos/isolators/network/cni/cni.cpp PRE-CREATION
>
> Diff: https://reviews.apache.org/r/44706/diff/
>
>
> Testing
> -------
>
> make check
>
>
> Thanks,
>
> Qian Zhang
>
>