Matthew Jacobs has posted comments on this change.
Change subject: IMPALA-3902: Scheduler improvements for running multiple
fragment instances on a single backend
......................................................................

Advertising

Patch Set 9:
(22 comments)
Focused mostly where I left off previously, from the Coordinator.
http://gerrit.cloudera.org:8080/#/c/4054/6/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:
Line 229:
.. total # fragments -1?
http://gerrit.cloudera.org:8080/#/c/4054/9/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:
PS9, Line 227: /// Index of 'this' in Coordinator::fragment_instance_states_
: int state_idx_;
not used
PS9, Line 485: int32_t
inconsistent use of int/int32_t
PS9, Line 486:
nit: space
PS9, Line 488: num_remaining_fragment_instances_ = num_remote_instances;
move this above near other usage of num_remote_instances
PS9, Line 494: // to keep things simple, make async Cancel() calls wait until
plan fragment
: // execution has been initiated, otherwise we might try to
cancel fragment
: // execution at Impala daemons where it hasn't even started
I don't understand why this comment belongs here. Am I missing something or
should it be moved/removed?
PS9, Line 542: FinishQueryStartup
Seems weird that we wouldn't have to call a function called
'FinishQueryStartup()' when there are no remote finstances. I'd suggest
changing the name but I guess it's fine since I assume this will go away with
Henry's upcoming change.
PS9, Line 637: // apparently this was never called for the coordinator
fragment
: //exec_state->ComputeTotalSplitSize(
: //rpc_params.fragment_instance_ctx.per_node_scan_ranges);
Why leave this as a comment? I'd imagine it will be called for the coord
fragment after Henry's change to make the coord fragment start like any other.
PS9, Line 645: GetInstanceIdx(coord_state->fragment_instance_id())
this and other cases where we index into fragment_instance_states_ would be
nice to use FragmentInstanceState::state_idx_
It looks to me like state_idx_ is not set or used yet. If we're not going to
use it, let's remove it.
PS9, Line 735: // DCHECK(filter_mode_ == TRuntimeFilterMode::OFF);
why is this a comment?
PS9, Line 746: instance_state_idx
same thing about using state_idx_
PS9, Line 1555: const DebugOptions* debug_options
It'd be nice to avoid passing these through the fn. I don't see any reason they
can't just be set on FragmentInstanceState*.
PS9, Line 1564: int instance_state_idx =
GetInstanceIdx(exec_params.instance_id);
: FragmentInstanceState* exec_state =
fragment_instance_states_[instance_state_idx];
same thing about the state_idx_ parameter
PS9, Line 1934: &fragment_profiles_[instance_state->fragment_id()];
was removing the bounds checks intentional? Here and below.
http://gerrit.cloudera.org:8080/#/c/4054/6/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:
Line 101:
nit: extra line
http://gerrit.cloudera.org:8080/#/c/4054/9/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:
PS9, Line 547: Also sets up and
: /// registers the state for the single coordinator fragment
instance.
Looks to me like it sets up the state for all non-coordinator fragment
instances.
http://gerrit.cloudera.org:8080/#/c/4054/9/be/src/scheduling/query-resource-mgr.cc
File be/src/scheduling/query-resource-mgr.cc:
PS9, Line 53: void ResourceResolver::GetResourceHostport(const TNetworkAddress&
src,
: TNetworkAddress* dest) const {
: if (ExecEnv::GetInstance()->is_pseudo_distributed_llama()) {
: auto entry = impalad_to_dn_.find(src);
: if (entry != impalad_to_dn_.end()) {
: *dest = entry->second;
: } else {
: *dest = TNetworkAddress();
: }
: } else {
: dest->hostname = src.hostname;
: dest->port = 0;
: }
I take it you'll rebase on the llama removal patch?
http://gerrit.cloudera.org:8080/#/c/4054/9/be/src/scheduling/query-schedule.h
File be/src/scheduling/query-schedule.h:
PS9, Line 46: boost::unordered_map
Do you know if we're avoiding std::unordered_map?
PS9, Line 176:
extra spaces
http://gerrit.cloudera.org:8080/#/c/4054/4/be/src/scheduling/simple-scheduler.cc
File be/src/scheduling/simple-scheduler.cc:
PS4, Line 541: // total_assigned_bytes won't hit total_size until
everything is assigned
: while (params_idx < params_list.size()
> i'm not sure what to add.
Ignore this, I misread this code before.
http://gerrit.cloudera.org:8080/#/c/4054/9/be/src/scheduling/simple-scheduler.cc
File be/src/scheduling/simple-scheduler.cc:
PS9, Line 942: // TODO-MT: call AdmitQuery()
any reason you can't do this now?
http://gerrit.cloudera.org:8080/#/c/4054/9/be/src/service/fragment-mgr.cc
File be/src/service/fragment-mgr.cc:
PS9, Line 148: VLOG_QUERY
This might be noisy for QUERY
--
To view, visit http://gerrit.cloudera.org:8080/4054
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: comment
Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <mar...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <l...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokh...@cloudera.com>
Gerrit-HasComments: Yes