Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "HedWig/TopicManagement" page has been changed by ErwinTam.
http://wiki.apache.org/hadoop/HedWig/TopicManagement?action=diff&rev1=1&rev2=2
--------------------------------------------------
- ---+ Topic management in Hedwig
+ == Topic management in Hedwig ==
- %TOC%
+ === ZooKeeper data structure ===
- ---++ !ZooKeeper data structure
-
- Metadata about topics, subscribers, and hubs will be stored in !ZooKeeper. For a given Hedwig
region, we will store the following structure:
+ Metadata about topics, subscribers, and hubs will be stored in ZooKeeper. For a given Hedwig
region, we will store the following structure:
-
- <img src="%ATTACHURLPATH%/hedwigzk.png" alt="hedwigzk.png"/>
The rectangles in this diagram are znodes; rectangles with dashed borders are ephemeral
znodes.
- * <B>hedwig</B> is the root. If the !ZooKeeper instance is shared between
regions, then there would be a region-specific root.
+ * '''hedwig''' is the root. If the !ZooKeeper instance is shared between regions, then
there would be a region-specific root.
- * <B>regionname</B> is the region this Hedwig cluster lives in.
+ * '''regionname''' is the region this Hedwig cluster lives in.
- * <B>topics</B> is the root for the topics subtree. All topics that
have been created live under this root.
+ * '''topics''' is the root for the topics subtree. All topics that have been created
live under this root.
- * <B>T1, T2 ... </B> are nodes for each topic. The nodes are named
by the topic name. The fact that a topic has a node under the <B>Topics</B> node
means that the topic exists. If we had any other topic metadata (like permissions) that we
wanted to store, we could store them as the content of the node.
+ * '''T1, T2 ... ''' are nodes for each topic. The nodes are named by the topic
name. The fact that a topic has a node under the <B>Topics</B> node means that
the topic exists. If we had any other topic metadata (like permissions) that we wanted to
store, we could store them as the content of the node.
- * <B>Ti.hub</B> is the hub that is currently assigned to the
topic. This is an ephemeral node, so that if the hub fails, another hub can take over. The
name of the node is "Hub" and the content is the hostname of the hub.
+ * '''Ti.hub''' is the hub that is currently assigned to the topic. This is
an ephemeral node, so that if the hub fails, another hub can take over. The name of the node
is "Hub" and the content is the hostname of the hub.
- * <B>hubscribers</B> is the root of the tree of subscribers to
the topic.
+ * '''hubscribers''' is the root of the tree of subscribers to the topic.
- * <B>Sub1, Sub2 ...</B> are the subscribers. Each node is
named with the subscriber ID (which should be descriptive enough for the hub to be able to
deliver messages to the subscriber.) The content of the node includes the subscriber's current
consume mark for the topic.
+ * '''Sub1, Sub2 ...''' are the subscribers. Each node is named with the
subscriber ID (which should be descriptive enough for the hub to be able to deliver messages
to the subscriber.) The content of the node includes the subscriber's current consume mark
for the topic.
- * <B>hosts</B> is the root of the tree of hubs in the region.
+ * '''hosts''' is the root of the tree of hubs in the region.
- * <B>S1:port, S2:port ... </B> are the hubs. Each node is named
with the hostname:port of the hub. The fact that a node exists for the hub means that the
hub is in the cluster, and is eligible to be assigned topics.
+ * '''S1:port, S2:port ... ''' are the hubs. Each node is named with the hostname:port
of the hub. The fact that a node exists for the hub means that the hub is in the cluster,
and is eligible to be assigned topics.
- * <B>Alive</B> is an ephemeral node that indicates whether the
hub is currently alive.
+ * '''Alive''' is an ephemeral node that indicates whether the hub is currently
alive.
- ---++ Topic creation
+ === Topic creation ===
Topic creation and assignment to a hub is a lazy process. Topics are created on demand (e.g.,
when there is a subscriber) and assigned to a hub on demand (e.g., when there is a new subscriber
or a message published.) When a hub responsible for a topic fails, we reassign the topic on
demand; e.g. when the connected subscribers reconnect.
- ---+++ Subscription process
+ === Subscription process ===
The subscribe call takes three parameters:
* The subscriber ID (string)
@@ -39, +35 @@
Upon receiving a <i>subscribe</i> message for topic T, the hub H1 will follow
these steps:
* The hub H1 will check in !ZooKeeper to see if the topic T exists as a child of <b>Topics</B>.
If the topic T does not exist:
- * H1 will create the node <B>Topics.T</B>, and the node <B>Topics.T.Subscribers</B>.
+ * H1 will create the node '''Topics.T''', and the node '''Topics.T.Subscribers'''.
- * The hub H1 will read <b>T.Hub</b>, the current hub assigned to the topic
(say, H2).
+ * The hub H1 will read '''T.Hub''', the current hub assigned to the topic (say, H2).
- * If a hub exists, and is the same hub the client contacted (e.g. H1==H2), then H1
will add C to the list of subscribers (under <b>T.Subscribers</B>), and set up
its internal bookkeeping to begin delivering messages to C.
+ * If a hub exists, and is the same hub the client contacted (e.g. H1==H2), then H1
will add C to the list of subscribers (under '''T.Subscribers'''), and set up its internal
bookkeeping to begin delivering messages to C.
* If a hub exists, and is a different hub than the one the client contacted (e.g.
H1!=H2), then H1 will return to the client a <I>redirect</I> message, requesting
that the client retry its subscription at H2.
* If no hub exists, H1 will check the flag of the <i>subscribe</I> call
to see if the client was redirected.
* If false (the client has not yet been redirected) H1 will choose a random hub
H3 (possibly itself) to manage the topic.
@@ -102, +98 @@
* We may want to optimize this procedure so that the hub does not have to contact !ZooKeeper
on every publish. This could be done using leases, or by depending on the disconnect exception...the
proper way to do this is an open question.
* Instead of redirecting the client, we could forward the published message to the correct
hub. But redirecting seems cleaner, since we would really like the publisher to be directly
connected to the correct hub.
- ---+++ Topic redistribution
+ === Topic redistribution ===
Occasionally, we should shuffle topics between hubs to ensure load balancing. For example,
when a new hub joins, we want topics to be assigned to it. Similarly, if some topics are hotter
than others, the hub should be able to shed load. Since all of the persistent state about
a topic is in !ZooKeeper or !BookKeeper, shuffling a single topic can be easy: the hub just
stops accepting publishes and deletes its ephemeral node. The next time a client tries to
subscribe or publish to the topic, it will get assigned to a random hub.
@@ -113, +109 @@
The constant shuffling of topics should help to keep load evenly balanced across hubs, without
human intervention. Moreover, lazily abandoning topics will help the shuffling to occur in
an incremental fashion spread out over time.
- ---++ Open questions
+ === Open questions ===
* How to handle the lease of the hub's primary status for a topic
* How to handle access control - who is allowed to create topics, who is allowed to subscribe
to them, who is allowed to publish to them