If not, please bind the host to an address other than localhost.

Just as the doc said,

If you run the server you will notice that the server is only accessible from your own computer, not from any other in the network. This is the default because in debugging mode a user of the application can execute arbitrary Python code on your computer.

If you have debug disabled or trust the users on your network, you can make the server publicly available simply by changing the call of the run() method to look like this:

12

# runs on its own IP addressapp.run(host='192.168.1.100')

or

12

# This tells your operating system to listen on all public IPs.app.run(host='0.0.0.0')

Remember to restart the service manually after each change to your code.

If it is, allow the traffic using iptables:

1

$ iptables -I INPUT -p tcp --dport 5000 -j ACCEPT

Reference

http://dixudx.github.io/2015/10/22/Tutorial-on-ZooKeeper-Part-4-Use-Ansible-to-Setup-ZooKeeper-Servers/2015-10-22T13:19:43.000Z2016-08-16T00:52:51.000ZFrom the last tutorial, we know how to install and configure a ZooKeeper cluster starting from scratch.

In order to simplify the deployment, I’ve written a ansible-zookeeper role, which is originally forked from AnsibleShipyard and modified to support both standalone mode and replicated mode (also known as cluster), to deploy ZooKeeper servers.

Note: This role requires Java support. You can install manually or use role ansible-java as a dependency. This role assumes that all your to-be-deployed servers have been already installed these required softwares.

Clone the Role

Before the deployment, you have to first clone the role to your ansible playbooks folder (just for better management), of course you can specify to other folders.

Prepare your own inventory

Just as last tutorial said, you have to be clear enough about your ZooKeeper servers. Touch your inventory file to store these servers. In order to differentiate these two modes for better demonstration, I create two sections, one is zookeeper_servers for standalone mode, the other is cluster_hosts for replicated mode. You can also give them other names. Take your choice.

1

$ vim myinventory

Start the Deployment

The default mode of this role is standalone, which means you can omit below parts in your yaml file.

1

vars:
install_mode: "standalone"

Standalone Mode

Once you have your own inventory file, you can create a blow playbook named zookeeper_standalone.yml and insert below contents.

]]>
From the last tutorial, we know how to install and config]]>
http://dixudx.github.io/2015/10/17/Tutorial-on-ZooKeeper-Part-3-Setup-ZooKeeper-Cluster/2015-10-17T02:32:18.000Z2016-08-16T00:58:53.000ZIn last tutorial, how to install and configure ZooKeeper in standalone mode and replicated mode is introduced. Now I will give you an explicit example on how to setup this replicated mode ( also known as ZooKeeper Cluster ) starting from scratch.

Pre-requisites

Before the deployment, please see System Requirements in the Admin guide to install some required software especially Java.

ZooKeeper Deployment Topology

Few things we have to be clear enough before we start the deployment of ZooKeeper.

The number of ZooKeeper servers that you want to form your ZooKeeper Cluster.

From the first tutorial, we know that odd number of machines are best for a cluster. In my example, I use 5 servers to form the cluster with tolerance of 2 failures.

About the boxes/physical machines that you want to deploy on.

Of course, you can install all your ZooKeeper servers on a single box. But for better high availability (aka HA), you should not put all the ZooKeeper servers into one sinlge basket, because two is better than one and more is better than less.

To achieve the highest probability of tolerating a failure you should try to make machine failures independent. For example, if most of the machines share the same switch, failure of that switch could cause a correlated failure and bring down the service. The same holds true of shared power circuits, cooling systems, etc.

In my example, I deploy the cluster on 2 boxes with ip 192.168.0.100 and 192.168.0.101.

Where to put data (attribute dataDir in zoo.cfg) for ZooKeeper Server?

dataDir is the location where ZooKeeper will store the in-memory database snapshots(persistent copies of the znodes) and, unless specified otherwise, the transaction log of updates to the database.

As changes are made to the znodes, these changes are appended to a transaction log, occasionally. When a log grows large, a snapshot of the current state of all znodes will be written to the filesystem. This snapshot supercedes all previous logs.

In my exmpale, I use /var/lib/zookeeper/zookeeper-[id]/data as data home.

If ZooKeeper has to contend with other applications for access to resourses like storage media, CPU, network, or memory, its performance will suffer markedly. ZooKeeper has strong durability guarantees, which means it uses storage media to log changes before the operation responsible for the change is allowed to complete. You should be aware of this dependency then, and take great care if you want to ensure that ZooKeeper operations aren’t held up by your media. Here are some things you can do to minimize that sort of degradation:

ZooKeeper‘s transaction log must be on a dedicated device. (A dedicated partition is not enough.) ZooKeeper writes the log sequentially, without seeking Sharing your log device with other processes can cause seeks and contention, which in turn can cause multi-second delays. For this setting, please refer to attribute dataLogDir (which allows a dedicated log device to be used, and helps avoid competition between logging and snaphots) in zoo.cfg. You can find more here.

Do not put ZooKeeper in a situation that can cause a swap. In order for ZooKeeper to function with any sort of timeliness, it simply cannot be allowed to swap. Therefore, make certain that the maximum heap size given to ZooKeeper is not bigger than the amount of real memory available to ZooKeeper. For more on this, see Things to Avoid.

Having a dedicated log device has a large impact on throughput and stable latencies. It is highly recommened to dedicate a log device and set dataLogDir to point to a directory on that device, and then make sure to point dataDir to a directory not residing on that device.

# The number of milliseconds of each ticktickTime=2000# The number of ticks that the initial synchronization phase can takeinitLimit=10# The number of ticks that can pass between# sending a request and getting an acknowledgementsyncLimit=5# the directory where the snapshot is stored.# Choose appropriately for your environmentdataDir=/var/lib/zookeeper/zookeeper-1/data/# the port at which the clients will connectclientPort=2181# the directory where transaction log is stored.# this parameter provides dedicated log device for ZooKeeperdataLogDir=/var/lib/zookeeper/zookeeper-1/log/# ZooKeeper server and its port no.# ZooKeeper ensemble should know about every other machine in the ensemble# specify server id by creating 'myid' file in the dataDir# use hostname instead of IP address for convenient maintenanceserver.1=192.168.0.100:2888:3888server.2=192.168.0.100:2889:3889server.3=192.168.0.100:2890:3890server.4=192.168.0.101:2888:3888server.5=192.168.0.101:2889:3889

Reference

]]>
In last tutorial, how to install and configure http://dixudx.github.io/2015/10/16/Tutorial-on-ZooKeeper-Part-2-Installation-and-Configuration/2015-10-16T05:37:58.000Z2016-08-16T00:52:51.000ZIn last tutorial, some concepts and terminologies are introduced. To further investigate and use ZooKeeper, we move to the next step - install and configure ZooKeeper.

# The number of milliseconds of each ticktickTime=2000# The number of ticks that the initial # synchronization phase can takeinitLimit=10# The number of ticks that can pass between # sending a request and getting an acknowledgementsyncLimit=5# the directory where the snapshot is stored.# do not use /tmp for storage, /tmp here is just # example sakes.dataDir=/tmp/zookeeper# the port at which the clients will connectclientPort=2181# the maximum number of client connections.# increase this if you need to handle more clients#maxClientCnxns=60## Be sure to read the maintenance section of the # administrator guide before turning on autopurge.## http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance## The number of snapshots to retain in dataDir#autopurge.snapRetainCount=3# Purge task interval in hours# Set to "0" to disable auto purge feature#autopurge.purgeInterval=1

Change the value of dataDir to specify an existing (empty to start with) directory.

For standalone mode, only the below three fields are needed and meaningful:

Replicated Mode

Running ZooKeeper in standalone mode is convenient for evaluation, some development, and testing.

You can find the meanings of these and other configuration settings in the section Configuration Parameters. A word though about a few here:

Every machine that is part of the ZooKeeper ensemble should know about every other machine in the ensemble. You accomplish this with the series of lines of the form server.id=host:port:port. The parameters host and port are straightforward. You attribute the server id to each machine by creating a file named myid, one for each server, which resides in that server’s data directory, as specified by the configuration file parameter dataDir.

But in production, you should run ZooKeeper in replicated mode. A replicated group of servers in the same application is called a quorum, and in replicated mode, all servers in the quorum have copies of the same configuration file. The file is similar to the one used in standalone mode, but with a few differences. Here is an example:

For the meanings of the new entries, initLimit and syncLimit, please refer to the comments in the file zoo.cfg of Standalone Mode.

The entries of the form server.X list the servers that make up the ZooKeeper service. When the server starts up, it knows which server it is by looking for the file myid in the data directory. This file, which I will show its usage in the next tutorial, is quite IMPORTANT and INDISPENSABLE. That file contains the server number, which is a cluster-unique ZooKeeper‘s instance id (1-255) in ASCII, and it should match X in server.X in the left hand side of this setting.

The list of servers that make up ZooKeeper servers that is used by the clients must match the list of ZooKeeper servers that each ZooKeeper server has.

Finally, note the two port numbers after each server name: 2888 and 3888. Peers use the former port to connect to other peers. Such a connection is necessary so that peers can communicate, for example, to agree upon the order of updates. More specifically, a ZooKeeper server uses this port to connect followers to the leader. When a new leader arises, a follower opens a TCP connection to the leader using this port. Because the default leader election also uses TCP, we currently require another port for leader election. This is the second port in the server entry.

In the next tutorial, I will give an explicit example on how to setup this replicated mode/a cluster of ZooKeeper server ( also known as an ensemble ) starting from scratch.

Reference

]]>
In last tutorial, some concepts and terminologies are ]]>
http://dixudx.github.io/2015/10/14/Tutorial-on-ZooKeeper-Part-1-Concepts-and-Terminologies/2015-10-14T12:25:21.000Z2016-08-16T00:52:51.000ZOverview

Apache ZooKeeper is a highly reliabledistributed coordination service for distributed applications to coordinate with each other through a shared hierarchical name space, which is organized similarly to a standard file system path. The name space consists of data registers - called znodes, in ZooKeeper parlance - and these are similar to files and directories which can provide strictly ordered access to the znodes. Unlike a typical file system, which is designed for storage, ZooKeeper data is kept in-memory, which means ZooKeeper can achieve high throughput and low latency numbers. It is especially fast in “read-dominant” workloads. ZooKeeper applications run on thousands of machines, and it performs best where reads are more common than writes, at ratios of around 10:1.

ZooKeeper is ordered. ZooKeeper stamps each update with a number that reflects the order of all ZooKeeper transactions. Subsequent operations can use the order to implement higher-level abstractions, such as synchronization primitives.

The performance aspects of ZooKeeper allow it to be used in large distributed systems. The reliability aspects prevent it from becoming the single point of failure in big systems. Its strict ordering allows sophisticated synchronization primitives to be implemented at the client.

From this part on, I will write a series of tutorials on ZooKeeper. Some concepts and terminologies are introduced here first.

What is ZooKeeper?

ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. All of these kinds of services are used in some form or another by distributed applications. Each time they are implemented there is a lot of work that goes into fixing the bugs and race conditions that are inevitable. Because of the difficulty of implementing these kinds of services, applications initially usually skimp on them ,which make them brittle in the presence of change and difficult to manage. Even when done correctly, different implementations of these services lead to management complexity when the applications are deployed.

ZooKeeper runs in Java and has bindings for both Java and C.

ZooKeeper is very fast and very simple, which is designed to be a basis for the construction of more complicated services, such as synchronization, it provides a set of guarantees. These are:

Sequential Consistency - Updates from a client will be applied in the order that they were sent.

Atomicity - Updates either succeed or fail. No partial results.

Single System Image - A client will see the same view of the service regardless of the server that it connects to.

Reliability - Once an update has been applied, it will persist from that time forward until a client overwrites the update.

Timeliness - The clients view of the system is guaranteed to be up-to-date within a certain time bound.

Some Concepts and Terminology

Data model and the hierarchical namespace

Just as mentioned in the Overview, the name space provided by ZooKeeper is much like that of a standard file system. A name is a sequence of path elements separated by a slash (“/“). Every node in ZooKeeper’s name space is identified by a path, which always needs to start with the root znode ( “/“ ).

Znode

Unlike standard file systems, each node in a ZooKeeper namespace can have data associated with it as well as children. You can create some sub-znodes/children znodes in the znode. It is like having a file-system that allows a file to also be a directory. (ZooKeeper was designed to store coordination data: status information, configuration, location information, etc., so the data stored at each node is usually small, in the byte to kilobyte range.) We use the term znode to make it clear that we are talking about ZooKeeper data nodes.

Note: Every znode must has a parent whose path is a prefix of the znode with one less element; the exception to this rule is root znode (“/“) which has no parent. Also, exactly like standard file systems, a znode cannot be deleted if it has any children.

Znodes maintain a stat structure that includes version numbers for data changes, acl changes. The stat structure also has timestamps. The version number, together with the timestamp allow ZooKeeper to validate the cache and to coordinate updates. Each time a znode‘s data changes, the version number increases. For instance, whenever a client retrieves data, it also receives the version of the data. And when a client performs an update or a delete, it must supply the version of the data of the znode it is changing. If the version it supplies doesn’t match the actual version of the data, the update will fail.

The command syntax for creating a znode is as follows:

1

create -[options] /[znode-name][znode-data]

Example 1: Create a new znode named “znode_test” with data “znode_test_data”

The path consists of the root znode (“/“) and the name of the znode you want to create. Here you can write /znode_test for its path.

Different Types of Znodes

In ZooKeeper there are 3 types of znodes: persistent, ephemeral, and sequential.

Persistent Znodes (Default)

These are the default znodes in ZooKeeper. They will stay in the ZooKeeper server permanently, as long as any other clients (including the creator) leave it alone.

1

create /znode mydata

Ephemeral Znodes

Ephemeral znodes (also referred as session znodes) are temporary znodes. Unlike the persistent znodes, they are destroyed as soon as the creator client logs out of the ZooKeeper server. For example, let’s say client1 created eznode1. Once client1 logs out of the ZooKeeper server, the eznode1 gets destroyed.

1

create –e /eznode mydata

Sequential Znodes

Sequential znode is given a 10-digit number in a numerical order at the end of its name. Let’s say client1 created a sznode1. In the ZooKeeper server, the sznode1 will be named like this:

sznode0000000001

If client1 creates another sequential znode, it would bear the next number in a sequence. So the next sequential znode will be called [znode-name]0000000002.

1

create –s /sznode mydata

ACL

ACL (Access Control List) is basically an authentication mechanism implemented in ZooKeeper. It makes znodes accessible to users, depending on how it is set. This part will be introduced in the following tutorials.

Ensemble and Quorum

ZooKeeper service can be replicated over a sets of hosts called an ensemble. As long as a majority of the ensemble are up, the service will be available. For the ZooKeeper service to be active, there must be a majority of non-failing machines that can communicate with each other. Failure in this context means a machine crash, or some error in the network that partitions a server off from the majority. To create a deployment that can tolerate the failure of F machines, you should count on deploying 2xF+1 machines. Thus, a deployment that consists of three machines can handle one failure, and a deployment of five machines can handle two failures. Note that a deployment of six machines can only handle two failures since three machines is not a majority. For this reason, ZooKeeper deployments are usually made up of an odd number of machines. ThreeZooKeeper servers is the minimum recommended size for an ensemble, and we also recommend that they run on separate machines.

A replicated group of servers in the same application is called a quorum. All servers in the quorum have copies of the same configuration file. QuorumPeers will form a ZooKeeper ensemble. A quorum is represented by a strict majority of nodes. You can have one node in your ensemble, but it won’t be a highly available or reliable system. If you have two nodes in your ensemble, you would need both to be up to keep the service running because one out of two nodes is not a strict majority. If you have three nodes in the ensemble, one can go down, and you would still have a functioning service (two out of three is a strict majority).

If a quorum of nodes are not available in an ensemble, the ZooKeeper service is nonfunctional.

Everyone knows Jenkins, right? And I think nobody don’t love Jenkins. Maybe it’s not the fastest or the fanciest, but it’s really easy to start to use, even for rookies, due to its short learning curves. What’s more, it has a great ecosystem of plugins and add-ons, which has significantly improved its capability. It is also optimized for easy customization. It can be configured to build codes, create Docker containers, run tons of tests, push to staging/production and etc.

However there are some issues regarding scaling and performance, which isn’t so unusual. Jenkins is built as a CI tool, which also needs CI for itself.

There are other cool solutions such as Travis CI and Circle CI, which are both hosted solutions that don’t require any maintenance on our side.

Reflatus

Ordinarily when a build flow is running, we want to track and dynamically show the real-time status of the Jenkins Build Flow. There is already a plugin named Build Graph View Plugin, which computes a graph of related builds starting from the first job to the current one, and renders it as a graph.

However, that plugin is full-fledged with no standalone daemon, which is hard to customized and integrated into your own dashboard. What’s more, that plugin cannot fully display the whole flow graph until all the subjobs/pipelines finish. So it is quite hard for developers, testers and operations engineers to maintain/monitor the overall progress of the current flow.

So I write such a standalone web service named reflatus, short for real-time jenkins flow status.

What it can NOT do

Reflatus only has a static parser, which can NOT parse the dedicated DSL defined by build flow. For the reasons, please refer to FAQ.

Then an extra yaml file is needed to explicitly define the build workflows (aka build pipelines). More info can be seen in the Configuration section.

FAQ

Why not adding/using a parser to handle the dedicated DSL defined by build flow ?

If so, there is no need to manually add an extra yaml file. Actually it will become quite complex to implement this feature. Regardless of the complicated build flow combinations, the name of a build job/pipeline can be dynamically acquired by triggered parameters, environment variables or an explicit name. This also applies to build job/pipeline parameters. These all adds more workloads and complexity to this tool. It is for this consideration that I discard this feature.

IBM® Rational Team Concert™, is built on the Jazz platform, allowing application development teams to use one tool to plan across teams, code, run standups, plan sprints, and track work. For more info, please refer to the official website.

Currently there are no light-weighted, easy-used clients for Rational Team Concert (aka, RTC). Indeed there is an official RTC client, which is quite powerful, fully fledged and can be integrated with Eclipse as a plugin, but it is a GUI-based client. It is very hard to integrate with or be called by other programs. Also it is neither easy-installed nor light-weighted.

The most common scenario that I want to use such a client is to open a new RTC defect/change request/story when a Jenkins pipeline finishes.

I’ve searched all over the Internet, but found nothing. I think such a client should at least has below characteristics:

easy to be installed;

light-weighted;

simple to use;

a command line tool without GUI;

at least supports some basic RTC usages, such as creating all kinds of Workitems, adding comments to the retrieved Workitems and etc;

So I write such a Python-based client to implement these basic RTC scenarios.

A Python-based Client: RTCClient

This client is named rtcclient, which has already been published to Pypi. You can install this client using pip:

1

$ pip install rtcclient

Actually rtcclient is more like a library, which provides some basic classes and methods to interact with RTC server: