Dots and Brackets: Code Bloghttps://codeblog.dotsandbrackets.com
Blog about DevOps, distributed applications and microservicesThu, 09 Aug 2018 03:08:09 +0000en-UShourly1https://wordpress.org/?v=4.6116176913Automating GCP Infrastructure with Deployment Managerhttps://codeblog.dotsandbrackets.com/gcp-deployment-manager/
https://codeblog.dotsandbrackets.com/gcp-deployment-manager/#respondThu, 09 Aug 2018 03:08:09 +0000https://codeblog.dotsandbrackets.com/?p=1450I don’t know how and why, but even though for the last couple of years I was spending at least few hours a week doing something with Google Cloud Platform, I never managed to notice that they have their own tool for automating infrastructure creation. You know, creating VMs, networks, storage, accounts and other resources. But it’s … Continue reading "Automating GCP Infrastructure with Deployment Manager"

I don’t know how and why, but even though for the last couple of years I was spending at least few hours a week doing something with Google Cloud Platform, I never managed to notice that they have their own tool for automating infrastructure creation. You know, creating VMs, networks, storage, accounts and other resources. But it’s there, right in the main menu.

The tool is called Deployment Manager and it can build and provision virtually everything that Google Cloud Platform can provide. All in one command. As any other tool from Google it has slightly mind bending learning curve and not always up to date documentation, but it works and gets the job done. Most of the time I was automating everything starting from the host and up, using Vagrant, Ansible, docker-compose or kubectl. But automating everything from the host and down – actual infrastructure – that’s going to be interesting.

How it works

I find it somewhat similar to docker-compose and Ansible. We describe the desired state of the infrastructure – VMs, networks, firewalls, etc. – in a file and then tell DM to make that happen. DM will treat the whole configuration as a single deployable unit, so we can check its status later, update it or even delete the whole thing.

Interesting, but not surprising thing with DM is that it won’t create requested e.g. virtual machines if they already exist. That’s default behaviour, it’s configurable and plays well with the idea of desired state configuration. However, it also can be confusing at times, when, for instance, one deployment accidentally acquires another one’s VM just because they use identical host name. It actually happens.

Small improvement: set ephemeral external IP

There’s minor improvement that we can make. Current configuration creates a VM without external IP address, which among the other things means you won’t be able to SSH into it. It’s very easy to fix, though. If we add the following accessConfig to networkInterfaces section, all firewall permitted connections from outside world will become possible. As we’re using ‘default’ network, that includes SSH.

More complex scenario

Creating single virtual machine from configuration file is nice example for a blog post, but in real life there will be something more complex. E.g. configuration file will contain multiple resources, some of which might depend on the others and therefore should be created after their dependencies. By default Deployment Manager will try to create everything in parallel, so we need an extra trick to introduce some sort of resource hierarchy.

There’re at least two ways to do that: through metadata and references.

Explicit dependency through metadata

That’s both the easiest and to my taste a little bit suspicious way to setup resource hierarchy. For instance, if our configuration has two resources – a VM and a network, and the VM must be created after the network, we could do something like this:

However, if one resource depends on the other, probably it also uses some attributes of that resource: a link, an IP address or a name. It’s way more logical to add references to those attributes, which in its turn will introduce dependencies and hierarchy.

Dependency through references

Let’s have a look at the following example. Assume we need a VM connected to a custom network with SSH allowed. In terms of GCP resources it means we need to create a network, a firewall rule and the VM itself. Obviously, the network should come first and only when it’s ready – continue with the VM and the rule. We can make sure DM does the right thing by adding references to the network to both the VM and the firewall rule like here:

When we deploy it, it’s pretty obvious that not everything is created in parallel now:

Few moments later:

We also could confirm that the VM is indeed connected to my-network and allow-inbound-ssh firewall rule is there:

Conclusion

By the end of such experiments I usually come to some sort of Stockholm syndrome and start to like the tool, no matter how good it is. This time is not an exception.

Deployment Manager is fine. It’s slightly hard to get started with it, as there are not many clear examples out there. Ironically, ones that do exist sometimes not actually compatible with latest APIs. Plus, it’s not obvious where to look for some information, like what configuration properties are there, or how exactly template schema should look like. But so far I was able to find almost everything I need.

I wonder how Deployment Manager compares to its seemingly main competitor – HashiCorp’s Terraform. My gut feeling says that DM is probable less mature, but who knows. I probably should get to know Terraform better to be sure, but that will happen later. So far there’re few interesting DM features to explore.

]]>https://codeblog.dotsandbrackets.com/gcp-deployment-manager/feed/01450Drones and possible blog theme shifthttps://codeblog.dotsandbrackets.com/blog-theme-shift/
https://codeblog.dotsandbrackets.com/blog-theme-shift/#respondThu, 26 Jul 2018 02:49:52 +0000https://codeblog.dotsandbrackets.com/?p=1438When I started this blog my initial impulse was to write about things that I usually work with or at least have a relation to. I even came up with a list of 50 topics or so. Half of them were about JavaScript, as that was my main focus area back then. And the other half was about NoSQL, … Continue reading "Drones and possible blog theme shift"

When I started this blog my initial impulse was to write about things that I usually work with or at least have a relation to. I even came up with a list of 50 topics or so. Half of them were about JavaScript, as that was my main focus area back then. And the other half was about NoSQL, as.. well, that was the book I was reading.

However, almost immediately I ended up writing about weirdly unrelated stuff: micro-services, distributed apps and DevOps. It had some crossovers with what I do for a living, and sometimes blog topics did became related to my work. But most of the time I simply stumbled upon an interesting name or a concept in a realm of distributed applications, learned about it and came up with a blog post afterwards. DevOps and distributed apps became a hobby, and therefore it was relatively easy to sacrifice a noticeable amount of sleep hours to it.

So here’s a problem. It’s not a hobby anymore. I made a shift into cloud consulting area, so current blog theme has 100% overlapping with my job. It looks like I need to find a new hobby, right?

And oh boy how many interesting things are happening in the industry. I’m actually thinking about picking up on something from machine learning area. I was avoiding the subject at first, as even though it is the real thing for years to come, there were too much hype around it and under such circumstances it’s very easy to start doing what everyone else does, not what’s interesting or makes sense to do. Plus, ML is huge.

However, there’s a small part of it which fascinates me: visual navigation for flying robots. Quadcopters, mainly. The first time I saw those flying beasts I thought “Man, I wonder if we could put two cameras on them and let them get around not by GPS, but rather using visual clues. You know, like humans do”. It should be fun. After all, it has all the right ingredients: robots, flying, ML and rather challenging task.

Even though only a subset of the task is ML, the subject is still huge. Where would I even start? But apparently there’s already much work going on in that area. I found the whole course at edX about autonomous navigation for flying robots. The course is almost 5 years old, but it still should provide a nice foundation for the subject. What’s cool, the same lecturer who taught that course, also published a set of lectures about visual navigation for drones specifically – a very good place to start.

Another huge thing is quadcopter itself. If I ever going to play with it, it shouldn’t be just a consumer model, which flies, follows a pilot and does a stun or two. As bare minimum it should come with some sort of API client or whatever, which will give the way to programmatically lift, rotate and navigate the drone. Ideally it should also have a way to attach more sensors to it and still be able to read their data through the same API.

That edX course I mentioned referred to at least two models that seem to be compatible with what I described: Parrot AR.Drone 2.0 and Bitcraze Crazyflie 2.0.

Parrot AR.Drone 2.0

Both models are slightly old, but the Parrot still seems to be the standard model for autonomous flying research, and Crazyflie simply provides lots of fun to play with. It’s a palm size and therefore very safe drone, which can be accompanied with USB-pluggable radio, that turns your your laptop into programmable flight control centre. In the same time drone size is also a downside. With such dimensions the little guy can lift somewhat like 20 grams, which i believe is less than a pair of video cameras for stereo optical navigation would weight. Not to mention additional computing power and a battery for that.

Crazyflie 2.0

Theoretically, if I got deeper into flight controller firmware, I probably could build up my own drone, add a radio to it and control it via whatever-to-usb adapter that may come. I was checking online documentation and it seems like it’s very doable. However, it’s such a long shot, that my enthusiasm might actually expire just after soldering. Way before the actual machine learning starts.

So that’s what’s going on. I’m fully in a cloud now, I can’t keep the hobby-topic and a full-time job topic the same, so the blog theme might change soon. Possibly to ML.

]]>https://codeblog.dotsandbrackets.com/blog-theme-shift/feed/01438Quick intro to helm – a package manager for Kuberneteshttps://codeblog.dotsandbrackets.com/helm-package-manager/
https://codeblog.dotsandbrackets.com/helm-package-manager/#respondThu, 12 Jul 2018 03:10:41 +0000https://codeblog.dotsandbrackets.com/?p=1414I suddenly realized that I haven’t blogged about Kubernetes for quite a while. But there’s so much happening in that area! For instance, even though creating Kubernetes objects from YAML configuration was the true way, it never felt that much convenient. So here’s the solution – use helm, the package manager for Kubernetes. What’s helm Helm allows … Continue reading "Quick intro to helm – a package manager for Kubernetes"

]]>I suddenly realized that I haven’t blogged about Kubernetes for quite a while. But there’s so much happening in that area! For instance, even though creating Kubernetes objects from YAML configuration was the true way, it never felt that much convenient. So here’s the solution – use helm, the package manager for Kubernetes.

What’s helm

Helm allows installation of Kubernetes apps in the same manner as we’d install TypeScript via npm or nginx via apt-get. It actually comes as two components: a command line client called helm and its companion service hosted inside of Kubernetes called tiller. Together they can search, install, remove, upgrade and create new application packages called charts.

‘Chart’ is not the only potentially confusing choice of words. For example, the instance of a chart running in Kubernetes is called a release. Two instances of the same chart would become two releases, and so forth.

Fortunately, the packages source has more conventional name – a repository. Obviously, there can be more than one of them, including privately maintained.

Having terminology sorted out, let’s install helm and see it in action.

Install

However, this only installs command line client. In order to make the whole thing to work we also need to install the server component – tiller – into Kubernetes cluster. I’ve started my cluster locally via minikube start and helm init will take care of the rest.

As a side note, it’s actually possible to see what exactly gets installed: helm init --output yaml spits out pretty trivial Deployment YAML that tiller is made of:

Basic operations

Search

First, let’s find something to install. helm search is the command to show all known packages. In fact, I already know what I want to try, so helm search prometheus narrows the search down to just a handful of packages:

It’s actually very convenient. Notes section in the end shows some info about what was installed and where to go next. For instance, there’re some suggested commands at lines 17-18 which will forward local port 9093 to exposed port of Alertmanager service, so we could point the browser to it and check what’s inside.

Btw, we could see the same notes again with the help of helm status %release name% command.

Apply options

As you can see (actually you can’t, as I truncated the output, but believe me on this one), it has lots and lots of customizable values. If I wanted to disable the Alertmanager during installation, I’d probably put alertmanager.enabled=false to separate config file and passed it as an additional argument to install command:

helm install -f config.yaml stable/prometheus

Alternatively, it’s also possible to pass this value directly, without the file at all:

helm install --set alertmanager.enabled=false stable/prometheus

However, as we already installed prometheus, it would be way simpler to just upgrade it.

Upgrade and rollback

Upgrade

If we replace install with upgrade, we can pass new chart settings to existing release:

So Chart.yaml obviously is chart definition with the name and other stuff, whereas charts seems to be the folder for charts-dependencies. values.yaml is quite interesting. All those parameters we could see in helm inspect values and which we could change in helm install/upgrade are actually coming from here.

Templates

Checking the contents of templates folder reveals one important secret – all these deployments, services and other Kubernetes goodies that we’re going to ship in our package and which will be stored in templates folder – they all can be templates.

Here’s how it could be useful. For instance, if we don’t want to hardcode some Deployment’s name in YAML and rather pick one assigned to release with some suffix, here’s how we could do that:

.Release (as well as .Values) is one of helm’s built-in objects that templates can use. There’re few other sources of values, but simple value substitutions is not the only thing that templates are capable of. They also have control structures like if or range, some functions or even pipelines, so {{ .Release.Name | upper }} is perfectly valid template entry.

Packing and installing the chart

In fact, folder with the chart is already perfectly installable entity, so helm install ./demo-chart would actually work. However, if we’re going to distribute the chart or upload it to our own charts repository, it would make more sense to pack it first:

Conclusion

So this is helm. It’s actually pretty neat: easy to use, quite easy to understand and definitely more powerful than I need for foreseeable future. If only it could not exist and simply be a part of some already existing package manager. After all, how can I possibly keep remembering all of them? Even JavaScript has three. Can we have, maybe, just one to rule them all? Please?

]]>https://codeblog.dotsandbrackets.com/helm-package-manager/feed/01414The mystery of package downgrade issuehttps://codeblog.dotsandbrackets.com/package-downgrade-issue/
https://codeblog.dotsandbrackets.com/package-downgrade-issue/#respondThu, 28 Jun 2018 03:34:41 +0000https://codeblog.dotsandbrackets.com/?p=1394In last six or so weeks Microsoft managed to release whole bunch of .NET Core 2.1 SDKs (Preview 2, Release Candidate 1, Early Access, RTM) and we tried all of them. By the end of these weeks my cluster of CI servers looked like a zoo. As everything was done in a hurry, there were … Continue reading "The mystery of package downgrade issue"

]]>In last six or so weeks Microsoft managed to release whole bunch of .NET Core 2.1 SDKs (Preview 2, Release Candidate 1, Early Access, RTM) and we tried all of them. By the end of these weeks my cluster of CI servers looked like a zoo. As everything was done in a hurry, there were servers with RC1 pretending to be Early Access ones. EA servers pretended to be RTM compatible, and the only RTM host we had was pretending to support everything. Don’t look at me funny. It happens.

The problem happened when I tried to cleanup the mess: removed P2, RC1 and EA SDK tags from release branches, deleted prerelease servers, forced remaining servers to tell exactly who they are and finally rolled out new VMs with latest and greatest .NET Core SDK 2.1 installed. Naturally, very first build failed.

The issue

Compilation error said that Detected package downgrade: Microsoft.NETCore.App from 2.1.1 to 2.1.0. In fact, it wasn’t even a compilation – the build failed during package restore phase.

There was also a chance that it’s one off issue caused by some mysterious race condition, so I could retry the build job and in case of success pretend the error never happened (can’t deal with it now). But nope, I retried it twice and the build failed twice as well. It looks like I have to use that brainy thing again.

Troubleshooting

Poking around

It’s quite interesting that project builds locally just fine. It’s the same Ubuntu 16.04, the same code, the same SDK.. or is it? Quick dotnet --version on both hosts shows that local SDK’s version is 2.1.300, whereas one on the build server is 2.1.301. So Microsoft released a patch few days ago? Interesting. After newer version finds its way to my workstation, project no longer builds on it as well. Well, that’s a good sign.

I checked project files, but they looked pretty much as usual and nothing would suggest the cause of the conflict. What’s interesting, it took me some time to notice that dotnet build command actually works. It’s only dotnet publish -r ubuntu-x64 that doesn’t. What’s even more interesting, if I skipped the runtime parameter (-r), even publish worked. Not sure how it helps me now, but who knows.

Getting the logs

Having zero ideas about where to look for more hints, I had no other option but enable diagnostics output in build/publish commands and try to find out at what point they start to behave differently.

If you never used -v diag parameters in MSBuild or dotnet build commands you probably should know that it produces a lot of output. No, like this – A LOT. For our ~90 projects solution it emits tens and tens of megabytes of unstructured text output. But if there’s something to find, it should be there.

That’s… a lot of colours. And lines. Because publish build failed right during NuGet packages restore, it’s about 20 times smaller. That’s also a good thing – I can remove everything after Done executing task 'RestoreTask', which separates restore phase from the rest of the build, and significantly reduce the amount of text to deal with.

Chasing the differences

The error message was saying something about package downgrade and version 2.1.1 in it. Let’s look for it then.

About a dozen of matches later I do find a place where RuntimeFrameworkVersion property becomes different: 2.1.1 in faulty build vs 2.1.0 in successful one. As a side note, 2.1.0 is the version of runtime shipped by default with .NET Core SDK 2.1.300. The latest SDK at the moment – 2.1.301 comes with patched runtime – 2.1.1. It’s very easy to check:

The error message stated that Microsoft.NETCore.App2.1.1 – the runtime – was conflicting with its 2.1.0 counterpart, so it really looks like something in our solution caused one part of it to target the latest runtime, and the other one – stick with base. OK, but where does RuntimeFrameworkVersion gets its value? Nowhere. In these particular log files it looks like its value comes from outer space and never gets explicitly assigned.

OK, another try. All these build properties are coming from .props and .targets files which are the part of .NET SDK. What if I search for the property assignment among them?

That’s interesting. If TargetLatestRuntimePatch property is set to true, then RuntimeFrameworkVersion will use LatestNetCorePatchVersion, which I believe is happening in our case. Here, it’s even in build logs:

OK, I think I see the picture here. The last question is when TargetLatestRuntimePatch becomes true?

Again, there’s nothing in logs, but in SDK itself I was able to find this:

Thinking part

It makes total sense now. When we compile using dotnet build, SelfContained becomes false and so does TargetLatestRuntimePatch, leaving RuntimeFrameworkVersion with its default version of 2.1.0. However, it all changes for dotnet publish. SelfContained is true, TargetLatestRuntimePatch is also true and therefore for .NET Core SDK 2.1.301 RuntimeFrameworkVersion becomes 2.1.1. For some reason at least one of our test projects still requires 2.1.0, thus causing the conflict. We didn’t have the issue with SDK 2.1.300, as that was the first one to come out, so Base and Latest runtime versions were the same.

So what’s next? How do I fix that? Well, there’re actually three choices. The true one and two temporary remedies.

Find the package causing runtime version downgrade and fix it.

When SelfContained, explicitly set TargetLatestRuntimePatch in problematic projects to true, thus eliminating the conflict.

Explicitly set TargetLatestRuntimePatch to false for ‘main’ project, so we always use the base version.

Eventually we decided to come up with forth solution: ignore SelfContained flag at all and always require latest runtime patch. After all, why would we want to stick with an old one?

Conclusion

Even though I’m not a fan of digging through dotnet and MSBuild internals, there’s some guilty pleasure in exercises like this. Long time ago I was dealing a lot with XSLT, which being an XML also was perfectly valid functional language with functions, recursion, patterns matching, etc. And it’s really something to see a program written in functional XML. Bizarre, but something. CSPROJ files along with .props and .targets files from MSBuild are also XML based and also carry a logic with them – assignments, conditionals, code imports and some form of functions with parameters. While a little bit archaic in nowadays, it’s still kind of cute. Ah, good old medieval days…

]]>https://codeblog.dotsandbrackets.com/package-downgrade-issue/feed/01394Service mesh implemented via iptableshttps://codeblog.dotsandbrackets.com/service-mesh-iptables/
https://codeblog.dotsandbrackets.com/service-mesh-iptables/#respondThu, 14 Jun 2018 02:17:23 +0000https://codeblog.dotsandbrackets.com/?p=1373So last time I mentioned, that another Kubernetes compatible service mesh – Conduit – has chosen another approach to solve the problem. Instead of enabling the mesh at machine level via e.g. http_proxy env variable, it connects k8s pods or deployments to it one by one. I really like such kinds of ideas that make 180° turn on solving the problem, … Continue reading "Service mesh implemented via iptables"

]]>Imaginary distributed app with services plugged into the service mesh

So last time I mentioned, that another Kubernetes compatible service mesh – Conduit – has chosen another approach to solve the problem. Instead of enabling the mesh at machine level via e.g. http_proxy env variable, it connects k8s pods or deployments to it one by one. I really like such kinds of ideas that make 180° turn on solving the problem, so naturally I wanted to see how exactly they did that.

So, what’s Conduit again?

As I said, Conduit is a service mesh for Kubernetes. Instead of letting application services to talk directly to each other, it forces them to send all traffic through the mesh. In return this allows us to collect all sorts of statistics and logs, adds more control over traffic routing and provides all other goodies that self respecting mesh should provide.

Installation

It’s also fairly easy to install. Basically, downloading the binaries via provided shell script and adding it to PATH is more than a half of a deal:

Cool! So among the whole bunch of accounts, services and deployments they also use Prometheus and Grafana. I think I know how they are collecting and displaying the metrics now.

Checking the dashboard

Sure, there will be no data, but it still interesting to see how it looks. conduit dashboard does the trick:

That’s nice. All is up and running, all’s green. The most interesting part is in the bottom. It says that cluster has 4 k8s namespaces, but only one of them – conduit – is fully connected to the service mesh.

So they are monitoring conduit with conduit? Nice. And it also means that there will be some network activity to see after all. Clicking at conduit namespace and…

They do have metrics! Per deployment and per pods. OK, going further – clicking at grafana deployment:

And we see Grafana, displaying traffic data about Grafana. Almost a recursion.

Wiring up more pods

Let’s head back to the main page, to the list of wired/unwired namespaces.

conduit is completely wired, default and kube-public are empty, but kube-system is both unwired and has 9 pods in it. I don’t think that would be a good idea to do that in production, but for minikube – what if we try to connect some system pods to the mesh?

Connection to the mesh is done via command line (I think they also have API for that): conduit inject. It takes YAML configuration for Kubernetes object (e.g. pod or deployment) as an input and returns a new one, slightly modified, with its traffic configured to go through the mesh. Before I actually connect anything to it, let’s check what exactly gets injected into, let’s say, deployment.

Comparing before and after YAMLs

kube-system namespace, which is the only one in default minikube setting, that has any pods in it, has two deployments:

Look at that beauty. Apparently, they are adding two containers: one that seems to accept the traffic and resend it to controller pods (gcr.io/runconduit/proxy:v0.4.2, line 189) and the other one – Init Container, gcr.io/runconduit/proxy-init:v0.4.2, line 211 – to run before anything else does and to setup the whole thing. I wonder what’s inside.

So it’s written in Go. The source files are also right there and the great reveal is that unlike Linkerd with it routing a traffic via http_proxy env variable, Conduit does this via iptables. At least that’s what my understanding of Go and word iptables tells me. As a side note, code for the second Docker image – proxy – is written in Rust. Right language for right job, cool.

Attaching kube-system deployments to the mesh

This satisfies my curiosity for now, so it’s time to try to connect the only other namespace with pods to the mesh: kube-system.

Conclusion

So this is Conduit. Like with Linkerd and a service mesh in general, I can’t say that playing with it was an eye opener for me. But I do like nice concepts and interesting implementation and a service mesh with Conduit is definitely the one. I really love how much one could learn just from looking at their source code: mixing languages works not only in theory, but in real products; Rust is alive; Init Containers are cool; iptables are too. I wonder, however, has anyone did a benchmarking of how using a service mesh affects performance? That naturally is going to be the first question anyone will ask me if I try to ‘sell’ the idea.

]]>https://codeblog.dotsandbrackets.com/service-mesh-iptables/feed/01373Playing with a service meshhttps://codeblog.dotsandbrackets.com/service-mesh/
https://codeblog.dotsandbrackets.com/service-mesh/#respondThu, 31 May 2018 03:51:15 +0000https://codeblog.dotsandbrackets.com/?p=1353I was looking for something new to play with the other day and somehow ended up with the thing called a service mesh. Pretty interesting concept, I can tell you. Not a game changing, or world peace bringing, but still nice intellectual concept with several scenarios where it can make life much simpler. Let’s have a look. So, … Continue reading "Playing with a service mesh"

]]>I was looking for something new to play with the other day and somehow ended up with the thing called a service mesh. Pretty interesting concept, I can tell you. Not a game changing, or world peace bringing, but still nice intellectual concept with several scenarios where it can make life much simpler. Let’s have a look.

So, what is a service mesh?

Average micro-service application will use some sort of service discovery to find its services and a network to communicate with them. While service discovery part is more or less actively maintained (e.g. via Consul), network just magically works. The same application in Docker or Kubernetes becomes even more magical, as even service discovery and load balancing gets handled for us. In essence, the crucial components of many self respecting distributed application aren’t actively controlled.

Imaginary distributed app where components talk directly to each other

But what if I want to reroute the traffic from service A to service B, which is like A, but newer? How do I monitor network latency? View requests success rate? Find services abusing the network? What if I need to retry failed request? How many times should I do that? All these are real life problems, which either end up being hardcoded into every service, or simply ignored.

In contrast, service mesh extracts service-to-service communication into separate component (or layer), so previously implicit infrastructure becomes both manageable and measurable entity. Because of that, service mesh can decide to what particular service the traffic should go, will record success/error rate, perform circuit-breaking, load balancing, requests retry, eviction of failing or slow services from load balancing pool, and all of that will be configured from one place and even have its own UI.

What’s behind the magic

At first I thought authors of the idea somehow reinvented the network, but apparently service meshes like Linkerd or Conduit work very similarly to regular proxies. Instead of making a direct call, a service will make a request to a proxy, which will decide where that call should go to (by examining HTTP “Host” header, for instance). And in order to start using service mesh I don’t even need to recompile the app and replace its hardcoded URLs. Proxy URL can go to http_proxy environmental variable, which works almost everywhere, so application can remain as is. It’s still might sound a little abstract, so let’s have a closer look into few real service meshes – Linkerd and Conduit.

Running service mesh on bare OS

Let’s start with Linkerd, which is a service mesh that can run in Docker, k8s and bare OS. Looking into bare OS example actually will shed some light why the whole idea works. As of today, Linkerd 1.4.1 needs Java8, which I don’t want to install, so the easiest way for me to get started is to launch openjdk Docker image and download Linkerd right there.

OK, so I won’t go into details of Linkerd configuration, mainly because I don’t know them. However, I do know that Linkerd uses file-based service discovery, so if I wanted to register service e.g. search, I would simply create search file in disco folder with service IP address and port number:

echo "172.217.1.14 80" > disco/search

172.217... is Google, btw. Now, in order to send a request to search service, I’ll simply send the request directly to Linkerd, listening now at port 4140 (by default), and indicate in HTTP Host header what service I’m looking for:

And voila! Google responded with 404, most likely because it doesn’t know who search service is. But Linkerd did successfully proxied the request to another service, while the issuer thought it’s talking to a real thing. Moreover, because Linkerd understands HTTP(S) and its status codes, most most likely it logged that request as not successful because of 404.

Imaginary distributed app where components communicate via proxy

Running Linkerd in Kubernetes

Checking out examples in Kubernetes usually looks more impressive, as due to containers nature we can bring huge blocks of ready to use functionality in one command. I’ll use minikube for running local k8s cluster. There was an older post describing how to get it up and running, so let’s assume it’s done and move straight to interesting part.

minikube start
# Starting local Kubernetes v1.10.0 cluster...

Linkerd on k8s works as a DaemonSet, meaning it runs as exactly one pod per host. There’s ready to use configuration file, which we can feed directly to kubectl and get it all installed:

Adding some demo services

However, with some client services these pages will make more sense.

Distributed apps have their own equivalent of “hello-world”, and one is even adapted for k8s. It’s a 2 pod application – pod hello and pod world. Whenever user sends a message to hello service, the service makes a sub call to world and then they both respond with “hello world”.

In order to make them talk through Linkerd we need to do a simple thing: set up a http_proxy environmental variable pointing to proxy service and the job is done. hello-world.yml, hosted among other Linkerd examples, does exactly that: in addition to 2 application pods it sets http_proxy env variable, which simply points to current host (via host name), where Linkerd listens to incoming connections:

But here’s the problem. For Kubernetes running on minikube, host name will resolve to minikube, which makes zero sense from inside of the cluster itself. It would make more sense to use node IP address instead and it’s actually fairly easy to do. We just need to replace hello-world.yml‘s spec.nodeName with status.hostIP, and problem will be solved. Those values come from k8s Downward API and that’s beyond the scope of this post.

Conduit’s approach to service meshes

There’s another service mesh that works with and designed specifically for Kubernetes – Conduit. It’s in early versions, so I’d probably avoid using it in production, but the interesting part is they use slightly different approach for deploying a service mesh. Instead of adding service proxy to every host, Conduit adds a sidecar container to every pod. In some way it’s not injecting service mesh to a host, but rather attaching individual pods to a service mesh.

Imaginary distributed app with services plugged into the service mesh

Other than that, Conduit also has web UI dashboard and even CLI interface for checking current stats and viewing HTTP logs in real time. As this post already turns out to be quite long, I won’t go into the details right now, but probably will do so in two weeks – in the next post. After all, it’s interesting to see how exactly their service mesh works.

Conclusion

So that’s the service meshes. My world didn’t shatter after I got to know them, as most of mesh functionality was already available in one form or another: nginx for reverse proxy (+lua extension for dynamic routing), Consul for service discovery and health check, etc. But I do like the idea that something so invisible and implicit, like a network, can and should be made more manageable, turned into a component. What’s more, I actually can see a real life task at my job where a service mesh could help. We need a dynamic proxy and nginx+lua, HA Proxy or something built in house were the choices for now. Service mesh brings another one.

]]>https://codeblog.dotsandbrackets.com/service-mesh/feed/01353Debugging .NET Core app from a command line on Linuxhttps://codeblog.dotsandbrackets.com/command-line-debugging-core-linux/
https://codeblog.dotsandbrackets.com/command-line-debugging-core-linux/#respondThu, 17 May 2018 01:27:46 +0000https://codeblog.dotsandbrackets.com/?p=1332Million years ago, way before the ice age, I was preparing small C++ project for “Unix Programming” university course and at some point had to debug it via command line. That was mind blowing. And surprisingly productive. Apparently, when nothing stands in the way, especially UI, debugging can become incredibly focused. Since .NET Framework got … Continue reading "Debugging .NET Core app from a command line on Linux"

Million years ago, way before the ice age, I was preparing small C++ project for “Unix Programming” university course and at some point had to debug it via command line. That was mind blowing. And surprisingly productive. Apparently, when nothing stands in the way, especially UI, debugging can become incredibly focused.

Since .NET Framework got his cross platform twin brother .NET Core, I was looking forward to repeat the trick and debug .NET Core app on Ubuntu via command line. Few days ago it finally happened and even though it wasn’t a smooth ride, that was quite an interesting experience. So, let’s have look.

Setup

We’ll need Ubuntu, .NET Core SDK, lldb debugger and a sample app. Late April, 2018 was the month of updates, so now we have shiny new Ubuntu 18.04 and .NET Core SDK 2.1 RC1, which finally got its libsosplugin.so compiled against lldb-3.9, so v3.6 we had to use for previous .NETs finally can rest in peace. As for demo project, any .NET Core hello-world wannabe with local variables and call stacks will do.

The tools

I’ll install all of that in a VM and here’s Vagrantfile with its provision.sh file to do so:

The project

It’s very simple. Let’s have a function that returns a number of ticks passed since last measurement. It’s absolutely useless except for the fact that it will have local variables and some arguments to examine later.

Setting up a breakpoint

For that we can use bpmd command introduced by SOS plugin. The only parameter it needs is a method name or its descriptor address, so in order to set up a breakpoint in a method called GetTicksElapsed, which is a member of Program class inside of console.dll assembly, here’s what I’d do:

As a side note, sometimes it might be actually easier to add a breakpoint by method descriptor address instead. Those are all over the place – in call stacks, instruction pointers, in class method tables. For instance, this is the call stack of currently selected thread:

The output is pretty straightforward. Even line numbers and file names are there. But clrstack can do more than that. For instance, it also has -p parameter, which will include function arguments into the output, and that’s something really, really useful:

Examining local variables

Bad news here. clrstack -i is the command to see local variables and their values. However, in lldb-3.9 and libsosplugin.so, which comes with .NET Core 2.1 RC1, this command immediately causes segmentation fault and crash of lldb process. Haven’t checked it in earlier versions, but here it happens 100% of the time.

Stepping in/over/out

Unlike with WinDBG, it doesn’t look like there are SOS commands for stepping in, out or over the next statement or function. However, there’re still native commands, which will step over assembly instructions, but it’s better than nothing. Especially when we can call clrstack and check where in managed realm we currently are:

Stepping out from current procedure could’ve been easy, if it didn’t jump 2 levels up instead of one every other time I used it. Maybe it has something to do with the fact that real call stack is actually a mixture of managed and unmanaged entries, whereas clrstack command shows only managed ones (-f argument would show all of them). Then, I couldn’t find an alias for step-out command and had to use its full form instead: thread step-out.

Conclusion

So this is how debugging a .NET Core app from a command line on Linux feels like. It’s surprisingly hard to find any documentation for it, so probably I missed a command or to. Faulting clrstack -i also doesn’t make the debugging easier. But still it’s really cool to be able to add a breakpoint, see how execution goes and examine call stack parameters – all from a command line, on any machine and for any project. It’s also surprising to see how thin the layer between a managed code and assembly language is. If I was able to convert callq instruction argument to managed method description, maybe I’ll be able to convert CPU register values to managed objects as well. Who knows.

]]>https://codeblog.dotsandbrackets.com/command-line-debugging-core-linux/feed/01332Sending proactive messages with Microsoft Bot Frameworkhttps://codeblog.dotsandbrackets.com/proactive-messages-bot-framework/
https://codeblog.dotsandbrackets.com/proactive-messages-bot-framework/#respondThu, 03 May 2018 02:17:54 +0000https://codeblog.dotsandbrackets.com/?p=1320I was thinking again about that bot, who supposedly will monitor unreliable tests for me, and suddenly realized one thing. All examples I dealt with were dialog based. You know, user sends the first message, bot responds, etc. But the bot I’m thinking about is different. Initial conversation indeed starts like a dialog. But once bot starts … Continue reading "Sending proactive messages with Microsoft Bot Framework"

I was thinking again about that bot, who supposedly will monitor unreliable tests for me, and suddenly realized one thing. All examples I dealt with were dialog based. You know, user sends the first message, bot responds, etc. But the bot I’m thinking about is different. Initial conversation indeed starts like a dialog. But once bot starts monitoring unit test statistics and finds something that I should take a look at, he needs to talk first! Microsoft calls such scenario sending proactive messages and there’re few tricks how to make that possible.

Saving conversation context

So the idea they suggesting is the following. Whenever conversation with user starts, we can save its session‘s message.address property and then use it to send proactive messages whenever we need to.

However, there’re few things to keep in mind. Firstly, when bot sends such proactive message, it already might be in the middle of ongoing conversation, so interrupting it with unrelated message might feel weird. Then, and this is quite obvious, as delayed message doesn’t have access to session and its send method, we have to use this weirdly looking bot API.

Finally, how do I get access to the state? userData, conversationData – it’s properties of session object and I don’t have any. Apparently, there’s slightly different way to send proactive messages, which takes all that into account.

Getting the session back

nodejs part of Bot Framework documentation is not that complete, but the source code is, so I noticed that bot object has a nice method called loadSession, which takes saved conversation address as an input and returns the session back through a callback.

As soon as we’ve got the session object back, not only we’re getting a familiar way of sending messages, but also a full access to user data and nice function called curDialog, which will return ongoing dialog name, if any. That’s comes in handy, when we want to wait until ongoing conversation finishes, before starting to bombard user with new messages.

Conclusion

As you can see, sending proactive messages is quite easy. It’s all focused around getting access to conversation context and then there are multiple ways to send messages with it: via bot object or by restoring the session, sending individual messages or initiating the whole dialog. That’s convenient. I wonder, however, how this approach works when users explicitly ends conversation. I’d bet on saved conversation context value becoming invalid, but who knows. Maybe it depends on a channel.

]]>https://codeblog.dotsandbrackets.com/proactive-messages-bot-framework/feed/01320Playing with Microsoft Bot Frameworkhttps://codeblog.dotsandbrackets.com/playing-microsoft-bot-framework/
https://codeblog.dotsandbrackets.com/playing-microsoft-bot-framework/#respondThu, 19 Apr 2018 03:17:34 +0000https://codeblog.dotsandbrackets.com/?p=1297Part of my job description is our CI/CD and it kind of implies that I’m interested in keeping the build green. It doesn’t mean that I immediately jump in whenever some unit test fails, but I’m definitely keeping an eye on unreliable ones. Whenever master branch stays red long enough, this is what starts to happen to each failed … Continue reading "Playing with Microsoft Bot Framework"

Part of my job description is our CI/CD and it kind of implies that I’m interested in keeping the build green. It doesn’t mean that I immediately jump in whenever some unit test fails, but I’m definitely keeping an eye on unreliable ones.

Whenever master branch stays red long enough, this is what starts to happen to each failed test in it:

If test behaves like a random results generator, create a case for that.

Skip the test in master branch and put the case number as a reason.

Find out who created the test (git blame) and assign it back to the author.

Pretty simple. And boring. I can automate that, but it’s not always clear who is the author of the test. After all, people resign, update each other’s tests, refactor and destroy git history on special occasions. I was thinking about doing something with machine learning to solve that, but it feels like an overkill. Creating a bot, on the other hand, who would ask me to double check when it’s uncertain, sounds more interesting and actually doable. Even if I’m never going to finish it.

However, I’ve never wrote any bots before, so for starters I’d like to check what it actually feels like.

Choosing a bot framework

Easy googling reveals that Microsoft Bot Framework seems to be default choice for bot writing. There’s also BotKit, but having Microsoft behind the tool might mean that there will be more examples for it. Plus, Bot Framework supports both C# and JavaScript – the two languages I’m the most comfortable with, so the choice was easy.

On the surface, developing with Bot Framework seems simple. For instance, in its object model there’s a Bot object, which has Conversations with Users through different Channels. Single Conversation consists of one or more Dialogs, and Channels are basically the tools User uses to interact with the bot: chats, Facebook, Skype, Slack, etc. Quite logical. There’re also some other object types and concepts behind it, but you get the idea.

However, despite such object model does look logical, writing with it is not quite intuitive. Few times I took existing bot example, made a tweak or two, and then dive into documentation trying to understand why exactly it doesn’t work. And I can tell you, the level of details I had to get into was incomparable to triviality of the change I made. But maybe it’s just me.

Writing “Hello World” bot

“Hello World” equivalent in the bot world is the bot echoing every message back to its sender. For this one I used Nodejs, but C# with .NET Framework or .NET Core is also a valid option.

This is how it looks. First, initialize the project and install botbuilder package:

Because it’s going to talk to user via console, we had to create a connector for that (line 2). Then, we create a bot (4), providing it with the connector (5) to a channel and default handler (6) for any conversation that user initiates. In this one our bot simply echoes user messages back, prepending them with Hello world and words.

$ node bot.js
> Hello-bot has started
I want some cash
> Hello world and I want some cash

Kinds of bot conversations

Even though in the previous example a word “Conversation” has never came up, that line-long arrow function was actually the conversation. In fact, MS Bot Framework has two types of them: waterfall and dialog based.

Waterfall

Waterfall conversation is simply a linear sequence of functions, which guide user through set of questions and answers. For instance, assume we’d want to write a pacifist bot – a guardian of pacifist chat, who asks a set (of one) of questions anyone who wants to join. Without changing previous hello-world example that much, here’s how we could do that.

The whole conversation can be put into two steps: a question and a reaction on its answer. The only question the bot is going to ask is “Are you against the war?”. Because that’s a question with limited number of answers, I used builder.Prompts.choice function to make it as hard as possible to answer anything other than yes or no.

What’s cool, depending on Channel and Connector type, Prompts.choice will try to use UI and tools that given Channel provides. For instance, ChatConnector with Bot Emulator attached could’ve rendered that as two buttons, eliminating the need to type anything at all.

Dialog-based conversations

Creating a conversation based on Dialog objects is more powerful, as with this approach bot starts behaving like a state machine. As it moves between its dialogs it any direction, for end user it might feel like a non-linear conversation.

Dialog itself is an object with miniature waterfall conversation in it. Dialogs can start each other, or stay idle, waiting to particular keywords in ongoing conversation and kicking in when they notice something relevant.

Let’s get back to the original reason why I started to think about writing a bot. If my bot is going to help me with monitoring unreliable tests, these are the kinds of dialogs I might need to have with it:

See the list of commands that bot understands.

Tell it to start monitoring unreliable tests.

Configure a threshold – how often a test is allowed to fail before it’s considered unreliable.

Tell it to stop monitoring tests.

Check current status.

What’s interesting, in some scenarios dialog #2 will automatically lead to dialog #3. After all, it’s hard to start monitoring for unreliable tests without being configured first. Another interesting fact is that this bot implies some sort of a state. Firstly, a threshold (dialog 3), which can be associated with current user, and secondly, whether or not monitoring is running, which we can associate with the conversation itself.

This is how we can do that.

Default conversation handler

Let’s add default conversation handler to kick in when I type messages that bot does not understand. Some sort of a help message in response would be enough.

As you can see, this dialog looks like a miniature version of waterfall conversation, where the only interesting things are reading the values from userData and conversationData and specifying when dialog should start.

Checking:

$ node bot.js
# > Judge-bot has started
# Status
# > This is what's happening at the moment:
# Unit test watcher is not configured
# Unit test watcher is not started

‘Stop’ dialog

This one is pretty straightforward. If conversationData‘s flag is not set – do nothing. Otherwise – turn it off. In real implementation it also would stop some background processes.

‘Configure’ dialog

This one is a little bit more interesting. I want to ask user to enter a number between 1 and 100, representing percentage of false positives, after which a test is considered unreliable. I don’t want to do reading/parsing myself, so let’s allow builder.Prompts.number to handle that.

Here we instruct the bot to end conversation if isStarted flag is already set, or begin a new dialog if configuration is needed. Then, when child dialog finishes, WatchDialog dialog regains the control and continues to the second step.

Having a conversation

$ node bot.js
#> Judge-bot has started
#Hey
#> Hello! I'm Judge-bot. Please say 'status' to get current status, 'watch' to start monitoring for unreliable tests, or 'stop' to stop it.
#Status
#> This is what's happening at the moment:
# Unit test watcher is not configured
# Unit test watcher is not started
#
#Stop
#> Hmm. I haven't really been doing anything to begin with. Anyway, problem solved.
#Watch
#> I'm sorry, it looks like this is the first time you asked me to do that. Let me ask you a few questions first.
#
#How often a test should fail before I treat it as unreliable? (1..100%)
#-100
#The number you entered was outside the allowed range of 1 to 100. Please enter a valid number.
#50
#> Got it. Setting the threshold to 50%
#
#> Starting to monitor unreliable tests
#Status
#> This is what's happening at the moment:
# Unit test watcher is configured
# Unit test watcher is started

Pretty cool.

As a bonus, here’s one more trick. Without any single change in conversation flow, I can replace ConsoleConnector with ChatConnector, add small web server to handle HTTP calls and use Bot Framework Emulator to have a nice chat with my bot.

Conclusion

Apparently, writing bots is fun. The biggest thing I haven’t touched yet is connecting a bot to Microsoft Cognitive Services, which would allow me to do some pretty cool stuff.

For instance, you noticed that dialog triggers are pretty rigid, right? If I type Start instead of Watch, corresponding dialog won’t kick in, even though it’s obvious in a given context what the intent was. As a solution, I could’ve connected user intents recognizing module to LUIS – MS service for understanding the language. I saw few examples of how it’s used, and that’s really impressive. For instance, LUIS was able to guess that ‘Hello’, ‘Howdy’, ‘Hey’ and ‘Wazzup!’ probably have the same intent. Or ’10s’ and ‘ten seconds’ are equivalents. Or ‘yesterday’ is a date. And so forth. Wow.

Anyway, I’m still thinking whether or not I should write that bot for unreliable tests, but at this point the bot itself is not a challenge anymore. The only difficult part would be writing connectors for bug tracker, git client and unit test results storage, so in addition to talking bot could actually do something.

]]>https://codeblog.dotsandbrackets.com/playing-microsoft-bot-framework/feed/01297Caveman’s brief look into modern front-endhttps://codeblog.dotsandbrackets.com/cavemans-brief-look-modern-front-end/
https://codeblog.dotsandbrackets.com/cavemans-brief-look-modern-front-end/#respondThu, 05 Apr 2018 02:31:00 +0000https://codeblog.dotsandbrackets.com/?p=1277Well, it might seem surprising, given what this blog is usually about, but during most of my career my main focus was… in front-end development. Yup, JavaScript and friends. It wasn’t the only thing I did, but definitely the biggest one. After moving to Canada focus shifted a little bit: I still do occasional front-end tasks for our web project, which … Continue reading "Caveman’s brief look into modern front-end"

Well, it might seem surprising, given what this blog is usually about, but during most of my career my main focus was… in front-end development. Yup, JavaScript and friends. It wasn’t the only thing I did, but definitely the biggest one. After moving to Canada focus shifted a little bit: I still do occasional front-end tasks for our web project, which started back in 2009, but basically last 2 years I’m on a server side.

If your project started nine years ago and it’s about web, most of your tools are probably obsolete. It’s been before Angular (the first one!), before React, Vue or Ember. We did have jQuery, but promptly got rid of it several years later. We built our own toolkit and avoided rewriting the app for every new-and-finally-done-right library that came along. Not everyone was happy, but the product works and sells, so approach worked.

However, in order to understand where the industry headed, I still tried to at least take a tutorial or two for every major tool that came along. After two year pause for doing another things, I’m refreshing my front-end knowledge again, and this time it feels.. different.

Checking out Webpack

Last time I was actively front-end’ing, Webpack was already there. It wasn’t really clear, though, if it’s going to be a thing. Grunt, Gulp, TypeScript and SystemJS also could bring multiple JavaScript modules together, so why another tool? It looks like Webpack has won, and after taking a Pluralsight course in it I’m.. confused.

What exactly new value does it provide? Or old value, but in more efficient manner? That was supposed to be a thing, but seeing how more and more unrelated settings are going into webpack.config.js, all I see is a duck tape and glue trying to attach a cat to a potato. Why? Well, they happened to be nearby, so must be related.

I admit, my rant might be caused by aging brain forced to learn a thing it though it already knows. But it feels OK learning everything else! And indeed, why should it enjoy solving the same task – bundling modules – again? Wasn’t it solved few times already?

Amusing thing was that Webpack configuration examples I saw were written for version 3. It’s 6 months later and current Webpack version is 4.4, which… is not entirely compatible with v3.

Seriously? So it takes 5 years and 4 versions to find out how final configuration format should look like? That’s depressing.

Checking out Vue.js

On the bright side, I’m also taking a few courses in Vue.js and it’s surprisingly enjoyful tool to use. Vue also has been around for some time and it also wasn’t clear if it’s going to be a thing. It did become one, and in this case I feel that the industry actually made a noticeable step forward.

Somehow it’s harder to enumerate what I do like in Vue, but generally I appreciate that it stays simple for simple scenarios and adds new levels of abstraction only as they needed. It doesn’t feel like it’s trying to do some sort of black magic under the hood – it’s pretty easy to predict how code will behave. That’s not always the case with other frameworks.

And I really like how they managed to make declarative part of the framework to feel that natural. Declarative approach itself is not something new, but here my mind almost bent, because suddenly I could see another set of ideas and building blocks I could compose my solutions from. It’s like discovering functional programming all over again.

Conclusion

So far all this front-end refreshing endeavour feels a little bit confusing. On the one hand there’s Vue and I’m glad somebody made it. On the other hand there’s Webpack and it’s a reminder that some areas of our industry are going in circles, not in ascending spiral.

What I don’t understand is why it’s only now when it feels so weird. After all, front-end always was like this. Maybe I got spoiled by two years of solid back-end, containers and occasional DevOps, where the tools, even ones that are not quite production ready, kind of make sense both individually and when assembled together into the app. Never completely got that feeling with front-end.