In OpenShift Aggregated Logging https://github.com/openshift/origin-aggregated-logging the Fluentd pipeline tries very hard to ensure that the data is correct, because it depends on having clean data in the output section in order to construct the index names for Elasticsearch. If the fields and values are not correct, then the index name construction will fail with an unhelpful error like this:

There is no context about what field might be missing, what tag is matching, or even which plugin it is, the operations output or the applications output (although you do get the plugin_id, which could be used to look up the actual plugin information, if the Fluentd monitoring is enabled).One solution is to just edit the logging-fluentd ConfigMap, and add a stdout filter in the right place:

and dump the time, tag, and record just before the outputs. The problem with this is that it will cause a feedback loop, since Fluentd is reading from its own pod log. The solution to this is to also throw away Fluentd pod logs.

Now, if you see a record that is missing @timestamp, or a record from a pod that is missing kubernetes.namespace_name or kubernetes.namespace_id, you know that the exception is caused by one of these missing fields.

Look for records that have "status": "red" and an "unassigned_shards" witha value of 1 or higher. IF YOU DON’T NEED THE DATA ANYMORE, AND ARE SURETHAT THIS DATA CAN BE LOST, then it might be easiest to just delete these usingthe REST API:

Unfortunately, the documentation is pretty scant, and some of the useful, interesting endpoints and options are not documented. I've captured some of that missing information below, and shown how it can be used to monitor the Elasticsearch output plugin.

Endpoints

/api/plugins

Provides information about each plugin in a text based columnar format:

@id

Search for plugin by @id. For example, in the above output, there is "plugin_id": "object:1dce4b0". Once you have identified the id, you can use that to display only the information for that particular id:

tag

Match the tag and get the info from the matched output plugin. Only works on output plugins. I unfortunately don't have an example, but I suppose you could use something like this to find the output plugins which have a match block which has a match for **_sendtoforwarder_**:

This tells me that the plugin is working, the queues are being flushed regularly, and the emit count (roughly, the number of times fluentd flushes the queued outputs, the number of times a request is made to Elasticsearch) is steadily increasing.

Tags:

WARNING - THE FOLLOWING MAY BE INSECURE!

This has been fixed in openshift-elasticseach-plugin 2.4.1.2, which is used with 1.4.1/3.4.1 version of OpenShift. If you are using an earlier version, the warning below applies.

There is no actual authentication check performed on each request. Do not use the below if you need actual security and authentication - for throwaway dev environments only. If you need a secure method, you must instead use a passthrough route, and use mutual (i.e. client cert) authentication.

Setup

The Elasticsearch deployed with OpenShift aggregated logging is not accessible externally, outside the logging cluster, by default. The intention is that Kibana will be used to access the data, and the various ways to deploy/install OpenShift with logging allow you to specify the externally visible hostname that Kibana (including the separate operations cluster) will use. However, there are many tools that want to access the data from Elasticsearch. This post describes how to enable a route for external access to Elasticsearch.

You will first need an FQDN for the Elasticsearch (and a separate FQDN for the Elasticsearch ops instance if using the separate operations cluster). I am testing with an all-in-one (OpenShift master + node + logging components) install on an OpenStack machine, which has a private IP and hostname, and a public (floating) IP and hostname. In a real deployment, the public IP addresses and hostnames for the elasticsearch services will need to be added to DNS.

The route is a reencrypt route. The --dest-ca-cert argument value below is the CA cert for the CA that issued the Elasticsearch server cert, used to re-encrypt the connection from the router to Elasticsearch. In this case, it is the same as the admin-ca cert, so we can just use that (using the method to extract it from the previous posting). By default, the route will use the server cert created by the OpenShift master CA. If you want to have a real server cert with the actual external Elasticsearch hostname, you will need to create one. An example of how to do this with OpenShift is described below (marked #optional). The route allows us to use username/password/token authentication to Elasticsearch - the auth is proxied through the router to SearchGuard/Elasticsearch.

I'm trying to convert gems to rpms. Unfortunately, gem2rpm -d does not separate/classify the dependencies. What I really need is a separate list of run-time dependencies. I can get this with gem spec --ruby. For example:

Tags:

Modern environments become more and more complex every year. When many applications and services collaborate together to perform a single task finding a cause of a problem is similar to looking for a needle in a haystack. Good tools are needed to help. There are some that do a very good job of collecting logs, alerts or notifications but they focus on a specific problem and not on the problem space as a whole. Collecting just logs, alerts or statistical data is not enough. There needs to be a way to combine the data together and let it speak, so that data from many different applications can be correlated from end-to-end, and from high to low levels. ViaQ is a new project that aims at creating a framework for connecting data aggregation, processing, and analytic technologies that already exist into a coherent and flexible solution adaptable to multiple use cases.

There are some efforts that we want to leverage:

OpenShift has begun shipping an EFK stack as containers - we want to leverage this work to provide our solution as containers, but perhaps not dependent on OpenShift

There has been a lot of investigation of collecting event data such as logs using a message bus and feeding that data into analysis tools such as Apache Storm and Apache Spark - we would like to use a message bus based approach so that we can not only feed data to an EFK stack but at the same time feed data to an analytics tool, data warehouses, or any other application requiring a live stream of data

There has been a lot of work done to describe a common data format so that logs from OpenStack (all of the various components and log formats if different from oslo logging), Ceph/Gluster, and syslog can be correlated together (e.g. timestamps, hostnames, node identifiers, etc.)

Use the new CentOS infrastructure to build upstream images based on CentOS, use the CentOS CI, and eventually use the CentOS container image build and repository systems

Demonstration

This is a demonstration about how to use RHEL Identity Management to automatically join VMs created with OSP7 (OpenStack) Nova, to automatically assign new VMs to hostgroups, and to automatically create DNS records when a floating IP address is assigned to a VM.

NOTE: The demo shows the ipaotp in the server instance metadata. The latest code at https://github.com/richm/rdo-vm-factory/blob/master/rdo-ipa-nova uses the inject_files method to inject a file into the new VM containing the OTP, which means the OTP is not available to be queried, and the VM can erase it as soon as possible.

How it works

OpenStack Nova provides hooks http://docs.openstack.org/developer/nova/hooks.html which allow developers to create custom code using the internal Nova APIs to perform actions based on Nova actions. The demonstration makes use of the build_instance and the instance_network_info hooks. Here is the source of the hook implementation: https://github.com/richm/rdo-vm-factory/blob/master/rdo-ipa-nova/novahooks.py. The build_instance.pre hook calls Identity Management with the host-add command. This will essentially "reserve" a slot for the new host, but the new host will not be fully joined (i.e. able to use Kerberos, SSH, SSSD, etc.) until the ipa-client-install completes. The build_instance.pre hook then creates the parameters that it needs to specify as arguments for the host-add command. It generates a One Time Password (OTP), and stores the OTP as a file named "/tmp/ipaotp" in the list of injected files in the new VM. This allows the VM to specify the OTP as the -w argument of ipa-client-install, then delete the OTP after it has been used. The OTP is used as the userpassword parameter for the host-add call. The ipaclass metadata item was set by using the --property argument with openstack server create. The value of that item is set to be the value of the userclass parameter for the host-add call, which in the demo is used to automatically assign the new VM to a hostgroup. The fully qualified hostname is constructed by using the VM name as the leftmost component of the FQDN, and the domain used is the Nova dhcp_domain setting if available, or an IPA specific domain configuration parameter. The force parameter is set to True because we want host-add to add the host even though we don't have a "real" public IP address yet, only the private IP address assigned by OpenStack networking. The other parameters are provided to show what options are available when calling host-add.

The VM image provided in the demo uses cloud-init, and Nova has been set up to provide certain data for the VM to use with cloud-init to call ipa-client-install with the OTP. The demo sets up the Nova vendordata_jsonfile_path with a JSON file containing the list of Identity Management client packages to install in the VM, and a runcmd to run a shell script that will run ipa-client-install. The build_instance.pre hook has been configured to add that shell script in the list of injected files in the new VM. The shell script extracts the OTP from /tmp/ipaotp, erases the file, then runs ipa-client-install -w $ipaotp -U. Once this command completes successfully, the VM is fully joined to Identity Management, and users can SSH into the new machine.

The instance_network_info.post is called after Nova handles networking related events. If the hook detects that there is a floating IP assignment, it calls dnsrecord-add to add the record for the floating IP address to the host in Identity Management.

Configuration

The hook uses a file called /etc/nova/ipaclient.conf to store its configuration. It requires the following configuration parameters:

service_name - The name of the Kerberos principal of the Identity Management HTTP JSON API service

url - The URL of the Identity Management HTTP JSON API service

cacert - The name of the file containing the certificate of the CA of the Identity Management HTTP JSON API service

keytab - The hook requires a user account in Identity Management that has the ability to add hosts and create DNS records. The hook must be provided with a keytab file for this user.

connect_retries - How many times the hook will retry an API call

json_rpc_version - The version of the Identity Management HTTP JSON API that the hook is using

inject_files - Files to inject into the VM. The format is "/localpath/to/file[ /path/to/file/invm]". If /path/to/file/invm is not given, then the path in the VM is assumed to be the same as the path in the local machine.