Send Flink's logs to ElasticSearch using Log4j

November 29, 2017

Flink uses slf4j as its logging façade, and log4j
as the default logging framework (they support logback too). Logs are accessible via Flink’s UI
in the JobManager tab which is good for short-lived jobs but unusable for long-lived, streaming
applications. You probably want your logs out of there somewhere else; here’s how you can
send them to ElasticSearch so you can access
them, say, with Kibana.

You have to compile the project into a jar and place it in Flink’s lib folder. By default, that project
will compile a simple JAR with no dependencies, which is inconvenient because it also depends
on elasticsearch and jest. I made a fork that
uses Gradle + the Shadow Jar plugin to make a fat jar with everything you need:

AWS users: how to make it work on EMR and Amazon’s Elastic Search

Tipically, Amazon managed ElasticSearch clusters are configured with an
access policy that restricts access either by IP or by IAM user/role. If
your Flink cluster is running on Amazon’s EMR, you need a little bit
extra work to make this work: