Integrating with service-oriented architectures
In this page we will share certain best practices you can employ in your applications using Log4j Core to integrate them with service-oriented architectures. While doing so, we will also try to share guides on some popular scenarios.
Motivation
Most modern software is deployed in service-oriented architectures. This is a very broad domain and can be realized in an amazingly large number of ways. Nevertheless, they all redefine the notion of an application:
-
Deployed in multiple instances
-
Situated in multiple locations; either in the same rack, or in different data centers located in different continents
-
Hosted by multiple platforms; hardware, virtual machine, container, etc.
-
Polyglot; a product of multiple programming languages
-
Scaled on demand; instances come and go in time
Naturally, logging systems also evolved to accommodate these needs. In particular, the old practice of "monoliths writing logs to files rotated daily" has changed in two major angles:
- Application delivers logs differently
-
Applications no longer write logs to files, but encode them structurally, and deliver them to an external system centrally managed. Most of the time this is a proxy (a library, a sidecar container, etc.) that takes care of discovering the log storage system and determining the right external service to forward the logs to.
- Platform stores logs differently
-
There is no longer
/var/log/tomcat/catalina.out
combining all logs of a monolith. Instead, the software runs in multiple instances, each is implemented in a different language, and instances get scaled (i.e., new ones get started, old ones get stopped) on demand. To accommodate this, logs are persisted on a central storage system (Elasticsearch, Google Cloud Logging, etc.) that allows advanced navigation and filtering capabilities.
Log4j Core not only adapts to this evolution, but also strives to provide the best in the class support for that. We will explore how to integrate Log4j with service-oriented architectures.
Best practices
Independent of the service-oriented architecture you choose, there are certain best practices we strongly encourage you to follow:
Encode logs using a structured layout
We can’t emphasize it enough to not use anything, but a structured layout to deliver your logs to an external system. We recommend JSON Template Layout for this purpose:
-
JSON Template Layout provides full customizability and contains several predefined layouts for popular log storage services.
-
JSON is accepted by every log storage service.
-
JSON is supported by logging frameworks in other languages. This makes it possible to agree on a common log format with non-Java applications.
Use a proxy for writing logs
Most of the time it is not a good idea to write to the log storage system directly, but instead delegate that task to a proxy. This design decouples applications' log target and the log storage system and, as a result, effectively enables each to evolve independently and reliably (i.e., without downtime). For instance, this will allow the log storage system to scale or migrate to a new environment while proxies take care of necessary buffering and routing.
This proxy can appear in many forms, for instance:
-
Console can act as a proxy. Logs written to console can be consumed by an external service. For example, The Twelve-Factor App and Kubernetes Logging Architecture recommends this approach.
-
A library can act as proxy. It can tap into the logging API and forward it to an external service. For instance, Datadog’s Java Log Collector uses this mechanism.
-
An external service can act as a proxy, which applications can write logs to. For example, you can write to Logstash, a Kubernetes logging agent sidecar, or a Redis queue over a socket.
What to use as a proxy depends on your deployment environment. You should consult to your colleagues if there is already an established logging proxy convention. Otherwise, we strongly encourage you to establish one in collaboration with your system administrators and architects.
Configure your appender correctly
Once you decide on the log proxy to use, the choice of appender pretty much becomes self-evident. Nevertheless, there are some tips we recommend you to practice:
-
For writing to console, use a Console Appender and make sure to configure its
direct
attribute totrue
for the maximum efficiency. -
For writing to an external service, use a Socket Appender and make sure to set the protocol to TCP and configure the null delimiter of the associated layout. For instance, see the
nullEventDelimiterEnabled
configuration attribute of JSON Template Layout.
Avoid writing to files
As explained in Motivation, in a service-oriented architecture, log files are
-
Difficult to maintain – writable volumes must be mounted to the runtime (container, VM, etc.), rotated, and monitored for excessive usage
-
Difficult to use – multiple files need to be manually combined while troubleshooting, no central navigation point
-
Difficult to interoperate – each application needs to be individually configured to produce the same structured log output to enable interleaving of logs from multiple sources while troubleshooting distributed issues
In short, we don’t recommend writing logs to files.
Separate logging configuration from the application
We strongly advise you to separate the logging configuration from the application and couple them in an environment-specific way. This will allow you to
-
Address environment-specific configurations (e.g., logging verbosity needs of test and production can be different)
-
Ensure Log4j configuration changes applies to all affected Log4j-using software without the need to manually update their Log4j configuration one by one
How to implement this separation pretty much depends on your setup. We will share some recommended approaches to give you an idea:
- Choosing configuration files during deployment
-
Environment-specific Log4j configuration files (
log4j2-common.xml
,log4j2-local.xml
,log4j2-test.xml
,log4j2-prod.xml
, etc.) can be provided in one of following ways:-
Shipped with your software (i.e., accessible in the classpath)
-
Served from an HTTP server
-
A combination of the first two
Depending on the deployment environment, you can selectively activate a subset of them using the
log4j2.configurationFile
configuration property.Spring Boot allows you to configure the underlying logging system. Just like any other Spring Boot configuration, logging-related configuration also can be provided in multiple files split by profiles matching the environment:
application-common.yaml
,application-local.yaml
, etc. Spring Boot’s Externalized Configuration System will automatically load these files depending on the active profile(s). -
- Mounting configuration files during deployment
-
Many service-oriented deployment architectures offer solutions for environment-specific configuration storage; Kubernetes' ConfigMap, HashiCorp’s Consul, etc. You can leverage these to store environment-specific Log4j configurations and mount them to the associated runtime (container, VM, etc.) at deployment.
Log4j Core can poll configuration files for changes (see the You need to be careful with this mechanism to not shoot yourself in the foot.
Imagine publishing an incorrect |
Guides
In this section, we will share guides on some popular integration scenarios.
Docker
See Log4j Docker for Docker-specific Log4j features, e.g., Docker Lookup. We also strongly advise you to check the extensive logging integration offered by Docker containers.
Kubernetes
Log4j Kubernetes (containing Kubernetes Lookup) is distributed as a part of Fabric8’s Kubernetes Client, refer to its website for details.
Elasticsearch & Logstash
Elasticsearch, Logstash, and Kibana (aka. ELK Stack) is probably the most popular logging system solution. In this setup,
-
Elasticsearch is used for log storage
-
Logstash is used for transformation and ingestion to Elasticsearch from multiple sources (file, socket, etc.)
-
Kibana is used as a web-based UI to query Elasticsearch
To begin with, JSON is the de facto messaging format used across the entire Elastic platform. Hence, as stated earlier, we strongly advise you to configure a structured encoding, i.e., JSON Template Layout.
Logstash as a proxy
While using ELK stack, there are numerous ways you can write your application logs to Elasticsearch. We advise you to always employ a proxy while doing so. In particular, we recommend you to use Logstash for this purpose. In a modern software stack, the shape and accessibility of log varies greatly: some write to files (be it legacy or new systems), some doesn’t provide a structured encoding, etc. Logstash excels at ingesting from a wide range of sources, transforming them into the desired format, and writing them to Elasticsearch.
While setting up Logstash, we recommend you to use TCP input plugin in combination with Elasticsearch output plugin to accept logs over a TCP socket and write them to Elasticsearch:
logstash.conf
snippet for accepting JSON-encoded log events over TCP and writing them to Elasticsearchinput {
tcp { (1)
port => 12345 (2)
codec => "json" (3)
}
}
output {
# stdout { codec => rubydebug } (4)
# Modify the hosts value to reflect where Elasticsearch is installed.
elasticsearch { (5)
hosts => ["http://localhost:9200/"] (6)
index => "app-%{application}-%{+YYYYMMdd}" (7)
}
}
1 | Using TCP input plugin to accept logs from |
2 | Setting the port Logstash will bind to accept TCP connections to 12345 – adapt the port to your setup |
3 | Setting the payload encoding to JSON |
4 | Uncomment this while troubleshooting your Logstash configuration |
5 | Using Elasticsearch output plugin to write logs to Elasticsearch |
6 | The list of Elasticsearch hosts to connect to |
7 | The name of the Elasticsearch index to write to |
Refer to the official documentation for details on configuring a Logstash pipeline.
For the sake of completeness, see the following Log4j configuration to write to the TCP socket Logstash accepts input from:
-
XML
-
JSON
-
YAML
-
Properties
log4j2.xml
<Socket name="SOCKET" host="localhost" port="12345">
<JsonTemplateLayout nullEventDelimiterEnabled="true"/>
</Socket>
log4j2.json
"Socket": {
"name": "SOCKET",
"host": "localhost",
"port": 12345,
"JsonTemplateLayout": {
"nullEventDelimiterEnabled": true
}
}
log4j2.yaml
Socket:
name: "SOCKET"
host: "localhost"
port: 12345
JsonTemplateLayout:
nullEventDelimiterEnabled: true
log4j2.properties
appender.0.type = Socket
appender.0.name = SOCKET
appender.0.host = localhost
appender.0.port = 12345
appender.0.layout.type = JsonTemplateLayout
appender.0.layout.nullEventDelimiterEnabled = true
We don’t recommend writing logs to files. If this is a necessity in your logging setup for some reason, we recommend you to check Filebeat. It is a data shipper agent for forwarding logs to Logstash, Elasticsearch, etc. |