Skip to content

Improve Log4j 2 example #222

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 35 additions & 34 deletions docs/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -28,31 +28,41 @@ Like stdin and file inputs, each event is assumed to be one line of text.
Can either accept connections from clients or connect to a server,
depending on `mode`.

===== Accepting log4j2 logs

Log4j2 can send JSON over a socket, and we can use that combined with our tcp
input to accept the logs.

First, we need to configure your application to send logs in JSON over a
socket. The following log4j2.xml accomplishes this task.

Note, you will want to change the `host` and `port` settings in this
configuration to match your needs.

<Configuration>
<Appenders>
<Socket name="Socket" host="localhost" port="12345">
<JsonLayout compact="true" eventEol="true" />
</Socket>
</Appenders>
<Loggers>
<Root level="info">
<AppenderRef ref="Socket"/>
</Root>
</Loggers>
</Configuration>

To accept this in Logstash, you will want tcp input and a date filter:
===== Accepting Log4j 2 logs

Log4j 2 can write ECS-compliant JSON-formatted log events to a TCP socket.
We can combine with our TCP input to accept the logs from applications using Log4j 2.

First, we need to configure your application to write JSON-formatted logs to a TCP socket:

.Example `log4j2.xml` configuration for writing JSON-formatted logs to Logstash TCP input
[source,xml]
----
<Configuration xmlns="https://logging.apache.org/xml/ns"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
https://logging.apache.org/xml/ns
https://logging.apache.org/xml/ns/log4j-config-2.xsd">
<Appenders>
<Socket name="SOCKET" host="localhost" port="12345"> <!--1-->
<JsonTemplateLayout <!--2-->
eventTemplateUri="classpath:EcsLayout.json" <!--3-->
nullEventDelimiterEnabled="true"/> <!--4-->
</Socket>
</Appenders>
<Loggers>
<Root level="INFO">
<AppenderRef ref="SOCKET"/>
</Root>
</Loggers>
</Configuration>
----
<1> Using Socket Appender to write logs to a TCP socket – make sure to *change the `host` attribute* to match your setup
<2> Using https://logging.apache.org/log4j/2.x/manual/json-template-layout.html[JSON Template Layout] to encode log events in JSON
<3> Using the ECS (Elastic Common Schema) layout bundled with JSON Template Layout
<4> Configuring that written log events should be terminated with a null (i.e., `\0`) character
Copy link
Contributor

@yaauie yaauie Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is surprising to me, and results in errors.

From the JsonTemplateLayout docs, nullEventDelimiterEnabled=true will cause a null to be appended after each eventDelimiter.

When the TCP input is configured with codec => json or codec => json_lines, it uses the json_lines codec, which splits the input on newline. This means that the post-eventDelimiter null byte will be the first byte of each subsequent sequence passed to the codec, which results in json parse failures.

For example, if I start Logstash minimally:

bin/logstash -e 'input { tcp { codec => json port => 9887 } }'

And then send it json that is both newline and null delimited:

echo $'{"this":"that"}\n\0{"foo":"bar"}\n\0' | nc 127.0.0.1 9887

The null bytes cause _jsonparsefailures:

{
      "@version" => "1",
    "@timestamp" => 2025-02-18T17:37:51.852917Z,
          "this" => "that"
}
{
       "message" => "\u0000{\"foo\":\"bar\"}",
          "tags" => [
        [0] "_jsonparsefailure"
    ],
      "@version" => "1",
    "@timestamp" => 2025-02-18T17:37:51.855102Z
}
{
       "message" => "\u0000",
          "tags" => [
        [0] "_jsonparsefailure"
    ],
      "@version" => "1",
    "@timestamp" => 2025-02-18T17:37:51.855359Z
}

These issues do not occur if we do not have the post-eventDelimiter null byte:

echo $'{"this":"that"}\n{"foo":"bar"}\n' | nc 127.0.0.1 9887
{
      "@version" => "1",
    "@timestamp" => 2025-02-18T17:47:34.570363Z,
          "this" => "that"
}
{
      "@version" => "1",
    "@timestamp" => 2025-02-18T17:47:34.570605Z,
           "foo" => "bar"
}

Copy link
Author

@vy vy Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right – I've confused it with the GELF Input Plugin, which requires null termination.

As a matter of fact, we have an IT for ELK:

Also, in apache/logging-log4j2@fef8af8, fixed our SOA page guiding users on ELK integration.

Removed the mention of null delimiter from this PR in 5116d68.


To accept this in Logstash, you will want a TCP input:

input {
tcp {
Expand All @@ -61,15 +71,6 @@ To accept this in Logstash, you will want tcp input and a date filter:
}
}

and add a date filter to take log4j2's `timeMillis` field and use it as the
event timestamp

filter {
date {
match => [ "timeMillis", "UNIX_MS" ]
}
}

[id="plugins-{type}s-{plugin}-ecs_metadata"]
==== Event Metadata and the Elastic Common Schema (ECS)

Expand Down