Mailing List Archive

[Bug 2610] Json log format
https://bugs.exim.org/show_bug.cgi?id=2610

- <jori.hamalainen@teliacompany.com> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |jori.hamalainen@teliacompan
| |y.com

--- Comment #3 from - <jori.hamalainen@teliacompany.com> ---
NDJSON http://ndjson.org/

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2610] Json log format [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2610

Phil Pennock <pdp@exim.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |pdp@exim.org

--- Comment #4 from Phil Pennock <pdp@exim.org> ---
The use of application/ld+json or application/x-jsonlines or
application/x-ndjson doesn't really matter much: they're all logging one
"object" per line as condensed JSON, no trailing comma after the object.

The issue is not the file format here, it's the schema to be used to describe
logging: if there is a standard schema already then it's better to adhere to it
unless there's a reason to diverge, as this increases the likelihood that
generic email log processors would handle much of the data automatically.

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2610] Json log format [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2610

Andrew Aitchison <exim@aitchison.me.uk> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |exim@aitchison.me.uk

--- Comment #5 from Andrew Aitchison <exim@aitchison.me.uk> ---
I have provoked a discussion on this in the "mailop" list.
Archived at (members-only, sorry):
https://list.mailop.org/private/mailop/2020-November/018004.html

Some respondents felt that this was the responsibility of the log reader
and posted links to code which converts exim or postfix logs to elasticsearch.

Tim Bray pointed out that if you use the one-JSON-document-per-line approach,
you need special handling for line-breaks with a single log entry.

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2610] Json log format [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2610

Graeme Fowler <graeme@graemef.net> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |graeme@graemef.net

--- Comment #6 from Graeme Fowler <graeme@graemef.net> ---
Disclosure: I wrote the original patterns and recipes for ingesting Exim logs
into logstash/elasticsearch and then utilising the data in Kibana, and wrote a
couple of blog articles about it (which are trivially searchable).

I haven't maintained it because we moved (at work) to Splunk, in which I have
done something similar.

One repeating issue which makes post-processing Exim's logs difficult
regardless of format is the lack of continual state and the handling of
messages at various phases by different processes. The way Exim is currently
designed, without one single management process, means the logs need to be
processed outside of Exim rather than having Exim keep an ever-growing state
table for a specific message which can be output when message processing
completes.

As an example, a message to multiple recipients could take hours or days (or
longer!) to complete delivery and could have a variety of responses to the
various transports used to send it to remote hosts, local mailboxes, or other
local processes via sockets/pipes etc. Whilst Exim keeps some message state, it
doesn't keep the log entries for everything to do with a message within the
queued message object - yes, there are per-message-logs if you enable that, but
even then the great variety within that makes processing them non-trivial.

It is my opinion that the log format should remain "as-is", because it is well
documented and understood, and that any work to produce some downstream JSON
format is the work of other tools/scripts.

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2610] Json log format [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2610

--- Comment #7 from Jeremy Harris <jgh146exb@wizmail.org> ---
On single identifiers for message lines -
- I think HS had a notion of allocating the message_id earlier in the receive
sequence, which would help here.
- It still wouldn't tie together pre-MAIL-command actions: things done in
connect ACL, helo ACL, authentication...
I guess we could invent a connection identifier, and log that also.
- It wouldn't tie together multiple messages on a connection, if that was
wanted

On "output when message processing completes" - you'd then have no visibility
of any message with a recipient still on the queue. I don't think that's
really
what you want?
It's also moot for the one-JSON-document-per-line approach (treating certain
current multi-line log items as single "logical" lines for this purpose).
It'd be up to still-required postprocessing to tie together all the records
for a message for this approach.

There's a problem with logging done by ACL log_message/logwrite too; unless we
invent reams of specialised stuff for it the best we could do is a freeform
string item tagged with the message_id - the config-file facilities know
nothing
of JSON facilities for logging currently. Designing syntax for saying "use a
JSON object with <blah> name" for this log string might sounds hard.

The hardwired log items are where the advantage, such as it is, comes. Exim
knows the semantics of each element of the log "line", and can label JSON
objects
unambiguously (vs. the current text lines where not all items have a LABEL=
leader, and some can have embedded spaces).


None of the above addresses defining the schema, which is a significant effort.
We'd also have to maintain both log formats, making the choice configurable.

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##