Fluentd: Pitfalls of custom log format

One day, I realized that the fluentd on our Kubernetes cluster was logging a lot of "invalid date" errors like this.

1#0 dump an error event: error_class=ArgumentError error="invalid date" location="/usr/local/bundle/gems/fluent-plugin-elasticsearch-4.1.1/lib/fluent/plugin/out_elasticsearch.rb:519:in `parse'" tag="Fluent::ElasticsearchOutput::TimeParser.error" time=2020-07-21 14:59:59.894536631 +0000 record={"tag"=>"hoge.fuga.kubernetes", "time"=>2020-07-21 14:59:52.000000000 +0000, "format"=>nil, "value"=>"1595343592"}

When ingesting HAProxy’s logs which I defined as JSON format, the error caused. I couldn’t understand why it happened because the above log said “TimeParser.error time=2020-07-21 14:59:59.894536631”. It looks the correct format that I expected. And also, other application logs like NGINX were no problem. Just this pod which has the HAProxy only.

I checked the log format again in the HAProxy configuration file and it was the follows.

1    log global
2    log-format '{"host":"%H","ident":"haproxy","pid":%pid,"time":"%Ts","haproxy":{"conn":{"act":%ac,"fe":%fc,"be":%bc,"srv":%sc},"queue":{"backend":%bq,"srv":%sq},"time":{"tq":%Tq,"tw":%Tw,"tc":%Tc,"tr":%Tr,"tt":%Tt},"termination_state":"%tsc","retries":%rc,"network":{"client_ip":"%ci","client_port":%cp,"frontend_ip":"%fi","frontend_port":%fp},"ssl":{"version":"%sslv","ciphers":"%sslc"},"request":{"method":"%HM","uri":"%[capture.req.uri,json(utf8s)]","protocol":"%HV","header":{"host":"%[capture.req.hdr(0),json(utf8s)]","xforwardfor":"%[capture.req.hdr(1),json(utf8s)]","referer":"%[capture.req.hdr(2),json(utf8s)]","user-agent":"%[capture.req.hdr(3),json(utf8s)]","content-length":"%[capture.req.hdr(4),json(utf8s)]"}},"name":{"backend":"%b","frontend":"%ft","server":"%s"},"response":{"status_code":%ST,"header":{"server":"%[capture.res.hdr(0),json(utf8s)]","content-type":"%[capture.res.hdr(1),json(utf8s)]","transfer-encoding":"%[capture.res.hdr(2),json(utf8s)]","cache-control":"%[capture.res.hdr(3),json(utf8s)]","via":"%[capture.res.hdr(4),json(utf8s)]","location":"%[capture.res.hdr(5),json(utf8s)]","content-encoding":"%[capture.res.hdr(6),json(utf8s)]"}},"bytes":{"uploaded":%U,"read":%B}}}'

After a few minutes, I realized the value of the “time” column was quoted by double quotations. So, I deleted it and deployed the HAProxy again then the errors disappeared. Finally I found out that the time column in the log format must not be quoted because that will be ingested as String not Date.