Kibana query language cheat sheet for web service
I’m using ELK stack for collecting and searching logs from web servers, CDN, load balancers and so on. Let me share my cheat sheet of the Kibana query language I usually use. I’ll update this continually.
Hide empty line
If the logs are not well formatted, some empty line will appear and it’s bothering. So, I hide it by this query.
1{
2 "query": {
3 "bool": {
4 "must": [
5 {
6 "regexp": {
7 "message.keyword": "\n"
8 }
9 }
10 ]
11 }
12 }
13}
Redirected HTTP access
Redirecting from HTTP to HTTPS is a popular way for load balancer settings because of the AOSSL (Always On SSL). It brings a lot of logs of the redirecting but most of the time it’s unnecessary for debugging or investigating issues. So, I hide it by this query.
1{
2 "query": {
3 "bool": {
4 "must": [
5 {
6 "match": {
7 "cs-protocol.keyword": "http"
8 }
9 },
10 {
11 "match": {
12 "sc-status": "301"
13 }
14 }
15 ]
16 }
17 }
18}
Must and Must Not
Sometimes, it’s necessary to combine complicated conditions. For example, the following query will pick up the logs which has "JP" on "geoip_country_code" field and also the status is not 2xx and 3xx.
1{
2 "query": {
3 "bool": {
4 "must": [
5 {
6 "match": {
7 "geoip_country_code": "JP"
8 }
9 }
10 ],
11 "must_not": [
12 {
13 "range": {
14 "status": {
15 "gte": 200,
16 "lt": 400
17 }
18 }
19 }
20 ]
21 }
22 }
23}
Exclude bot logs by UA
There are a bunch of access to public web servers and CDN logs. When debugging, sometimes it’s bothering to search some specified logs. In that situation, I’ll use the following query to filter by the user-agents. Of Course it’s not perfect because there are millions of bots around the world and some wicked bots have fake user-agents but better than nothing.
1{
2 "query": {
3 "bool": {
4 "should": [
5 {
6 "regexp": {
7 "cs(User-Agent)": ".*bot.*"
8 }
9 },
10 {
11 "regexp": {
12 "cs(User-Agent).keyword": ".*spider.*"
13 }
14 },
15 {
16 "regexp": {
17 "cs(User-Agent).keyword": ".*google.*"
18 }
19 },
20 {
21 "regexp": {
22 "cs(User-Agent).keyword": ".*Google.*"
23 }
24 },
25 {
26 "regexp": {
27 "cs(User-Agent).keyword": ".*curl.*"
28 }
29 },
30 {
31 "regexp": {
32 "cs(User-Agent).keyword": ".*python.*"
33 }
34 },
35 {
36 "regexp": {
37 "cs(User-Agent).keyword": ".*Photon.*"
38 }
39 },
40 {
41 "regexp": {
42 "cs(User-Agent).keyword": ".*cortex.*"
43 }
44 },
45 {
46 "regexp": {
47 "cs(User-Agent).keyword": ".*crawler.*"
48 }
49 },
50 {
51 "regexp": {
52 "cs(User-Agent).keyword": ".*Crawler.*"
53 }
54 },
55 {
56 "regexp": {
57 "cs(User-Agent).keyword": ".*daum.*"
58 }
59 },
60 {
61 "regexp": {
62 "cs(User-Agent).keyword": ".*Hatena.*"
63 }
64 },
65 {
66 "regexp": {
67 "cs(User-Agent).keyword": ".*ltx71.*"
68 }
69 },
70 {
71 "regexp": {
72 "cs(User-Agent).keyword": "newspaper/.*"
73 }
74 },
75 {
76 "regexp": {
77 "cs(User-Agent).keyword": ".*Qwantify.*"
78 }
79 },
80 {
81 "regexp": {
82 "cs(User-Agent).keyword": ".*ubermetrics-technologies.com.*"
83 }
84 },
85 {
86 "regexp": {
87 "cs(User-Agent).keyword": "WF%2520search/Nutch-.*"
88 }
89 },
90 {
91 "regexp": {
92 "cs(User-Agent).keyword": "Apache-HttpClient.*"
93 }
94 },
95 {
96 "regexp": {
97 "cs(User-Agent).keyword": "Mozilla.*Yeti/.*https://naver.me/spd.*"
98 }
99 },
100 {
101 "regexp": {
102 "cs(User-Agent).keyword": ".*TrendsmapResolver.*"
103 }
104 },
105 {
106 "regexp": {
107 "cs(User-Agent).keyword": "-"
108 }
109 }
110 ]
111 }
112 }
113}