Reading from `/dev/stdin` is very slow #685

mpol · 2022-10-10T11:03:21Z

Processing this 7.5M line file normally takes ~2-3 minutes on my laptop:

$ time ./mtail -emit_prog_label=false -one_shot -one_shot_format=prometheus -logtostderr -progs=lc -logs access-ssl.log.6
I1010 12:23:55.906123  179121 main.go:117] mtail version 3.0.0-rc50 git revision 89b462d7a83bab283e84a3b6e592a964090f7f4c go version go1.18.3 go arch amd64 go os linux
I1010 12:23:55.906203  179121 main.go:118] Commandline: ["./mtail" "-emit_prog_label=false" "-one_shot" "-one_shot_format=prometheus" "-logtostderr" "-progs=lc" "-logs" "access-ssl.log.6"]
I1010 12:23:55.906590  179121 store.go:189] Starting metric store expiry loop every 1h0m0s
I1010 12:23:55.907672  179121 runtime.go:188] Loaded program lc.mtail
I1010 12:23:55.907708  179121 runtime.go:84] unmarking lc.mtail
I1010 12:23:55.907763  179121 logstream.go:61] Parsed url as /tmp/access-ssl.log.6
I1010 12:23:55.907871  179121 filestream.go:278] signalling stop at next EOF
I1010 12:23:55.907896  179121 tail.go:287] Tailing /tmp/access-ssl.log.6
I1010 12:23:55.907986  179121 tail.go:343] No polling loop in oneshot mode.
I1010 12:23:55.908012  179121 mtail.go:126] Listening on [::]:3903
I1010 12:23:55.908022  179121 tail.go:315] No gc loop in oneshot mode.
I1010 12:26:51.737588  179121 runtime.go:288] END OF LINE
I1010 12:26:51.737615  179121 vm.go:1059] VM "lc.mtail" finished
# HELP lines defined at lc.mtail:1:9-13
# TYPE lines counter
lines 7.519629e+06

real	2m55,853s
user	3m37,175s
sys	0m29,575s

But when it is piped on standard input like this:

$ cat access-ssl.log.6 | time ./mtail -emit_prog_label=false -one_shot -one_shot_format=prometheus -logtostderr -progs=lc -logs /dev/stdin

it takes forever, reading ~2.5k lines/minute:

$ http :3903/metrics|grep stdin; sleep 5m;  http :3903/metrics|grep stdin
mtail_log_lines_total{logfile="/dev/stdin"} 16550
mtail_log_lines_total{logfile="/dev/stdin"} 29579

Tested on the latest relase downloaded from github and the simplest line counter program:

$ cat lc/lc.mtail 
counter lines

/$/ {
 lines++
}

The text was updated successfully, but these errors were encountered:

jaqx0r · 2023-01-01T05:55:24Z

Cool bug, thanks for the report!

If someone picks this up before me, I'd try to attach a profiler to see what's taking so long.

jvgutierrez · 2024-04-23T14:39:20Z

we've detected the same issue with mtail 3.0.0-rc50 (shipped with Debian bookworm) and latest version available (3.0.0-rc55)

jvgutierrez · 2024-04-23T14:58:22Z

perf top report looks like this:

  14.22%  mtail    mtail              [.] regexp.(*Regexp).tryBacktrack                                                                                                          
   8.83%  mtail    [kernel.kallsyms]  [k] finish_task_switch.isra.0                                                                                                              
   3.73%  mtail    mtail              [.] runtime.mallocgc                                                                                                                       
   3.58%  mtail    [kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore                                                                                                            
   2.79%  mtail    mtail              [.] regexp/syntax.(*Inst).MatchRunePos                                                                                                     
   2.71%  mtail    mtail              [.] github.com/google/mtail/internal/runtime/vm.(*VM).execute                                                                              
   2.68%  mtail    mtail              [.] github.com/google/mtail/internal/tailer/logstream.decodeAndSend                                                                        
   2.50%  mtail    mtail              [.] runtime.usleep.abi0                                                                                                                    
   2.39%  mtail    mtail              [.] runtime.futex.abi0                                                                                                                     
   1.56%  mtail    mtail              [.] bytes.(*Buffer).WriteByte                                                                                                              
   1.38%  mtail    mtail              [.] regexp.(*inputString).step                                                                                                             
   1.34%  mtail    mtail              [.] unicode/utf8.DecodeRune                                                                                                                
   1.27%  mtail    mtail              [.] bytes.(*Buffer).WriteRune                                                                                                              
   1.19%  mtail    mtail              [.] runtime.(*mspan).writeHeapBitsSmall                                                                                                    
   1.16%  mtail    mtail              [.] runtime/internal/syscall.Syscall6                                                                                                      
   1.12%  mtail    mtail              [.] runtime.(*_panic).nextDefer                                                                                                            
   1.01%  mtail    mtail              [.] runtime.memclrNoHeapPointers                                                                                                           
   0.98%  mtail    [vdso]             [.] 0x000000000000072f                                                                                                                     
   0.98%  mtail    mtail              [.] runtime.findRunnable                                                                                                                   
   0.87%  mtail    mtail              [.] runtime.lock2                                                                                                                          
   0.80%  mtail    mtail              [.] runtime.stealWork                                                                                                                      
   0.72%  mtail    [kernel.kallsyms]  [k] syscall_enter_from_user_mode                                                                                                           
   0.72%  mtail    mtail              [.] runtime.scanobject                                                                                                                     
   0.69%  mtail    mtail              [.] runtime.chanrecv                                                                                                                       
   0.69%  mtail    mtail              [.] runtime.growslice                                                                                                                      
   0.69%  mtail    mtail              [.] runtime.unlock2                                                                                                                        
   0.65%  mtail    mtail              [.] runtime.greyobject                                                                                                                     
   0.58%  mtail    [vdso]             [.] __vdso_clock_gettime                                                                                                                   
   0.58%  mtail    mtail              [.] runtime.casgstatus                                                                                                                     
   0.58%  mtail    mtail              [.] runtime.chansend                                                                                                                       
   0.51%  mtail    [kernel.kallsyms]  [k] exit_to_user_mode_prepare                                                                                                              
   0.51%  mtail    mtail              [.] runtime.deferreturn                                                                                                                    
   0.51%  mtail    mtail              [.] runtime.scanblock                                                                                                                      
   0.51%  mtail    mtail              [.] runtime.selectgo

in our setup, mtail 3.0.0~rc43 (shipped with Debian bullseye) was working as expected

jaqx0r · 2024-05-26T16:24:34Z

latest mtail at head:

jaq% time ./mtail -progs examples/rsyncd.mtail -one_shot -logs internal/mtail/testdata/rsyncd.log
0.01s user 0.01s system 111% cpu 0.017 total

jaq% time ./mtail -logs - -progs examples/rsyncd.mtail < internal/mtail/testdata/rsyncd.log
0.04s user 0.00s system 3% cpu 1.276 total

Wild guess that the read buffer is too small on pipes, so we're doing a lot of syscalls.

jaqx0r · 2024-05-26T16:29:09Z

Increasing the read buffer from 4096 to 65536 had this effect:

jaq% time ./mtail -logs - -progs examples/rsyncd.mtail < internal/mtail/testdata/rsyncd.log
0.01s user 0.01s system 8% cpu 0.269 total

total time drops, CPU increased, so less time waiting on syscalls (again guessing, but seems obvious -- i could hook up a trace and get a profile for time spent blocked on syscalls to confirm)

jaqx0r · 2024-05-26T16:31:50Z

A 128K buffer seems pretty good. I can imagine an improvement where mtail uses getsockopt to ask the kernel what the read buffer size is.

Fewer syscalls reading from large stdins means we spend more time processing and less time waiting for the kernel. Issue: #685

jaqx0r · 2024-05-27T09:41:01Z

Pipes use fcntl(.. F_GETPIPE_SZ..)

Sockets use getsockopt(... SO_RCVBUF_SZ)

Datagrams use whatever you set net.core.rmem_default to, but (wild guess) you can use getsockopt here too.

Regular files don't seem to have a read buffer (because the entire file is available to the VFS I presume) and anecdotal data on the net says maybe 4GiB.

Dealing with buffer sizes on Linux seems tractable but tedious; I looked into Windows support and couldn't get a satifsfying answer.

Increasing buffers per stream is also going to increase RAM usage so I don't want to go wild here without better reasoning.

Finally, the stdin read rate is still 10x slower than direct file reads. I presume this is because the pipe buffer requires copying into kernel memory and back out again, while direct file i/o has fewer (maybe zero) copies.

jaqx0r added bug This is considered a bug and shall get fixed mtail-Log Tailing Issues related to log polling and tailing labels Jan 1, 2023

jaqx0r added a commit that referenced this issue May 27, 2024

fix: Set a read buffer size on pipes to 128KiB.

e7d279d

Fewer syscalls reading from large stdins means we spend more time processing and less time waiting for the kernel. Issue: #685

jaqx0r mentioned this issue May 27, 2024

fix: Raise the read buffer sizes to 128KiB. #869

Merged

jaqx0r closed this as completed in #869 May 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reading from `/dev/stdin` is very slow #685

Reading from `/dev/stdin` is very slow #685

mpol commented Oct 10, 2022

jaqx0r commented Jan 1, 2023

jvgutierrez commented Apr 23, 2024 •

edited

jvgutierrez commented Apr 23, 2024

jaqx0r commented May 26, 2024 •

edited

jaqx0r commented May 26, 2024

jaqx0r commented May 26, 2024

jaqx0r commented May 27, 2024 •

edited

Reading from /dev/stdin is very slow #685

Reading from /dev/stdin is very slow #685

Comments

mpol commented Oct 10, 2022

jaqx0r commented Jan 1, 2023

jvgutierrez commented Apr 23, 2024 • edited

jvgutierrez commented Apr 23, 2024

jaqx0r commented May 26, 2024 • edited

jaqx0r commented May 26, 2024

jaqx0r commented May 26, 2024

jaqx0r commented May 27, 2024 • edited

Reading from `/dev/stdin` is very slow #685

Reading from `/dev/stdin` is very slow #685

jvgutierrez commented Apr 23, 2024 •

edited

jaqx0r commented May 26, 2024 •

edited

jaqx0r commented May 27, 2024 •

edited