普通用户启动supervisor报HTTP错误(strace)解决分析
报错信息
公司的开发对生产环境都有普通用户 www 的权限,采用堡垒机登录到生产环境的机器。
默认 supervisor 使用 root 用户启动,开发没有权限直接修改配置和操作 supervisor 管理的进程,所以 supervisor 都采用 www 用户启动就解决问题了。
但是今天操作生产环境的时候,报的错误令人懵逼,SRE 同学折腾了好久,配置来配置去,搞不明白为啥 www 就是启动不了 supervisor !
报错信息是这样的:
[www@**************** ~]$ supervisord -c /etc/supervisord.conf Error: Cannot open an HTTP server: socket.error reported errno.EACCES (13) For help, use /usr/bin/supervisord -h
翻遍谷歌、百度,找到的资料,没有一个真的能解决了问题,焦头烂额。
突然想起 strace 这个命令,于是乎,我改成这样子执行:
[www@**************** ~]$ strace supervisord -c /etc/supervisord.conf
奇迹发生了!
read(6, "\0S(\1\0\0\0N(\3\0\0\0R'\0\0\0R \0\0\0R\24\0\0\0(\1\0\0"..., 4096) = 1381 read(6, "", 4096) = 0 close(6) = 0 close(5) = 0 close(4) = 0 getpid() = 3656 unlink("/var/run/supervisor/supervisor.sock.3656") = -1 ENOENT (No such file or directory) socket(AF_UNIX, SOCK_STREAM, 0) = 4 bind(4, {sa_family=AF_UNIX, sun_path="/var/run/supervisor/supervisor.sock.3656"}, 42) = -1 EACCES (Permission denied) unlink("/var/run/supervisor/supervisor.sock.3656") = -1 ENOENT (No such file or directory) write(2, "Error: Cannot open an HTTP serve"..., 75Error: Cannot open an HTTP server: socket.error reported errno.EACCES (13) ) = 75 write(2, "For help, use /usr/bin/superviso"..., 38For help, use /usr/bin/supervisord -h ) = 38 close(4) = 0 rt_sigaction(SIGINT, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fa06ce997e0}, {sa_handler=0x7fa06d1b0750, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fa06ce997e0}, 8) = 0 close(3) = 0 close(11) = 0 exit_group(2) = ? +++ exited with 2 +++
原来是没有权限,把 sock 文件挪到 www 用户下,问题解决。
关于 strace 命令
如果你的服务器还没安装 strace,可以用下面命令安装:
# Ubuntu / Debian sudo apt install strace # CentOS yum install -y strace
如果一个进程已经在运行,可以通过 PID 跟踪程序的系统调用,使用 CTRL + C 中止跟踪:
[root@**************** ~]# strace -p 16701 strace: Process 16701 attached restart_syscall(<... resuming interrupted read ...>) = 0 sendto(8, "*2\r\n$4\r\nLLEN\r\n$52\r\nhorizon:comma"..., 73, MSG_DONTWAIT, NULL, 0) = 73 poll([{fd=8, events=POLLIN|POLLERR|POLLHUP}], 1, 60000) = 1 ([{fd=8, revents=POLLIN}]) recvfrom(8, ":0\r\n", 8192, MSG_DONTWAIT, NULL, NULL) = 4 sendto(9, "*2\r\n$4\r\nLLEN\r\n$14\r\nqueues:defaul"..., 35, MSG_DONTWAIT, NULL, 0) = 35 poll([{fd=9, events=POLLIN|POLLERR|POLLHUP}], 1, 60000) = 1 ([{fd=9, revents=POLLIN}]) recvfrom(9, ":0\r\n", 8192, MSG_DONTWAIT, NULL, NULL) = 4 sendto(8, "*3\r\n$4\r\nHGET\r\n$21\r\nhorizon:queue"..., 55, MSG_DONTWAIT, NULL, 0) = 55 poll([{fd=8, events=POLLIN|POLLERR|POLLHUP}], 1, 60000) = 1 ([{fd=8, revents=POLLIN}]) recvfrom(8, "$18\r\n1.6690604487864007\r\n", 8192, MSG_DONTWAIT, NULL, NULL) = 25 wait4(16711, 0x7ffeaafa566c, WNOHANG|WSTOPPED, NULL) = 0 select(15, [12 14], [], [], {tv_sec=0, tv_usec=0}) = 0 (Timeout) wait4(16712, 0x7ffeaafa566c, WNOHANG|WSTOPPED, NULL) = 0 select(17, [13 16], [], [], {tv_sec=0, tv_usec=0}) = 0 (Timeout) wait4(16713, 0x7ffeaafa566c, WNOHANG|WSTOPPED, NULL) = 0 select(19, [15 18], [], [], {tv_sec=0, tv_usec=0}) = 0 (Timeout) sendto(8, "*14\r\n$5\r\nHMSET\r\n$54\r\nhorizon:sup"..., 500, MSG_DONTWAIT, NULL, 0) = 500 sendto(8, "*4\r\n$4\r\nZADD\r\n$19\r\nhorizon:super"..., 99, MSG_DONTWAIT, NULL, 0) = 99 sendto(8, "*3\r\n$6\r\nEXPIRE\r\n$54\r\nhorizon:sup"..., 85, MSG_DONTWAIT, NULL, 0) = 85 poll([{fd=8, events=POLLIN|POLLERR|POLLHUP}], 1, 60000) = 1 ([{fd=8, revents=POLLIN}]) recvfrom(8, "+OK\r\n:0\r\n:1\r\n", 8192, MSG_DONTWAIT, NULL, NULL) = 13 sendto(8, "*3\r\n$5\r\nSETNX\r\n$29\r\nhorizon:moni"..., 58, MSG_DONTWAIT, NULL, 0) = 58 poll([{fd=8, events=POLLIN|POLLERR|POLLHUP}], 1, 60000) = 1 ([{fd=8, revents=POLLIN}]) recvfrom(8, ":0\r\n", 8192, MSG_DONTWAIT, NULL, NULL) = 4 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 rt_sigaction(SIGCHLD, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffeaafa54a0) = 0 sendto(8, "*2\r\n$4\r\nLLEN\r\n$52\r\nhorizon:comma"..., 73, MSG_DONTWAIT, NULL, 0) = 73 poll([{fd=8, events=POLLIN|POLLERR|POLLHUP}], 1, 60000) = 1 ([{fd=8, revents=POLLIN}]) recvfrom(8, ":0\r\n", 8192, MSG_DONTWAIT, NULL, NULL) = 4 wait4(16711, 0x7ffeaafa566c, WNOHANG|WSTOPPED, NULL) = 0 select(15, [12 14], [], [], {tv_sec=0, tv_usec=0}) = 0 (Timeout) wait4(16712, 0x7ffeaafa566c, WNOHANG|WSTOPPED, NULL) = 0 select(17, [13 16], [], [], {tv_sec=0, tv_usec=0}) = 0 (Timeout) wait4(16713, 0x7ffeaafa566c, WNOHANG|WSTOPPED, NULL) = 0 select(19, [15 18], [], [], {tv_sec=0, tv_usec=0}) = 0 (Timeout) sendto(8, "*14\r\n$5\r\nHMSET\r\n$54\r\nhorizon:sup"..., 500, MSG_DONTWAIT, NULL, 0) = 500 sendto(8, "*4\r\n$4\r\nZADD\r\n$19\r\nhorizon:super"..., 99, MSG_DONTWAIT, NULL, 0) = 99 sendto(8, "*3\r\n$6\r\nEXPIRE\r\n$54\r\nhorizon:sup"..., 85, MSG_DONTWAIT, NULL, 0) = 85 poll([{fd=8, events=POLLIN|POLLERR|POLLHUP}], 1, 60000) = 1 ([{fd=8, revents=POLLIN}]) recvfrom(8, "+OK\r\n:0\r\n:1\r\n", 8192, MSG_DONTWAIT, NULL, NULL) = 13 sendto(8, "*3\r\n$5\r\nSETNX\r\n$29\r\nhorizon:moni"..., 58, MSG_DONTWAIT, NULL, 0) = 58 poll([{fd=8, events=POLLIN|POLLERR|POLLHUP}], 1, 60000) = 1 ([{fd=8, revents=POLLIN}]) recvfrom(8, ":0\r\n", 8192, MSG_DONTWAIT, NULL, NULL) = 4 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 rt_sigaction(SIGCHLD, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 nanosleep({tv_sec=1, tv_nsec=0}, ^Cstrace: Process 16701 detached <detached ...>
使用 -c 参数,可以得到各个系统调用的耗时、调用次数、错误数:
[root@**************** ~]# strace -p 16701 -c strace: Process 16701 attached ^Cstrace: Process 16701 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 54.00 0.000216 5 39 sendto 11.25 0.000045 1 25 recvfrom 9.25 0.000037 1 25 poll 8.25 0.000033 5 6 nanosleep 6.50 0.000026 1 21 select 5.75 0.000023 1 21 wait4 3.50 0.000014 1 14 rt_sigprocmask 1.50 0.000006 0 7 rt_sigaction 0.00 0.000000 0 1 restart_syscall ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000400 159 total
使用 -tt 参数,显示时间戳(微妙),-T 参数显示系统调用的耗时,合起来用效果是这样的:
[root@**************** ~]# strace -p 16701 -tt -T strace: Process 16701 attached 23:05:15.019519 restart_syscall(<... resuming interrupted read ...>) = 0 <0.915045> 23:05:15.934770 sendto(8, "*2\r\n$4\r\nLLEN\r\n$52\r\nhorizon:comma"..., 73, MSG_DONTWAIT, NULL, 0) = 73 <0.000080> 23:05:15.934975 poll([{fd=8, events=POLLIN|POLLERR|POLLHUP}], 1, 60000) = 1 ([{fd=8, revents=POLLIN}]) <0.000022> 23:05:15.935065 recvfrom(8, ":0\r\n", 8192, MSG_DONTWAIT, NULL, NULL) = 4 <0.000022> 23:05:15.935215 wait4(16711, 0x7ffeaafa566c, WNOHANG|WSTOPPED, NULL) = 0 <0.000023> 23:05:15.935301 select(15, [12 14], [], [], {tv_sec=0, tv_usec=0}) = 0 (Timeout) <0.000026> 23:05:15.935390 wait4(16712, 0x7ffeaafa566c, WNOHANG|WSTOPPED, NULL) = 0 <0.000018> 23:05:15.935446 select(17, [13 16], [], [], {tv_sec=0, tv_usec=0}) = 0 (Timeout) <0.000018> 23:05:15.935503 wait4(16713, 0x7ffeaafa566c, WNOHANG|WSTOPPED, NULL) = 0 <0.000023> 23:05:15.935574 select(19, [15 18], [], [], {tv_sec=0, tv_usec=0}) = 0 (Timeout) <0.000023> 23:05:15.935731 sendto(8, "*14\r\n$5\r\nHMSET\r\n$54\r\nhorizon:sup"..., 500, MSG_DONTWAIT, NULL, 0) = 500 <0.000059> 23:05:15.935848 sendto(8, "*4\r\n$4\r\nZADD\r\n$19\r\nhorizon:super"..., 99, MSG_DONTWAIT, NULL, 0) = 99 <0.000046> 23:05:15.935937 sendto(8, "*3\r\n$6\r\nEXPIRE\r\n$54\r\nhorizon:sup"..., 85, MSG_DONTWAIT, NULL, 0) = 85 <0.000041> 23:05:15.936028 poll([{fd=8, events=POLLIN|POLLERR|POLLHUP}], 1, 60000) = 1 ([{fd=8, revents=POLLIN}]) <0.000019> 23:05:15.936088 recvfrom(8, "+OK\r\n:0\r\n:1\r\n", 8192, MSG_DONTWAIT, NULL, NULL) = 13 <0.000018> 23:05:15.936278 sendto(8, "*3\r\n$5\r\nSETNX\r\n$29\r\nhorizon:moni"..., 58, MSG_DONTWAIT, NULL, 0) = 58 <0.000057> 23:05:15.936385 poll([{fd=8, events=POLLIN|POLLERR|POLLHUP}], 1, 60000) = 1 ([{fd=8, revents=POLLIN}]) <0.000019> 23:05:15.936441 recvfrom(8, ":0\r\n", 8192, MSG_DONTWAIT, NULL, NULL) = 4 <0.000017> 23:05:15.936505 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 <0.000017> 23:05:15.936564 rt_sigaction(SIGCHLD, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0 <0.000023> 23:05:15.936624 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000017> 23:05:15.936672 nanosleep({tv_sec=1, tv_nsec=0}, ^Cstrace: Process 16701 detached <detached ...>
常用的参数也就这么几个,如果记不住,直接看帮助吧:)
以上就是普通用户启动 supervisor 报 HTTP 错误(strace)的详细内容,更多关于启动 supervisor 报 HTTP 错误的资料请关注脚本之家其它相关文章!
最新评论