we have strange problem on our web-project.
we use:
2 intel(r) xeon(r) cpu e5520 @ 2.27ghz
12 gb memory
have 20 hits per seconds. 4-5 requests per second heavy – search requests.
use nginx + php-fpm (5.3.22)
mysql server installed on machine.
of time have load average less 10 , cpu usage 50%
cpu usage 95% , after load average grows 50 , more!!!
can see load average , cpu usage here (my reputation low send images here)
load average
cpu usage
we have reload php-fpm ( /etc/init.d/php-fpm reload) normalize situation.
can happens 4-5 times per day.
tried use strace exam situation.
sorry long logs! output of command strace -cp pid
pid – random php-fpm process id (we start 100 php-fpm processes).
2 results in moment high cpu usage.
process 17272 attached - interrupt quit process 17272 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 65.56 0.008817 267 33 munmap 13.38 0.001799 900 2 clone 9.66 0.001299 2 589 read 7.43 0.000999 125 8 mremap 2.84 0.000382 1 559 96 access 0.59 0.000080 40 2 waitpid 0.29 0.000039 0 627 gettimeofday 0.16 0.000022 0 346 write 0.04 0.000006 0 56 getcwd 0.04 0.000005 0 348 poll 0.00 0.000000 0 55 open 0.00 0.000000 0 69 close 0.00 0.000000 0 17 chdir 0.00 0.000000 0 189 time 0.00 0.000000 0 28 lseek 0.00 0.000000 0 2 pipe 0.00 0.000000 0 17 times 0.00 0.000000 0 8 brk 0.00 0.000000 0 8 getrusage 0.00 0.000000 0 18 setitimer 0.00 0.000000 0 8 flock 0.00 0.000000 0 1 nanosleep 0.00 0.000000 0 11 rt_sigaction 0.00 0.000000 0 13 rt_sigprocmask 0.00 0.000000 0 6 pread64 0.00 0.000000 0 7 pwrite64 0.00 0.000000 0 33 mmap2 0.00 0.000000 0 18 4 stat64 0.00 0.000000 0 34 lstat64 0.00 0.000000 0 92 fstat64 0.00 0.000000 0 63 fcntl64 0.00 0.000000 0 53 clock_gettime 0.00 0.000000 0 1 socket 0.00 0.000000 0 1 1 connect 0.00 0.000000 0 9 accept 0.00 0.000000 0 1 send 0.00 0.000000 0 21 recv 0.00 0.000000 0 9 1 shutdown 0.00 0.000000 0 1 getsockopt ------ ----------- ----------- --------- --------- ---------------- 100.00 0.013448 3363 102 total [root@hp-php ~]# strace -cp 30767 process 30767 attached - interrupt quit process 30767 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 52.88 0.016926 220 77 munmap 29.06 0.009301 2 4343 read 8.73 0.002794 466 6 clone 3.59 0.001149 0 5598 time 3.18 0.001017 0 3745 write 1.12 0.000358 0 7316 gettimeofday 0.64 0.000205 1 164 fcntl64 0.39 0.000124 21 6 waitpid 0.22 0.000070 0 1496 326 access 0.13 0.000041 0 3769 poll 0.03 0.000009 0 151 close 0.02 0.000008 0 114 clock_gettime 0.02 0.000007 0 110 getcwd 0.00 0.000000 0 112 open 0.00 0.000000 0 38 chdir 0.00 0.000000 0 47 lseek 0.00 0.000000 0 6 pipe 0.00 0.000000 0 38 times 0.00 0.000000 0 135 brk 0.00 0.000000 0 3 ioctl 0.00 0.000000 0 14 getrusage 0.00 0.000000 0 38 setitimer 0.00 0.000000 0 19 flock 0.00 0.000000 0 40 mlock 0.00 0.000000 0 40 munlock 0.00 0.000000 0 6 nanosleep 0.00 0.000000 0 27 rt_sigaction 0.00 0.000000 0 31 rt_sigprocmask 0.00 0.000000 0 13 pread64 0.00 0.000000 0 18 pwrite64 0.00 0.000000 0 78 mmap2 0.00 0.000000 0 111 10 stat64 0.00 0.000000 0 49 lstat64 0.00 0.000000 0 182 fstat64 0.00 0.000000 0 8 socket 0.00 0.000000 0 8 5 connect 0.00 0.000000 0 19 accept 0.00 0.000000 0 7 send 0.00 0.000000 0 66 recv 0.00 0.000000 0 3 recvfrom 0.00 0.000000 0 20 1 shutdown 0.00 0.000000 0 5 setsockopt 0.00 0.000000 0 4 getsockopt ------ ----------- ----------- --------- --------- ---------------- 100.00 0.032009 28080 342 total
yes, out scripts reads information. normal.
why munmap works long??!! , when have problem munmap always in top!
comparison result of trace random php-fpm process in regular situation:
process 28606 attached - interrupt quit process 28606 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 45.72 0.001816 1 2601 read 32.88 0.001306 435 3 clone 9.19 0.000365 0 2175 write 6.95 0.000276 0 7521 time 2.24 0.000089 0 4158 gettimeofday 2.01 0.000080 1 114 brk 0.28 0.000011 0 2166 poll 0.20 0.000008 0 833 155 access 0.20 0.000008 0 53 recv 0.18 0.000007 2 3 waitpid 0.15 0.000006 0 18 munlock 0.00 0.000000 0 69 open 0.00 0.000000 0 96 close 0.00 0.000000 0 29 chdir 0.00 0.000000 0 36 lseek 0.00 0.000000 0 3 pipe 0.00 0.000000 0 29 times 0.00 0.000000 0 10 getrusage 0.00 0.000000 0 5 munmap 0.00 0.000000 0 1 ftruncate 0.00 0.000000 0 29 setitimer 0.00 0.000000 0 1 sigreturn 0.00 0.000000 0 11 flock 0.00 0.000000 0 18 mlock 0.00 0.000000 0 5 nanosleep 0.00 0.000000 0 19 rt_sigaction 0.00 0.000000 0 24 rt_sigprocmask 0.00 0.000000 0 6 pread64 0.00 0.000000 0 12 pwrite64 0.00 0.000000 0 69 getcwd 0.00 0.000000 0 5 mmap2 0.00 0.000000 0 35 7 stat64 0.00 0.000000 0 41 lstat64 0.00 0.000000 0 96 fstat64 0.00 0.000000 0 108 fcntl64 0.00 0.000000 0 87 clock_gettime 0.00 0.000000 0 5 socket 0.00 0.000000 0 4 4 connect 0.00 0.000000 0 16 2 accept 0.00 0.000000 0 8 send 0.00 0.000000 0 15 shutdown 0.00 0.000000 0 4 getsockopt ------ ----------- ----------- --------- --------- ---------------- 100.00 0.003972 20541 168 total process 29168 attached - interrupt quit process 29168 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 54.81 0.002366 1 1717 read 26.41 0.001140 1 1696 poll 8.29 0.000358 0 1662 write 7.37 0.000318 2 131 121 stat64 1.53 0.000066 0 3249 gettimeofday 1.18 0.000051 0 746 525 access 0.23 0.000010 0 27 fcntl64 0.19 0.000008 0 62 brk 0.00 0.000000 0 1 restart_syscall 0.00 0.000000 0 7 open 0.00 0.000000 0 16 close 0.00 0.000000 0 3 chdir 0.00 0.000000 0 1039 time 0.00 0.000000 0 1 lseek 0.00 0.000000 0 3 times 0.00 0.000000 0 3 ioctl 0.00 0.000000 0 1 getrusage 0.00 0.000000 0 4 munmap 0.00 0.000000 0 3 setitimer 0.00 0.000000 0 1 sigreturn 0.00 0.000000 0 1 flock 0.00 0.000000 0 1 rt_sigaction 0.00 0.000000 0 1 rt_sigprocmask 0.00 0.000000 0 2 pwrite64 0.00 0.000000 0 3 getcwd 0.00 0.000000 0 4 mmap2 0.00 0.000000 0 7 fstat64 0.00 0.000000 0 9 clock_gettime 0.00 0.000000 0 6 socket 0.00 0.000000 0 5 1 connect 0.00 0.000000 0 3 2 accept 0.00 0.000000 0 5 send 0.00 0.000000 0 64 recv 0.00 0.000000 0 3 recvfrom 0.00 0.000000 0 2 shutdown 0.00 0.000000 0 1 getsockopt ------ ----------- ----------- --------- --------- ---------------- 100.00 0.004317 10489 649 total
and can see munmap not in top.
don’t have ideas how solve problem :(
examined next potential problems , answers "no":
- additioan user activity
- long scripts execution (several seconds)
- using swap
can us?
Comments
Post a Comment