| 
		
			| nagios: Warning: The check of host '****' could not be performed due to a fork() error: [message #40380] | Mon, 16 August 2010 06:22  |  
			| 
				
				
					|  romeor Messages: 11
 Registered: April 2010
 | Junior Member |  |  |  
	| hello, sirs! 
 I've installed the nagios with nagvis plugin and from time to time it stops to respond, while i still can vzctl enter into this machine.
 i receive this message in /var/log/messages
 
 Warning: The check of host '****' could not be performed due to a fork() error: 'Cannot allocate memory'.
 
 here is the conf of this container:
 
 # Primary parameters
 NUMPROC="8000:8000"
 NUMTCPSOCK="9223372036854775807:9223372036854775807"
 NUMOTHERSOCK="9223372036854775807:9223372036854775807"
 VMGUARPAGES="603785:9223372036854775807"
 
 # Secondary parameters
 KMEMSIZE="9223372036854775807:9223372036854775807"
 OOMGUARPAGES="603785:9223372036854775807"
 PRIVVMPAGES="603785:664163"
 TCPSNDBUF="9223372036854775807:9223372036854775807"
 TCPRCVBUF="9223372036854775807:9223372036854775807"
 OTHERSOCKBUF="9223372036854775807:9223372036854775807"
 DGRAMRCVBUF="9223372036854775807:9223372036854775807"
 
 # Auxiliary parameters
 NUMFILE="9223372036854775807:9223372036854775807"
 NUMFLOCK="9223372036854775807:9223372036854775807"
 NUMPTY="512:512"
 NUMSIGINFO="1024:1024"
 DCACHESIZE="9223372036854775807:9223372036854775807"
 LOCKEDPAGES="20126:20126"
 SHMPAGES="9223372036854775807:9223372036854775807"
 NUMIPTENT="9223372036854775807:9223372036854775807"
 PHYSPAGES="0:9223372036854775807"
 
 # Disk quota parameters
 DISKSPACE="10485760:11534336"
 DISKINODES="2000000:2200000"
 QUOTATIME="0"
 QUOTAUGIDLIMIT="0"
 
 and
 
 sisemon:~# cat /proc/bc/105/resources
 kmemsize                 30949316             42515317  9223372036854775807  9223372036854775807                    0
 lockedpages                     0                    0                20126                20126                    0
 privvmpages                327761              1143370               603785               664163                98961
 shmpages                      671                  687  9223372036854775807  9223372036854775807                    0
 numproc                        41                   92                 8000                 8000                    0
 physpages                  293667               294962                    0  9223372036854775807                    0
 vmguarpages                     0                    0               603785  9223372036854775807                    0
 oomguarpages               293668               294963               603785  9223372036854775807                    0
 numtcpsock                     16                   19  9223372036854775807  9223372036854775807                    0
 numflock                       17                   21  9223372036854775807  9223372036854775807                    0
 numpty                          1                    2                  512                  512                    0
 numsiginfo                      0                   20                 1024                 1024                    0
 tcpsndbuf                  320000              1016320  9223372036854775807  9223372036854775807                    0
 tcprcvbuf                  262144               210688  9223372036854775807  9223372036854775807                    0
 othersockbuf                19968                81920  9223372036854775807  9223372036854775807                    0
 dgramrcvbuf                     0               497920  9223372036854775807  9223372036854775807                    0
 numothersock                   19                   39  9223372036854775807  9223372036854775807                    0
 dcachesize                1405444              1438558  9223372036854775807  9223372036854775807                    0
 numfile                     67141                67298  9223372036854775807  9223372036854775807                    0
 numiptent                      14                   14  9223372036854775807  9223372036854775807                    0
 
 my kernel is 2.6.24-11 and containers are managed by proxmox.
 and the TOP output is:
 
 
 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 533 nagios    20   0 1229m 1.0g 2292 S 112.7 44.7   1060:29 nagios
 1 root      20   0 10360  752  628 S  0.0  0.0   0:10.18 init
 29 root      20   0   100   12    4 S  0.0  0.0   0:28.20 init-logger
 79 root      16  -4 12616  672  352 S  0.0  0.0   0:00.00 udevd
 344 root      20   0  5920  664  532 S  0.0  0.0   0:01.00 syslogd
 366 root      20   0 62636 1212  652 S  0.0  0.1   0:00.00 sshd
 375 root      20   0 21652  916  704 S  0.0  0.0   0:00.00 xinetd
 408 root      20   0 11936 1408 1168 S  0.0  0.1   0:00.00 mysqld_safe
 458 mysql     20   0  233m  25m 4936 S  0.0  1.1   6:19.71 mysqld
 491 root      20   0 62808 2328  804 S  0.0  0.1   0:27.76 sendmail
 499 smmsp     20   0 57704 1772  616 S  0.0  0.1   0:00.02 sendmail
 509 root      20   0  251m  10m 5964 S  0.0  0.4   0:48.22 httpd
 544 root      20   0 20876 1160  580 S  0.0  0.0   0:00.12 crond
 562 xfs       20   0 20264 1244  752 S  0.0  0.1   0:00.08 xfs
 570 root      20   0 46744  828  428 S  0.0  0.0   0:00.00 saslauthd
 571 root      20   0 46744  560  160 S  0.0  0.0   0:00.00 saslauthd
 622 root      20   0 86072 3352 2608 S  0.0  0.1   0:01.70 sshd
 630 root      20   0 12200 1808 1300 S  0.0  0.1   0:00.26 bash
 10143 apache    20   0  323m  15m 3124 S  0.0  0.7   0:27.96 httpd
 16508 apache    20   0  253m 9380 2364 S  0.0  0.4   0:00.00 httpd
 16512 apache    20   0  253m 9380 2364 S  0.0  0.4   0:00.02 httpd
 16513 apache    20   0  253m 9388 2364 S  0.0  0.4   0:00.00 httpd
 16514 apache    20   0  315m 9436 2364 S  0.0  0.4   0:00.02 httpd
 16527 apache    20   0  253m 9408 2364 S  0.0  0.4   0:00.02 httpd
 16534 apache    20   0  251m 4940  628 S  0.0  0.2   0:00.00 httpd
 22034 apache    20   0  324m  17m 3132 S  0.0  0.7   2:18.70 httpd
 25014 root      20   0 12620 1192  920 R  0.0  0.0   0:00.00 top
 31795 apache    20   0  322m  16m 3124 S  0.0  0.7   1:30.98 httpd
 
 considering this>
 privvmpages 327761 1143370 603785 664163 98961
 seems like there is a memory leak somewhere... why the hell it wants to use 4,4 GB of ram ?
  
 
 emmm... seems like i've got it. i gave to VEs total guaranteed  memory more, than i have physically... how can i null those failcnt out, so it would be easy to monitor the changes?
 
 [Updated on: Mon, 16 August 2010 09:08] Report message to a moderator |  
	|  |  | 
	|  | 
	|  | 
	|  | 
	|  |