On Tue, 2008-07-01 at 01:00 +0400, Kir Kolyshkin wrote:
> First, I hope you know the difference between memory shortage and memory
> used for buffer/cache. In short: Linux tries to use all unused memory
> for cache -- and if more memory is needed cache is shrunk.
Sure. But actually it never uses more than a couple hundred megs, even
if it's available (certainly not 6GB). Also in this case buffer/cache
was driven down to under 100K and the server was using a couple hundred
megs of swap (which is unexpected given it has 8GB of RAM and only using
~2GB under normal circumstances).
Here's what top currently shows:
Mem: 8168892k total, 2161572k used, 6007320k free, 144216k buffers
Swap: 15631236k total, 0k used, 15631236k free, 1125972k cached
which is what I would expect. I don't have exact numbers now, but
"used" was over 7.6GB at the time.
> Second, can you give us more details regarding "Athlon died after
> exhausting memory"? If it was an oops -- we need its text.
No oops, just lots of processes killed by OOM, including sshd.
> Third, are you using x86_64 kernel? What is your exact kernel version?
Yes, x86_64, kernel version 2.6.18-028.049 (Gentoo).
Currently, running 2.6.24 seems to have fixed it (but I may have to give
it longer to see).
> Cliff Wells wrote:
> > Hi,
> > I've been running 2.6.18 releases on two servers and both of them have
> > suffered a slow but steady memory leak. The first server is a dual
> > Athlon MP with 3GB of RAM, the second a quad Opteron 275 with 8GB of
> > RAM. I first noticed the issue on the Opteron a few days ago when I
> > realized it was using 7.6GB of RAM and had started using swap (it's only
> > got about 2GB allocated to VE's). Today the Athlon died after
> > exhausting memory. Both had been up for around 70 days.
> > When investigating the situation on the Opteron, I stopped all the VE's
> > but no memory was reclaimed. Nothing in "top" showed any significant
> > memory consumption. I wasn't able to investigate on the Athlon system
> > as it was in an unusable state.
> > My immediate solution was to upgrade the Opteron system to 2.6.24 and
> > boot the Athlon system into a stock kernel (I'm not currently using
> > OpenVZ on it).
> > Anyway, my concern is that I've seen no mention of similar issues from
> > anyone else (and in fact, until the Athlon server failed today, I was
> > inclined to believe it was a configuration issue on the Opteron), so I
> > have a fear that if there is a leak, it might still exist in newer
> > kernels.
> > I'm going to babysit these machines to see if the problem reappears, but
> > has anyone else noticed similar patterns of memory consumption and is
> > there anything I can do to track this down?
> > Regards,
> > Cliff