hi
we have several R420 servers intel 2.5 ghz hex core processors with hyper threading. these are close to 1 year old.
16 / 32 gb ram
1) of these 4 servers are with linux - no raid
primary hdd - 600 gb 15k rpm
3 drives with data 2tb drives
-- mail server with medium / high traffic
2) windows vps servers
600 gb x 2 drives raid 1 (host os)
600 gb x 2 drives raid 1 (guest os)
we have two guest vps per server
our issue is that on all these servers on a random basis (once in 15 days to a few months) we find that server gives sluggish response / high cpu usage for all processes.
this issue is experienced only in the linux and vps servers and happens even during off-office hours when there is no usage on the servers.
the higher the i/o usage of the server the more frequently this issue occurs.
there is no antivirus on these servers.
the important point i noticed is that even ping to 127.0.0.1 is slow on both linux and windows servers.
on linux ping to 127.0.0.1 gives response of 0.3 to 0.4 ms whereas it should normally be around 0.03 ms. ie it taking 10 times more time than normal.
in the vps servers, ping to 127.0.0.1 show 1 ms (should be < 1ms) and ping time from host to guest is 3-4 ms -- should be < 1 ms.
there is plenty of spare RAM, disk space -- only cpu usage is high
hard disk read/write is like normal though ie comparable with other servers.
the network card is broadcom 1 gbps.
we checked using harddisk, ram and network performance monitors on windows and collecl on linux -- other than above normal cpu usage everything else is normal
IMPORTANT NOTE : after a few hours the issue automatically fixes itself without us doing anthing.
if i stop the i/o intensive operations like spamassassin and antivirus scanning the issue fixes itself faster.
we recently had to move the vps guests from the host which was facing a problem to a different server where they worked fine. the host server fixed itself after around few hours after all the i/o stopped.
help required please
rajesh