Archive for October, 2008

Swap file problems in CentOS (Rocks Cluster)

Thursday, October 16th, 2008

We’ve been experiencing an interesting problem on our cluster nodes which causes them to freeze up.  It appears to be related to the way the linux kernel in CentOS deals with memory allocation requests.  The issue is caused by the swap partition on a machine filling completely, which freezes the system.  Any attempt to start a new process hangs, waiting for space to become available from the swap (which it never does).  There are several ways of trying to deal with this.  The first is to use oomkiller, a process that will detect when the memory limit is going to be reached and kill a process it decides can be sacrificed for the greater good. this is a good description of the memory issues and how to test for them.

Oct 14th to 18th

Thursday, October 16th, 2008

The ssl certificate installation was sucessful on the mail server, so I will be rolling out certficates from the same authority for the web server and other web applications we host.  We will also be trying an upgrade on the cluster operating system, as well as moving the cs home directories to a new disk array.

  • Cluster Upgrade
  • CS Disk space move
  • Active Directory Testing
  • Windows Software Patches and Updates
  • Continued refresh of installed windows software (firefox, possibly matlab)