At work some of our Red Hat (RHEL5) servers with 32GB of ram were configured with only 2 Gb swap files. For some workloads this might be fine, but a 2-in-1 Apache and MySQL server is not one of them.
Read more »
Category Archives: Sysadmin
Adding a swap file on Linux
How not to troubleshoot an unexplained server reboot
We asked our provider to investigate why one of our servers rebooted last night. In the process they accidentally rebooted it again… this is root’s bash_history just before it happened, note line 971:
954 2011-08-17_15:10:39 sar -q 955 2011-08-17_15:10:59 sar -q|less 956 2011-08-17_15:11:09 sar -r|less 957 2011-08-17_15:11:24 last -x|less 958 2011-08-17_15:11:49 history |grep -i shutd 959 2011-08-17_15:11:21 history 960 2011-08-17_15:11:32 date 961 2011-08-17_15:13:52 cd /var/log/ 962 2011-08-17_15:13:53 ls 963 2011-08-17_15:13:54 ls -lah 964 2011-08-17_15:13:58 less audit/ 965 2011-08-17_15:14:04 less audit/audit.log 966 2011-08-17_15:14:25 less secure 967 2011-08-17_15:15:15 grep -v nagios secure | less 968 2011-08-17_15:16:11 dmesg 969 2011-08-17_15:17:57 sar -r 970 2011-08-17_15:18:19 dmesg 971 2011-08-17_15:18:30 dmesg | reboot 972 2011-08-17_16:20:20 [LOGOUT]: xxxx pts/2 2011-08-17 15:27 (xxx.xxx.xxx.xxx)
Getting the most out of Terminator
Terminator is a must-have tool for Linux administrators. It’s a terminal emulator that supports multiple terminals via tabs, but also by dividing up its window with horizontal and vertical splits.
The user documentation is a bit sparse, in fact what you see in the man page is what you get. In this post I’ll take you through the features that I think make Terminator the best terminal emulator around.
Read more »
Recovering a RAID5 mdadm array with two failed devices
Got into an interesting situation with my parents home server today (Ubuntu 10.04). Hardware wise it’s not the best setup – two of the drives are in an external enclose connected with eSATA cables. I did encourage Dad to buy a proper enclosure, but was unsuccessful. This is a demonstration of why eSATA is a very bad idea for RAID devices.
What happened was that one of the cables had been bumped, disconnecting one of the drives. Thus the array was running in a degraded state for over a month – not good. Anyway I noticed this when logging in one day to fix something else. The device wasn’t visible so I told Dad to check the cable, but unfortunately when he went to secure the cable, he must have somehow disconnected the another one. This caused a second drive to fail so the array immediately stopped.
Despite having no hardware failure, the situation is similar to someone replacing the wrong drive in a raid array. Recovering it was an interesting experience, so here I’ve documented the process.
Read more »
Splitting files with dd
We have an ESXi box hosted with Rackspace, it took a bit of pushing to get them to install ESXi it in the first place as they tried to get us to use their cloud offering. But this is a staging environment and we want something dedicated on hardware we control so we can get an idea of performance without other people’s workloads muddying the water.
Anyway, I’ve been having a bit of fun getting our server template uploaded to it, which is only 11GB compressed – not exactly large, but apparently large enough to be inconvenient.
In my experience the datastore upload tool in the vSphere client frequently fails on large files. In this case I was getting the “Failed to log into NFC server” error, which is probably due to a requisite port not being open. I didn’t like that tool anyway, move on.
The trusty-but-slow scp method was also failing however. Uploads would start but consistently stall at about the 1GB mark. Not sure if it’s a buffer or something getting filled in dropbear (which is designed to be a lightweight ssh server and really shouldn’t need to deal with files this large), but Googling didn’t turn up much.
Read more »
Likewise Open – problems rejoining domain after upgrade
There seems to be a common problem with Likewise open not gracefully upgrading on Ubuntu, e.g. – upgrading a system from the distribution supplied Likewise-open 5 in Ubuntu 10.10 to the latest packages from the Likewise website (Likewise 6.0 at the time of writing).
The system in this case was an old Ubuntu 9.10 server using Likewise Open 5. After some patching and an update to the current vmware tools it started failing to authenticate domain users, so I decided to upgrade to the latest version. However after the upgrade I was getting an error when trying to join the domain:
Error: ERROR_FILE_NOT_FOUND code 0×00000002
The obvious solution is to remove all likewise packages and purge the config, however that didn’t seem to work either. What DID work, was removing & purging the config, manually removing a few directories that were not empty, purging a few other seemingly related packages which were marked as no longer required after the uninstall, and finally reinstalling.
Read more »
Bash script to alert when memory gets low
We have a web server that’s running out of memory about once every couple of weeks. The problem is that when it happens the swap file is totally full, the system is unresponsive and it usually needs a hard reboot. So it’s a bit difficult to debug. To avoid digging through log files I don’t understand I elected to put a script in /etc/cron.hourly which checks the total amount of free memory (including swap and physical). If there is less than 256mb free (this server has 512mb of ram and a 1gb swap so at this point the situation is serious), it dumps the process list to /tmp/processes.txt and sends me an email with it attached.
Note that mutt must be installed (‘apt-get install mutt’ on Debian/Ubuntu, or ‘yum install mutt’ on RedHat/CentOS).
#!/bin/bash
free=`free -mt | grep Total | awk '{print $4}'`
if [ $free -lt 256 ]; then
ps -eo %mem,pid,user,args >/tmp/processes.txt
echo 'Warning, free memory is '$free'mb' | mutt -a /tmp/processes.txt -s "Server alert" email@me.com
fi
Then of course make it executable and symlink to cron.hourly:
chmod +x /etc/scripts/memalert.sh ln -s -t /etc/cron.hourly/memalert.sh /etc/scripts/memalert.sh
