Migration – Week1
What a week. What…A…Week. The term this week has been that we have been ‘In The Trenches’!
We are full swing into our migration now. 220,000 accounts have been moved over to our new Zimbra setup. There have been a number of challenges and things that have been over looked, but nothing that has stopped the show or caused any major impact.
Moving to postfix has brought some interesting challenges. We have used qmail-ldap for many, many years and have got the tuning and performance enhancements down to a fine art. We have a 1.3m user qmail-ldap installation, so we know a thing or two about it! The ironic thing is, about a year before I joined TI, I was turned down for a position at a company – Easynet in London – for not knowing enough about qmail!
The challenge for this week will be to complete the migration and then start the tidying up. We will also have to increase our level of monitoring and performance analysis. Another challenge will be to start fine tuning the postfix servers. Queue management and understanding of the queue structure will be priority number 1!
I have also been impressed with the level of effort and commitment that the whole team has shown this past week. It is amazing how everyone can come together when their backs are against a wall and are up against it!
So far, so good.
UPDATE – Sat 1:10pm SAST
I have been monitoring the utilization across all 4 of our mailstores and have made an alarming observation of the disk I/O:
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %utilxvda 0.00 1.00 0.00 1.00 0.00 16.00 16.00 0.00 0.00 0.00 0.00xvda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00xvda2 0.00 1.00 0.00 1.00 0.00 16.00 16.00 0.00 0.00 0.00 0.00xvdp 0.00 0.00 8.00 0.50 8.00 0.50 1.00 0.00 0.47 0.47 0.40xvdb 0.00 11.00 137.00 85.00 4636.00 3920.00 38.54 1.40 6.35 2.98 66.20
The /dev/xvdb device is the /opt/zimbra/db/data directory and as you can see – it is being hit hard! Remember also that there are 4 mail stores – all performing like this. This is going to pose a dilemma over the coming days. As this device is a LUN on the SAN and is exposed to the mailstore as a virtual disk – There is no quick way of increasing the I/O capabilities.
The service is still 99% pop3. Therefore the disk I/O is being hit hard with the amount of data that is constantly changing – lots of random reads and writes. I am thinking that by having users change their usage habits to rather use the web front end will actually help to bring down the number of required writes/sec. Getting them to do that is going to be the big challenge. Some people just cannot live without their outlook express!