January 21, 2013

AWS EC2 EBS Partition Resize

I recently noticed I was getting low on space on one of my EBS volumes I mount on my EC2 server. After doing some research (aka Googling) it seemed like a simple procedure: umount, snapshot, create volume, resize, and mount. I even found a concise page that detailed resizing an EBS volume with sample commands. However when I got to the resize step I ran into an error along the lines of "The filesystem is already XXX blocks long. Nothing to do!". Turns out I didn't mount the complete EBS volume as a drive but instead partitioned it. The snapshot and create volume steps had increased the size of the EBS volume but not my partition. Without a graphical interface the prevailing recommendation to use gparted didn't help. Thankfully I stumbled upon another blog posting about resizing partitions using fdisk without losing data. While that post dealt with another virtual server service the same basic principles applied. After carefully deleting and recreating the partition on my EBS volume keeping the starting sector and file type the same resize2fs finally worked and I was off to the races. In retrospect the partition table hacking makes sense that it would work but isn't something I would have thought to try.

Tags: aws ebs ec2 neophi

March 28, 2011

NeoPhi Breathing Again

On Tuesday the 15th upon arriving at work I got a monitoring email alert message saying that my blog's RSS feed file wasn't accessible over the Internet. Given some of the networking issues RCN has been having lately, I figured this was another of them. However, this time it turned out not to be the case as my housemate was showing up on GMail chat which meant our home network was still online. I had him try and reboot the server as it has wedged in the past due to kernel panics. Unfortunately after a few attempts it just wouldn't startup. No video output, no nothing, just a repeating cycle of two short beeps every few seconds.

I was slated to head to a friends house after work to check out his iRacing setup but I figured I'd swing by home first to get a better idea of what troubleshooting would be ahead. It seemed though that I was just warming up for a bad day as a few blocks from my house my back tire went flat. Turned out I had gotten a carpet tack squarely wedged in it. No puncture resistent tire can quite handle that. To top it off my bicycle pump started freaking out while trying to fill up my tire. I still have no idea what was up with that. In any case I did finally get my flat tire fixed but made no progress with troubleshooting my system.

Since I wasn't sure how long my system would be down, I switched over email accounts that were being delivered directly to my server to my GMail Apps hosted domain and headed out for some iRacing fun. Man did I suck at that. Didn't realize quite how much the seat of the pants feel when driving factors into car control.

After waking up Wednesday morning I noticed one of my typical daily morning emails wasn't there. I realized that with my server off the net since yesterday morning by secondary DNS server was no longer picking up SOA records, which of course meant there was no MX record for the domain, hence no email. I tried to promote my secondary DNS at easyDNS to become primary but became completely frustrated with the UI and inconsistent help. I bailed on them and setup new primary hosting service with DNS Made Easy, which was a snap. A couple of hours later my whois record reflected the new DNS servers and mail started flowing again.

After work I bailed on what looked like an awesome BFPUG talk to spend more time troubleshooting. No dice. My motherboard manual indicated that a series of short repeating beeps meant a power issue. The power supply checked out fine and removing the power load by unplugging a few of the RAIDed drives didn't help. In the meantime I decided that even if I could get the server up it was time to raise the priority of virtualizing it. Adding fuel to the fire maybe wasn't the best thing to do but it was something I had wanted to do anyway.

Having used Amazon's EC2 environment extensively at my last job I knew I could get a new server up and running quickly. Combined with RightScale, 30 minutes later I had an EBS volume attached to a generic RightScale server template and started working on getting my data transfered and services restored.

On the data backup side I've used my MacBook Pro as an onsite backup for my server and used TimeMachine and Jungle Disk to backup that data along with the rest of the files on my laptop. Worse case scenario if the old server was completely dead and all the data was lost, I'd lose at most 9 days, that being the time since the last backup. While I'd verified that I could restore files back to my Mac, I had never tried using Jungle Disk to get them directly back onto a Linux server... I think everyone can see where this is going :) It should have worked and the support ticket I opened with Jungle Disk indicated that it was possible, but despite trying for a couple of hours I couldn't get there graphical setup tool to work over X11 port forwarding (the version of Jungle Disk I'd used to make my backup wasn't Linux command line compatible). As a result I started the much longer process of using my home Internet connection to start pushing files up to my new server, instead of pulling them directly onto my new server from S3 which is where I have my Jungle Disk backup.

Thursday after work I made a run to MicroCenter to procure a new motherboard as everything indicated that was the most likely failed component. Turns out my server is at least 3 generations behind the times and MicroCenter had nothing that I could plug my existing components into. I looked at picking up both a new motherboard and CPU but given I was going virtual this felt like a waste of money. I left empty handed knowing that I was probably going to end up with a few days of lost data.

When I got home I checked that my transfer was still running and was disheartened to see that based on how much rsync had already transfered the expected completion time was 7 days. It then I remembered that when I first ran JungleDisk I had to leave my laptop on a week straight for it to finish the first backup. While updating people on the status of the server at game night as we put a take-out order together I had the idea that I could use a Windows instance running on EC2 to run JungleDisk and restore my files which would then be an inter Amazon transfer onto my new server. Post games RightScale's Windows Server Template allowed me to get a server up and running in about 30 minutes. I had a few hiccups and unexplained operational states during that process though. I'm going to attribute most of the oddness to the EBS issue Amazon was experiencing during that time. With Remote Desktop Connection I was able to control my new Windows instance (remember to update your security group to allow the RDP protocol in), install Jungle Disk software and start the restore. Estimated at 12 hours this was much better than transferring from my local machine.

The restore was done by the time I woke up on Friday at which point I started the transfer to the new server. Just to review the path this data has taken: Linux to Mac to S3 to Windows to Linux. The last bit there pretty much meant any user/group/permission settings that existed on the original Linux had been lost. Turns out some important information had also been lost on the original Linux to Mac step. By default the Mac filesystem is case preserving but case insensitive, so whereas on Linux two directories called Mail and mail are separate, only one of those wins when landing on the Mac. This unfortunately meant that getting the original server backup was a requirement now otherwise any colliding data when I made the original backup from Linux to Mac would be lost.

Post work on Friday I grabbed a much needed drink with friends at Green Street Grill and when I got home worked on getting the correct users and groups setup on my new system so that I could fix the user/groups/permissions on the subset of files that had been restored. I went to bed with plans to trek out to MicroCenter on Saturday to buy a cheap desktop that I could use to pull the data off the old server.

I woke up Saturday morning long before MicroCenter would open so I decided to take one last stab at trying to fix the problem. After ripping out the RAID card which I really hoped wasn't the problem since it cost as much as the rest of the computer, the system didn't beep when I turned it on. Looking into the troubleshooting guide for RAID controller it turned out the beeping was the "bus reset signal" (which I've not bothered to lookup exactly what that means). Needless to say with this new piece of diagnostic information I went back to searching the Internet and ran across a few posts about my motherboard not POSTing being related to bad memory. I unseated two of the DIMMs with no change in behavior. When switching to having the other two unseated the system started and the BIOS setup utility kicked in.

Bad memory all along. Which might have explained some of the previous kernel panics I'd seen in the past. I plugged the RAID controller back in and everything still booted. Alas when I went into the hardware RAID setup only 2 of the 5 drives showed up. Given that I run RAID 6 with a hot spare those two drives were enough to ensure that I had no data loss. Looking in the case again I noticed that in the course of mucking with the memory and the general tight space of the server I'd knocked lose one drive's power, another drive's SATA cable, and lastly one drive's cable from the controller. Thankfully everything was hot-pluggable and within a minute the RAID controller saw everything.

One very long reboot later (fscking is slow) my old server was up at running, albiet with half its memory gone. Having committed to making the virtualization switch I started up the process of syncing the missing data up to the new server. Thankfully with the bulk of the data already there the deltas didn't take long. The rest of Saturday was spent getting MySQL and Apache running again and configured correctly to handle the fact that the root filesystem is ephemeral. For a user machine like this I could technically use an EBS backed root filesystem, but that then means I can't quickly boot a new instance that has a fresh and clean OS on it. Nor does it really speak the real goal of virtualization which is to be able to spin up new cloned instances to handle increased traffic, but I don't expect that to happen anytime soon with my server.

Needless to say 5 days later with a few late nights thrown in and having spent the previous weekend at No Fluff Just Stuff conference, I'm ready for next weekend already.

Tags: life neophi

November 17, 2008

Nude Riding a Bicycle and In Bed with Google

It feels like ages ago that I made a post even though its only been a few weeks. I've been a little busy and writing a blog entry was fairly low on the list. After last night's activities I feel compelled to fill the gap.

Awhile ago Allurent moved and in the process I acquired a mannequin. At the time I solicited suggestions but nothing inspired me. Earlier this week Annalisa sent me information about a Halloween Bike Ride that was taking place around Boston. Given that I hadn't come up with any other Halloween plans it sounded like a grand idea. It also served as the inspiration for decorating my bike, with what else but, a mannequin. Given the reactions I got while bringing it home on my back I figured a more elaborate setup (and one that I could ride with for the night was in order).

TR gave me the idea of mounting it to my bike rack which would better support the weight than strapping it to my bike like I did in order to get it home. The tricky part was figuring out how to get it on the bike so that it made some sense and could still be ridable. Thankfully with some help from Clara, 3M Packaging Tape, and a little ingenuity it came together. Hence was born "Nude Riding a Bicycle". The ride itself was a blast with over 240 riders starting it off. Some dropped off as the route meandered around Boston but there was a strong presence to the end. My hats off to the organizers for a great night and to New England weather for making the night tolerable.

In other news I've gotten in bed with Google more than I thought I ever would. For a long time I've been running this little neck of the Internet known as NeoPhi including my own web server, mail server, etc. It's given me the freedom to do just about whatever I want. This month I changed some of that. In particular I changed my mobile phone. While for some this maybe a common occurrence, it isn't for me. This is only my third mobile phone, ever. My previous two were each with me about 5 years. It's not that I haven't thought about getting a new phone I just didn't find anything worth switching to.

I seriously considered the iPhone but couldn't justify the price and frankly wasn't that interested in switching from T-Mobile. I looked into some other 3G phones that debuted in European markets but T-Mobile's 3G service either wasn't compatible (since it used a different spectrum) or didn't exist were I would have been using it. Flash forward to a couple of months ago when the G1 was announced. A form factor that was close to my old phones, a platform I could easily develop for if I wanted to, it worked with my provider of choice, and had some cool apps. Without ever seeing one in person I pre-ordered it and picked it up the night of Oct 21.

My impressions so far have been very favorable! My biggest complaint is that with only some of the features turned on the battery drains pretty quickly. My last phone would last at least 3 days between charges while with the G1 I'm having to charge it every night. The touch screen is great and having access to a full physical QWERTY keyboard has made working with it a breeze. The web browser has handled various sites I've thrown at it and I've already downloaded and used some other applications from the Android Market.

In order to start using the phone though you must tie it to a Google account. While I do have a GMail account, it isn't my primary email address. I mostly created it to get access to Google features before they let you sign up with any email address. Given the tight integration between the phone and Google I thought it worth considering what life with Google would be like. I previously used a Palm to track my calendar and contacts (those details that my phone couldn't handle like addresses). With my new smart phone it felt like overkill to carry around both devices. The biggest hurdle though was email.

I figured my insistence on using procmail, SpamAssassin, and Pine to read me email while serving me well for the past 13 years may have been due for an update. To enact the change I signed up for a Google Apps Premier account. Which means that Google and Postini now handle email for my domain. The premier account also lets you import old mail via IMAP which made migrating 13 years and a couple gigs of email a breeze. Well not an entire breeze. My local cataloging system didn't directly translate into Google's label system so after all of my email was in my GMail account I did have to spend a few hours massaging labels to make it work. Now that it's all done though I must say GMail is a reasonable replacement for my old system. More importantly I have complete and easy access to all of it from my new phone.

My contacts and calendar on the other hand are an on going effort to migrate them to Google. In particular getting the data out of Palm Desktop into a format that could be imported into Google was not straight forward. While there are some applications that can synchronize between the two, I didn't want to shell out the cash for them since I wasn't planning on continuing to use my Palm once the transition was done.

I ended up using Apple's Address Book and Calendar to read in vCal and vCard exports from Palm Desktop. I then used a 3rd party utility to export my address book into a Google friendly format and iCal's native export format. Like my email the translation process isn't perfect and I suspect there will be a couple hours of cleanup for each data set before it's Google friendly. In retrospect though it's also prompting me to do some house cleaning of the data which is sometime I've been trying to do in general.

In the process of moving my digital life to Google I also moved my computing life to a new Mac Book Pro. While trying to do the development on PPE the speed of Java on my old G5 PPC was really starting to annoy me. Conveniently the rumor mill said that Apple was about ready to release new some Mac laptops. Turns out the rumors were true and on Oct 14th that happened. I'm very pleased with the new solid case design. While at first I lamented about the lack of a matte finish for the screen, given when I most often use my laptop it hasn't been something that has bothered me as much as I thought it would.

Needless to say between the new laptop, phone, visiting Elissa, having the house de-leaded, and migrating my digital life to Google, my free time has been minimal. For me busy is happiness which has made these past few weeks fly by nicely. I'm hoping to get back to PPE as I really want to move that along so it doesn't become an abandoned project of mine. My closing hint is that when copying files from a Mac with rsync be sure to add the -E flag. Thankfully I had backups and you should to. Time capsule is overpriced but the convenience of wireless piece of mind backups is great.

Tags: android g1 google life mac neophi

November 17, 2008

Tripping on NeoPhi

As most regular readers of this blog know my housemates and I are having the house de-leaded. This unfortunately means temporarily vacating part of the house while it is worked on. Thankfully despite the age of Gilman Manor it had very little lead. One area that did have lead was near where NeoPhi lives. This of course necessitated moving NeoPhi. That went smoothly and should have only been seen as a short network blip and NeoPhi's and its UPS moved smoothly into another room. Alas the short blip turned into a two hour nightmare for me as once in its new home while trying to get something I forgot I was going to need I knocked NeoPhi's power cord out from the UPS.

Needless to say it wasn't happy. The fsck during reboot was taking an extremely long time and I thought that it had died on me. When I was about to give it it started beeping horribly. A sound I'd never heard before and hope to not hear again. In a panic I killed the power again, the beeping stopped. I turned it back on and during the RAID firmware startup the beeping started again. After another quick reboot I went into the RAID BIOS. It was in the middle of a rebuild and the alarm was the indication that bad stuff was going on. No problem. While I could have let the system startup and run in a degraded mode while rebuilding I figured it was best under the circumstances to finish the rebuild in the BIOS.

That looked like it was going to take about an hour to do. Fine I finished moving stuff around, read my email, and worked on other stuff to pass the hour. Upon returning it was stuck at 90.1% complete on the rebuild. I waited, still at 90.1%. I waited some more. Finally after 10 minutes is ticked up to 90.2%. I'm like this can't be good. I went into the the event log and it was slowing filling up with Read Error on Channel 4. Great. Thankfully since I'm running RAID 6 I wasn't too concerned about lost data. Maybe just some corrupt data due to the power loss.

Looking around the BIOS some more though I couldn't find anything about my hot spare drive. Instead I found 2 RAID sets one which was incomplete and the other had only 4 disks (one of which was my hot spare). Guess this is why they highly recommend buying the battery backup for your RAID controller. Given how long the read failure for drive 4 were taking I took an unorthodox step and just yanked its cable from the controller. The event log quickly reported that the device had been detached and in a couple of minutes the last 10% of the rebuild finished.

At this point I was able to boot up in single user mode and ran a manual fsck on all the partitions. I got some really nasty errors from fsck that I've never seen before. A little research said there wasn't much hope of really fixing them so I just said Yes and let fsck do its thing. Once all disks passed fsck cleanly I rebooted again. I deleted the single drive that was part of the incomplete RAID set (which was really one of the primary drives of the original RAID set), added it as a hot spare and rebooted again.

NeoPhi is now back up and running. While I type this the RAID is doing another rebuild in the background and I have the one drive that was getting read errors sitting next to me soon to be replaced with a new drive from Newegg. Unfortunately along the way I again learned how limited RAID control support is for OpenBSD as I couldn't even use the alarm silence function of bioctl to shut the damn thing up. I must say though that the Areca support under OpenBSD at least exists, unlike my previous 3Ware card. Now I can at least get the status of the rebuild even if some of the other features like seeing the event log don't seem possible.

I'll probably do a reboot when the new drive comes in since I don't think hot plugging an internal drive is the best thing to do, even if I did unplug one that way. The thing that gets me the most about this little incident is a friend of mine had serious problems with his VPS at Dreamhost. Seems like no matter what you do eventually hardware or software failures will catch up with you.

Update: Seems the network port and or cable that I plugged NeoPhi into last night degraded over the night or possibly something got fried when the house lost power for 20 minutes this morning. In either case, slightly before the construction crew got fully setup I swapped out the cable and that seems to have made the network happier.

Tags: gilmanmanor neophi

September 22, 2008

If it isn't broke don't fix it

I've wasted too many hours over the last couple of days playing with MovableType 4.2 only to find out that I don't really need it. Granted there are some nifty new features in it, but it feels much slower. I attempted to speed it up using mod_perl but met great difficulty. I didn't even get around to making sure that all of my plugins were doing the right thing since half of the ones I had I think I saw on the don't use with MT4 list. Thankfully I did all my upgrade testing in a separate copy of my database and published to a new location so nothing got lost.

If I was starting a brand new blog I'd still strongly consider using MovableType but given all of the customization of templates, plug-ins, and pages I've done to my existing version it was turning into too big of a task without enough gain. Too bad I didn't have that foresight yesterday when I started down this doomed path.

Tags: movabletype neophi

February 25, 2008

Getting My Geek On

I just flipped the switch and NeoPhi is now running on new hardware. This project started last Friday Jan 25th when the new hardware came in. I ordered everything individually since I had a specific system in mind. In particular I wanted to increase my data redundancy by switch to RAID 6. The hard part is that since I run OpenBSD not that many RAID cards are supported. Thankfully I found one that was built a system up around it.

This was the second such time I'd done a whole hog hardware transfer. The last time was about four years ago and I think I'd forgotten what I nuisance such a migration is. Anyway it's done, seems to be working and boy am I tired right now.

Tags: neophi

October 24, 2007

Grrrrrr

It seems RCN's out bound mail server died. Which means that all of my outbound mail since like 8am this morning have not been delivered... I called them a few hours ago and after being on hold for 25 minutes finally spoke to someone that said "the issue had been resolved". Well it still looks like no out bound email has gone out based on my own tests. Up to this point besides some static IP issues I had when I first got going RCN had been rock solid. Needless to say if you were expecting an email from me, it probably didn't get sent...

And for those that are curious I don't send email out directly from my box since many of the blacklists by default include all cable IPs. This caused problems with email being bounced in the past.

Tags: email neophi rcn

March 11, 2007

NeoPhi Upgrade 2

Most of this weekend was spent upgrading NeoPhi to OpenBSD 4.0. I figured since I was going to patch my machine for the timezone fix I might as well just patch everything. Just as last time I did the less recommended upgrading without install media approach. It really isn't that bad. While I did learn a few tricks from the last time, I managed to make some other ones...

First I forgot one of the userland packages. I guess at some point I added xbase to the set of userland packages I had installed. Thankfully the problem only played out when one of the optional packages failed to upgrade due to a missing dependency. Thankfully I found a post that spoke to the issue, installed both revs of it and things started working again. Guess I just need to keep better track of what I have installed.

I again had postfix upgrade issues. Thankfully this time I had shutoff incoming requests at the firewall so no messages were put into limbo. The problem was during the upgrade of either the userland packages or optional packages the preferred mail server gets reset. As a result you need to rerun the postfix-enable script.

Mailman didn't upgrade properly and I also forgot to test it post upgrade. As a result this generated some bounced messages. I finally tracked it down to a combination of a configuration and permission/ownership issue which means I should be fine for future upgrades. I added an update in my previous upgrade post as that includes the error message and other details.

I have to say the new package update scheme rocks. Using both PKG_PATH and PKG_CACHE I was able to simulate what I'd previously done manually. One problem is the pair.com OpenBSD mirror doesn't play well with pkg_add. I had to switch to another mirror to actually get it to work. I thought I was being cleaver and teeing the output of the pkg_add command. Alas it uses some funky screen redraws and my output file contained almost no useful information. This is kind of bad since unless you are paying attention there is some information that scrolls by that you would otherwise miss. Next time I'll have to look at other switches to pkg_add to maybe change its behavior Probably another reason to do a fresh install.

Once everything was updated I applied the most recent patches based on the errata list. For some reason though I didn't think to reboot. As a result running applications didn't pickup the new timezone information, which was the main reason I went through the entire exercise to begin with...

During the course of the many reboots I also checked in on my RAID disks. Turns out I don't think they have ever been doing the right thing! While OpenBSD does support 3Ware, the drivers don't do everything that a 3Ware setup really needs to do. There is some funky RAID Array rebuilding operation (which my disks are currently in) that has to be initiated by the driver. 3Ware has never supplied the OpenBSD developers with the needed documentation to add this support into the driver. So, should your machine ever experience a power outage you'll need to reboot into some other OS which 3Ware completely supports in order to actually get your RAID Array working again. Ludicrous!

Tags: neophi openbsd

May 31, 2006

Blacklists

I've recently been having issues sending mail to various domains from NeoPhi since it is currently on an RCN static IP. It seems various email blacklist sites have started adding cable modem IPs. RCN technical support was no help in trying to resolve the issues. In the end I decided to use RCN's mail server as my outgoing relay in the hopes that at least that RCN machine isn't blacklisted.

Tags: blacklist mail neophi rcn

May 31, 2006

You can't read this

It seems RCN has lost my static IP. As a result NeoPhi is off the net, so you can't read this. You can try if you happen to know the IP that it is currently listening to, but that seems to keep changing so that won't do either of us much good. Hopefully the problem will be fixed in the near future, but I don't have my hopes set very high, the people I talked to on the phone weren't really sure what was going on.

Tags: ip neophi rcn

April 30, 2006

Gallery Upgrades

I decided it was time to upgrade to the latest versions of Gallery. I have to say that Gallery definitely gets high marks in my book for their Gallery 2 upgrade process. After a simple file unpack the entire process is handled through a slick browser based wizard. Painless.

The latest Gallery 1 release also has a smooth upgrade process, but not nearly as slick as Gallery 2. I think it might be time to look at migrating everything over again. Hopefully they have some more of the kinks worked out.

Tags: gallery neophi pictures upgrade

February 28, 2006

NeoPhi Upgrade

I spent most of today upgrading NeoPhi from OpenBSD 3.4 to OpenBSD 3.8. When I had the box in the colocation facility I wasn't too keen on attempting a remote upgrade and based on today's experiences that turned out to be a good idea. The first problem I encountered was that my server doesn't have a cdrom drive in it. It's only a 1U rack unit and the cdrom space is taken up by a second hard drive and RAID-1 controller.

I popped the case hooked up a cdrom drive and attempted to boot off of the OpenBSD 3.5 cdrom. The first time the boot hung. All subsequent tries it went into a reboot loop. It would start to boot from the cdrom and then reboot the machine. I also tried the 3.6 cdrom and didn't have any luck. At that point I switched to the less recommended in-place upgrade. After boot into single user mode I removed all of the packages I had previously installed. One of the upgrades I was planning changed gcc versions which was sure to introduce some incompatibilities. After removing about 60 packages I started the upgrade process.

I was very pleased with the instructions. Even for all of the warnings that they give you, the process is straight forward. Really the only tricky part comes in merging changes to /etc, which isn't that bad if you haven't done that much customization. I've tried to follow good practices with keeping my changes in .local files or linked somewhere else entirely. As a result I only had a few changes that I had to manually fix in /etc.

Since I was planning on using the boot images and just on-the-fly downloading of install files I had to side track for a bit to pull down the files for OpenBSD 3.6, 3.7, and 3.8. I already had 3.5 from when I previously looked at what was involved in upgrading. Thanks to pair.com for having a fast OpenBSD mirror. Each version upgrade required two reboots and a bunch of waiting for files to unpack. Besides that it was pretty smooth (minus the /etc merging) and I think it only took about two and half hours to do the four upgrades.

At this point the machine was in a state that I could get core services like sshd up and running. Considering that my machine was reporting a system temperature of 60 degrees, I was happy to get out of the basement and back up to my normal computer. Which was when I had my first scare. I couldn't connect via ssh. Back down stairs I quickly found out that my problem was that I had uninstalled the tcsh package, so I had no shell. I grabbed that and added that package. Now that I could really log in remotely, back up to the warm part of the house.

Since the versions changes on many of the packages I previously had installed and I like having a local copy, I spent the next hour just downloading updated packages. Some of the dependencies had changed (yes I should use the automatic dependency handler, but I'm still leery of those for some reason) which meant downloading additional packages to complete the install of the ones I already had. That ended up taking another couple of hours. At this point I had all of the software I was supposed to and just needed to make sure everything still worked.

Happily most things just did. Jumping all the way from 3.4 to 3.8 did produce these problems:

  • /usr/local/bin/safe_mysqld became /usr/local/bin/mysqld_safe
  • ntpdate handling in /etc/rc.local changed
  • /usr/lib/apache/modules/libphp4.so changed to /usr/local/lib/php/libphp4.so

The next big problem I had was this nasty error message trying to send email to a mailman controlled list:

"/usr/local/lib/mailman/mail/mailman post test". Command output: Group mismatch error. Mailman expected the mail wrapper script to be executed as group "_mailman", but the system's mail server executed the mail script as group "nobody". Try tweaking the mail server to run the script as group "_mailman", or re-run configure, providing the command line option `--with-mail-gid=nobody'.

I double checked that I had grabbed the correct packages. In this case it was mailman-2.1.6p1-postfix.tgz. Searching didn't turn up anything on interest. The only thing that I ran across was a comment in /usr/local/share/doc/mailman/README.OpenBSD:

Problem: I use Postfix for my MTA and the mail wrapper programs are logging complaints about the wrong GID. Solution: Install mailman with the following command: % FLAVOR=postfix make install

I pulled down the ports package and can see that in the mailman Makefile, if the flavor is postfix that is sets up "--with-mail-gid=nobody". Since there wasn't a non postfix mailman-2.1.6p1 package I decided to pull down the 2.1.6p0 package. That installed and ran fine. I now need to look into what was changed going from p0 to p1 and also figure out why the gid stuff changed. Maybe I missed a mod to /etc along the way?

Update It turns out I'm not completely following the standard install, probably since I upgrade instead of doing a fresh install. My mailman aliases were merged in with my regular aliases. Since postfix looks at the owner/group of the aliases file to determine how the wrapper script gets run, this was leading to the problem. I must have missed in the docs where this was spelled out. Anyway I put my mailman aliases in a different file and followed the setup instructions and everything is working with the latest mailman package.

With mailman now working, I had one last issue. A custom compiled httpd was failing to start. A configtest against it said everything was okay. There were no error messages when trying to start it up or in the logs, yet nothing was running. Finally when I tried to run it in single threaded mode it core dumped. At this point I figured that some of the libraries had probably gotten out of sync enough that it just needed to be recompiled. Since it also needed SSL added to it I figured now was as good a time as any.

After pulling down apache, mod_ssl, and mod_perl I went through the standard combination build. Patch apache with mod_ssl then build it all from mod_perl. No dice. Trying to build it shared gave me unresolved symbols in mod_perl and the resulting httpd also core dumped. I ended up having to do a mod_ssl and apache build and then doing an APXS build of mod_perl. Weird stuff. One thing I did run across was the need to set the SSL_BASE when building. In a bash shell something like "SSL_BASE=/usr ./configure (...)" should do the trick.

Needless to say the upgrade took the normal 80/20 rule. What I thought was going to be 80% of the pain ended up only taking 20% of the time while the last 20% took 80% of the time. The end result is that the system is upgraded and everything appears to be working fine. Yay!

Tags: neophi openbsd