Where I’ve been…

Yesterday was the first day since I got back from vacation that I took a full day off — basically told work to stick it and signed off. Laurie and I took advantage of it and we went for a little drive. A drive down highway 1 through Big Sur to San Simeon, and then back home via 101.

A nice, but rather gray day (but not like the weather san jose had — we were surprised by the rain and lightning on our return), so we didn’t do any photography, but the drive was gorgeous, and technical enough to keep Laurie (and the BMW) happy. the Highway 1 trip has been on the docket for years — I’m a firm believer that everyone who lives in california needs to do it once, so they don’t feel the need to do it twice…. It’s pretty, it’s technical, but once you get past that, it’s mostly driving. Not that I’m complaining. Now’s a good time, too, because of the relative lack of RV’s that can make the road a living (and 20MPH) hell.

It’s been a long, occasionally nasty run. We had a short-notice need to scale capacity (again) — the project is just gotten amazingly popular, so we keep having to make it handle more things and more people. So far, we’ve avoided the choke-on-success problem, which is a much more fun problem to have than, say, the one Steve Ballmer has right now.

But it creates challenges — and not ones you can necessarily schedule.

For the last year, my project’s been basically four people: myself, my programmer, Michelle, who’s acted as program and project manager (and first line help desk, and QA, and doc writer, and trainer, and… cat herder and chuqui-tamer and…. pretty much everything else), and Deborah, who’s been our admin handling email and making sure stuff that needs done gets done.

Deborah got pregnant (for which we are all thrilled) — but she’s now headed out to maternity leave. Her backfill starts soon, but isn’t here yet.

Michelle moved back to be with her family a few months back, and has been running things remotely for us since. We knew it was only a matter of time before she found something local she wanted to do, and I’m thrilled to note she’s moved on and started a great job (for her) at Razorfish. Unfortunately, that leaves a bit of a hole in the organization to fill. We’ve brought in Colleen, who worked with us in the early days of the project, to help out (and even better, she doesn’t need to be trained in chuqui-herding, she’s a natural), but still, I’m taking on some stuff MIchelle did, as is the entire team.

This, of course, all happened while we had a peak volume period — in about a week, we needed to manage a transaction volume that was about 1.2X what our record volume for a month had been. The sound you hear in the background is Scotty yelling “cap’n! the warp drives can’t handle it!”. But somehow, we did. And there was much rejoicing, except among the Klingons.

And to add to the fun, we’re expanding. My job is being split into more or less three pieces — and so we’ve brought on Dean (the new liason to the data center for managing hardware, installs, upgrades, etc, and acting as a toolsmith for the team in build and test automation, and other ‘stuff’), and Alan (my new development lead, who’s going to take over the development….). Which leaves me, um, the easy stuff. Figuring out how to continue to scale the beast (we expect to grow at least another 2.5x next year, front-loaded — and our growth assumptions have always been woefully conservative), and dealing with issues like Disaster Recovery, Fault Tolerance, High Availability, Redundancy, Global Access, and acting as both lead architect (I get to do the 30,000 foot version, Alan to worry about the details) and deal with the business owners and our various client groups. So we’ve been bringing everyone up to speed, or trying to. Oh, while I’ve had a writer living in my cube with me the last few weeks bringing the project docs up to speed (which, in reality, means writing them from scratch, and asking lots of questions, which is why he’s living in my cube with me…).

All while managing the day to day operations (and glitches). And, did I mention I was trying to get in a database upgrade? That went in over the weekend — we moved from a single Xserve and MySQL 4.0 to two boxes in a replicated environment using MySQL 4.1. That seems like a simple upgrade (and in theory, it is), but our data set is now around 100 gigs of data — and simply coordinating the logistics and moving that much data around turned Saturday into a 14 hour day. The good news, though, is that it was a complete success (and ended 4 hours before year end freeze in the data center stops everything in its tracks for two weeks…). It ended a multi-week string of days like that, with weekends running 8-10 hours, weekdays more like 14, iwth a few stretching closer to 18.

Ya know? I’m not 20 any more. I admit — I hit the wall (and crawled over, and faceplanted on the other side), and somehow kept going (mostly). Ultimately, I found myself so chronically tired that I started getting insomnia, started binge-eating carbs (up four pounds, after 18 months of maintaining weight despite everything). I’ve pushed myself hard, but never — this hard. And it shows, and I feel it. On the other hand, tonight’s the first night I haven’t felt like a zombie for a while, and up to complex tasks like, oh, paying bills and doing more than sitting and staring at stuff. I’ve almost reset the sleep cycle, and that’s really the key thing to getting things back to normal.

Was it worth it? Hell, I don’t know. Right now, I don’t think so — but I also know if we hit another crunch time, I’ll dig in and fight to get it done. But the hope is (and I expect it’s correct this year) that the added folks will make that less necessary, both short-term (once everyone’s up to speed), and down the road. My stated goal is, well, to get my weekends back, without anyone else having to sacrifice theirs. We’ll see.

Once Alan and I get the development moving again, we’ll be adding 2-3 contract programmers, plus we’re working to not only replace MIchelle, but add in a second project admin. And probably a QA person (and test developer, since we need to automate our testing a lot more). And, well, we’ll see. (and I’ve got 30 more Xserves in the budget for next year — so far).

But I never want to be that tired again, ever. For now. I guess.