Saturday, June 5, 2010

Get out of the sysadmin firefighting business

A while back there was a post on the lopsa-discuss mailing list about time management.  If you read it and the ensuing thread there are a number of really good suggestions about how to more effectively handle your work time so that you are more productive, less harried and start to really gain a sense of situational awareness about your environment.  It's all good stuff and I have used many of the suggestions in that thread with great success.  If you're a system administrator and feel that you need 36 hours in a day, go read that thread and then do at least one of the recommendations.  You'll never look back.

However, there was one particular bit from the original post that really has been hanging out in the back of my mind, bugging me:
I frequently find myself dealing with so many little things throughout the day that by the end of the day I feel like I've been busy but can't really point at what I've done during the day.
So the entire day is running around "fighting fires?"  Time management can't fix that problem, trust me, I've tried.  It can help and it's a great first step, you should do it.  But at some point you need to stop looking for better firefighting techniques to fix problems and start looking at fireproofing things so they don't catch on fire in the first place.  You might think that's a really hard (or even impossible) thing to do and that asbestos underwear is itchy.  Luckily, you'd be wrong on the first part of that thought, and I'd like to talk about some high level, introductory concepts that can help you get started fireproofing quickly.

And, no, I don't really want to talk about your underwear.

The first step for me is always to fix the flare-ups, the small reoccurring fires.  If you're constantly fighting the same fire, over and over again, then it's time you showed up with something more than a garden hose.  You'll be happy, your users will be happy, your bosses will be happy.  And as a wonderful side effect you'll have more time to manage because you won't be in a reactive mode all of the time fixing things! 

In a perfect world you'd see the problem at its' very core, tackle the it with precision, and resolve the problem once and for all.  Pesky print server?  Replace it!  Unhappy database server?  Upgrade it!

We don't live in a perfect world though, so sometimes the only real fix is manage the problem such that the pain it causes stays at a bearable level until you can handle the problem correctly.  One method is to isolate the problem so that when it explodes it can't take anything else out.  For example, move that troublesome application that causes whatever hardware it's on to lock up and require a reboot to a dedicated host. That way the reboot only effects the application instead of everything on the server.  Another method is to install some sprinklers that will automatically put out the fire for you.  Got a service that likes to leak memory?  Automate a restart during the lowest usage period so that the leak doesn't cause problems during peak usage times.

That's all fine for technical issues.  If you're constantly putting out fires from end-user questions and tickets there are some other strategies that can help.  Documentation is one method, but self-service documentation portals are only so useful.  Often we forget to update the docs so they're a little bit wrong, users don't follow directions carefully, some just don't want to, etc...  I additionally take a three pronged approach to handling fires from users:
  • Educate: Try to educate your users when you can so they understand the problem they're having.  If you explain it well enough, they can synthesize the information and use it to help themselves later.  Better yet, if you have desktop support or helpdesk staff, educate them so they can fix the problem on first contact with the user so everyone walks away happy.
  • Automate: Accountants are not sysadmins.  They do not want to follow a 12 step process to reset their passwords.  Automate thing things people do frequently that cause problems so it's easy and less error prone.
  • Facilitate: Some people just are not reasonable.  Facilitate their needs by getting it done without argument or hassle so everyone can get on with their life.  Often just doing whatever it is will take less time than arguing about it anyway, so skip the drama and suck it up.
A similar strategy can work for management initiated fires too, though with a heavier does of facilitation.

The take away here is that if you're fixing the same thing over and over again, you're not really fixing it.  Step back, look at the problem from all sides, examine the pain points and find a way to get the fire under control enough so you get some time and sanity back and so that your users don't feel like they need those pitchforks and torches.  If you can put the fire out once and for all, even better, if not, you're probably dealing with a big fire which takes a separate type of attention.


A side effect of fixing the flare-ups is that the air is a lot clearer to see the smoke from the real fires.  So my second step in fireproofing is to start looking for that smoke and if possible, the flames at the source.  In order to see the fire before your users do start monitoring the performance, capacity, and availability of your environment. 

If you don't have monitoring in place already, put some in and start with monitoring something about everything.  Don't spend huge amounts of time or money on monitoring at this point because you'll have no idea what you really need.  Stand up some cheap and easy monitoring solution and start tossing stuff into it and see what's useful.  If something breaks, put in a monitor for it.  Eventually you'll have enough monitoring in place (and experience from it) to make an educated and well formed decision about what you need to do in order to get to a point of comprehensive and useful monitoring.  And be sure do do that evaluation, otherwise....

If you do have monitoring, fix it.  Seriously.  If you jut had to fix a series of flare ups and suffer from interruptions every minute of the day because something is broken or needs attention and you weren't proactive in getting it resolved before the users took notice then something is fundamentally broken with your monitoring.  Evaluate what you monitor, how you monitor it, and what you monitor it with. Look to see where the breakdown is.  Too many fine grained monitors make even a server reboot look like a calamity?  Add in some dependencies.  Monitoring package doesn't monitor services well?  Add something else that does.  Is it really hard to setup proper monitoring because each machine needs a finicky client installed?  Find something new.


People laugh and think I'm joking when I say "Monitoring is a journey, not a destination" but I'm not.  Things change and your monitoring will need to change along with those things.  As a system administrator, it is the single most useful thing you can have in your arsenal.


So that's my simple two step, two minute introduction of how to start getting out of the sysadmin firefighting business.  I don't maintain that these suggestions will put out every fire you may have or come across.  I do think they offer a good place on the ground to start with.  In future posts I'd like to examine how to deal with the larger fires that arise, tire fires, better fireproofing though design, and what kind of tools are out there to help you fight the fires.

Saturday, May 1, 2010

Backup Applications for the Mac

I love backups.  They let me sleep well at night, they make me feel good in the morning, and that little pit of despair deep in my soul gets a little smaller every time I see my Time Machine icon spin.  So imagine my reaction when my wife forwarded me an email from her campus IT folks that had this to say about Time Machine after she inquired about why it didn't seem to be installed on her Mac:
"Time Machine is not an enterprise product so is basically banned on campus. It works well and is great for home use but the security issues on campus comes from the fact that it backs up without you thinking about it.  If someone sends you a file with SSNs in it, Time Machine backs it up.  If you delete that file and empty your trash, Time Machine still retains a copy of it.  Time Machine retains a lot of things that you intended to be deleted and never want back even."
The multiple layers of wrongness in that statement have confounded me for days.  Even better, the small pit of despair is now growing again.  So I'm spending my Saturday afternoon evaluating other backup applications for my wife's Time Machine-less laptop.

So far the leading contender is ChronoSync to do regular data protection tasks (backups, archives) of her home directory and Carbon Copy Cloner on a semi-regular basis for disaster recovery images.  CCC is awesome and has been forever but this is the first time I've ever tried ChronoSync.  It looks pretty nice.  I'm interested to see how it does after a week or two on my wife's laptop going back and forth to work every day when the storage device it wants to use isn't available.  The one thing I have found is that it can't automatically mount a remote share and use a disk image that's on the share.  It's not a big deal, but it would be nice to have.  Synk doesn't seem to support this either, so maybe it's not an oft requested feature.

Procedural note: At my wife's campus backups aren't banned, the genre of software known as "backup software" or "data protection software" as I far as I can find isn't banned, and the IT folks helpfully suggested to my wife that manually copying important data to a USB drive was an appropriate data protection method, so clearly copying data to some other location isn't banned.  As a result I don't think that by setting this up for her by request I'm causing my wife to violate any work policies surrounding data storage, protection, or retention.   For those of you following along at home: always check the fine print before futzing with a machine that isn't yours.

System Administration isn't doomed, but it's going to to be hella different real soon

Let's compare and contrast these two articles:
My personal opinion is that cloud computing isn't going to take over the enterprise any time real soon. Most companies don't want their data floating on someone else's machines. And in terms of context and locality, it doesn't make sense to have a "print server in the cloud" much less a file share (and yes, there will always be file shares).

I do thing that the concept and arcitecture of "cloud computing" is going to change the I.T. industry and the profession of system administration a whole lot though. My particular vision goes something like:
  • Various {system,application,database,network} administration roles will blend more. In a cloud, it's much harder to build walls between hardware, network, applications, and "other stuff" because they all depend on each other. To a degree, this has already begun: witness SOA.
  • Scale will start to bite these fines folks. The tools will get better as a direct result.
  • Day to day "I.T. Guy" tasks will be automated/documented/made easy enough for them to be pushed down to the users, or at least desktop/helpdesk staff. "I.T. Guy" either moves laterally to another career, or refocuses as desktop/helpdesk or upper level administration.

Sunday, April 11, 2010

Selling out, or selling up? Yes, I'm Enabling Ads

When I first started trying to write and post regularly I found it difficult to do so.  Part of my difficulty was trying to publish content that wasn't just a whiny rundown of my regular activities -- since my life is pretty boring, that kind of rundown really isn't very interesting either.  The other difficulty I had was finding the will to regularly post.  Two years later I think I'm starting to get the hang of the writing part and since I need to keep doing it in order to get better I'm looking for some extra motivation.

Love, happiness, and a sense of safety and well-being are all great motivators; but the Blogger "Monetize" tab looked much easier and quicker.  So, I'm going to try serving some ads along with content in order to see if having a positive financial impact will spur me into posting more regularly.  To keep my karma levels even I will apply the first $75 of revenue to a Kiva loan.  Judging by the (lack of) traffic I get that will take a while to happen.  Also, since my traffic levels are so low I'm not even going to worry about planning past that first $75.  My goal is to post and write, so I'm really less concerned about money and more concerned about just producing something worthwhile.  As things progress, I will post updates on the effectiveness of this scheme and link to whatever Kiva loans that are made as a result.  Hey, look!  It's already working, I have more stuff to write about!

Since I just enabled ads my request hasn't been fully processed yet.  So in a wonderful twist of irony, this post to my blog about enabling ads on my blog will go live with, you got it, no ads.

Fedora 12: Usage review

With Fedora 13 Alphas being released it seems like as good a time as any to plonk down some of my thoughts on Fedora 12.
  • Gnome Shell preview is really awesome.  I like it, it works, and I'm excited to see what else might be in store.  Hopefully a few keyboard shortcuts for flipping desktops.
  • NetworkManager continues to make the user networking experience good.  At this point it's about as reliable and useful as my Mac.
  • Video/Media codec support still sucks.  This is not Fedora's fault, it sucks on Mac too.  Codecs just plain suck.  Point in case?  I installed mythfrontend during a demo at MilwaukeeLUG and it broke Totem's ability to play an Ogg file.  After a few logouts and some more installs is started to work.  Then broke again.  Folks, this is why Flash video is successful.  HTML5 video may standardize on a few codecs but I'm not terribly sure that will help.  This has been a problem for close to a decade and I'm sometimes shocked it's still an issue. 
  • Firefox is still slower than Firefox on my Mac.
  • For the most part, things just work.
A big thinks to the Fedora developers and community for making great stuff.

Monday, April 5, 2010

ldap time to perl time()

This might save 30 seconds for someone else who wrongly thought that DateTime::Format:ISO8601 would parse an LDAP timestamp.

# Takes an ISO8601ish LDAP timestamp datatype and converts it into a perl time()
# compatible structure.  Requires Time::Local.
sub ldap2time
{
    my $ldap_ts = shift;
    return unless $ldap_ts =~ /(\d{4})(\d{2})(\d{2})(\d{2})(\d{2})(\d{2})Z/;
    my ($year, $mon, $day, $hour, $min, $sec) = ($1, $2, $3, $4, $5, $6);
    return timegm($sec, $min, $hour, $day, ($mon-1), $year);
}

Monday, March 8, 2010

Cricket Modem On Linux

I have a Cal-Comp A600 USB 3G modem from Cricket as my backup internet connection.  It's a "flip-flop" USB device that presents a small disk when first plugged in that has all the drivers you need to Windows and Mac.  Once the drivers are installed, they know how to frob the device to make it look like a modem, and not just a USB stick.

Getting this to work in Linux seems really poorly documented, so here's my attempt to save some poor bastard in the future a few minutes of pain:

# Flip this device
usb_modeswitch -v 0x1f28 -p 0x0021 -m 0x08 -M 55534243b82e238c24000000800108df200000000000000000000000000000
# Hold a moment for things to catch up
sleep 5
# Reset the device so it will come back as a modem
usb_modeswitch -v 0x1f28 -p 0x0020 -R 1

After that, NetworkManager should see a CDMA modem and offer to set it up for you (at least on Fedora 11 and 12, and I guess other recent distros with recent NetworkManger).

Sunday, February 21, 2010

The Great Start Page Debate and How to Abuse Firefox with Tree Style Tab

Every now and again I see an online discussion about what the best start page is.  Some people like a blank page, some folks like iGoogle, some folks like pictures of bunnies and kittens, etc....  I guess I'm odd or something.  I have three separate start pages that I use on a daily basis.

That's right, three start pages.  AT THE SAME FREAKIN' TIME.  Suck it.

Since this topic comes up often enough in various online forums, it seems like I should briefly discuss my setup.  I don't think it's perfect, it most certainly isn't for everyone and it does assume your browser is one of the primary applications you work with all day (AdiumX and iTerm are the other two I use constantly).  But I do think my setup is unique enough to warrant some mention.  So, my three start pages are:
  • One for my personal stuff that points at my personal iGoogle page
  • One for work related services, that points at the start page for our Google Apps
  • One for work related links I use every day (ticketing, wiki, monitoring, etc..)
That's a lot of pages and links.  I probably could condense that down into one local page someplace if I really cared too.  Frankly though, I like have things separated out since it allows me to visually segregate work related tabs versus personal related tabs.  This setup also helps me stay focused on what I need to be doing without pulling in lots of other distractions from unrelated items.  For example,  I can't get distracted [1] by the TwitterGadget on my personal iGoogle page when looking at my work start pages.  This also lets me have really focused start pages with minimal information that doesn't require a scroll down to see everything at a glance.

If visually segregating tabs sounds odd to you, then you haven't discovered the Tree Style Tab extension for Firefox.  For someone like me who regularly has 75 browser tabs open on any given day and clicks a few hundred links in a day, it's a life-saver.  Tree Style Tab lets me have a nice horizontal list of tabs on the side of the window that are indented with parent-child relationships.  For the context of this post: the links I click on from one start page are created as new tabs under the start page from where they came, so everything stays grouped together instead of becoming a jumbled mess.  When combined with the Vimperator extension which removes nearly all of the chrome from Firefox and gives me great keyboard shortcuts for nearly everything, I have tons of room on my widescreen monitor for the tabs listing.  Tree Style Tab also allows me to only need one open browser window which is even more helpful since I use Alt-Tab as my primary method for switching between applications and in Mac OS X you can't target a specific window to switch to, just the most recently focused one for the app.

Have I sold you on Tree Style Tab yet?  No?  Well then perhaps a picture is worth more than the words in this post.  Here's an example screenshot of my browser window right before I started writing:

The top three non-indented tabs on the left are my start pages, the tabs under them are related to the start page (and so on down the tree), and the things at the bottom are tabs I opened directly.

1 - Yes, I am easily distracted by shiny .... oooohhhhh

Unspamming facebook

So, I have this blog imported into my Facebook account. Sadly, long rants about Linux and System Administration tend to seem silly in the light of Facebook, so I'm going to only import posts with the "facebook" tag now to try to not spam folks.