Here is a excerpt from a chat I had with a few Colleagues. Names and details may have been changed.
[18:16] Fred> I just got my nosed rubbed in an important systems update rule
of thumb
[18:17] tk> rtfm?
[18:17] tk> Or not before a holiday weekend?
[18:17] Fred> yep you hit it with your 2nd guess
[18:18] Fred> fallout not complete yet, but fortunately I'm not the one who
actually made the mistake
[18:18] tk> I avoid upgrading anything, except when I've got time
[18:19] tk> what did you upgrade?
[18:19] Fred> we have about a dozen checkpoint edge devices and push policy out
to them from a central server
[18:19] Fred> one one of them, the policy did not install correctly
[18:19] Fred> let me guess, not all in 1 location
[18:20] Fred> oh no, they're scattered all over the county, physically
[18:20] Fred> when this happens instead of using the previous policy they
behave as though no policy is present at all
[18:20] tk> Joy
[18:20] Fred> yep
[18:20] Fred> so, after determining the problem, I pushed the policy out to
that one box, and things started behaving normally
[18:21] tk> nice
[18:21] Fred> but, the device had been down for 4 days over the holidays
[18:21] tk> eek
[18:21] KL> good thing you have a layered defense model....
[18:21] Fred> I know - the PD being (ahem) protected was not happy
[18:22] Fred> they should have a fallback
[18:22] Fred> like, EVDO cards
[18:23] tk> or Dial up :)
[18:23] Fred> well, several things happened to make this last longer than it
should have
[18:23] Fred> one, the on-call pager person did not follow up on the initial
report
[18:24] Fred> this was a new guy, doing it for the first time, so he's
probably going to get off lightly
[18:25] Fred> but, zero, the policy should not have been pushed just before
a major holiday weekend
[18:25] Fred> that's the fundamental rule that was broken
A few "Rules" for Updating systems
- Do not update before a holiday weekend, vacation or business trip, unless you plan on working
- Communicate with remote locations, make sure they are aware of the upgrade
- Plan for stuff to go wrong, it probably will
- If your sites are spread over a multiple locations, have a plan to remedy the situation in a timely manner.
- Make sure all on site techs know that an upgrade is planned and issues related to the update need to be addressed promptly.
Kracomp.com
2 comments:
and: never rely on just one layer of defense. Assume it will fail and have compensating controls in place.
Also test the deployment, as throughly as possibly.
(10:33 PM) Bob> I just noticed yesterday that websense mysteriously stopped filtering web traffic after a maintenance window in November.
(10:33 PM) tk> oops
(10:33 PM) Bob> speaking of controls....
(10:33 PM) tk> and testing
(10:33 PM) Bob> mmmhmmm
(10:34 PM) Bob> and lack of staff to ensure that everything is running as it should
(10:34 PM) tk> *GASP* you're not properly staffed
(10:34 PM) Bob> i know.....*shock*
Post a Comment