Monday, January 28, 2008

Recovery complete!

Having backups is essential. Reviewing backups is critical.

It is not enough to be able to restore individual files in a timely manner, you need to consider how long it will take to restore 100% of the backed up data. I'd recommend testing a restore of at least 25% of the data, then multiplying that out to get a decent estimation of how long the restore will take. Also make sure to backup the configs of every program in a some manner, either in writing or in an online backup of some sort. Make it part of your SOP to document and review backups when a new program is installed.

Databases are special files that cannot be backed up by "copying the files" while they are in use. Remember to stop the database service or use a special tool to back them up.

Specialized programs, vertical apps, or custom software require special attention, they may have files scattered about the machine, or be a database in disguise.

When making the descision as to what software program(s) to use to back up your data, be sure to consider open files, bare metal recovery, system state, domain state and databases. Open files are files that are in use when the back is ruinning. Bare metal recovery is when the machine has had a total meltdown and needs to have everything re-installed or a new machine is needed to replace the old one. System state is the information about the system it self, in windows this includes the registry and other vital information. Domain state is the information about all the users, passwords, machines, printers, etc in the domain, or "on the network". Databases are stores of data in a special format, special care must be used when backing them up, as a data base exists in memory and on the hard drive, if all the information is backup on the hard drive while it is running, critical pieces will probably be missing as some of the data is still in memory.

-- Tim

Saturday, January 12, 2008

SHTF, During a move

No matter how thoroughly you think you prepare for a move, moving equipment is always risky. Especially old equipment, that has been running for many years. The last longterm shutdown on many of these boxes was during hurricane Wilma in 2005, iirc. Needless to say there were some problems. Disaster recovery is in play, I get to see how good the plan was, and where things need to be improved. A 6-8 hour move has already turned into 14+. So far 1 Server is partially up, and the second should be recovered tomorrow sometime (fingers crossed).

-- Tim

Sunday, January 6, 2008

Computer Security is Much Like Paintball

My son and I got a chance to go and play paint ball again this weekend. We were playing some scrimmage matches on different fields (Hyperball, Spool, x-ball and the wood cross) and I was trying some new tactics, some times they failed and sometimes the succeeded. Different tactics, work against different teams, different numbers, or with different markers. Which got me thinking, how often do we change our tactics on protecting out computers and data? The Virus, Mal-ware and Insider threats are constantly changing.

There are A few different types of Administrators & Security Coordinators. The proactive ones, The passive ones, the reactive ones, and the ones in denial. There are also funding levels which can help or hinder the reactions. Fully funded, Partially funded, justify the cost, underfunded, and no funding.

Over the next few days we will discuss some of the differences and how they can affect your business.

-- Tim Krabec

Thursday, January 3, 2008

Plan your Updates carefully

Here is a excerpt from a chat I had with a few Colleagues. Names and details may have been changed.

[18:16] Fred> I just got my nosed rubbed in an important systems update rule
of thumb
[18:17] tk> rtfm?
[18:17] tk> Or not before a holiday weekend?
[18:17] Fred> yep you hit it with your 2nd guess
[18:18] Fred> fallout not complete yet, but fortunately I'm not the one who
actually made the mistake
[18:18] tk> I avoid upgrading anything, except when I've got time
[18:19] tk> what did you upgrade?
[18:19] Fred> we have about a dozen checkpoint edge devices and push policy out
to them from a central server
[18:19] Fred> one one of them, the policy did not install correctly
[18:19] Fred> let me guess, not all in 1 location
[18:20] Fred> oh no, they're scattered all over the county, physically
[18:20] Fred> when this happens instead of using the previous policy they
behave as though no policy is present at all
[18:20] tk> Joy
[18:20] Fred> yep
[18:20] Fred> so, after determining the problem, I pushed the policy out to
that one box, and things started behaving normally
[18:21] tk> nice
[18:21] Fred> but, the device had been down for 4 days over the holidays
[18:21] tk> eek
[18:21] KL> good thing you have a layered defense model....
[18:21] Fred> I know - the PD being (ahem) protected was not happy
[18:22] Fred> they should have a fallback
[18:22] Fred> like, EVDO cards
[18:23] tk> or Dial up :)
[18:23] Fred> well, several things happened to make this last longer than it
should have
[18:23] Fred> one, the on-call pager person did not follow up on the initial
[18:24] Fred> this was a new guy, doing it for the first time, so he's
probably going to get off lightly
[18:25] Fred> but, zero, the policy should not have been pushed just before
a major holiday weekend
[18:25] Fred> that's the fundamental rule that was broken

A few "Rules" for Updating systems

  1. Do not update before a holiday weekend, vacation or business trip, unless you plan on working
  2. Communicate with remote locations, make sure they are aware of the upgrade
  3. Plan for stuff to go wrong, it probably will
  4. If your sites are spread over a multiple locations, have a plan to remedy the situation in a timely manner.
  5. Make sure all on site techs know that an upgrade is planned and issues related to the update need to be addressed promptly.
-- Tim Krabec