|
|
|
|
|
|
AI?
by Bert - 06/25/2025 7:52 AM
|
|
|
|
|
Posts: 839
Joined: May 2009
|
|
#45415
05/30/2012 2:20 PM
|
Joined: Nov 2009
Posts: 87 Likes: 1
Member
|
OP
Member
Joined: Nov 2009
Posts: 87 Likes: 1 |
It's not been a good week for our server. Lots of NTFS errors saying the disk is corrupt and unusable and to run the chkdsk utility. I run the chkdsk utility and it gets stuck in an endless loop trying to fix one particular error. Spent all day Saturday on the phone with Dell trying to repair, but in the end they say a clean install is what I'm going to have to do. Sigh.
The IT guy who set this up for us got another job in another town, but I emailed him yesterday and he insists this is exactly what the Bare Metal Restore is for. Dell says it's the partition and the BMR won't help. Should I try it anyway? Do I have anything to lose?
The weird part about all this - if not for a chkdsk message that popped up on a restart I wouldn't know there was a problem. The server's running fine, everyone can work off it, the data is there, SBS Console shows no errors, and I'm getting regular backups. I'm not sure what to make of it.
Anne-Marie Family Medicine Whatever Someone Else Isn't Handling Manager
|
|
|
|
Joined: Jun 2009
Posts: 1,811
Member
|
Member
Joined: Jun 2009
Posts: 1,811 |
Good-news-bad-news first. If there are disk errors, the good news is that you caught it before the disk became BROKEN. [non-technical term] Bad news is you need new drive(s). Most of us would recommend a minimum of at least two drives, but that is another subject.
Triage: Get multiple good backups, as well as a copy of the Imported Items directory [as a minimum] onto an external disk. There are probably other files you need, so a full backup would be preferable.
Next up, buy the drives you need, and when they are in-hand, pull the old drives out (labeling them), and install the new drives. Then try the BMR - worse case the BMR goes sideways, and you go back to your other drives.
That way you have a path back, even if it is imperfect and won't last for ever.
Suggestion: also buy a cabled webcam that you can attach to another machine (laptop) so that you can get remote eyes-on.
You can reach PM me if you need someone standing by during the 'surgery'.
|
|
|
|
Joined: Nov 2009
Posts: 87 Likes: 1
Member
|
OP
Member
Joined: Nov 2009
Posts: 87 Likes: 1 |
Thanks Indy!
We actually have 2 drives. Dell says the problem is with "the logical volume - the C partition" because they ran diagnostics and it didn't show problems with the drives. Think I could pull out one of the drives and try the BMR on the second one only?
Anne-Marie Family Medicine Whatever Someone Else Isn't Handling Manager
|
|
|
|
Joined: Jun 2009
Posts: 1,811
Member
|
Member
Joined: Jun 2009
Posts: 1,811 |
Thanks Indy!
We actually have 2 drives. Dell says the problem is with "the logical volume - the C partition" because they ran diagnostics and it didn't show problems with the drives. Think I could pull out one of the drives and try the BMR on the second one only? So you have two physical drives? Or do you have two logical drives on one physical drive. The point about two [or more drives] is having hardware redundancy; thus begins the conversation of RAID 1,5,10 etc. If you have looked inside the machine and you can see two physical drives, that will be a way to know if the BIOS boot screen isn't familiar to you. Drives themselves are not real expensive, here are better quality 300GB drives for $150. At 10K RPM, they are between desktop and enterprise in performance. You could also go with RAIDed SSDs using an appropriate RAID controller, but that is increases the complexity. http://www.amazon.com/gp/product/B003FW9T0M/
|
|
|
|
Joined: Nov 2009
Posts: 87 Likes: 1
Member
|
OP
Member
Joined: Nov 2009
Posts: 87 Likes: 1 |
Know that I'm a software girl. I can make Excel turn somersaults, but when you start talking about BIOS I'm lost. So even though I know the following, I don't really know what the following means:
We have a Dell PowerEdge 310. SBS 2011 Standard. Two physical drives, which I believe are of the hot swap variety. Raid 1. Mirror image.
I'm of the mindset that I'm going to have to do a clean install this weekend, so having to do anything less than that will make me a very happy camper. Dell seemed to go to the clean install option quicker than I would have liked, but they obviously know a lot more than I do.
Anne-Marie Family Medicine Whatever Someone Else Isn't Handling Manager
|
|
|
|
Joined: Apr 2011
Posts: 2,316 Likes: 2
G Member
|
G Member
Joined: Apr 2011
Posts: 2,316 Likes: 2 |
We actually have 2 drives. Dell says the problem is with "the logical volume - the C partition" because they ran diagnostics and it didn't show problems with the drives. Think I could pull out one of the drives and try the BMR on the second one only? I would rather go buy one than break my RAID array unnecessarily. So to be clear, Dell said the physical disks were fine? So what did the ChkDsk say exactly? BMR can reformat and repartition entire volumes. You'll want a disk that is at least the same size as your current disk. Don't jump the gun with the 300GB Velociraptor just yet. Dell usually offers 500GB drives minimum with their servers. You may not be able to restore to a 300GB drive.
|
|
|
|
Joined: Apr 2011
Posts: 2,316 Likes: 2
G Member
|
G Member
Joined: Apr 2011
Posts: 2,316 Likes: 2 |
Next up, buy the drives you need, and when they are in-hand, pull the old drives out (labeling them), and install the new drives. Then try the BMR - worse case the BMR goes sideways, and you go back to your other drives. This is the smart move. I say plug the new drive into a different port entirely, just to be safe. You can then test your BMR. If successful, you can restore to that.
|
|
|
|
Joined: Jun 2009
Posts: 1,811
Member
|
Member
Joined: Jun 2009
Posts: 1,811 |
Know that I'm a software girl. I can make Excel turn somersaults, but when you start talking about BIOS I'm lost. So even though I know the following, I don't really know what the following means:
We have a Dell PowerEdge 310. SBS 2011 Standard. Two physical drives, which I believe are of the hot swap variety. Raid 1. Mirror image.
I'm of the mindset that I'm going to have to do a clean install this weekend, so having to do anything less than that will make me a very happy camper. Dell seemed to go to the clean install option quicker than I would have liked, but they obviously know a lot more than I do. So, it would take 2-day shipping, but I would think hard about ordering two additional Dell hot-swap drives to do the BMR. That allows you to have spare drives handy, as well as a road back. Good news is that HS-drives are easy to change out, no tools required. There are also software tools that allow you to 'image' your system, then just write that image to the new drives. Faster than installing, config, patching, but requires greater technical skills.
|
|
|
|
Joined: Nov 2009
Posts: 87 Likes: 1
Member
|
OP
Member
Joined: Nov 2009
Posts: 87 Likes: 1 |
They said "the problem is with the C: partition, not the actual disk drives".
They've sent me another diagnostics tool to run, so I'll do that once I can get everyone out of here and see what it says.
First sign of trouble was when I rebooted last week. I got a message saying "One of your disks needs to be checked for consistency" and it suggested I run Chkdsk. In stage 2, the following message appeared over and over again on screen: "Correcting error in index $I30 for file 206436". Had to power off and start again, this time bypassing Chkdsk.
I'm seeing two NTFS errors in the Event logs now: "The file system structure on the disk is corrupt and unusable. Please run the chckdsk utiility on the volume\Device\HarddiskVolume2"
"The file system structure on the disk is corrupt and unusable. Please run the chckdsk utiility on the volume C:"
Anne-Marie Family Medicine Whatever Someone Else Isn't Handling Manager
|
|
|
|
Joined: Nov 2009
Posts: 87 Likes: 1
Member
|
OP
Member
Joined: Nov 2009
Posts: 87 Likes: 1 |
Buying two extra drives makes sense. I just wish I didn't have to keep telling our provider she needs to spend more money on more hardware for a server she already spent a lot of money on less than a year ago.
Anne-Marie Family Medicine Whatever Someone Else Isn't Handling Manager
|
|
|
|
Joined: Apr 2011
Posts: 2,316 Likes: 2
G Member
|
G Member
Joined: Apr 2011
Posts: 2,316 Likes: 2 |
I'm wondering if they're going to make you run the full diagnostic. Event ID 55 I'm guessing?
Could be indicative bad sectors. I'd try to use the drive manufacturer's utility to check for bad sectors.
|
|
|
|
Joined: Apr 2011
Posts: 2,316 Likes: 2
G Member
|
G Member
Joined: Apr 2011
Posts: 2,316 Likes: 2 |
Anyway, these CHKDSK errors are usually signs that hard drive failure is imminent. I wouldn't want to keep my data on hard drives with bad sectors. Have you experienced any power outages or unexpected restarts? Try to replace them ASAP.
|
|
|
|
Joined: Apr 2011
Posts: 2,316 Likes: 2
G Member
|
G Member
Joined: Apr 2011
Posts: 2,316 Likes: 2 |
This is where Amazon Prime is useful. One-day shipping at 3.99/item is nothing. Free 2 day. Without prime it'd probably cost you the price of the prime membership anyways.
What is the make, model, and capacity of the current drives? E.g. Western Digital RE4 500GB
|
|
|
|
Joined: Nov 2009
Posts: 87 Likes: 1
Member
|
OP
Member
Joined: Nov 2009
Posts: 87 Likes: 1 |
Yes, Event ID 55. I've seen talk that this is an indication that the hard drives will fail. Only Dell seems to be unconcerned.
500GB 7.2K RPM SATA 3.5" Hot Plug Hard Drive
Will purchase tonight.
The part I'm still confused about is whether I'll be able to use the Bare Metal Restore for this, or if that's going to be corrupt too.
Anne-Marie Family Medicine Whatever Someone Else Isn't Handling Manager
|
|
|
|
Joined: Jun 2009
Posts: 1,811
Member
|
Member
Joined: Jun 2009
Posts: 1,811 |
Buying two extra drives makes sense. I just wish I didn't have to keep telling our provider she needs to spend more money on more hardware for a server she already spent a lot of money on less than a year ago. If Dell is providing support, then my guess is that the drive(s) that is failing should be covered under warranty. Having at least one spare drive is just good process for just such an eventuality. Dell should not be charging you to replace them. Sadly, I have had two different clients with Dell machines have hardware failures at the 12-14 month point recently, and both were repaired under warranty, but it is still tremendously disrupting.
|
|
|
|
Joined: Sep 2003
Posts: 12,884 Likes: 34
Member
|
Member
Joined: Sep 2003
Posts: 12,884 Likes: 34 |
They said "the problem is with the C: partition, not the actual disk drives". I am not sure I see the difference between the two. Do you have S.M.A.R.T. enabled in the BIOS? I am guessing the bare metal restore will be fine. Sandeep, what do you think of all the Steve Gibson Research Center stuff. I have actually used Spinrite twice to save drives. They weren't mission critical, though.
Bert Pediatrics Brewer, Maine
|
|
|
|
Joined: Apr 2011
Posts: 2,316 Likes: 2
G Member
|
G Member
Joined: Apr 2011
Posts: 2,316 Likes: 2 |
I hear good things about SpinRite. I would only use to extend the amount of time I have to complete a recovery. I don't like the idea of "refreshing" bad sectors.
I like to try file-based rescue systems before the sector-based. Sector takes a long time, but if the data is absolutely necessary and there are no backups. Might as well.
|
|
|
|
Joined: Nov 2009
Posts: 87 Likes: 1
Member
|
OP
Member
Joined: Nov 2009
Posts: 87 Likes: 1 |
I don't really understand the difference either. The guy yesterday said it's not the drives, but the controller, even though there's no errors in those logs. I suspect Dell probably wants me to do everything possible before they have to throw their money at it, while I want them to do everything possible so I can avoid having to start from ground zero.
My focus right now, however, is why AC's backup isn't working. When it rains, it pours...
Anne-Marie Family Medicine Whatever Someone Else Isn't Handling Manager
|
|
|
|
Joined: Sep 2003
Posts: 12,884 Likes: 34
Member
|
Member
Joined: Sep 2003
Posts: 12,884 Likes: 34 |
I generally can help with AC backup.
Dell wouldn't not do it to save money. They are very, very good. Usually, if under warranty and you call at 5 pm, it will be at your office at 10 am. Also, look and see if you have like a 5/24 onsite support, and then just call and say, I want a tech at my office tomorrow. Basically, they have to.
On the other hand, if they didn't cooperate which they will, just go to Best Buy and pick it up. Or pay Dell. You can argue with them later. However, when it is mission critical, if you have the onsite protection, tell them you want someone out.
Bert Pediatrics Brewer, Maine
|
|
|
|
Joined: Apr 2011
Posts: 2,316 Likes: 2
G Member
|
G Member
Joined: Apr 2011
Posts: 2,316 Likes: 2 |
I'm confused as to why they would suggest a full reinstall if the controller is bad. That wouldn't change the fact that the controller is bad and it will happen again. I think we're missing something here.
|
|
|
|
Joined: Apr 2011
Posts: 2,316 Likes: 2
G Member
|
G Member
Joined: Apr 2011
Posts: 2,316 Likes: 2 |
If I were in your shoes, here's what I would do: 1) Remove the drives, note the make and model. 2) Go the MFR site and download the boot time diagnostic tool. (Seatools for Seagate, etc.) 3) Install the drive in another computer and boot to the diagnostic tool. 4) See if there are bad sectors
What you learn: Drive is good/bad. If drive is bad, replace it and rebuild the array. If drive is good, there is likely a problem with your controller and/or controller driver. Make sure you have the latest driver for your RAID controller/replace the RAID controller.
A bad RAID controller could be indicating that the integrity of an array is good when it may not be.
|
|
|
|
Joined: Sep 2003
Posts: 12,884 Likes: 34
Member
|
Member
Joined: Sep 2003
Posts: 12,884 Likes: 34 |
I think she was talking about if she fixed the hard drive or replaced it, would she be able to do a bare metal restore.
I thought about putting the drive in another computer.
Bert Pediatrics Brewer, Maine
|
|
|
|
Joined: Nov 2009
Posts: 87 Likes: 1
Member
|
OP
Member
Joined: Nov 2009
Posts: 87 Likes: 1 |
OK, here's what I'm doing:
I've hired a new hardware IT guy. We're closing early and he'll be here at Noon tomorrow. He came highly recommended, I've worked with him already on a couple projects, and I was planning on hiring him anyway. So I'll let him take over and I'll get my checkbook out. It's not my favorite solution, but I'll be the first to admit chasing hardware issues is not my thing. That's become crystal clear these last few days.
Bert - Looks like we don't have onsite support. And AC Support is telling me I need to disable SQLServer in order to do a manual copy of the databases?
Anne-Marie Family Medicine Whatever Someone Else Isn't Handling Manager
|
|
|
|
Joined: Sep 2003
Posts: 12,884 Likes: 34
Member
|
Member
Joined: Sep 2003
Posts: 12,884 Likes: 34 |
Sounds like a plan. Can you get a new drive by then?
You probably know this, but when they say disable SQL Server, they are simply asking you to turn off the service. Of course if you do a complete restore, it's not going to matter.
If you want to be really safe, you could have my friend Raja back up all the databases directly through SSMS. Then reinstall if necessary and restore the SQL databases.
You can move the databases by either turning off the service or using Advanced in Amazing Utilities.
Bert Pediatrics Brewer, Maine
|
|
|
|
Joined: Apr 2011
Posts: 2,316 Likes: 2
G Member
|
G Member
Joined: Apr 2011
Posts: 2,316 Likes: 2 |
|
|
|
|
Joined: Nov 2009
Posts: 87 Likes: 1
Member
|
OP
Member
Joined: Nov 2009
Posts: 87 Likes: 1 |
We had to do the re-install. All the big stuff is done, but I'm still working on all the little things (redirected folders, recreating staff shortcuts, deleting Bing off the face of the Earth, etc). It's slow-going, mostly because I'm taking fastidious notes along the way for future reference, since I had little to look at this time around.
On the plus side, unless his bill is out of control, the new hardware IT guy is a keeper. And a HUGE shout out to Nick at AC Support, who made sure we had all the files we needed backed up before the restore, then got back in afterwards to re-install AC and the data. Still knocking on wood, but as far as I can see we didn't lose anything.
Thanks a bunch for all the suggestions. Keep your fingers crossed for me that whatever happened the first time around doesn't decide to repeat itself.
Anne-Marie Family Medicine Whatever Someone Else Isn't Handling Manager
|
|
|
|
Joined: Sep 2003
Posts: 12,884 Likes: 34
Member
|
Member
Joined: Sep 2003
Posts: 12,884 Likes: 34 |
Glad to hear it worked out. Nothing like being forced to do a reformat rather than plan it.
Bert Pediatrics Brewer, Maine
|
|
|
|
Joined: Apr 2011
Posts: 2,316 Likes: 2
G Member
|
G Member
Joined: Apr 2011
Posts: 2,316 Likes: 2 |
Was the root issue identified? Was it the controller, the drive, or the file system?
|
|
|
|
Joined: Sep 2003
Posts: 12,884 Likes: 34
Member
|
Member
Joined: Sep 2003
Posts: 12,884 Likes: 34 |
It sounds like she just put in another drive and re-installed.
Bert Pediatrics Brewer, Maine
|
|
|
|
Joined: Nov 2009
Posts: 87 Likes: 1
Member
|
OP
Member
Joined: Nov 2009
Posts: 87 Likes: 1 |
Root issue was not identified, which bugs the heck out of me, as I have no guarantee it won't fail again. Bert - I remember your "Backup, Backup, Backup" mantra from ACUC a couple years ago, and I think even you would find it amusing how many backups I have at the moment. Server, extra hard drive, local computers, external hard drives, flash drives - I wasn't about to lose anything without a fight. And I didn't. 
Anne-Marie Family Medicine Whatever Someone Else Isn't Handling Manager
|
|
|
|
Joined: Sep 2003
Posts: 12,884 Likes: 34
Member
|
Member
Joined: Sep 2003
Posts: 12,884 Likes: 34 |
Excellent. I only wish I could have finished it. 
Bert Pediatrics Brewer, Maine
|
|
|
|
|
1 members (Ruben),
107
guests, and
45
robots. |
Key:
Admin,
Global Mod,
Mod
|
|
|
|