• I want to thank all the members that have upgraded your accounts. I truly appreciate your support of the site monetarily. Supporting the site keeps this site up and running as a lot of work daily goes on behind the scenes. Click to Support Signs101 ...

Ideas if this might be a SATA controller failure?

choucove

New Member
In 2008 I built a new RIP computer for our office in Hays. The platform for the system is two quad-core AMD Opteron Barcelona processors at 2 Ghz on a dual-socket ASUS L1N64-SLI WS/B socket 1207 (F) motherboard with 4 GB of DDR2 667 ECC Registered memory. The system also uses a PNY Quadro FX570 and has two 75 GB Raptor hard drives.

Recently, the computer seems to be having some irregular issues. Beginning about six months ago, it would occasionally looks "connection" with the hard drives. The two WD Raptor drives were set up in RAID 0 for the best performance, but once every couple months when they would come in and power on the system it would display an NVStripe error message. It was easy for me to fix, at least, as all I had to do was disconnect and reconnect each drive separately for the computer to recognize. Then after connecting both back up and booting up it would work like before.

However, this became more and more of a problem, and as no one at their office is really tech savvy to deal with this, it would require for me to drive nearly two hours each way to fix a solution that only requires about ten to fifteen minutes of actual tech work. So, the last time that it happened to me, I instead disabled the NVStripe and just ran all the hard drives as separate IDE drives, which worked fine. Then, I get a call again that the computer is still having issues, so obviously it isn't just the system not liking RAID 0 arrays, but possibly that there is actual failure occurring in the SATA controller onboard the motherboard.

I was able to talk them through how to fix it, but it's obviously becoming more of an issue and everyone has been frustrated with it, so I may try to replace that computer and find a different use for the rest of the platform.

I'm wondering if others out there might have some helpful input, though, if this might actually be a SATA controller failure on the motherboard. I'm rather sure it's not the hard drives themselves because again after reconnecting the drives with a couple reboots it will come right back up, and checkdisk finds no errors. You can't even buy a new one of these ASUS boards, and most all of the socket 1207 motherboards are now gone really, so replacing it is going to cost a ton even though there aren't a lot of options out there.
 

Techman

New Member
i have never heard of a SATA failure like this. They either work or not. Check the drivers.
 

SignBurst PCs

New Member
You are getting the message during the NV RAID "POST" screen right? Before it ever gets a chance to boot into the OS? If that is the case, I wouldn't worry about drivers. I think you are correct to look at hardware.

It is hard to say, but it could be anything from a faulty controller to a crappy connection on one of the SATA cables. There is even the possibility that the PS is not giving the drives (or motherboard) the correct volts/amps consistently. PS issues often give a wide array of symptoms that can be boggling at times, especially if the issues are intermittent.

I am sorry that I am not on the forum as much as I would like anymore. If you want to give me a call, I might be able to give you some other ideas.
 

choucove

New Member
A power supply issue isn't something that I had originally thought of, as I hadn't seen it crop up in this kind of way, but could be. Out here in Western Kansas we go through so many weird power issues all the time. It has a Thermaltake 750 Watt PSU in the system, and I have heard from others online who own this motherboard and platform that it likes to use a TON of power, but I still estimated that 750 Watt to be more than enough.

I also had not really looked in detail about changing around the SATA ports on the board and see if it's just a certain port causing problem. It does have 12 onboard SATA ports, so that might be a fix I need to look at next time I am out that way. I also thought it might be just a CMOS battery going bad and causing it to lose the configuration information in BIOS about the boot drive order and all that, but I doubt that is the case. It's a quick test to see, though, by checking if the date and time in BIOS are still correct and a quick fix if that is the problem!

In the end, just to try and make sure the system can function its full intended life cycle I may do as Jiarby suggested and just go with a separate RAID controller. I do have a spare simple HighPoint RocketRAID PCI-Express X4 4-port card that I can try, though it's definitely not as much of a beast as some of the cards you can get out there!
 

choucove

New Member
Yikes!

So we went in today while the shop was closed for holiday to see if we could do some work on this RIP computer and find a workaround or at least figure out what was causing the problem. Ended up causing more headaches than we started with! First, we imaged the hard drives to make sure we had good backups, but from there we installed a dedicated RAID controller card (Highpoint RocketRAID 4-port PCI-e card) which caused the system to freeze during boot and finally would refuse to let the system even post! We removed the RAID card and still the system would not properly boot, it seemed almost as if the PCI-Express channels would not respond as the video card would not power up even though the rest of the computer seemed to come to life.

After we cleared the CMOS and removed the CMOS battery, it finally would allow to POST properly where we could reconnect the original hard drives and reconfigure all the proper settings in BIOS. The system would finally allow us to boot up completely (9 times out of 10 at least) but within Windows we began seeing some weird issues about program exceptions when trying to open just about anything and crashing Windows. We reinstalled some buggy drivers with the USB, and then suddenly were unable to get Flexi to open properly! It would crash during startup with "an unrecoverable exception". Reinstalled the SaferNet Sentinel drivers and that didn't fix the issue, so we finally did a repair installation of the entire FlexiSign software, which finally fixed the issue and would come up completely. All the custom settings were lost, so I'm going to get an ear-full for that of course, but it was the only possible solution to allow the program to load.

Now I have a lot of issues to discuss with the crew, though, as it was evident that there were malicious applications and games downloaded onto the computer that should NOT have been there.

In the end, we are just going to replace the computer. The rest of the whole office is getting ready for a full upgrade to new hardware and software, so it was decided to fix the issue with a new platform identical to all the rest. And I can pretty much guarantee that once we completely rebuild this system with a clean installation of Windows we won't have any more problems with it, but it will be relocated to a design station.
 
Top