Yesterday my Intel SSD 320 failed horrible and kills all data on my RAID system. I still do not understand how it happened. I was sure my storage system is very safe.
My system was configured as a mirror RAID with two 1TB HDDs. The RAID was created with Intel Rapid Storage technology (my motherboard is Asus p8z77v-pro). The failed SSD was used as a RAID cache configured with Intel Smart Response technology.
I was wary to use SSD directly as a main system drive, because I heard of "BAD_CTX 13F" error which happened with Intel 320 drives. My hope was that if SSD is used as just a cache, then in the case of SSD failure the data still be safe. Since this error reportedly occurs during power outage only, I set up an UPS. But all these precautions have not helped.
Yesterday I surfed web with Google Chrome, and my computer suddenly become unresponsive. At first only Chrome was unusually slow, and other open programs work normally, but in a few minutes the computer was totally freeze. I was forced to press reset, and upon restart Windows automatically entered into non-interruptible "recovery mode". After more then 24 hours the OS reported that "further recovery is impossible" and the RAID become unbootable. The SSD serial number was changed to "BAD_CTX 0000013F", the sign of famous "8mb bug". It is interesting that in my case this bug was not caused by any power outage except when I pressed "reset" button, but I don't think this is count as a power loss.
I take an HDD out of the RAID to connect it to other computer and save critical data, but without any success. At first sight all file system looks correct, and I even manage to copy all recent data files, but when I looked into those files it was total mess. Each file consist of some arbitrary chunks of unrelated files, mixed in random order - a bit of some executable file, several lines of my project source code, followed by chunk of some unknown xml configuration file, followed by random bytes, etc. A total mess.
I still don't understand the reason of such spectacular data corruption. I have three hypotheses:
1. SSD cache sent incorrect data to RAID on write (two month ago I switched SSD cache from "enhanced" to "maximized" mode, in which writes initially goes to SSD and only then to the RAID disks).
2. Intel RAID controller goes crazy due to a program error.
3. Windows corrupt data during non-interruptible "recovery" phase.
The moral is, even Intel SSD with UPS is not safe, and mirror RAID cannot not protect data from such errors.
My system was configured as a mirror RAID with two 1TB HDDs. The RAID was created with Intel Rapid Storage technology (my motherboard is Asus p8z77v-pro). The failed SSD was used as a RAID cache configured with Intel Smart Response technology.
I was wary to use SSD directly as a main system drive, because I heard of "BAD_CTX 13F" error which happened with Intel 320 drives. My hope was that if SSD is used as just a cache, then in the case of SSD failure the data still be safe. Since this error reportedly occurs during power outage only, I set up an UPS. But all these precautions have not helped.
Yesterday I surfed web with Google Chrome, and my computer suddenly become unresponsive. At first only Chrome was unusually slow, and other open programs work normally, but in a few minutes the computer was totally freeze. I was forced to press reset, and upon restart Windows automatically entered into non-interruptible "recovery mode". After more then 24 hours the OS reported that "further recovery is impossible" and the RAID become unbootable. The SSD serial number was changed to "BAD_CTX 0000013F", the sign of famous "8mb bug". It is interesting that in my case this bug was not caused by any power outage except when I pressed "reset" button, but I don't think this is count as a power loss.
I take an HDD out of the RAID to connect it to other computer and save critical data, but without any success. At first sight all file system looks correct, and I even manage to copy all recent data files, but when I looked into those files it was total mess. Each file consist of some arbitrary chunks of unrelated files, mixed in random order - a bit of some executable file, several lines of my project source code, followed by chunk of some unknown xml configuration file, followed by random bytes, etc. A total mess.
I still don't understand the reason of such spectacular data corruption. I have three hypotheses: 1. SSD cache sent incorrect data to RAID on write (two month ago I switched SSD cache from "enhanced" to "maximized" mode, in which writes initially goes to SSD and only then to the RAID disks). 2. Intel RAID controller goes crazy due to a program error. 3. Windows corrupt data during non-interruptible "recovery" phase.
The moral is, even Intel SSD with UPS is not safe, and mirror RAID cannot not protect data from such errors.