Update 1/11/2019
If you’ve reached this article by Googling how to recover a RAID array, I suggest you don’t use this guide. The Linux RAID wiki has much more correct, complete, and authoritative information. In retrospect I was very lucky not to destroy the data, but this article is an interesting account of how I did it.
Update
Before reading this article you should know that it is now quite old and there is a better method – ‘mdadm –assemble –force’ (it may have been there all along). This will try to assemble the array by marking previously failed drives as good. From the man page:
If mdadm cannot find enough working devices to start the array, but can find some devices that are recorded as having failed, then it will mark those devices as working so that the array can be started.
I would however strongly suggest that you first disconnect the drive that failed first. If you need to discover which device failed first, or assemble doesn’t work and you need to manually recreate the array, then read on.
I found myself in an interesting situation with my parents home server today (Ubuntu 10.04). Hardware wise it’s not the best setup – two of the drives are in an external enclose connected with eSATA cables. I did encourage Dad to buy a proper enclosure, but was unsuccessful. This is a demonstration of why eSATA is a very bad idea for RAID devices.
What happened was that one of the cables had been bumped, disconnecting one of the drives. Thus the array was running in a degraded state for over a month – not good. Anyway I noticed this when logging in one day to fix something else. The device wasn’t visible so I told Dad to check the cable, but unfortunately when he went to secure the cable, he must have somehow disconnected the another one. This caused a second drive to fail so the array immediately stopped.
Despite having no hardware failure, the situation is similar to someone replacing the wrong drive in a raid array. Recovering it was an interesting experience, so here I’ve documented the process.
YOU CAN PERMANENTLY DAMAGE YOUR DATA BY FOLLOWING THIS GUIDE. DO NOT PERFORM THIS OPERATION ON THE ORIGINAL DISKS UNLESS THE DATA IS BACKED UP ELSEWHERE.
Gathering information
The information you’ll need should be contained in the superblocks of the raid devices. First you need to find out which drive failed first, with the mdadm –examine command. My example was a raid5 array of 4 devices, sdb1, sdc1, sdd1 and sde1:
root@server:~# mdadm --examine /dev/sdb1 mdadm: metadata format 01.02 unknown, ignored. /dev/sdb1: Magic : a92b4efc Version : 00.90.00 UUID : 87fa9a4d:d26c14f1:01f9e43d:ac30fbff (local to host server) Creation Time : Mon Oct 11 00:13:02 2010 Raid Level : raid5 Used Dev Size : 625128960 (596.17 GiB 640.13 GB) Array Size : 1875386880 (1788.51 GiB 1920.40 GB) Raid Devices : 4 Total Devices : 3 Preferred Minor : 0 Update Time : Mon Mar 21 00:03:26 2011 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 2 Spare Devices : 0 Checksum : 713f331d - correct Events : 3910 Layout : left-symmetric Chunk Size : 512K Number Major Minor RaidDevice State this 0 8 17 0 active sync /dev/sdb1 0 0 8 17 0 active sync /dev/sdb1 1 1 0 0 1 faulty removed 2 2 8 49 2 active sync /dev/sdd1 3 3 0 0 3 faulty removed
Look at the last part. Here we can see that this drive is in sync with /dev/sdd1 but out of sync with the other two (sdc1 and sde1) – the data indicates that sdc1 and sde1 have failed. These drives are the two in the external enclosure… but I digress.
Performing an examine on sdc1 shows “active sync” for all the other drives, clearly this disk has no idea what’s going on. Also note the update time of February 5 (it is now March!!):
root@server:~# mdadm --examine /dev/sdc1 [...] Update Time : Sat Feb 5 11:22:29 2011 State : clean Active Devices : 4 Working Devices : 4 Failed Devices : 0 Spare Devices : 0 Checksum : 7105b39b - correct Events : 218 Layout : left-symmetric Chunk Size : 512K Number Major Minor RaidDevice State this 1 8 33 1 active sync /dev/sdc1 0 0 8 17 0 active sync /dev/sdb1 1 1 8 33 1 active sync /dev/sdc1 2 2 8 49 2 active sync /dev/sdd1 3 3 8 65 3 active sync /dev/sde1
This indicates that it was the first drive to be disconnected, as the drives were all in sync the last time this drive was part of the array. That leaves sde1:
root@server:~# mdadm --examine /dev/sde1 [...] Update Time : Sun Mar 20 23:53:07 2011 State : clean Active Devices : 3 Working Devices : 3 Failed Devices : 1 Spare Devices : 0 Checksum : 713f30d1 - correct Events : 3904 Layout : left-symmetric Chunk Size : 512K Number Major Minor RaidDevice State this 3 8 65 3 active sync /dev/sde1 0 0 8 17 0 active sync /dev/sdb1 1 1 0 0 1 faulty removed 2 2 8 49 2 active sync /dev/sdd1 3 3 8 65 3 active sync /dev/sde1
When this drive was last part of the array, sdc1 was faulty but the other two were fine. This indicates that it was the second drive to be disconnected.
Scary stuff
Despite being marked as faulty, we have to assume that the data on /dev/sde1 is crash-consistent with sdb1 and sdd1 as the array immediately stopped upon failure. The original array won’t start because it only has two active devices. But we can create a new array with 3/4 of the drives as members and one missing.
This sounds scary and it should. If you have critical data that you’re trying to recover from this situation I would honestly be buying a whole new set of drives, cloning the data across to them and working from those. Having said that, the likelihood of permanently erasing the data is low if you’re careful and don’t trigger a rebuild with an incorrectly configured array (like I almost did).
Important information to note is the configuration of the array, in particular device order, layout and chunk size. If you’re using defaults (in hindsight probably a good idea to lessen the chance of something going wrong in situations ilke this), you don’t need to specify them. However you’ll note that in my example the chunk size is 512K, which differs from the default of 64K.
Update 2012/01/04
When reading the following notes you should note that the default chunk size in more recent versions of mdadm is 512K. In addition, ensure that you are using the same layout version as the original array by specifying with -e 0.90 or -e 1.2. If you are using the same distribution of mdadm as the array was created with, and didn’t manually specify a different version, you should be safe. However when dealing with raid arrays it always pays to double check. The metadata version information should be in the output of mdadm –examine or in mdadm.conf. Thanks to Neil Walfield for the info!
Creating a new array with old data
Here is the command I used to recreate the array:
root@server:~# mdadm --verbose --create /dev/md1 --chunk=512 --level=5 --raid-devices=4 /dev/sdb1 /dev/sdd1 /dev/sde1 missing mdadm: metadata format 01.02 unknown, ignored. mdadm: layout defaults to left-symmetric mdadm: /dev/sdb1 appears to be part of a raid array: level=raid5 devices=4 ctime=Mon Oct 11 00:13:02 2010 mdadm: /dev/sdd1 appears to be part of a raid array: level=raid5 devices=4 ctime=Mon Oct 11 00:13:02 2010 mdadm: /dev/sde1 appears to be part of a raid array: level=raid5 devices=4 ctime=Mon Oct 11 00:13:02 2010 mdadm: size set to 625128960K Continue creating array? y mdadm: array /dev/md1 started.
Oops.
Can you see what I did there…. I created the array with the missing drive at the [3], when in actual fact the missing drive is [1] (the device numbering starts at 0). Thus when I tried to mount:
root@server:/# mount -r /dev/md1p1 /mnt -t ext4 mount: wrong fs type, bad option, bad superblock on /dev/md1p1, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so
!!
Upon realising this I looked at mdstat then stopped the array:
root@server:/# cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md1 : active raid5 sde1[2] sdd1[1] sdb1[0] 1875386880 blocks level 5, 512k chunk, algorithm 2 [4/3] [UUU_] unused devices: root@server:/# mdadm -D /dev/md1 mdadm: metadata format 01.02 unknown, ignored. /dev/md1: Version : 00.90 Creation Time : Mon Mar 21 02:00:54 2011 Raid Level : raid5 Array Size : 1875386880 (1788.51 GiB 1920.40 GB) Used Dev Size : 625128960 (596.17 GiB 640.13 GB) Raid Devices : 4 Total Devices : 3 Preferred Minor : 1 Persistence : Superblock is persistent Update Time : Mon Mar 21 02:00:54 2011 State : clean, degraded Active Devices : 3 Working Devices : 3 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 512K UUID : e469103f:2ddf45e9:01f9e43d:ac30fbff (local to host server) Events : 0.1 Number Major Minor RaidDevice State 0 8 17 0 active sync /dev/sdb1 1 8 49 1 active sync /dev/sdd1 2 8 65 2 active sync /dev/sde1 3 0 0 3 removed root@server:/# mdadm --stop /dev/md1
I then recreated the array with the missing drive in the correct position:
root@server:/# mdadm --verbose --create /dev/md1 --chunk=512 --level=5 --raid-devices=4 /dev/sdb1 missing /dev/sdd1 /dev/sde1 mdadm: metadata format 01.02 unknown, ignored. mdadm: layout defaults to left-symmetric mdadm: /dev/sdb1 appears to be part of a raid array: level=raid5 devices=4 ctime=Mon Mar 21 02:00:54 2011 mdadm: /dev/sdd1 appears to be part of a raid array: level=raid5 devices=4 ctime=Mon Mar 21 02:00:54 2011 mdadm: /dev/sde1 appears to be part of a raid array: level=raid5 devices=4 ctime=Mon Mar 21 02:00:54 2011 mdadm: size set to 625128960K Continue creating array? y mdadm: array /dev/md1 started.
And examined the situation:
root@server:/# cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md1 : active raid5 sde1[3] sdd1[2] sdb1[0] 1875386880 blocks level 5, 512k chunk, algorithm 2 [4/3] [U_UU] unused devices: root@server:/# fdisk /dev/md1 GNU Fdisk 1.2.4 Copyright (C) 1998 - 2006 Free Software Foundation, Inc. This program is free software, covered by the GNU General Public License. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. Using /dev/md1 Command (m for help): p Disk /dev/md1: 1920 GB, 1920389022720 bytes 255 heads, 63 sectors/track, 233474 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/md1p1 1 233475 1875387906 83 Linux Warning: Partition 1 does not end on cylinder boundary. Command (m for help): q root@server:/# mount -r /dev/md1p1 /mnt root@server:/# ls /mnt Alex Garth Hamish Jenny lost+found Public Simon root@server:/# umount /mnt
Phew!
So despite creating a bad array I was still able to stop it and create a new array with the correct configuration. I don’t believe there is any corruption as no writes occurred, and the array didn’t rebuild.
Adding the first-disconnected drive back in
The array is of course still in a degraded state at this point and no more secure than RAID0. We still need to add the disk that was disconnected first back in to the array. Compared to the rest of the saga this is straightforward:
root@server:/# mdadm -a /dev/md1 /dev/sdc1 mdadm: metadata format 01.02 unknown, ignored. mdadm: added /dev/sdc1 root@server:/# cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md1 : active raid5 sdc1[4] sde1[3] sdd1[2] sdb1[0] 1875386880 blocks level 5, 512k chunk, algorithm 2 [4/3] [U_UU] [>....................] recovery = 0.0% (442368/625128960) finish=164.7min speed=63196K/sec unused devices:
Here we can see a happily rebuilding RAID5 array. Note that you will need to update /etc/mdadm/mdadm.conf file with the new uuid, the line can be simply generated with:
root@server:/# mdadm --detail --scan mdadm: metadata format 01.02 unknown, ignored. ARRAY /dev/md1 level=raid5 num-devices=4 metadata=00.90 spares=1 UUID=7271bab9:23a4b554:01f9e43d:ac30fbff
You can keep an eye on the rebuild with ‘watch cat /proc/mdstat’.
I’d broaden a bit and say eSATA is a risky choice for any permanent use – RAID or not.
As someone who used much of your prior ubuntu server post as reference, I decided to go with RAID6 instead. Even though I’m only running 4 drives at the moment and RAID6 causes me to sacrifice 2 of 4, the redundancy of RAID5 is not sufficient for me. Given most RAIDs are built with drives of the same model, similar age, often the same Lot #, and experience nearly identical usage, multiple simultaneous failures are not that farfetched. I believe the odds of two drives dying at the exact same moment are low, but a full rebuild will stress-test the remaining drives at a time that I can least afford to have a second drive go.
I do like eSATA for performing back-ups. I’d be interested in a solution that can back up the entire RAID. Is it reasonable to run a tape drive at home?
Totally agree re eSATA.
For me however the security of the daily backup offsets the risk of multiple drives failing, so while one failing might indicate that an additional failure from the same batch is more likely, at most you lose a day’s worth of data.
IMHO, the only reasons to go with tape are portability and durability of the media. You can get more storage on a hard drive these days for much lower cost, and the speed and flexibility of the backups is incomparable (can’t rsync to a tape…). If you need to keep your backups for a long time and your data set isn’t too large, tapes can make sense, but for someone who just wants to ensure their data is safe a couple of 2TB (or 3TB) hard drives on rotation with the aforementioned RAID array is hard to beat.
This was great, thansk! This is exactly what I needed to help me recover from multiple drive failures.
I linked back to your blog in my write up at http://axlecrusher.blogspot.com/2011/10/recovering-raid5-from-multiple.html
Thanks again!
Cheers! Glad it was useful
Exactly the reassurance and mdadm –create command I needed to get my RAID5 array back together. Thank you a ton for posting this.
Pingback: Recovering Linux software RAID, RAID5 Array - MySQL Performance Blog
This guide literally saved my life. (Ok, not literally.) I had a SATA controller die with two drives attached die on me. When I replaced the controller the RAID would not start because both drives were marked as faulty. Using this guide, I thought of recreating the array with both drivers since the –examine details were the same for both, but somehow I feared there would be inconsistency between the two, so chickened out and created the array with just one of them. My data (family photos) was preserved. I added the other “faulty” drive later without issue. THANKS!!!!
This was a very helpful post. Thanks!
I have two commends:
mdadm’s default chunk size recently changed from 64k to 512k. This makes your warning even more relevant.
Second, the default layout has changed! There is a new version 1.20. Many people are likely using 0.90. They need to specify -e 0.90 or risk losing data!
Thanks for the information!
The warning should be a general one to ensure the config of the new array matches the old… any one change could result in scrambled data.
I’ll update the post now.
The following happened to me: I have a 4 disk RAID-5. A disk failed, which I promptly replaced. I added the new disk to the array, but it was not integrated because the recovery exposed a bad sector in a second disk. Double disk failure! Ouch. To recover, I had to recover one of the failed disks. To do this, I used gddrescue, a dd-like tool with error recovery capabilities. After copying one of the failed disks to the new disk, I was able to recover the array using this guide.
his was a very helpful post. Thanks!
Very nice bolg
you have been most helpful! thank you :)
This was very helpful. Thank you very much!
This is very helpful, thank You for sharing.
ps. I’ve lost whole 1TB volume few years ago due:
– ext4 still unstable
– raid10 volume corruption
– kernel panics
– 3 x power failure while rebuild
fortunatelly, it was just test server..
No drive failed, all were fine. Just filesystem corruption.
Then I’ve learned how to recover data from mdadm RAID arrays using Testdisk.
Mdadm has mailer daemon built-in.
You just have to set up Exim or Sendmail as smarthost (if You have some mail server already and can use it) or sending-only system.
Mdadm sends mail on every event.. even testing!
This post gave me the courage to pull the trigger on my failed array. Thank you, 16TB family RAID5 array rebuilding now!!! Now if I can only remember to keep the bloody server chassis locked with a two year old around…
A 2nd follow up. You can use mdadm –assemble –force rather than –create to put the raid back together. With this method you don’t have to specify the missing drive. I also don’t think you need to specify the drives in order, though I did anyway. It will also mark failed drives as clean because of –force.
Thanks for the tip! Haven’t tested but this sounds like a better method, as it preserves the existing config.
Alex, could you add om top of your post that “mdadm –assemble –force” should be tried first?
I used re-creation suggested in this post and damaged my data, because sector offset is different because mdadm versions. This is very dangerous.
Of course this is dangerous, I probably could have stressed this more but it worked for me, and the difference between versions IS mentioned (although that was a later update thanks to feedback). The assemble force method is clearly better however, so I’ve added a note to the top.
Hi I need some help.
I got:
root@server:~# mdadm –examine /dev/md0
/dev/md1:
Version : 00.90
Creation Time : Sun Oct 12 18:16:02 2008
Raid Level : raid5
Used Dev Size : 722209984 (688.75 GiB 739.54 GB)
Raid Devices : 4
Total Devices : 2
Preferred Minor : 1
Persistance : Superblock is persistent
Update Time : Fri Jan 17 12:11:13 2014
State : active, FAILED, Not Started
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Events : 3910
Layout : left-symmetric
Chunk Size : 512K
Events : 0.251432
Number Major Minor RaidDevice State
0 0 0 0 removed
1 8 22 1 active sync /dev/sdb6
2 8 38 2 active sync /dev/sdc6
3 0 0 3 removed
If I look at /dev/sda6 I got:
root@server:~# mdadm –examine /dev/sda6
/dev/sda6:
Magic : a92b4efc
Version : 00.90.00
UUID :
Creation Time : Sun Oct 12 18:16:02 2008
Raid Level : raid5
Used Dev Size : 722209984 (688.75 GiB 739.54 GB)
Array Size : 2166629952 (2066.26 GiB 2218.63 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 1
Update Time : Sun Dec 30 11:35:30 2012
State : active
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Checksum : f38f0a97 – correct
Events : 127013
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 0 8 6 0 active sync /dev/sda6
0 0 8 6 0 active sync /dev/sda6
1 1 8 22 1 active sync /dev/sdb6
2 2 8 38 2 active sync /dev/sdc6
3 3 8 54 3 active sync /dev/sdd6
If I look at /dev/sdd6 I got:
root@server:~# mdadm –examine /dev/sdd6
/dev/sdd6:
Magic : a92b4efc
Version : 00.90.00
UUID :
Creation Time : Sun Oct 12 18:16:02 2008
Raid Level : raid5
Used Dev Size : 722209984 (688.75 GiB 739.54 GB)
Array Size : 2166629952 (2066.26 GiB 2218.63 GB)
Raid Devices : 4
Total Devices : 3
Preferred Minor : 1
Update Time : Fri Jan 10 17:43:59 2014
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Checksum : f584d13d – correct
Events : 251427
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 3 8 54 3 active sync /dev/sda6
0 0 8 6 0 removed
1 1 8 22 1 active sync /dev/sdb6
2 2 8 38 2 active sync /dev/sdc6
3 3 8 54 3 active sync /dev/sdd6
But what I do I wil not rebuild, the raid0 I have on the same disk is working because the computer was booting.
Please help I’am lost and want my data back
I did:
mdadm –stop /dev/md1
and:
mdadm –verbose –create /dev/md1 –chunk=64 –level=5 –raid-devices=4 /dev/sd[a,b,c,d]6
which gave me:
root@server:/# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md1 : active (auto-read-only) raid5 sdd6[4](S) sdc6[2] sdb6[1] sda6[0]
2166626688 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/3] [UUU_]
md0 : active raid0 sdd5[3] sdc5[2] sdb5[1] sda5[0]
39069696 blocks 64k chunks
but fdisk /dev/md1 gives me “unknown partition table”
Now what?
I stopped the array again and did
mdadm –verbose –create /dev/md1 –chunk=64 –metadate=0.90 –level=5 –raid-devices=4 /dev/sd[a,b,c,d]6
but still gives “unknown partition table”
please help!!
It’s been a while since I’ve worked with mdadm, so I’m not the best person to get advice from, but it looks like you’re trying to create a new array with all the devices of the old one present. The article above describes how to create a new array with only the drive that failed first missing, which should result in a readable array if your data hasn’t been corrupted somehow.
Thanks for you replay Alex
It seem that I lost a drive in december 2012 but did not know that and lost a second one last weekend. This one I noticed because the data was partly accessible. I did shut down the server and discontected alle the hdd and connect them again and all 4 were seen by the bios and the md0 worked again because it booted in safe mode. So I have good hope that most of my data is still fine if I make the right rebuild.
Now I have to know what the right rebuild could / should be.
In the article above I used:
# mdadm –verbose –create /dev/md1 –chunk=512 –level=5 –raid-devices=4 /dev/sdb1 missing /dev/sdd1 /dev/sde1
You’ll need to adapt this for your own scenario, it looks like /dev/sda6 failed first so I’d suggest creating it with that one missing.
One thing I just noticed; your /dev/md1 says chunk size is 512K but your drives say 64K. I can only guess as to how this happened, but make sure you use whatever the array was originally created as. Figuring out the default for your version of mdadm might help with this.
I tried the following
mdadm –verbose –create /dev/md1 –chunk=64 –metadate=0.90 –level=5 –raid-devices=4 missing /dev/sdb6 /dev/sdc6 /dev/sdc6
which seemed to work
Then I repaired the file system with fsck.ext4 -cDfty -C 0 /dev/md1
Checked if the data was recovered and i was, than I added the missing drive with
mdadm -a /dev/md1 /dev/sda6
which lead to a rebuild of the array and a booting and working server and all my data is back.
Is it possible that mdadm sent a message when something happens to the array or do I have to move to a hardware aray. A RocketRAID 2720SGL cost around E.180,– so that is not so expansive as it used to be.
Glad you got your data back.
Mdadm can absolutely send email alerts, but you have to configure the address, and have a working MTA.
Your instructions saved my data. Thank you. I learned something valuable about RAID5 today :-)
Glad to hear it!
Thanks for your post. I had to resucitate an old machine decomisioned long time ago, and this post helped a lot.
Pingback: RAID5 degradado y sin superbloque md en una de las unidades restantes