Recovering a RAID5 mdadm array with two failed devices

Update 1/11/2019

If you’ve reached this article by Googling how to recover a RAID array, I suggest you don’t use this guide. The Linux RAID wiki has much more correct, complete, and authoritative information. In retrospect I was very lucky not to destroy the data, but this article is an interesting account of how I did it.

Update

Before reading this article you should know that it is now quite old and there is a better method – ‘mdadm –assemble –force’ (it may have been there all along). This will try to assemble the array by marking previously failed drives as good. From the man page:

If mdadm cannot find enough working devices to start the array, but can find some devices that are recorded as having failed, then it will mark those devices as working so that the array can be started.

I would however strongly suggest that you first disconnect the drive that failed first. If you need to discover which device failed first, or assemble doesn’t work and you need to manually recreate the array, then read on.

I found myself in an interesting situation with my parents home server today (Ubuntu 10.04). Hardware wise it’s not the best setup – two of the drives are in an external enclose connected with eSATA cables. I did encourage Dad to buy a proper enclosure, but was unsuccessful. This is a demonstration of why eSATA is a very bad idea for RAID devices.

What happened was that one of the cables had been bumped, disconnecting one of the drives. Thus the array was running in a degraded state for over a month – not good. Anyway I noticed this when logging in one day to fix something else. The device wasn’t visible so I told Dad to check the cable, but unfortunately when he went to secure the cable, he must have somehow disconnected the another one. This caused a second drive to fail so the array immediately stopped.

Despite having no hardware failure, the situation is similar to someone replacing the wrong drive in a raid array. Recovering it was an interesting experience, so here I’ve documented the process.

YOU CAN PERMANENTLY DAMAGE YOUR DATA BY FOLLOWING THIS GUIDE. DO NOT PERFORM THIS OPERATION ON THE ORIGINAL DISKS UNLESS THE DATA IS BACKED UP ELSEWHERE.

Gathering information

The information you’ll need should be contained in the superblocks of the raid devices. First you need to find out which drive failed first, with the mdadm –examine command. My example was a raid5 array of 4 devices, sdb1, sdc1, sdd1 and sde1:

root@server:~# mdadm --examine /dev/sdb1
mdadm: metadata format 01.02 unknown, ignored.
/dev/sdb1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 87fa9a4d:d26c14f1:01f9e43d:ac30fbff (local to host server)
  Creation Time : Mon Oct 11 00:13:02 2010
     Raid Level : raid5
  Used Dev Size : 625128960 (596.17 GiB 640.13 GB)
     Array Size : 1875386880 (1788.51 GiB 1920.40 GB)
   Raid Devices : 4
  Total Devices : 3
Preferred Minor : 0

    Update Time : Mon Mar 21 00:03:26 2011
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 2
  Spare Devices : 0
       Checksum : 713f331d - correct
         Events : 3910

         Layout : left-symmetric
     Chunk Size : 512K

      Number   Major   Minor   RaidDevice State
this     0       8       17        0      active sync   /dev/sdb1

   0     0       8       17        0      active sync   /dev/sdb1
   1     1       0        0        1      faulty removed
   2     2       8       49        2      active sync   /dev/sdd1
   3     3       0        0        3      faulty removed

Look at the last part. Here we can see that this drive is in sync with /dev/sdd1 but out of sync with the other two (sdc1 and sde1) – the data indicates that sdc1 and sde1 have failed. These drives are the two in the external enclosure… but I digress.

Performing an examine on sdc1 shows “active sync” for all the other drives, clearly this disk has no idea what’s going on. Also note the update time of February 5 (it is now March!!):

root@server:~# mdadm --examine /dev/sdc1
[...]
    Update Time : Sat Feb  5 11:22:29 2011
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 7105b39b - correct
         Events : 218

         Layout : left-symmetric
     Chunk Size : 512K

      Number   Major   Minor   RaidDevice State
this     1       8       33        1      active sync   /dev/sdc1

   0     0       8       17        0      active sync   /dev/sdb1
   1     1       8       33        1      active sync   /dev/sdc1
   2     2       8       49        2      active sync   /dev/sdd1
   3     3       8       65        3      active sync   /dev/sde1

This indicates that it was the first drive to be disconnected, as the drives were all in sync the last time this drive was part of the array. That leaves sde1:

root@server:~# mdadm --examine /dev/sde1
[...]
    Update Time : Sun Mar 20 23:53:07 2011
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 1
  Spare Devices : 0
       Checksum : 713f30d1 - correct
         Events : 3904

         Layout : left-symmetric
     Chunk Size : 512K

      Number   Major   Minor   RaidDevice State
this     3       8       65        3      active sync   /dev/sde1

   0     0       8       17        0      active sync   /dev/sdb1
   1     1       0        0        1      faulty removed
   2     2       8       49        2      active sync   /dev/sdd1
   3     3       8       65        3      active sync   /dev/sde1

When this drive was last part of the array, sdc1 was faulty but the other two were fine. This indicates that it was the second drive to be disconnected.

Scary stuff

Despite being marked as faulty, we have to assume that the data on /dev/sde1 is crash-consistent with sdb1 and sdd1 as the array immediately stopped upon failure. The original array won’t start because it only has two active devices. But we can create a new array with 3/4 of the drives as members and one missing.

This sounds scary and it should. If you have critical data that you’re trying to recover from this situation I would honestly be buying a whole new set of drives, cloning the data across to them and working from those. Having said that, the likelihood of permanently erasing the data is low if you’re careful and don’t trigger a rebuild with an incorrectly configured array (like I almost did).

Important information to note is the configuration of the array, in particular device order, layout and chunk size. If you’re using defaults (in hindsight probably a good idea to lessen the chance of something going wrong in situations ilke this), you don’t need to specify them. However you’ll note that in my example the chunk size is 512K, which differs from the default of 64K.

Update 2012/01/04

When reading the following notes you should note that the default chunk size in more recent versions of mdadm is 512K. In addition, ensure that you are using the same layout version as the original array by specifying with -e 0.90 or -e 1.2. If you are using the same distribution of mdadm as the array was created with, and didn’t manually specify a different version, you should be safe. However when dealing with raid arrays it always pays to double check. The metadata version information should be in the output of mdadm –examine or in mdadm.conf. Thanks to Neil Walfield for the info!

Creating a new array with old data

Here is the command I used to recreate the array:

root@server:~# mdadm --verbose --create /dev/md1 --chunk=512 --level=5 --raid-devices=4 /dev/sdb1 /dev/sdd1 /dev/sde1 missing
mdadm: metadata format 01.02 unknown, ignored.
mdadm: layout defaults to left-symmetric
mdadm: /dev/sdb1 appears to be part of a raid array:
    level=raid5 devices=4 ctime=Mon Oct 11 00:13:02 2010
mdadm: /dev/sdd1 appears to be part of a raid array:
    level=raid5 devices=4 ctime=Mon Oct 11 00:13:02 2010
mdadm: /dev/sde1 appears to be part of a raid array:
    level=raid5 devices=4 ctime=Mon Oct 11 00:13:02 2010
mdadm: size set to 625128960K
Continue creating array? y
mdadm: array /dev/md1 started.

Oops.

Can you see what I did there…. I created the array with the missing drive at the [3], when in actual fact the missing drive is [1] (the device numbering starts at 0). Thus when I tried to mount:

root@server:/# mount -r /dev/md1p1 /mnt -t ext4
mount: wrong fs type, bad option, bad superblock on /dev/md1p1,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so

Upon realising this I looked at mdstat then stopped the array:

root@server:/# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md1 : active raid5 sde1[2] sdd1[1] sdb1[0]
      1875386880 blocks level 5, 512k chunk, algorithm 2 [4/3] [UUU_]

unused devices: 
root@server:/# mdadm -D /dev/md1
mdadm: metadata format 01.02 unknown, ignored.
/dev/md1:
        Version : 00.90
  Creation Time : Mon Mar 21 02:00:54 2011
     Raid Level : raid5
     Array Size : 1875386880 (1788.51 GiB 1920.40 GB)
  Used Dev Size : 625128960 (596.17 GiB 640.13 GB)
   Raid Devices : 4
  Total Devices : 3
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Mon Mar 21 02:00:54 2011
          State : clean, degraded
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 512K

           UUID : e469103f:2ddf45e9:01f9e43d:ac30fbff (local to host server)
         Events : 0.1

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       1       8       49        1      active sync   /dev/sdd1
       2       8       65        2      active sync   /dev/sde1
       3       0        0        3      removed
root@server:/# mdadm --stop /dev/md1

I then recreated the array with the missing drive in the correct position:

root@server:/#  mdadm --verbose --create /dev/md1 --chunk=512 --level=5 --raid-devices=4 /dev/sdb1 missing /dev/sdd1 /dev/sde1
mdadm: metadata format 01.02 unknown, ignored.
mdadm: layout defaults to left-symmetric
mdadm: /dev/sdb1 appears to be part of a raid array:
    level=raid5 devices=4 ctime=Mon Mar 21 02:00:54 2011
mdadm: /dev/sdd1 appears to be part of a raid array:
    level=raid5 devices=4 ctime=Mon Mar 21 02:00:54 2011
mdadm: /dev/sde1 appears to be part of a raid array:
    level=raid5 devices=4 ctime=Mon Mar 21 02:00:54 2011
mdadm: size set to 625128960K
Continue creating array? y
mdadm: array /dev/md1 started.

And examined the situation:

root@server:/# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md1 : active raid5 sde1[3] sdd1[2] sdb1[0]
      1875386880 blocks level 5, 512k chunk, algorithm 2 [4/3] [U_UU]

unused devices: 
root@server:/# fdisk /dev/md1
GNU Fdisk 1.2.4
Copyright (C) 1998 - 2006 Free Software Foundation, Inc.
This program is free software, covered by the GNU General Public License.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

Using /dev/md1
Command (m for help): p                                                   

Disk /dev/md1: 1920 GB, 1920389022720 bytes
255 heads, 63 sectors/track, 233474 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System 
/dev/md1p1               1      233475  1875387906   83  Linux 
Warning: Partition 1 does not end on cylinder boundary.                   
Command (m for help): q                                                   
root@server:/# mount -r /dev/md1p1 /mnt
root@server:/# ls /mnt
Alex  Garth  Hamish  Jenny  lost+found  Public  Simon
root@server:/# umount /mnt

Phew!

So despite creating a bad array I was still able to stop it and create a new array with the correct configuration. I don’t believe there is any corruption as no writes occurred, and the array didn’t rebuild.

Adding the first-disconnected drive back in

The array is of course still in a degraded state at this point and no more secure than RAID0. We still need to add the disk that was disconnected first back in to the array. Compared to the rest of the saga this is straightforward:

root@server:/# mdadm -a /dev/md1 /dev/sdc1
mdadm: metadata format 01.02 unknown, ignored.
mdadm: added /dev/sdc1
root@server:/# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md1 : active raid5 sdc1[4] sde1[3] sdd1[2] sdb1[0]
      1875386880 blocks level 5, 512k chunk, algorithm 2 [4/3] [U_UU]
      [>....................]  recovery =  0.0% (442368/625128960) finish=164.7min speed=63196K/sec

unused devices:

Here we can see a happily rebuilding RAID5 array. Note that you will need to update /etc/mdadm/mdadm.conf file with the new uuid, the line can be simply generated with:

root@server:/# mdadm --detail --scan
mdadm: metadata format 01.02 unknown, ignored.
ARRAY /dev/md1 level=raid5 num-devices=4 metadata=00.90 spares=1 UUID=7271bab9:23a4b554:01f9e43d:ac30fbff

You can keep an eye on the rebuild with ‘watch cat /proc/mdstat’.

34 thoughts on “Recovering a RAID5 mdadm array with two failed devices”

Steven F 13/04/2011 at 16:32

I’d broaden a bit and say eSATA is a risky choice for any permanent use – RAID or not.

As someone who used much of your prior ubuntu server post as reference, I decided to go with RAID6 instead. Even though I’m only running 4 drives at the moment and RAID6 causes me to sacrifice 2 of 4, the redundancy of RAID5 is not sufficient for me. Given most RAIDs are built with drives of the same model, similar age, often the same Lot #, and experience nearly identical usage, multiple simultaneous failures are not that farfetched. I believe the odds of two drives dying at the exact same moment are low, but a full rebuild will stress-test the remaining drives at a time that I can least afford to have a second drive go.

I do like eSATA for performing back-ups. I’d be interested in a solution that can back up the entire RAID. Is it reasonable to run a tape drive at home?

Reply ↓
1. Alex 13/04/2011 at 16:56
  
  Totally agree re eSATA.
  
  For me however the security of the daily backup offsets the risk of multiple drives failing, so while one failing might indicate that an additional failure from the same batch is more likely, at most you lose a day’s worth of data.
  
  IMHO, the only reasons to go with tape are portability and durability of the media. You can get more storage on a hard drive these days for much lower cost, and the speed and flexibility of the backups is incomparable (can’t rsync to a tape…). If you need to keep your backups for a long time and your data set isn’t too large, tapes can make sense, but for someone who just wants to ensure their data is safe a couple of 2TB (or 3TB) hard drives on rotation with the aforementioned RAID array is hard to beat.
  
  Reply ↓
Josh A 16/10/2011 at 03:43

This was great, thansk! This is exactly what I needed to help me recover from multiple drive failures.

I linked back to your blog in my write up at http://axlecrusher.blogspot.com/2011/10/recovering-raid5-from-multiple.html

Thanks again!

Reply ↓
1. Alex Post author16/10/2011 at 16:47
  
  Cheers! Glad it was useful
  
  Reply ↓
Sam 02/11/2011 at 20:58

Exactly the reassurance and mdadm –create command I needed to get my RAID5 array back together. Thank you a ton for posting this.

Reply ↓
Pingback: Recovering Linux software RAID, RAID5 Array - MySQL Performance Blog
DonaldVR 08/12/2011 at 03:36

This guide literally saved my life. (Ok, not literally.) I had a SATA controller die with two drives attached die on me. When I replaced the controller the RAID would not start because both drives were marked as faulty. Using this guide, I thought of recreating the array with both drivers since the –examine details were the same for both, but somehow I feared there would be inconsistency between the two, so chickened out and created the array with just one of them. My data (family photos) was preserved. I added the other “faulty” drive later without issue. THANKS!!!!

Reply ↓
Neal Walfield 31/12/2011 at 10:56

This was a very helpful post. Thanks!

I have two commends:

mdadm’s default chunk size recently changed from 64k to 512k. This makes your warning even more relevant.

Second, the default layout has changed! There is a new version 1.20. Many people are likely using 0.90. They need to specify -e 0.90 or risk losing data!

Reply ↓
1. Alex Post author04/01/2012 at 14:14
  
  Thanks for the information!
  
  The warning should be a general one to ensure the config of the new array matches the old… any one change could result in scrambled data.
  
  I’ll update the post now.
  
  Reply ↓
Neal Walfield 04/01/2012 at 16:19

The following happened to me: I have a 4 disk RAID-5. A disk failed, which I promptly replaced. I added the new disk to the array, but it was not integrated because the recovery exposed a bad sector in a second disk. Double disk failure! Ouch. To recover, I had to recover one of the failed disks. To do this, I used gddrescue, a dd-like tool with error recovery capabilities. After copying one of the failed disks to the new disk, I was able to recover the array using this guide.

Reply ↓
kumarsalmeda 13/04/2012 at 09:13

his was a very helpful post. Thanks!

Reply ↓
kumarsalmeda 13/04/2012 at 09:13

Very nice bolg

Reply ↓
hansel 29/06/2012 at 12:14

you have been most helpful! thank you :)

Reply ↓
richie 30/07/2012 at 09:23

This was very helpful. Thank you very much!

Reply ↓
TooMeeK 15/08/2012 at 16:14

This is very helpful, thank You for sharing.
ps. I’ve lost whole 1TB volume few years ago due:
– ext4 still unstable
– raid10 volume corruption
– kernel panics
– 3 x power failure while rebuild
fortunatelly, it was just test server..
No drive failed, all were fine. Just filesystem corruption.
Then I’ve learned how to recover data from mdadm RAID arrays using Testdisk.

Reply ↓
1. TooMeeK 21/08/2012 at 02:23
  
  Mdadm has mailer daemon built-in.
  You just have to set up Exim or Sendmail as smarthost (if You have some mail server already and can use it) or sending-only system.
  Mdadm sends mail on every event.. even testing!
  
  Reply ↓
Jorus 24/08/2012 at 04:04

This post gave me the courage to pull the trigger on my failed array. Thank you, 16TB family RAID5 array rebuilding now!!! Now if I can only remember to keep the bloody server chassis locked with a two year old around…

Reply ↓
Josh A 16/09/2012 at 01:17

A 2nd follow up. You can use mdadm –assemble –force rather than –create to put the raid back together. With this method you don’t have to specify the missing drive. I also don’t think you need to specify the drives in order, though I did anyway. It will also mark failed drives as clean because of –force.

Reply ↓
1. Alex Post author17/09/2012 at 08:57
  
  Thanks for the tip! Haven’t tested but this sounds like a better method, as it preserves the existing config.
  
  Reply ↓
  1. PKIX 11/12/2012 at 14:41
    
    Alex, could you add om top of your post that “mdadm –assemble –force” should be tried first?
    I used re-creation suggested in this post and damaged my data, because sector offset is different because mdadm versions. This is very dangerous.
    
    Reply ↓
    1. Alex Post author15/12/2012 at 14:37
      
      Of course this is dangerous, I probably could have stressed this more but it worked for me, and the difference between versions IS mentioned (although that was a later update thanks to feedback). The assemble force method is clearly better however, so I’ve added a note to the top.
      
      Reply ↓
Boudewijn Charite 20/01/2014 at 15:44

Hi I need some help.

I got:

root@server:~# mdadm –examine /dev/md0
/dev/md1:
Version : 00.90
Creation Time : Sun Oct 12 18:16:02 2008
Raid Level : raid5
Used Dev Size : 722209984 (688.75 GiB 739.54 GB)
Raid Devices : 4
Total Devices : 2
Preferred Minor : 1
Persistance : Superblock is persistent

Update Time : Fri Jan 17 12:11:13 2014
State : active, FAILED, Not Started
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Events : 3910

Layout : left-symmetric
Chunk Size : 512K
Events : 0.251432

Number Major Minor RaidDevice State

0 0 0 0 removed
1 8 22 1 active sync /dev/sdb6
2 8 38 2 active sync /dev/sdc6
3 0 0 3 removed

If I look at /dev/sda6 I got:

root@server:~# mdadm –examine /dev/sda6
/dev/sda6:
Magic : a92b4efc
Version : 00.90.00
UUID :
Creation Time : Sun Oct 12 18:16:02 2008
Raid Level : raid5
Used Dev Size : 722209984 (688.75 GiB 739.54 GB)
Array Size : 2166629952 (2066.26 GiB 2218.63 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 1

Update Time : Sun Dec 30 11:35:30 2012
State : active
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Checksum : f38f0a97 – correct
Events : 127013

Layout : left-symmetric
Chunk Size : 64K

Number Major Minor RaidDevice State
this 0 8 6 0 active sync /dev/sda6

0 0 8 6 0 active sync /dev/sda6
1 1 8 22 1 active sync /dev/sdb6
2 2 8 38 2 active sync /dev/sdc6
3 3 8 54 3 active sync /dev/sdd6

If I look at /dev/sdd6 I got:

root@server:~# mdadm –examine /dev/sdd6
/dev/sdd6:
Magic : a92b4efc
Version : 00.90.00
UUID :
Creation Time : Sun Oct 12 18:16:02 2008
Raid Level : raid5
Used Dev Size : 722209984 (688.75 GiB 739.54 GB)
Array Size : 2166629952 (2066.26 GiB 2218.63 GB)
Raid Devices : 4
Total Devices : 3
Preferred Minor : 1

Update Time : Fri Jan 10 17:43:59 2014
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Checksum : f584d13d – correct
Events : 251427

Layout : left-symmetric
Chunk Size : 64K

Number Major Minor RaidDevice State
this 3 8 54 3 active sync /dev/sda6

0 0 8 6 0 removed
1 1 8 22 1 active sync /dev/sdb6
2 2 8 38 2 active sync /dev/sdc6
3 3 8 54 3 active sync /dev/sdd6

But what I do I wil not rebuild, the raid0 I have on the same disk is working because the computer was booting.

Please help I’am lost and want my data back

Reply ↓
1. Boudewijn Charite 21/01/2014 at 17:55
  
  I did:
  mdadm –stop /dev/md1
  and:
  mdadm –verbose –create /dev/md1 –chunk=64 –level=5 –raid-devices=4 /dev/sd[a,b,c,d]6
  
  which gave me:
  
  root@server:/# cat /proc/mdstat
  Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
  md1 : active (auto-read-only) raid5 sdd6[4](S) sdc6[2] sdb6[1] sda6[0]
  2166626688 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/3] [UUU_]
  
  md0 : active raid0 sdd5[3] sdc5[2] sdb5[1] sda5[0]
  39069696 blocks 64k chunks
  
  but fdisk /dev/md1 gives me “unknown partition table”
  
  Now what?
  
  Reply ↓
  1. Boudewijn Charite 21/01/2014 at 18:04
    
    I stopped the array again and did
    
    mdadm –verbose –create /dev/md1 –chunk=64 –metadate=0.90 –level=5 –raid-devices=4 /dev/sd[a,b,c,d]6
    
    but still gives “unknown partition table”
    
    please help!!
    
    Reply ↓
  2. Alex Post author21/01/2014 at 18:07
    
    It’s been a while since I’ve worked with mdadm, so I’m not the best person to get advice from, but it looks like you’re trying to create a new array with all the devices of the old one present. The article above describes how to create a new array with only the drive that failed first missing, which should result in a readable array if your data hasn’t been corrupted somehow.
    
    Reply ↓
    1. Boudewijn Charite 21/01/2014 at 19:00
      
      Thanks for you replay Alex
      
      It seem that I lost a drive in december 2012 but did not know that and lost a second one last weekend. This one I noticed because the data was partly accessible. I did shut down the server and discontected alle the hdd and connect them again and all 4 were seen by the bios and the md0 worked again because it booted in safe mode. So I have good hope that most of my data is still fine if I make the right rebuild.
      
      Now I have to know what the right rebuild could / should be.
      
      Reply ↓
      1. Alex Post author21/01/2014 at 19:19
        
        In the article above I used:
        # mdadm –verbose –create /dev/md1 –chunk=512 –level=5 –raid-devices=4 /dev/sdb1 missing /dev/sdd1 /dev/sde1
        
        You’ll need to adapt this for your own scenario, it looks like /dev/sda6 failed first so I’d suggest creating it with that one missing.
      2. Alex Post author21/01/2014 at 19:31
        
        One thing I just noticed; your /dev/md1 says chunk size is 512K but your drives say 64K. I can only guess as to how this happened, but make sure you use whatever the array was originally created as. Figuring out the default for your version of mdadm might help with this.
Boudewijn Charite 24/01/2014 at 16:08

I tried the following

mdadm –verbose –create /dev/md1 –chunk=64 –metadate=0.90 –level=5 –raid-devices=4 missing /dev/sdb6 /dev/sdc6 /dev/sdc6

which seemed to work

Then I repaired the file system with fsck.ext4 -cDfty -C 0 /dev/md1

Checked if the data was recovered and i was, than I added the missing drive with

mdadm -a /dev/md1 /dev/sda6

which lead to a rebuild of the array and a booting and working server and all my data is back.

Is it possible that mdadm sent a message when something happens to the array or do I have to move to a hardware aray. A RocketRAID 2720SGL cost around E.180,– so that is not so expansive as it used to be.

Reply ↓
1. Alex Post author24/01/2014 at 17:47
  
  Glad you got your data back.
  
  Mdadm can absolutely send email alerts, but you have to configure the address, and have a working MTA.
  
  Reply ↓
Daniel Franklin 26/09/2014 at 16:57

Your instructions saved my data. Thank you. I learned something valuable about RAID5 today :-)

Reply ↓
1. Alex Post author26/09/2014 at 17:37
  
  Glad to hear it!
  
  Reply ↓
ediaz_ultreia 21/12/2017 at 16:26

Thanks for your post. I had to resucitate an old machine decomisioned long time ago, and this post helped a lot.

Reply ↓
Pingback: RAID5 degradado y sin superbloque md en una de las unidades restantes