Raid 5 inaccessible after defective drive swap

Backup and data protection discussion at its finest.

Moderator: Lillian.W@AST

Richard
Posts: 66
youtube meble na wymiar Warszawa
Joined: Sat Jan 03, 2015 4:38 am

Raid 5 inaccessible after defective drive swap

Post by Richard »

Today I got a mail that one of my drives was defective.
So I checked the NAS which is an Asustor 304T which is configured in 1 volume with Raid 5.

From the control panel I shutdown the nas, swapped the defective drive and restarted the nas again.
Since then I only get a system error in the overview and when looking at volume it says "inaccessible" instead of rebuilding.

I can't access the shares anymore, so even if I had space, I can't make a backup.

Is there another way to rebuild the raid and make the volume accessible again? Maybe via SSH or telnet access with linux commands?
But I presume then first the volume should be accessible again? I don't know what to do and the raid has 4 disks of 2 TB from which at least 3,5 TB is used with data.
User avatar
Nazar78
Posts: 2004
Joined: Wed Jul 17, 2019 10:21 pm
Location: Singapore
Contact:

Re: Raid 5 inaccessible after defective drive swap

Post by Nazar78 »

Actually you should have done a hot-swap without shutting down the NAS. Refer to scenario B of https://support.asustor.com/index.php?/ ... ts-damaged.

Try restart the NAS with only 3 disks, if you can see your volume then you may proceed to insert the new drive and manage it from the storage manager. If you need further assistance, SSH into the NAS as root (enable the terminal in services). Then let us know the content of:

Code: Select all

cat /proc/mdstat
Or you can contact Asustor Support directly to raise a ticket.
AS5304T - 16GB DDR4 - ADM-OS modded on 2GB RAM
Internal:
- 4x10TB Toshiba RAID10 Ext4-Journal=Off
External 5 Bay USB3:
- 4x2TB Seagate modded RAID0 Btrfs-Compression
- 480GB Intel SSD for modded dm-cache (initramfs auto update patch) and Apps

When posting, consider checking the box "Notify me when a reply is posted" to get faster response
Richard
Posts: 66
Joined: Sat Jan 03, 2015 4:38 am

Re: Raid 5 inaccessible after defective drive swap

Post by Richard »

Thank you for your reply.

When I put back the defective drive, I got a rebuilding. So the volume was OK again and then I proceeded with the hot swap.
Since then it keeps geving error.

At this moment I again have the defective drive inside. It is drive 4.

When starting the NAS with 3 disks, it also gives the system failure in the overview and in the volume:
Volume 1 / Raid 5 / Inaccesible.

I can see all 4 drives, with on the smart page of drive 4 the bad on the smart.

I went in via SSH as required. This is the output of the cat /proc/mdstat command (with the faulty drive 4 present).

Code: Select all

root@AS-304T-4BA9:/etc # cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md126 : active raid1 sdb3[0] sdc3[4] sda3[1]
      2096064 blocks super 1.2 [4/3] [UUU_]
      
md0 : active raid1 sdb2[0] sdc2[5] sda2[1]
      2096064 blocks super 1.2 [4/3] [UU_U]
      
unused devices: <none>
I was also able to do a mdadm -D /dev/md0 which gave this output:

Code: Select all

root@AS-304T-4BA9:/etc # mdadm -D /dev/md0
/dev/md0:
        Version : 1.2
  Creation Time : Thu Jan  1 19:02:17 2015
     Raid Level : raid1
     Array Size : 2096064 (2047.28 MiB 2146.37 MB)
  Used Dev Size : 2096064 (2047.28 MiB 2146.37 MB)
   Raid Devices : 4
  Total Devices : 3
    Persistence : Superblock is persistent

    Update Time : Mon Aug 23 17:57:11 2021
          State : clean, degraded 
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

           Name : AS-304T-4BA9:0  (local to host AS-304T-4BA9)
           UUID : 21ba27b1:f0620d57:037c740d:3345ae9e
         Events : 195573

    Number   Major   Minor   RaidDevice State
       0       8       18        0      active sync   /dev/sdb2
       1       8        2        1      active sync   /dev/sda2
       4       0        0        4      removed
       5       8       34        3      active sync   /dev/sdc2
And this is the output of mdadm -D /dev/md126

Code: Select all

root@AS-304T-4BA9:/etc # mdadm -D /dev/md126
/dev/md126:
        Version : 1.2
  Creation Time : Thu Jan  1 19:02:48 2015
     Raid Level : raid1
     Array Size : 2096064 (2047.28 MiB 2146.37 MB)
  Used Dev Size : 2096064 (2047.28 MiB 2146.37 MB)
   Raid Devices : 4
  Total Devices : 3
    Persistence : Superblock is persistent

    Update Time : Mon Aug 23 17:28:29 2021
          State : clean, degraded 
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

           Name : AS-304T-4BA9:126  (local to host AS-304T-4BA9)
           UUID : 0f784e69:d7b9c2d4:c6303d4e:98be3c11
         Events : 65860

    Number   Major   Minor   RaidDevice State
       0       8       19        0      active sync   /dev/sdb3
       1       8        3        1      active sync   /dev/sda3
       4       8       35        2      active sync   /dev/sdc3
       6       0        0        6      removed
So I'm worried because it says here raid1 in this output, while this should be a raid 5 volume.
User avatar
Nazar78
Posts: 2004
Joined: Wed Jul 17, 2019 10:21 pm
Location: Singapore
Contact:

Re: Raid 5 inaccessible after defective drive swap

Post by Nazar78 »

When I put back the defective drive, I got a rebuilding. So the volume was OK again and then I proceeded with the hot swap.
Since then it keeps geving error.
What error was it? Also what is the first error you received about the array?
So I'm worried because it says here raid1 in this output, while this should be a raid 5 volume.
md0 is the OS, md126 is the swap, they are both raid1. Your volume however should be md1 raid5. You can try force reassemble it. However before issuing the command this also depends on what's currently stated in the mdstat, is it rebuilding, old disk4 or new disk4? I'm not sure exactly which state is in, with old or new disks.

Code: Select all

mdadm --assemble --force --verbose /dev/md1 /dev/sd[a-d]4
Note the defective slot is partition 4 of disk sdd.

For new disk4 you can try force assemble only sd[a-c]4 then add back the 4th disk.

EDITED: Correction I was replying from my mobile. You can't assemble the 4th new disk unless the superblock and partitions are prepared. So first force assemble with 3 disks, then backup any important files from the raid5 if possible. If ADM doesn't automatically add the 4th new disk when you reinsert, we can proceed further from there.
AS5304T - 16GB DDR4 - ADM-OS modded on 2GB RAM
Internal:
- 4x10TB Toshiba RAID10 Ext4-Journal=Off
External 5 Bay USB3:
- 4x2TB Seagate modded RAID0 Btrfs-Compression
- 480GB Intel SSD for modded dm-cache (initramfs auto update patch) and Apps

When posting, consider checking the box "Notify me when a reply is posted" to get faster response
Richard
Posts: 66
Joined: Sat Jan 03, 2015 4:38 am

Re: Raid 5 inaccessible after defective drive swap

Post by Richard »

What error was it? Also what is the first error you received about the array?
The very first error I received was in a mail I got. I don't remember the exact content but it was a notification that raid was degraded because of a faulty disk 4.
The red link on my Asustor was blinking at that time.

After the hotswap, the error what I ment was not really an error that I got by mail from the system, but just in the Overview in the GUI that the drive was inaccessible.
However before issuing the command this also depends on what's currently stated in the mdstat, is it rebuilding, old disk4 or new disk4? I'm not sure exactly which state is in, with old or new disks.
The output I gave with the mdstat command is with the old disk. It's doing nothing as far as I can see.
I also can't use the file explorer from the GUI.

Both mdadm -D commands show that the system is clean and degraded.

So I would like to try and use the force command you gave me. But should I do this with the old disk inside or the new disk?
Also, you say you don't know in exactly which state it is. Is there a way or a command I can use to show you the information needed?
What I'm also wondering about, in the outputs I see /dev/sdx2 and /dev/sdx3 (where x is a and b and c) but I don't see any /dev/sdx1. On a 4 disk raid i would expect to see /dev/sda1 somewhere.

So I tried the fdisk -l command which gave this output:

Code: Select all

root@AS-304T-4BA9:~ # fdisk -l

Disk /dev/mtdblock0: 16 MB, 16777216 bytes
255 heads, 63 sectors/track, 2 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/mtdblock0 doesn't contain a valid partition table

Disk /dev/sda: 2000.3 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks  Id System
/dev/sda1               1          33      261120  83 Linux
Partition 1 does not end on cylinder boundary
/dev/sda4               1           1        1023+ ee EFI GPT
Partition 4 does not end on cylinder boundary

Partition table entries are not in disk order

Disk /dev/sdc: 2000.3 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks  Id System
/dev/sdc1               1          33      261120  83 Linux
Partition 1 does not end on cylinder boundary
/dev/sdc4               1           1        1023+ ee EFI GPT
Partition 4 does not end on cylinder boundary

Partition table entries are not in disk order

Disk /dev/sdb: 2000.3 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks  Id System
/dev/sdb1               1          33      261120  83 Linux
Partition 1 does not end on cylinder boundary
/dev/sdb4               1           1        1023+ ee EFI GPT
Partition 4 does not end on cylinder boundary

Partition table entries are not in disk order

Disk /dev/md0: 2146 MB, 2146369536 bytes
2 heads, 4 sectors/track, 524016 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md0 doesn't contain a valid partition table

Disk /dev/md126: 2146 MB, 2146369536 bytes
2 heads, 4 sectors/track, 524016 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md126 doesn't contain a valid partition table
If I do this on my Linux server, this does not say it doesn't contain a valid partition table. Or is this now because the Raid system has to be rebuild?
Richard
Posts: 66
Joined: Sat Jan 03, 2015 4:38 am

Re: Raid 5 inaccessible after defective drive swap

Post by Richard »

I just read your edit. I didn't think I can rebuild a raid 5 with 3 disks?

So if I see this command:

Code: Select all

mdadm --assemble --force --verbose /dev/md1 /dev/sd[a-d]4
Do I have to use this 4 times? And if it's done with 3 disks do I still have to use the 4 at the end?

Like
mdadm --assemble --force --verbose /dev/md1 /dev/sda4
mdadm --assemble --force --verbose /dev/md1 /dev/sdb4
mdadm --assemble --force --verbose /dev/md1 /dev/sdc4
mdadm --assemble --force --verbose /dev/md1 /dev/sdd4

Or... sorry about this, I'm not used to rebuild raid systems and am afraid to loose data.

Edit: And does this make a raid 5 from tha raid 1 again?
User avatar
Nazar78
Posts: 2004
Joined: Wed Jul 17, 2019 10:21 pm
Location: Singapore
Contact:

Re: Raid 5 inaccessible after defective drive swap

Post by Nazar78 »

Ok I'm not sure why you have this issue, I have replaced mine with a new disk twice in a year, hot-swapped with no issue it rebuild.
So I would like to try and use the force command you gave me. But should I do this with the old disk inside or the new disk?
Do this without the 4th disks.

Code: Select all

mdadm --assemble --force --verbose /dev/md1 /dev/sd[a-c]4
What I'm also wondering about, in the outputs I see /dev/sdx2 and /dev/sdx3 (where x is a and b and c) but I don't see any /dev/sdx1. On a 4 disk raid i would expect to see /dev/sda1 somewhere.
/dev/sda1, the a refers to your physical disk, 1 refers to your partition in this case the 1 is not a data but superblock metadata for the raid config. So in a typical Asustor NAS you would see /dev/sda2 as OS partition, /dev/sda3 as swap partition and finally /dev/sda4 as your data volume partition. Those partition 2 and 3 extends to the other disks b/c/d as raid 1 mirror array. Partition 4 as whatever array you've chosen. So in short your volume data is in raid 5 of partitions /dev/sda4, /dev/sdb4, /dev/sdc4, /dev/sdd4.
Do I have to use this 4 times? And if it's done with 3 disks do I still have to use the 4 at the end?
As mentioned above the "4" is the partition on each disk not number of disks. And you don't have to run multiple times, this is called one-liner so you can combine the inputs. Just run below without the 4th disk:

Code: Select all

mdadm --assemble --force --verbose /dev/md1 /dev/sd[a-c]4
This command is the same as:

Code: Select all

mdadm --assemble --force --verbose /dev/md1 /dev/sda4 /dev/sdb4 /dev/sdc4
What I'm trying to propose here is that:
1. We try assemble your raid5 array with only 3 disks make sure it appears in /proc/mdstat
2. Backup if possible once we can see the raid5 volume.
3. Ask mdadm to purposely fail the disk #4 for all its array partitions 2,3,4. Edited: Also need to shrink the partition 4 so we can reduce the disks from 4 to 3.
4. Ask mdadm to purposely remove the disk #4 for all the array partitions 2,3,4, so now your raid5 will only have 3 disks.
5. Insert the new disk #4 and use the Asustor Storage Manager to expand the raid5 from 3 to 4 disks.

But if you're not confident, suggest you ask Asustor Support for assistance.
Last edited by Nazar78 on Tue Aug 24, 2021 2:40 am, edited 1 time in total.
AS5304T - 16GB DDR4 - ADM-OS modded on 2GB RAM
Internal:
- 4x10TB Toshiba RAID10 Ext4-Journal=Off
External 5 Bay USB3:
- 4x2TB Seagate modded RAID0 Btrfs-Compression
- 480GB Intel SSD for modded dm-cache (initramfs auto update patch) and Apps

When posting, consider checking the box "Notify me when a reply is posted" to get faster response
User avatar
Nazar78
Posts: 2004
Joined: Wed Jul 17, 2019 10:21 pm
Location: Singapore
Contact:

Re: Raid 5 inaccessible after defective drive swap

Post by Nazar78 »

For the partitions, this command might give you a better picture. I ran it in chroot but you can install lsblk from opkg entware. Note I've disabled my swap so you don't see sda3 as swap. I also have a mirror of the ADM OS /dev/md0 in SSD (external enclosure) hence the /dev/sde2 2GB partition.

Code: Select all

(Chroot)root@Nimbustor4:~# lsblk
NAME         MAJ:MIN RM   SIZE RO TYPE   MOUNTPOINT
loop0          7:0    0     1M  0 loop
sda            8:0    0   9.1T  0 disk
├─sda1         8:1    0   255M  0 part
├─sda2         8:2    0     2G  0 part
│ └─md0        9:0    0     2G  0 raid1
├─sda3         8:3    0     2G  0 part
└─sda4         8:4    0   9.1T  0 part
  └─md1        9:1    0  18.2T  0 raid10 /share/Video
sdb            8:16   0   9.1T  0 disk
├─sdb1         8:17   0   255M  0 part
├─sdb2         8:18   0     2G  0 part
│ └─md0        9:0    0     2G  0 raid1
├─sdb3         8:19   0     2G  0 part
└─sdb4         8:20   0   9.1T  0 part
  └─md1        9:1    0  18.2T  0 raid10 /share/Video
sdc            8:32   0   9.1T  0 disk
├─sdc1         8:33   0   255M  0 part
├─sdc2         8:34   0     2G  0 part
│ └─md0        9:0    0     2G  0 raid1
├─sdc3         8:35   0     2G  0 part
└─sdc4         8:36   0   9.1T  0 part
  └─md1        9:1    0  18.2T  0 raid10 /share/Video
sdd            8:48   0   9.1T  0 disk
├─sdd1         8:49   0   255M  0 part
├─sdd2         8:50   0     2G  0 part
│ └─md0        9:0    0     2G  0 raid1
├─sdd3         8:51   0     2G  0 part
└─sdd4         8:52   0   9.1T  0 part
  └─md1        9:1    0  18.2T  0 raid10 /share/Video
sde            8:64   0 447.1G  0 disk
├─sde1         8:65   0 445.1G  0 part   /share/Web
└─sde2         8:66   0     2G  0 raid1
sdf            8:80   0   1.8T  0 disk
└─md2          9:2    0   5.5T  0 raid5  /share/Media/MD2-RAID5
sdg            8:96   0   1.8T  0 disk
└─md2          9:2    0   5.5T  0 raid5  /share/Media/MD2-RAID5
sdh            8:112  0   1.8T  0 disk
└─md2          9:2    0   5.5T  0 raid5  /share/Media/MD2-RAID5
sdi            8:128  0   1.8T  0 disk
└─md2          9:2    0   5.5T  0 raid5  /share/Media/MD2-RAID5
mmcblk0      179:0    0   3.5G  0 disk
├─mmcblk0p1  179:1    0     2M  0 part
├─mmcblk0p2  179:2    0   244M  0 part
└─mmcblk0p3  179:3    0   244M  0 part
mmcblk0boot0 179:8    0     2M  1 disk
mmcblk0boot1 179:16   0     2M  1 disk
Last edited by Nazar78 on Tue Aug 24, 2021 1:58 am, edited 1 time in total.
AS5304T - 16GB DDR4 - ADM-OS modded on 2GB RAM
Internal:
- 4x10TB Toshiba RAID10 Ext4-Journal=Off
External 5 Bay USB3:
- 4x2TB Seagate modded RAID0 Btrfs-Compression
- 480GB Intel SSD for modded dm-cache (initramfs auto update patch) and Apps

When posting, consider checking the box "Notify me when a reply is posted" to get faster response
Richard
Posts: 66
Joined: Sat Jan 03, 2015 4:38 am

Re: Raid 5 inaccessible after defective drive swap

Post by Richard »

I thought to try something and add the 4 disk.
But stupid me used this command.

mdadm --manage /dev/md0 -a /dev/sdd
instead of mdadm --manage /dev/md0 -a /dev/sdd4

And it started synching, but I stopped it in time, but I cant delete it anymore

Code: Select all

root@AS-304T-4BA9:~ # cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md126 : active raid1 sdb3[0] sdc3[4] sda3[1]
      2096064 blocks super 1.2 [4/3] [UUU_]
      
md0 : active raid1 sdd[4](F) sdb2[0] sdc2[5] sda2[1]
      2096064 blocks super 1.2 [4/3] [UU_U]
but can't remove it anymore.

Code: Select all

root@AS-304T-4BA9:~ # mdadm --manage /dev/md0 --remove /dev/sdd
mdadm: Cannot find /dev/sdd: No such file or directory
root@AS-304T-4BA9:~ # mdadm --manage /dev/md0 --remove /dev/sdd4
mdadm: Cannot find /dev/sdd4: No such file or directory
I think I will keep off things until I'm sure what I have to do.
Because I was just thinking, if I force rebuild a raid 5 on 3 disks, that would mean I would loose free space on those 3 disks, and I don't know if I have enough free space on those.
but you can install lsblk from opkg entware.
How do I do this? Because opkg install lsblk gives that opkg is not found.
Richard
Posts: 66
Joined: Sat Jan 03, 2015 4:38 am

Re: Raid 5 inaccessible after defective drive swap

Post by Richard »

I just tried the command you gave, without disk 4 inserted and this was the output:

Code: Select all

root@AS-304T-4BA9:~ # mdadm --assemble --force --verbose /dev/md1 /dev/sd[a-c]4
mdadm: looking for devices for /dev/md1
mdadm: /dev/sda4 is identified as a member of /dev/md1, slot 1.
mdadm: /dev/sdb4 is identified as a member of /dev/md1, slot 0.
mdadm: /dev/sdc4 is identified as a member of /dev/md1, slot 2.
mdadm: added /dev/sda4 to /dev/md1 as 1
mdadm: added /dev/sdc4 to /dev/md1 as 2
mdadm: no uptodate device for slot 6 of /dev/md1
mdadm: added /dev/sdb4 to /dev/md1 as 0
mdadm: /dev/md1 assembled from 2 drives and 1 rebuilding - not enough to start the array.
Edit:
mdadm: no uptodate device for slot 6 of /dev/md1 was maybe because I added the disk the wrong way. I don't have a slot 6. Or I misunderstand something.
Output of mdamd -d /dev/md0 gives now this without disk 4 inserted. So maybe I have to remove something some way?

Code: Select all

    Number   Major   Minor   RaidDevice State
       0       8       18        0      active sync   /dev/sdb2
       1       8        2        1      active sync   /dev/sda2
       4       0        0        4      removed
       5       8       34        3      active sync   /dev/sdc2

       4       8       48        -      faulty
Just wondering, what would happen if I used that force rebuild command with the new disk inserted?
Post Reply

Return to “Backup and Data Protection”