Raid 5 inaccessible after defective drive swap

Backup and data protection discussion at its finest.

Moderator: Lillian.W@AST

User avatar
Nazar78
Posts: 2002
youtube meble na wymiar Warszawa
Joined: Wed Jul 17, 2019 10:21 pm
Location: Singapore
Contact:

Re: Raid 5 inaccessible after defective drive swap

Post by Nazar78 »

Richard wrote:I thought to try something and add the 4 disk.
But stupid me used this command.

mdadm --manage /dev/md0 -a /dev/sdd
instead of mdadm --manage /dev/md0 -a /dev/sdd4

And it started synching, but I stopped it in time, but I cant delete it anymore

Code: Select all

root@AS-304T-4BA9:~ # cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md126 : active raid1 sdb3[0] sdc3[4] sda3[1]
      2096064 blocks super 1.2 [4/3] [UUU_]
      
md0 : active raid1 sdd[4](F) sdb2[0] sdc2[5] sda2[1]
      2096064 blocks super 1.2 [4/3] [UU_U]
but can't remove it anymore.

Code: Select all

root@AS-304T-4BA9:~ # mdadm --manage /dev/md0 --remove /dev/sdd
mdadm: Cannot find /dev/sdd: No such file or directory
root@AS-304T-4BA9:~ # mdadm --manage /dev/md0 --remove /dev/sdd4
mdadm: Cannot find /dev/sdd4: No such file or directory
Erm why did you do this on md0?
I think I will keep off things until I'm sure what I have to do.
Because I was just thinking, if I force rebuild a raid 5 on 3 disks, that would mean I would loose free space on those 3 disks, and I don't know if I have enough free space on those.
Another to think of, before removing the partition should be shrinked.
but you can install lsblk from opkg entware.
How do I do this? Because opkg install lsblk gives that opkg is not found.
Oh yes your volume1 is not available yet.
AS5304T - 16GB DDR4 - ADM-OS modded on 2GB RAM
Internal:
- 4x10TB Toshiba RAID10 Ext4-Journal=Off
External 5 Bay USB3:
- 4x2TB Seagate modded RAID0 Btrfs-Compression
- 480GB Intel SSD for modded dm-cache (initramfs auto update patch) and Apps

When posting, consider checking the box "Notify me when a reply is posted" to get faster response
Richard
Posts: 66
Joined: Sat Jan 03, 2015 4:38 am

Re: Raid 5 inaccessible after defective drive swap

Post by Richard »

Erm why did you do this on md0?
I've seen this on YT. But as said, this was stupid of me but I managed to undo this. The /dev/md0 looks like before now.
4. Ask mdadm to purposely remove the disk #4 for all the array partitions 2,3,4, so now your raid5 will only have 3 disks.
I'm indeed not confident with that, because a raid array always needs space. So if you build a raid 5 on 3 disks, the space needed is used on the 3 disks. I'm not confident I have enough space on those 3 to build a raid 5 on them.
I rather not shrink them because all my data is on these.

I already have a ticket with Asustor, but in my experience in the past, they did not answer that fast.

So I would like to rebuild if possible, but with 4 disks. If that is possible.
Last edited by Richard on Tue Aug 24, 2021 2:37 am, edited 1 time in total.
User avatar
Nazar78
Posts: 2002
Joined: Wed Jul 17, 2019 10:21 pm
Location: Singapore
Contact:

Re: Raid 5 inaccessible after defective drive swap

Post by Nazar78 »

Richard wrote:Just wondering, what would happen if I used that force rebuild command with the new disk inserted?
Don't do this it will damage your partition as you the array size will be smaller than the partition. The aim here is so you are able to see your volume1 to backup.

Can you do a cat /proc/mdstat now?
AS5304T - 16GB DDR4 - ADM-OS modded on 2GB RAM
Internal:
- 4x10TB Toshiba RAID10 Ext4-Journal=Off
External 5 Bay USB3:
- 4x2TB Seagate modded RAID0 Btrfs-Compression
- 480GB Intel SSD for modded dm-cache (initramfs auto update patch) and Apps

When posting, consider checking the box "Notify me when a reply is posted" to get faster response
Richard
Posts: 66
Joined: Sat Jan 03, 2015 4:38 am

Re: Raid 5 inaccessible after defective drive swap

Post by Richard »

With 3 disks inserted, this is the current cat /proc/mdstat.

Code: Select all

root@AS-304T-4BA9:~ # cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md126 : active raid1 sdb3[0] sdc3[4] sda3[1]
      2096064 blocks super 1.2 [4/3] [UUU_]
      
md0 : active raid1 sdb2[0] sdc2[5] sda2[1]
      2096064 blocks super 1.2 [4/3] [UU_U]
      
unused devices: <none>
I editted my previous messages as I removed my mistake from md0.
User avatar
Nazar78
Posts: 2002
Joined: Wed Jul 17, 2019 10:21 pm
Location: Singapore
Contact:

Re: Raid 5 inaccessible after defective drive swap

Post by Nazar78 »

Ok now insert the old disk #4, wait awhile then cat /proc/mdstat again post here to see any differences.

If you see md1 quickly backup the volume1, if you don't see md1 try force assemble it:

Code: Select all

mdadm --assemble --force --verbose /dev/md1 /dev/sd[a-d]4
Remember this is for md1 and disks a, b, c, d, partitions #4.
AS5304T - 16GB DDR4 - ADM-OS modded on 2GB RAM
Internal:
- 4x10TB Toshiba RAID10 Ext4-Journal=Off
External 5 Bay USB3:
- 4x2TB Seagate modded RAID0 Btrfs-Compression
- 480GB Intel SSD for modded dm-cache (initramfs auto update patch) and Apps

When posting, consider checking the box "Notify me when a reply is posted" to get faster response
Richard
Posts: 66
Joined: Sat Jan 03, 2015 4:38 am

Re: Raid 5 inaccessible after defective drive swap

Post by Richard »

With the old disk 4 inserted, looks exactly the same:

Code: Select all

root@AS-304T-4BA9:~ # cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md126 : active raid1 sdb3[0] sdc3[4] sda3[1]
      2096064 blocks super 1.2 [4/3] [UUU_]
      
md0 : active raid1 sdb2[0] sdc2[5] sda2[1]
      2096064 blocks super 1.2 [4/3] [UU_U]
      
unused devices: <none>
So I tried the new command you gave and now this is the output:

Code: Select all

root@AS-304T-4BA9:~ # mdadm --assemble --force --verbose /dev/md1 /dev/sd[a-d]4
mdadm: looking for devices for /dev/md1
mdadm: /dev/sda4 is identified as a member of /dev/md1, slot 1.
mdadm: /dev/sdb4 is identified as a member of /dev/md1, slot 0.
mdadm: /dev/sdc4 is identified as a member of /dev/md1, slot 2.
mdadm: added /dev/sda4 to /dev/md1 as 1
mdadm: added /dev/sdc4 to /dev/md1 as 2
mdadm: no uptodate device for slot 6 of /dev/md1
mdadm: added /dev/sdb4 to /dev/md1 as 0
mdadm: /dev/md1 assembled from 2 drives and 1 rebuilding - not enough to start the array.
still no md1 tho. Don't understand why it's still giving the "not enough to start the array" notice.
Richard
Posts: 66
Joined: Sat Jan 03, 2015 4:38 am

Re: Raid 5 inaccessible after defective drive swap

Post by Richard »

I'm not really into raid... but could it be in your command we shoud use the --level=5 --raid-devices=4 as part of the commandline?
Or is that only when a new array is created?

Also in mdadm detail it still says Raid device 4 and total device 3, and at the bottom stats one as removed.

Shouldn't the 4th disk just be added again? Which would also start automatic resync.
It also started to resync when I did the stupid action to add /dev/sdd to md0 but that was wrong and luckily I managed to stop the resync instantly.

But I thought that a removed disk had to be added again if it was removed by the system. I can be mistaken, not sure of anything anymore. ;)

Found some info about the error in the last line and why that happens when drives are marked as being removed.
https://www.thegeekdiary.com/not-enough ... aid-array/
User avatar
Nazar78
Posts: 2002
Joined: Wed Jul 17, 2019 10:21 pm
Location: Singapore
Contact:

Re: Raid 5 inaccessible after defective drive swap

Post by Nazar78 »

Richard wrote:I'm not really into raid... but could it be in your command we shoud use the --level=5 --raid-devices=4 as part of the commandline?
Or is that only when a new array is created?
It will automatically determine the existing raid level and disks. We only issue those extras for creating. I'm doing this via script to assemble on boot for my external usb enclosure raid5: mdadm --assemble --force --verbose /dev/md2 /dev/sd[f-i]>/dev/null

Also in mdadm detail it still says Raid device 4 and total device 3, and at the bottom stats one as removed.
Because it can't find your /dev/sdd4. This is a data partition created by ADM when a new disk is inserted and initialized (besides the superblock and OS/swap partition). I also didn't recall seeing this disk #4 in your previous fdisk -l. In your old disk #4 it should have shown the partitions.

Shouldn't the 4th disk just be added again? Which would also start automatic resync.
Yes and this is weird. The question is why slot 6? It should be slot 0,1,2,3 where 3 is your last bay. Let Asustor Support answer this.

It also started to resync when I did the stupid action to add /dev/sdd to md0 but that was wrong and luckily I managed to stop the resync instantly.

But I thought that a removed disk had to be added again if it was removed by the system. I can be mistaken, not sure of anything anymore. ;)
Yes it should be that way. I think the problem starts when the NAS was first shutdown not hot-swapped but again I also have no issues shutting down a broken array only to start it up then reassemble as normal. I hope you have an existing backup of the raid5 volume1, at least those important data.
AS5304T - 16GB DDR4 - ADM-OS modded on 2GB RAM
Internal:
- 4x10TB Toshiba RAID10 Ext4-Journal=Off
External 5 Bay USB3:
- 4x2TB Seagate modded RAID0 Btrfs-Compression
- 480GB Intel SSD for modded dm-cache (initramfs auto update patch) and Apps

When posting, consider checking the box "Notify me when a reply is posted" to get faster response
Richard
Posts: 66
Joined: Sat Jan 03, 2015 4:38 am

Re: Raid 5 inaccessible after defective drive swap

Post by Richard »

#4 in your previous fdisk -l. In your old disk #4 it should have shown the partitions.
When I do an fdisk -l with the old disk present, I see something about the 4th disk, but I'm not sure if that is because of my previous action or not.

This is the last part of fdisk -l with the old disk 4 present:

Code: Select all

Disk /dev/sdd: 2000.3 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks  Id System
/dev/sdd1               1          33      261120  83 Linux
Partition 1 does not end on cylinder boundary
/dev/sdd4               1           1        1023+ ee EFI GPT
Partition 4 does not end on cylinder boundary
Drive /dev/sddx should be my 4th disk. So it seems as if fdisk can see it if I'm not mistaken.

Unfortunately I do not have a backup, as the NAS is my backup system. So I got the original system, but I also used this to keep music and loads of movies, which I don't have a backup for as I did not expect a raid 5 to fail. And also i wouldn't have enough space anywhere else to backup a 5 TB NAS.

With my servers in the datacenter on replacement, they also shutdown the system, replaced harddisk and started again and raid was rebuilding nicely. So I also have no clue onto why it suddenly became inaccessible on a NAS. I hope Asustor will find a solution.

But that slot 6 is odd indeed, I'm curious to Asustor's answer to that.
User avatar
Nazar78
Posts: 2002
Joined: Wed Jul 17, 2019 10:21 pm
Location: Singapore
Contact:

Re: Raid 5 inaccessible after defective drive swap

Post by Nazar78 »

If you have a machine with usb enclosure, for the time being awaiting Asustor's reply, you can try mount it to that machine using linux like debian/ubuntu just to see if you can view the raid5. Sorry not meant to add more despair but raid is for redundancy not backup, most datacenter I've been to will always have backups on top of the arrays. This is part of their Disaster Recovery Plan (DRP). I do hope all ends well.
AS5304T - 16GB DDR4 - ADM-OS modded on 2GB RAM
Internal:
- 4x10TB Toshiba RAID10 Ext4-Journal=Off
External 5 Bay USB3:
- 4x2TB Seagate modded RAID0 Btrfs-Compression
- 480GB Intel SSD for modded dm-cache (initramfs auto update patch) and Apps

When posting, consider checking the box "Notify me when a reply is posted" to get faster response
Post Reply

Return to “Backup and Data Protection”