(another) Failed drive, can't replace through documentation thr

eyocum

Dabbler
Joined
Mar 16, 2015
Messages
16
Hi, all;

I'm using Scale 22.12.2, on a FreeNAS-Mini # XL.

I had a drive fail on my pool (red overlay on the drive in the VDEV/Device list), ordered a replacement drive and went through the documentation steps to replace it. When those didn't work, I found the long list of posts about Failed= Offline, powering off, etc. I replaced the drive and powered back up, and have this now:

1710707961004.png



When I try the replace on the disk info, it lists the former sdd member in the member disk:


1710708346479.png


but this is the opposite of what the docs say (sdd should be the Replacing Disk, and Member Disk should be 1701..etc at least from my reading).

From the other threads, it seems like those poeple were able to go through the process properly when they brought their systems up and mine isn't.

Before I start poking at things, I want advice. What more information would be helpful to have?

Thanks!
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
I don't understand the question. you are replacing the missing 1701 with existing sdd.

also, just the model # of a freenas xl doesn't really satisfy the forum requirements for listing your hardware. while limited the mins do still have varying hardware config - mainly RAM and any addin card.
 

eyocum

Dabbler
Joined
Mar 16, 2015
Messages
16
I apologize - I was tired when I wrote the first post, I'll try again.

My pool had started giving me error counts increases and offline uncorrectable sector error messages. I ordered a replacement drive under warranty. The pool is a 4 drive member, and the drive having problems was sdd. The new disk is the same model, capacity, etc as the other members of the pool.

When the drive failed (the sdd member had a red flagged status and the pool has yellow graded flag), I followed the steps in the documentation for a swap. In the step to offline the drive to prep for swap, the entry for the sdd would not go into offline mode.

I looked at the forums and found several posts about this issue, the basic answer being that a failed status was an offline status already. Just power the system off, swap the drives, power back up and continue the process of replacement.

When I did that, the new drive appeared as the 1701etc (first screen shot), and when I tried to place it into the pool the second screen shot made it appear that 1701etc was already a member of the pool and replacement seemed to be the existing sdd.

It seems that the system doesn't recognize the disk as a valid additon to/replacement to the pool, but these are the steps that the manual say to follow: https://www.truenas.com/docs/scale/22.12/scaletutorials/storage/pools/disks/replacingdisks/. I used the My Disk is faulted. Should I replace it? step, and that's where I'm stuck. Either I did something incorrectly when I removed the drive, or something when I'm trying to add/replace the drive.

I'm sorry about the wrong hardware info. The system has been great until this drive errored, it doesn't seem a system fault. The only change to the stock system I've made is an upgrade to 64 in ram.
 

chuck32

Guru
Joined
Jan 14, 2023
Messages
623
Can you go to storage - disks and confirm sdd is currently not assigned any pool?

The identifiers sdX change between reboots. It may very well be that sdd is your unused drive. The former member of the pool 1701 is not available anymore, since you replaced it.
You could also verify in the disks section that the serial of sdd matches your new drive.
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
it looks normal to me; its telling you that you are replacing 1701 and asking which disk you would like to replace it with. it will not show already used disks in the list. the wording could perhaps be more clear.
 

eyocum

Dabbler
Joined
Mar 16, 2015
Messages
16
Can you go to storage - disks and confirm sdd is currently not assigned any pool?

The identifiers sdX change between reboots. It may very well be that sdd is your unused drive. The former member of the pool 1701 is not available anymore, since you replaced it.
You could also verify in the disks section that the serial of sdd matches your new drive.

Yes, I made sure that the serial matched the one in the error message, with the drive I removed. If I had pulled a good/non errored drive by accident, wouldn'tthat have faulted the pool as a whole? I can still access the data on it and it is only showing as degraded. Just checked in Storage-> Disks, the drive I pulled that was assigned sdd isn't in the list that's still in, based on the serial number that the error message has in it.
 
Last edited:

eyocum

Dabbler
Joined
Mar 16, 2015
Messages
16
So in this screen shot:

1711593695197.png


The system thinks that 1701etc is already a member of the pool, even though the disk isn't assigned an id? Ok.
Is there a way to remove or delete it from the list of assets, so I can add it into the pool and restore the full functionality of the pool?
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
that is not a drive, that is a placeholder for where a drive used to be but is now gone/missing/etc. you need to replace the placeholder with a drive, which is exactly what the screenshot you originally posted is showing - replace 1701blah with sdd.

it seems like you are making this more complicated than it needs to be. the placeholder varies; /dev/(s)da, /dev/gptid, a string of numbers like here. it doesnt matter what it is, just replace it with a new disk.
truenas will only show unused disks in the replacement UI.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Yes, the dialog is confusing, its asking you which disk to replace 1701xxxxx etc with.

The answer is "the only disk which is not in use" which is the only disk in the list... which is sdd
 
Top