Problem expanding pool after drive upgrade

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
I don't honestly know which versions of ZFS are on CORE and SCALE, I could look tomorrow. If they are the same version, Moving to CORE just to make the VDEV work might be worth it, then go back to SCALE. And SCALE did get a minor update today, 23.10.1.1. The changelog didn't mention anything about ZFS changes.
 

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959
Unfortunately, versions went away in favor of feature flags... so far, short of turning up a current Core (which I might do in a bit) and creating a sample pool, I haven't found a good reference on what's supported in Core vs Scale. I'm a little hesitant to try jumping to Core - something is obviously a touch wonky somewhere, and I don't want to upset the balance and wind up with the vdev, and subsequently the pool, going tango uniform on me.

I pulled the zpool history and found that the creation date for this pool in its current form was Christmas Eve 2015:
Code:
2015-12-24.04:24:06 zpool create -o cachefile=/data/zfs/zpool.cache -o failmode=continue -o autoexpand=on -O compression=lz4 -O aclmode=passthrough -O aclinherit=passthrough -f -m /Tier3 -o altroot=/mnt
Tier3 raidz2 /dev/gptid/26d2563c-a9f6-11e5-b3e9-002590869c3c /dev/gptid/277b82e6-a9f6-11e5-b3e9-002590869c3c /dev/gptid/28279512-a9f6-11e5-b3e9-002590869c3c /dev/gptid/28d8c104-a9f6-11e5-b3e9-002590869c3
c /dev/gptid/298897cc-a9f6-11e5-b3e9-002590869c3c /dev/gptid/2a3c8e32-a9f6-11e5-b3e9-002590869c3c spare /dev/gptid/2af81b3c-a9f6-11e5-b3e9-002590869c3c


Unless the iX gurus come up with something, I think I'm just going to say that more than 8 years is a pretty damn good life for a pool and go ahead and rebuild with the new drives. It'll let me clean out some cruft at the same time.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Unless the iX gurus come up with something, I think I'm just going to say that more than 8 years is a pretty damn good life for a pool and go ahead and rebuild with the new drives. It'll let me clean out some cruft at the same time.
I can understand that. It would be nice to know if there is a solution to your issue however you have lived with it long enough, time to get on with life.
 

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959
I can understand that. It would be nice to know if there is a solution to your issue however you have lived with it long enough, time to get on with life.
Yep. I haven't written it off quite yet. The new drives supposedly arrive end of day tomorrow. Then it will be 9-10 days to run badblocks and long smart. The bug report is filled with details and a debug and was very quickly assigned, where others seem to languish, so maybe the iX gurus will have an answer. If nothing else, maybe it will prevent someone else from having the same issue I am.
 

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959
Because I'm an impatient fsck, I did a fast badblocks run... no errors, so I pressed on. Built a new pool of the 18TBs, replicated the data, exported and imported to rename, confirmed everything working, then built a second vdev with the 20TB drives. Dealt with a little Active Directory issue (unrelated), but now all is well. And I should have enough free space for at least a year or two:
1705794086189.png
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
And I should have enough free space for at least a year or two:
LOL, You think ! ~138TB Usable Capacity, that is a lot. I'm happy with ~8TB of storage. Of course my use case is different that yours.

Glad you have everything up and running again.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
I think there's a bug in SCALE where the partition created on the replacement disk is the same size as on the old disk--I'm encountering the same issue. Ticket here:
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
I think there's a bug in SCALE where the partition created on the replacement disk is the same size as on the old disk--I'm encountering the same issue. Ticket here:
That is good to know about, sucks to have the bug in the first place.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
sucks to have the bug in the first place.
Agreed. There was a bug in 23.10.0 that was supposed to be fixed in 23.10.1, but apparently it wasn't.
 

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959
Agreed. There was a bug in 23.10.0 that was supposed to be fixed in 23.10.1, but apparently it wasn't.
Wonder if I can forward my second Newegg bill to iX for reimbursement?
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Wonder if I can forward my second Newegg bill to iX for reimbursement?
You can, and they would enjoy it for bringing in some humor on a dark gloomy day.
 

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959
You can, and they would enjoy it for bringing in some humor on a dark gloomy day.
Lol I figured as much. I can't complain... TN (and FN) have served me very well for well over a decade, after I realized that Synology wasn't really all that.

The bug has been acknowledged and assigned highest priority for a fix. Looks like it's slated for 24.04/24.10. Alexander noticed something I didn't... after the resize, the partition start gets changes from sector 4096 to sector 2048. No wonder the drive goes missing. I guess it's exceedingly lucky that the middleware apparently does the resize serially and not in parallel - if it had done that to all 6 drives, it would have been a really bad day, especially on a pool with more than one vdev.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Annoying bug this one. It's one of those that takes forever to convince you that it is a real bug.
Not TrueNAS, but a similar vibe: the other day I spent an afternoon trying to migrate a few SATA boot SSDs to NVMe, but the partitions were always getting messed up, with warnings that they'd exceeded the available L As, even though the new disks were larger. Turns out, a while before this, I'd formatted the new SSDs to present 4k blocks instead of 512-byte blocks, since they'd be using ashift=12 anyway. By the time I got around to using the disks, I'd forgotten all about it.
 
Top