Is this a bad sign: smartd: 1 Currently unreadable (pending) sectors....?

Status
Not open for further replies.

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I would plug the new drive into a windows machine and delete the partition it has. That could be the problem.
 

HHawk

Contributor
Joined
Jun 8, 2011
Messages
176
But wouldn't wiping it (in Freenas) have fixed this in the first place?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I suppose that would do the same thing.
 

HHawk

Contributor
Joined
Jun 8, 2011
Messages
176
Then it means it still doesn't want the harddisk and says it's the harddisk is to small.
I am really getting the impression I did something wrong somewhere down the line.

:(

I checked the status with zpool status command and it shows the new drive being offline. I also fooled around a bit by trying to online it from the command prompt, however then it would change to unavail(able). I really expected replacing a faulty harddisk with a new one to be less problematic than this. :(


//edit

This is what zpool status -v says:

[root@freenas] ~# zpool status -v
pool: storage
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: scrub repaired 0 in 3h42m with 0 errors on Tue Nov 27 12:39:34 2012
config:

NAME STATE READ WRITE CKSUM
storage DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
gptid/19177fb9-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0
gptid/19b5ec3a-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0
6201553240551106299 OFFLINE 0 0 0 was /dev/gptid/1a551ccf-25fa-11e2-9ab0-00151736994a
gptid/1aefa3e9-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0
gptid/1b8f2b64-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0
gptid/1c2d6a74-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0

errors: No known data errors

Probably not useful, but I am getting desperate now in finding a solution...

//update 2

Shouldn't it show ada0 / ada1 / ada2 / etc instead of 'gptid'-etc...?
 

bollar

Patron
Joined
Oct 28, 2012
Messages
411
Then it means it still doesn't want the harddisk and says it's the harddisk is to small.
I am really getting the impression I did something wrong somewhere down the line.

:(

I checked the status with zpool status command and it shows the new drive being offline. I also fooled around a bit by trying to online it from the command prompt, however then it would change to unavail(able). I really expected replacing a faulty harddisk with a new one to be less problematic than this. :(

If there weren't dozens of others having the exact same problem using various versions of ZFS on various platforms, I'd be inclined to agree. It's more likely that you're the unlucky winner in the drive replacement lottery.

If I were the one facing this issue, I'd find/buy a larger drive and install it.

A former Sun employee references the "too small" problem here: Home Server: RAID-GREED and Why Mirroring is Still Best
 

HHawk

Contributor
Joined
Jun 8, 2011
Messages
176
If there weren't dozens of others having the exact same problem using various versions of ZFS on various platforms, I'd be inclined to agree. It's more likely that you're the unlucky winner in the drive replacement lottery.

If I were the one facing this issue, I'd find/buy a larger drive and install it.

A former Sun employee references the "too small" problem here: Home Server: RAID-GREED and Why Mirroring is Still Best

Yes you mentioned this earlier as well, however why did it say the same thing with the original drive after wiping it? It doesn't make any sense. It should have accepted the original / old drive after wiping... Right?

//edit

Also mentioned on the same page your referred me to and I quote:

Oh, and as a side note: I bought three different 1.5 TB drives to check for the "sorry-your-disk-is-just-a-few-blocks-too-small-error" and can report that the 1.5TB from WD, Samsung (F2) and Seagate (7200 rpm) are indeed the same size and can be used interchangeable.

So I sincerly doubt this is the problem. I think it's something else.
 

bollar

Patron
Joined
Oct 28, 2012
Messages
411
Yes you mentioned this earlier as well, however why did it say the same thing with the original drive after wiping it? It doesn't make any sense. It should have accepted the original / old drive after wiping... Right?

Maybe/maybe not -- maybe one of the other five disks in the vdev is a different size or something like that. In any event, that's not something that is fixable.

What we do know is that the drives you have aren't being accepted by the array and the zpool status -v looks normal for a degraded array waiting for a disk. And you already know what I'd do if I had this problem.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I'd wait and see if paleoN or protosd answers back. They know a boatload of crazy little things that have saved random people here and there. I'm sure one of them will pop in sometime today.

Edit: I'm not convinced that this isn't fixable. But I'm not sure how exactly to fix it.

Also, what version of FreeNAS are you using? I looked through the thread and I don't think you said exactly what version you have.
 

HHawk

Contributor
Joined
Jun 8, 2011
Messages
176
I'd wait and see if paleoN or protosd answers back. They know a boatload of crazy little things that have saved random people here and there. I'm sure one of them will pop in sometime today.

Yeah I will wait, before I really start to do some permanent damage. Hopefully they know what's wrong, when reading the newest posts of today...

And I really think the problem is being caused somewhere else, because when I readup on the docs on Freenas, it shows it should said it is going to replace some adaX (see here: http://doc.freenas.org/images/e/e3/Replace1a.jpeg) instead of None (as it's displaying for me). I think that there is the problem, because if Freenas is replacing a disk called 'None', than how would it now it's size? And therefor I am getting the error. That's my best guess...

//edit

I am using the latest FreeNAS 8.3.0. I thought I mentioned this before, but apparently I only mentioned 'latest version', my bad.
 

HHawk

Contributor
Joined
Jun 8, 2011
Messages
176
In case paleoN or protosd decide to take another look, I decided to summarize my findings.

Received a replacement harddisk from Western Digital:
Device Model: WDC WD20EARX-008FB0
Original drive was a: Device Model: WDC WD20EARX-00PASB0
However everything is the same for the rest; same amount of space, bytes etc.

More detailed information on both drives can be found here: http://forums.freenas.org/showthrea...pending)-sectors&p=45027&viewfull=1#post45027

I am getting the following error when I select 'Replace' in 'Volume Status':
Dec 3 19:56:15 freenas notifier: swapoff: /dev/ada2p1: Invalid argumentDec 3 19:56:15 freenas notifier: 1+0 records in
Dec 3 19:56:15 freenas notifier: 1+0 records outDec 3 19:56:15 freenas notifier: 1048576 bytes transferred in 0.003348 secs (313184256 bytes/sec)
Dec 3 19:56:16 freenas notifier: dd: /dev/ada2: short write on character device
Dec 3 19:56:16 freenas notifier: dd: /dev/ada2: end of device
Dec 3 19:56:16 freenas notifier: 5+0 records in
Dec 3 19:56:16 freenas notifier: 4+1 records out
Dec 3 19:56:16 freenas notifier: 4284416 bytes transferred in 0.047198 secs (90775262 bytes/sec)
Dec 3 19:56:17 freenas manage.py: [middleware.exceptions:38] [MiddlewareError: Disk replacement failed: "cannot replace 6201553240551106299 with gptid/1dbe3cfd-3d7b-11e2-8af1-00151736994a: device is too small, "]

When selecting replace disk (in Disk Replacement screen) it says:
Replacing disk None
Member disk ada2 (2.0 TB)

Running zpool status -v shows this (don't know if this is normal):

[root@freenas] ~# zpool status -v
pool: storage
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: scrub repaired 0 in 3h42m with 0 errors on Tue Nov 27 12:39:34 2012
config:

NAME STATE READ WRITE CKSUM
storage DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
gptid/19177fb9-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0
gptid/19b5ec3a-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0
6201553240551106299 OFFLINE 0 0 0 was /dev/gptid/1a551ccf-25fa-11e2-9ab0-00151736994a
gptid/1aefa3e9-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0
gptid/1b8f2b64-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0
gptid/1c2d6a74-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0

errors: No known data errors

Running FreeNAS 8.3.0

I think I have made a nice summary (instead of searching through all posts) in case paleoN or protosd are taking a look in this thread...
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
I think I have made a nice summary (instead of searching through all posts) in case paleoN or protosd are taking a look in this thread...
Just as I finished reading through the rest of it. The summary would be nice if you used [code][/code] tags, do so for future postings at least.

Output of the following:
Code:
camcontrol identify /dev/ada2

camcontrol identify /dev/ada1
You can drop the Features part. So, that's the replacement drive and one of the existing drives in the pool.

Also:
Code:
gpart show

glabel status
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Oh yeah.. the big guns are here!
 

HHawk

Contributor
Joined
Jun 8, 2011
Messages
176
I was just planning to go to bed and saw you replied paleoN. Thanks!!

Here is the output (in code-tags).

Code:
pass2: <WDC WD20EARX-008FB0 51.0AB51> ATA-8 SATA 3.x device
pass2: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)

protocol              ATA/ATAPI-8 SATA 3.x
device model          WDC WD20EARX-008FB0
firmware revision     51.0AB51
serial number         WD-WCAZAJ407237
WWN                   50014ee2079f3b54
cylinders             16383
heads                 16
sectors/track         63
sector size           logical 512, physical 4096, offset 0
LBA supported         268435455 sectors
LBA48 supported       3907029168 sectors
PIO supported         PIO4
DMA supported         WDMA2 UDMA6 


Code:
pass1: <WDC WD20EARX-00PASB0 51.0AB51> ATA-8 SATA 3.x device
pass1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)

protocol              ATA/ATAPI-8 SATA 3.x
device model          WDC WD20EARX-00PASB0
firmware revision     51.0AB51
serial number         WD-WMAZA5281618
WWN                   50014ee0581f67ed
cylinders             16383
heads                 16
sectors/track         63
sector size           logical 512, physical 4096, offset 0
LBA supported         268435455 sectors
LBA48 supported       3907029168 sectors
PIO supported         PIO4
DMA supported         WDMA2 UDMA6 

Seems identical enough.

gpart show output:
Code:
=>        34  3907029101  ada0  GPT  (1.8T)
          34          94        - free -  (47k)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  3902834696     2  freebsd-zfs  (1.8T)
  3907029128           7        - free -  (3.5k)

=>        34  3907029101  ada1  GPT  (1.8T)
          34          94        - free -  (47k)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  3902834696     2  freebsd-zfs  (1.8T)
  3907029128           7        - free -  (3.5k)

=>        34  3907029101  ada3  GPT  (1.8T)
          34          94        - free -  (47k)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  3902834696     2  freebsd-zfs  (1.8T)
  3907029128           7        - free -  (3.5k)

=>        34  3907029101  ada4  GPT  (1.8T)
          34          94        - free -  (47k)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  3902834696     2  freebsd-zfs  (1.8T)
  3907029128           7        - free -  (3.5k)

=>        34  3907029101  ada5  GPT  (1.8T)
          34          94        - free -  (47k)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  3902834696     2  freebsd-zfs  (1.8T)
  3907029128           7        - free -  (3.5k)

=>      63  31293377  da0  MBR  (14G)
        63   1930257    1  freebsd  [active]  (942M)
   1930320        63       - free -  (31k)
   1930383   1930257    2  freebsd  (942M)
   3860640      3024    3  freebsd  (1.5M)
   3863664     41328    4  freebsd  (20M)
   3904992  27388448       - free -  (13G)

=>      0  1930257  da0s1  BSD  (942M)
        0       16         - free -  (8.0k)
       16  1930241      1  !0  (942M)

=>        34  3907029101  ada2  GPT  (1.8T)
          34          94        - free -  (47k)
         128    20971520     1  freebsd-swap  (10G)
    20971648  3886057480     2  freebsd-zfs  (1.8T)
  3907029128           7        - free -  (3.5k)


And finally glabel status:
Code:
                                      Name  Status  Components
gptid/19177fb9-25fa-11e2-9ab0-00151736994a     N/A  ada0p2
gptid/19b5ec3a-25fa-11e2-9ab0-00151736994a     N/A  ada1p2
gptid/1aefa3e9-25fa-11e2-9ab0-00151736994a     N/A  ada3p2
gptid/1b8f2b64-25fa-11e2-9ab0-00151736994a     N/A  ada4p2
gptid/1c2d6a74-25fa-11e2-9ab0-00151736994a     N/A  ada5p2
                             ufs/FreeNASs3     N/A  da0s3
                             ufs/FreeNASs4     N/A  da0s4
                            ufs/FreeNASs1a     N/A  da0s1a
gptid/54957c6c-3d8e-11e2-8af1-00151736994a     N/A  ada2p2
gptid/548988f0-3d8e-11e2-8af1-00151736994a     N/A  ada2p1


If you need anything else let me know. Thanks.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
Seems identical enough.
They are. Just wanted to double check. As long as they are both reporting the same number of same sized sectors they are the same size.

The problem is the 10G swap partition on ada2. I had a suspicion it might be swap related. Just thought it would be in the other direction. Change swap size to the default 2 under Advanced Settings, try the replace again and let us know how it turns out.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I thought that the swap may be the problem, but I wasn't sure how to fix it or identify if that is the problem and I wasn't about to give him any advice that would jeopardize his data. Figured I'd let the real pros handle it.
 

HHawk

Contributor
Joined
Jun 8, 2011
Messages
176
They are. Just wanted to double check. As long as they are both reporting the same number of same sized sectors they are the same size.

The problem is the 10G swap partition on ada2. I had a suspicion it might be swap related. Just thought it would be in the other direction. Change swap size to the default 2 under Advanced Settings, try the replace again and let us know how it turns out.

Man... You are a life-saver! That worked. I did what you said and put the swap back to 2 and clicked on 'Replace' and no error messages. It's now resilvering as we speak.

Code:
scan: resilver in progress since Tue Dec  4 08:43:07 2012
49.0G scanned out of 3.18T at 245M/s, 3h43m to go
8.12G resilvered, 1.50% done


Now I need to be a bit patient. I am really 'glad' I waited and didn't follow up on bollar's advice and go out to buy a bigger harddisk. Money is these days an 'issue' and I can use my money on other things.

I guess I should scrub when the resilvering has been completed right? And do I need to do anything else? Remove or detach things after resilvering? I thought I read something about that somewhere. But I am not sure, I read a lot of things the past (few) days.

Once again, many thanks for the help!!

//update

I think my maximum resilvering speed is 336M/s
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
I guess I should scrub when the resilvering has been completed right? And do I need to do anything else? Remove or detach things after resilvering? I thought I read something about that somewhere.
This is a bit late, but no need to scrub afterwards. Unless, you are concerned about the other drives. A resilver essentially is a scrub except it only goes over the metadata & data necessary to rebuild the array. In other words you completely "scrubbed" the new drive and parts of the other drives needed to rebuild the new one. Of course drive replacement is infrequent enough, or damn well better be ;), that you can scrub afterwards if that makes you more comfortable.

8.3 should be better with not needing to detach the old devices.

I think my maximum resilvering speed is 336M/s
That's quite high, but initial estimates are unreliable & vary quite a bit. What's the average speed?
 

HHawk

Contributor
Joined
Jun 8, 2011
Messages
176
This is a bit late, but no need to scrub afterwards. Unless, you are concerned about the other drives. A resilver essentially is a scrub except it only goes over the metadata & data necessary to rebuild the array. In other words you completely "scrubbed" the new drive and parts of the other drives needed to rebuild the new one. Of course drive replacement is infrequent enough, or damn well better be ;), that you can scrub afterwards if that makes you more comfortable.

8.3 should be better with not needing to detach the old devices.

No problem. I did a scrub anyways, just to see if everything was working without nasty error messages. However I had to use detach (was a lucky guess), because it was still showing it in the Freenas GUI. Also it didn't say the ZFS Pool was heathy. After detach, it was healthy! :)

That's quite high, but initial estimates are unreliable & vary quite a bit. What's the average speed?

Yeah, I guess it was the maximum or something. It ended resilvering at 269 M/s.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
However I had to use detach (was a lucky guess), because it was still showing it in the Freenas GUI. Also it didn't say the ZFS Pool was heathy. After detach, it was healthy! :)
A replace is supposed to be an attach followed by a detach once resilvered. Sometimes you do need to detach manually afterwards. Regardless, you got it done and the array wouldn't be considered healthy until the old device was removed.

Yeah, I guess it was the maximum or something. It ended resilvering at 269 M/s.
I believe that's still fairly speedy. The more filled and/or the more fragmented the array becomes you can expect that number to drop down some.
 

uutzinger

Dabbler
Joined
Nov 27, 2011
Messages
43
After same smart message as discussed in this thread on my system I replaced the disk and I am going through resilvering.

It does not seem to stop. I use FreeNAS-8.2.0-RELEASE-p1-x64.

The disks are still being accessed. It says it has resilvered 3.9TB but the disk is only 3TB. Will it go through all 18TB (6 drives)? I checked the new drive with long self test using smartctl and found no error.
I would like to avoid going to FreeNAS 8.3 as I have custom 8.2 build (additional kernel modules) and custom jail. Also my disk controller firmware might have issues with 8.3.

Code:
status: One or more devices is currently being resilvered.  The pool will       
        continue to function, possibly in a degraded state.                     
action: Wait for the resilver to complete.                                      
 scrub: resilver in progress for 30h7m, 100.00% done, 0h0m to go 
....
          raidz2                                          DEGRADED     0     0  
   0                                                                            
            gptid/d47e8934-2235-11e1-a4ff-002522d93db0    ONLINE       0     0  
   0                                                                            
            gptid/d56d25c1-2235-11e1-a4ff-002522d93db0    ONLINE       0     0  
   0                                                                            
            replacing                                     DEGRADED     0     0  
   0                                                                            
              gptid/d639394d-2235-11e1-a4ff-002522d93db0  OFFLINE      0     0  
   0                                                                            
              gptid/634fedd2-3ff7-11e2-9229-002522d93db0  ONLINE       0     0  
   0  3.94T resilvered                                                          
            gptid/d7234ff8-2235-11e1-a4ff-002522d93db0    ONLINE       0     0  
   0                                                                            
            gptid/d7f35950-2235-11e1-a4ff-002522d93db0    ONLINE       0     0  
   0                                                                            
            gptid/d8db1888-2235-11e1-a4ff-002522d93db0    ONLINE       0     0 
 
Status
Not open for further replies.
Top