Disk got strange name after switch

Status
Not open for further replies.

Stenull

Dabbler
Joined
Aug 22, 2011
Messages
45
Hi.
I switched disks a while ago and after attaching the new disk it appears with a strange name
The others have names like gpt/disk0, gpt/disk2 and so on. the 2nd disk was originally named gpt/disk1 but now its named ada2p2 after the switch...
I did the switch in the GUI.
I switch to bigger disks so i can expand.

Could some one explain this to me and is it dangerous?

Heres a "zpool status Tank1"

Code:
  pool: Tank1
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: none requested
config:

        NAME           STATE     READ WRITE CKSUM
        Tank1          ONLINE       0     0     0
          raidz1       ONLINE       0     0     0
            gpt/disk0  ONLINE       0     0     0
            ada2p2     ONLINE       0     0     4
            gpt/disk2  ONLINE       0     0     0
            gpt/disk3  ONLINE       0     0     0

errors: No known data errors


What more info do you need?
Thanks in advance!
/S
 

William Grzybowski

Wizard
iXsystems
Joined
May 27, 2011
Messages
1,754
Looks like you've got a pool created in a very old freenas version, something like 8.0-RELEASE?

This is not dangerous, it is just a label of the disk, the gpt/* have been dropped in recent releases to do not confuse the user.

I would recommend you to run:

# zpool clear Tank1
# zpool scrub Tank1

Because you've got checksum errors in one of the disks...
 

Stenull

Dabbler
Joined
Aug 22, 2011
Messages
45
Looks like you've got a pool created in a very old freenas version, something like 8.0-RELEASE?

This is not dangerous, it is just a label of the disk, the gpt/* have been dropped in recent releases to do not confuse the user.

I would recommend you to run:

# zpool clear Tank1
# zpool scrub Tank1

Because you've got checksum errors in one of the disks...

Ah ok, thanks.
So if i replace the other 3 disks one by one with themselves then all will be mainstreamed ala ada2p2 style?

And yes i get checksum errors occasionally on that disk, i thought it had something to do with me doing something wrong with the replacement, thou the different name .
So i should probably replace the disk if this checksum error continues coming back?
 

William Grzybowski

Wizard
iXsystems
Joined
May 27, 2011
Messages
1,754
Ah ok, thanks.
So if i replace the other 3 disks one by one with themselves then all will be mainstreamed ala ada2p2 style?

Correct.

And yes i get checksum errors occasionally on that disk, i thought it had something to do with me doing something wrong with the replacement, thou the different name .
So i should probably replace the disk if this checksum error continues coming back?

Yes you should.
Take a look at smartctl -a /dev/ada2 output, might be helpful.
 

Stenull

Dabbler
Joined
Aug 22, 2011
Messages
45
I have been trying to analyze smart data but i can't tell if its good or bad...
Could u take a quick peek?

Code:
=== START OF INFORMATION SECTION ===
Model Family:     Hitachi Deskstar 5K3000
Device Model:     Hitachi HDS5C3020ALA632
Serial Number:    ML0220F312HURD
LU WWN Device Id: 5 000cca 369cf3bab
Firmware Version: ML6OA580
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 4
Local Time is:    Tue Sep 18 15:09:56 2012 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (21947) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off supp          ort.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 255) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_          FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail  Always       -                 0
  2 Throughput_Performance  0x0005   136   136   054    Pre-fail  Offline      -                 94
  3 Spin_Up_Time            0x0007   133   133   024    Pre-fail  Always       -                 410 (Average 412)
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -                 733
  5 Reallocated_Sector_Ct   0x0033   079   079   005    Pre-fail  Always       -                 539
  7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  Always       -                 0
  8 Seek_Time_Performance   0x0005   146   146   020    Pre-fail  Offline      -                 29
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -                 4910
 10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  Always       -                 0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -                 39
192 Power-Off_Retract_Count 0x0032   095   095   000    Old_age   Always       -                 6624
193 Load_Cycle_Count        0x0012   095   095   000    Old_age   Always       -                 6624
194 Temperature_Celsius     0x0002   171   171   000    Old_age   Always       -                 35 (Min/Max 22/44)
196 Reallocated_Event_Count 0x0032   001   001   000    Old_age   Always       -                 2776
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -                 0
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -                 0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -                 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


Theres a lot of "Old_age" and "Pre_fail" going on there.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
I dont see anything wrong.
:confused: Are we looking at the same drive?

Could u take a quick peek?
I'm not that familiar with Hitachi disks, but unless they use Attribute #5 differently than everyone else you have problems. The disk has swapped out 539 bad sectors. I would have replaced the disk already.

If that number is growing at all it would explain your checksum errors and is a sign of impending disk failure. In which case, as William already said, replace the disk.
 

William Grzybowski

Wizard
iXsystems
Joined
May 27, 2011
Messages
1,754
I might be wrong but AFAIK #5 was recoverable, I would be worried if they Current_Pending_Sector or Offline_Uncorrectable.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
I'm no expert on SMART myself, but it's my understanding that #5 is simply a count of all the sectors that have been replaced with spares. Given that we are dealing with SMART things can & do vary between manufacturers, disk models and firmwares. Even if all the errors were recoverable, it's still a sign the disk is having problems. Rather than waiting until the disk can't keep up, or is unlucky, I would look to replace it sooner. I'm likely conservative in this though.

#197 are questionable sectors that have failed a read. On a subsequent successful read, the drive will either swap the sector, incrementing #5, or even keep it in service. A successful write to the sector results in the same two cases. A failed write will cause the sector to be swapped out assuming there are free spares.

#198 I never been very clear on. I just know I don't want any.

#196 I am assuming isn't normalized and is vendor specific for this particular disk/firmware. Otherwise I would expect it to closely track #5.
 
Status
Not open for further replies.
Top