11.1 U6 does not let me replace a faulty disk

Status
Not open for further replies.

tbingel

Dabbler
Joined
Aug 17, 2011
Messages
13
Here is my situation:
I am running the latest stable release of 11.1.
I have a volume of ZFS1 which consist of 6 WD Red Pro 4TB drives.
ada4 died on me a few days ago with a constantly lit drive status LED.
FreeNas also started to act like not serving GUI even though the server was appearing on the network etc. after the drive failed.
I was going to replace the disk and re-silver the new one as per the usual method; offlining the faulty disk, powering the server down, replacing the physical disk and issuing the replace command to initiate pool re-silvering.
However, server does not boot normally unless I remove the dead disk, and I cannot offline the member disk as it appears Unavail when I do.
I am stuck. Any suggestions?
Thank you.
 

tbingel

Dabbler
Joined
Aug 17, 2011
Messages
13
Hi,
MOBO is an AsrockRack Mini ITX E3C236D2I
CPU is Core i3-4170
RAM 16GB
Drives Are WD Red Pro (SATA) x6 direct to mobo

zpool status response is as follows:
pool: MediaVolume
state: DEGRADED
status: One or more devices could not be opened. Sufficient replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
see: http://illumos.org/msg/ZFS-8000-2Q
scan: scrub repaired 309M in 1 days 15:52:17 with 0 errors on Tue Sep 4 15:52:18 2018
config:

NAME STATE READ WRITE CKSUM
MediaVolume DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
gptid/693426d5-03aa-11e6-85dc-d0509979a5b9 ONLINE 0 0 0
gptid/69c5202b-03aa-11e6-85dc-d0509979a5b9 ONLINE 0 0 0
gptid/6a584a23-03aa-11e6-85dc-d0509979a5b9 ONLINE 0 0 0
gptid/6ae6e19d-03aa-11e6-85dc-d0509979a5b9 ONLINE 0 0 0
3790711850920324544 UNAVAIL 0 0 0 was /dev/gptid/6b7192e8-03aa-11e6-85dc-d0509979a
5b9
gptid/6bfa3740-03aa-11e6-85dc-d0509979a5b9 ONLINE 0 0 0

errors: No known data errors

pool: freenas-boot
state: ONLINE
scan: scrub repaired 0 in 0 days 00:04:48 with 0 errors on Thu Sep 20 03:49:48 2018
config:

NAME STATE READ WRITE CKSUM
freenas-boot ONLINE 0 0 0
da0p2 ONLINE 0 0 0

errors: No known data errors

Thank you.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Your pool is thankfully healthy, but in a dangerous state (no redundancy) since RAIDZ1 was used vs RAIDZ2.

If you have no OFFLINE option and the drive is showing UNAVAIL, then you should be able to boot with your replacement drive present and the REPLACE option should be there.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Previous threads on this issue have had people resolve it by rebooting their server, at which point the spare drive will become visible in the GUI. I assume you've tried that?

I'm hesitant to recommend forcing it via the CLI.
 

tbingel

Dabbler
Joined
Aug 17, 2011
Messages
13
Hi,
I also came across with those accounts that you've mentioned and I also did expect that to happen.
my server has 6 drive bays which were occupied by 6 active drives of the pool.
I am assuming that I would need one more drive bay with an uninitialized drive as a spare so that I could add it in as a replacement?
Than again I am not very deep into FreeBSD and FreeNAS for that matter.
And yes, I have rebooted three times with the same hope :smile:
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
You shouldn't need any additional bays.

Is your additional drive recognized by the system? If you try to create a new pool in the Volume Manager, does it show one 4TB drive present? Does it respond to SMART commands?

You can try to export (note: not "delete"!) and then re-import your pool, to see if that makes the FreeNAS GUI figure out "oh, this drive is just completely missing" and give you the option of replacing it with the fresh drive.

If the export/import doesn't work, I'd then say it's time to hit the command-line and try to punt the offending drive from the pool manually.
 

tbingel

Dabbler
Joined
Aug 17, 2011
Messages
13
I just have discovered that SMART daemon was not on and this was among the Alert System warnings as: WARNING: Sept. 24, 2018, 9:57 p.m. - smartd_daemon is not running.
I have tried to start the service manually by clicking START NOW button on the Services but it did not respond at all.
The new drive is actually a 6TB version of the same family. This should have worked regardless.
New disk does not appear as available disk on the Volume Manager pages.
I don't know how to export a pool? In where is this done?
Thanks.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Exporting a pool is done through the Volume Manager - it's the scary looking button with the big X at the bottom of the window when you select the root pool, labeled "Detach volume."

Make absolutely certain that you do not check the option labeled "Mark the disks as new (destroy data)" as it will do exactly as it says it will and destroy your data!

Once that's done, you can hit the "Import Volume" button at the top of the window, and try re-importing the pool.
 

tbingel

Dabbler
Joined
Aug 17, 2011
Messages
13
This is what the process returned:

Environment:

Software Version: FreeNAS-11.1-U6 (caffd76fa)
Request Method: POST
Request URL: http://192.168.1.103/storage/detach/1/


Traceback:
File "/usr/local/lib/python3.6/site-packages/django/core/handlers/exception.py" in inner
42. response = get_response(request)
File "/usr/local/lib/python3.6/site-packages/django/core/handlers/base.py" in _legacy_get_response
249. response = self._get_response(request)
File "/usr/local/lib/python3.6/site-packages/django/core/handlers/base.py" in _get_response
178. response = middleware_method(request, callback, callback_args, callback_kwargs)
File "./freenasUI/freeadmin/middleware.py" in process_view
162. return login_required(view_func)(request, *view_args, **view_kwargs)
File "/usr/local/lib/python3.6/site-packages/django/contrib/auth/decorators.py" in _wrapped_view
23. return view_func(request, *args, **kwargs)
File "./freenasUI/storage/views.py" in volume_detach
727. cascade=form.cleaned_data.get('cascade', True))
File "./freenasUI/storage/models.py" in delete
439. n.start(svc)
File "./freenasUI/middleware/notifier.py" in start
202. return c.call('service.start', what, {'onetime': onetime}, **kwargs)
File "./freenasUI/middleware/notifier.py" in start
202. return c.call('service.start', what, {'onetime': onetime}, **kwargs)
File "/usr/local/lib/python3.6/site-packages/middlewared/client/client.py" in call
429. raise CallTimeout("Call timeout")

Exception Type: CallTimeout at /storage/detach/1/
Exception Value: Call timeout
 

tbingel

Dabbler
Joined
Aug 17, 2011
Messages
13
I have imported the volume and the same thing happened.
Freenas does not see the new disk.
Tomorrow I will check the physical connection of the ada4 bay.
Thank you very much.
 

tbingel

Dabbler
Joined
Aug 17, 2011
Messages
13
It turned out that the replacement brand new disk was defective.
Popped in a new one and pool is re-silvering now.
Thank you.
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
Status
Not open for further replies.
Top