The boot volume state is DEGRADED - what is the best way to fix this

Status
Not open for further replies.

zimon

Contributor
Joined
Jan 8, 2016
Messages
134
So after I updated Freenas I got the following message:
Code:
The boot volume state is DEGRADED: One or more devices are faulted in response to persistent errors. Sufficient replicas exist for the pool to continue functioning in a degraded state.

my boot pool looks like this:
Code:
 pool: freenas-boot

 state: DEGRADED

status: One or more devices are faulted in response to persistent errors.

	Sufficient replicas exist for the pool to continue functioning in a

	degraded state.

action: Replace the faulted device, or use 'zpool clear' to mark the device

	repaired.

  scan: scrub repaired 0 in 0h7m with 0 errors on Wed Sep 28 03:52:37 2016

config:


	NAME											STATE	 READ WRITE CKSUM

	freenas-boot									DEGRADED	 0	 0	 0

	  mirror-0									  DEGRADED	 0	 0	 0

		gptid/4efb264a-21cd-11e0-9bf5-3cd92b02910c  ONLINE	   0	 0	 0

		da0p2


I think that `da0p2` is the thumbdrive on my mainboard (there is one usb slot on the mainboard), but tbh I am not 100% sure so this is what `geom disk list` gave me:
Code:
Geom name: da0

Providers:

1. Name: da0

  Mediasize: 15504900096 (14G)

  Sectorsize: 512

  Mode: r0w0e0

  descr: Patriot Memory

  ident: 07013833DCB2DE54

  fwsectors: 63

  fwheads: 255



So I guess that this flash drive is just broken and I have to replace it.
Can I just get another one (unformatted) and plug it in and thats it? (maybe do a scrub?) what is the way to go here?
Also is there an easy way to do a snapshot of my boot pool, temporary to another pool? It would be very bad if the other drive also fails while I am waiting until my thumb drive arrives.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
If your thunbdrive has an activity light

dd if=/dev/da0 of=/dev/null

Will very quickly tell you which one to pull/replace.
 

BigDave

FreeNAS Enthusiast
Joined
Oct 6, 2013
Messages
2,479
So I guess that this flash drive is just broken and I have to replace it.
Your guess is correct.
Can I just get another one (unformatted) and plug it in and thats it? (maybe do a scrub?) what is the way to go here?
The manual has easy to follow instructions for replacing a failed drive.
Also is there an easy way to do a snapshot of my boot pool, temporary to another pool?
Just save a copy of your configuration on to another machine on your LAN BEFORE doing anything.
It would be very bad if the other drive also fails while I am waiting until my thumb drive arrives.
No worries, your boot device is totally separate from your data and is under no danger.
 

zimon

Contributor
Joined
Jan 8, 2016
Messages
134
Well, I am talking about my freenas boot pool where one drive is failing of a mirror-0. So if the other one is failing two that would not be good.
Which manual are you referring to?

And unfortunately my flash drive has no status LED but I am now pretty sure which one it is
 

MisterIce

Explorer
Joined
May 21, 2016
Messages
87
Well, I am talking about my freenas boot pool where one drive is failing of a mirror-0. So if the other one is failing two that would not be good.
Which manual are you referring to?

And unfortunately my flash drive has no status LED but I am now pretty sure which one it is
This manual: http://doc.freenas.org/9.10/

And if your second boot drive would fail your FreeNAS has crashed yes, but it hasn't effected the data on any of your storage pools. The FreeNAS boot pool is not the same pool you use to store you data.
 

zimon

Contributor
Joined
Jan 8, 2016
Messages
134
thanks. Somehow after when I started my freenas (it does not run 24/7) the error was gone and it looks like the usb drive works now?
Code:
	NAME											STATE	 READ WRITE CKSUM

	freenas-boot									ONLINE	   0	 0	 0

	  mirror-0									  ONLINE	   0	 0	 0

		gptid/4efb264a-21cd-11e0-9bf5-3cd92b02910c  ONLINE	   0	 0	 0

		da0p2									   ONLINE	   0	 0	 0



What should I do now?
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504

rogerh

Guru
Joined
Apr 18, 2014
Messages
1,111
If you take the wrong one out it won't boot, or at least not without error. So you will know you left the wrong one in. This is exceedingly unlikely to do any harm.
 

zimon

Contributor
Joined
Jan 8, 2016
Messages
134
It should.

That's normal; ZFS errors clear when you reboot. If you do a scrub on the boot pool it will probably come back.

Well I scrubbed it now several times but the error did not come back? Am I doing something wrong?

Is there any other downside in "not runing it 24/7" than not having regular S.M.A.R.T and scrubs?
 

zimon

Contributor
Joined
Jan 8, 2016
Messages
134
Sorry to ask again, but the Degraded error still did not occur so I am wondering what I can do beside a scrub to test if the drive (flash drive) is ok?
 

rogerh

Guru
Joined
Apr 18, 2014
Messages
1,111
Sorry to ask again, but the Degraded error still did not occur so I am wondering what I can do beside a scrub to test if the drive (flash drive) is ok?

I don't really know the answer, so I'll say what I think and we will see if anyone contradicts it. I don't understand SSD SMART test reports, but I would do regular short SMART tests and worry if they say in so many words that the SSD is broken or the test has failed. I would also do regular scrubs (weekly if the don't take too long) and if no new errors appear assume the the SSD was good for some time yet.

(Edit: there seems to be some doubt if long SMART tests are meaningful with SSDs and whether they could increase 'wear'.)

Edit 2. So we are just talking about flash drives here? So SMART tests aren't possible. I would save the config file and either replace the dubious drive, or if unsure which it is replace both with reputable makes and reinstall. It is hardly worth taking the risk of a problem recurring with such cheap and untestable devices.
 
Last edited:

zimon

Contributor
Joined
Jan 8, 2016
Messages
134
yea it is a flash drive, but if it fails I still have the ssd. Well I guess i wait if there will be any further error in the future and if so, I replace it :)
 
Status
Not open for further replies.
Top